the subversion of bonsai

Over the last couple of nights I've been trudging thru some code, mostly in perl, to allow bonsai to work with Subversion repositories. All in all pretty fun stuff.

The first thing I did was kind of a sanity check to see if svn commit data has all of the information that bonsai expects -- I wanted to see if I would need to generate any information required. To determine this I took a look at bonsai's dolog.pl and rebuildcvshistory.cgi -- both populate the files and checkins tables in the bonsai database.

dirs table

ColumnTypeComment

id

Integer

Primary key

dir

VarChar(255)

directory (relative to repository)

files table

ColumnTypeComment

id

Integer

Primary key

file

VarChar(255)

filename (filename only - does not contain path)

checkins table

ColumnTypeComment

type

Enum

'Change', 'Add', 'Remove'

ci_when

DateTime

Timestamp of the revision

repositoryid

Integer

Foreign Key (FK) into repositories table

dirid

Integer

FK into dirs table

fileid

Integer

FK into files table

revision

VarChar(32)

revision identifier

stickytag

VarChar(255)

tag identifier

branchid

Integer

FK into branches table

addedlines

Integer

number of lines added for this revision

removedlines

Integer

number of lines removed for this revision

descid

Integer

FK into descs table

Now, while I haven't dumped all of the tables, the above should give you a feel for how it's structured. What is interesting to note is that while the checkins table has columns for branch and even stickytag, they don't seem to be populated.

The heart of the bonsai system is the checkins table shown above where each row holds the information for each unique revision of each file. When this table is populated from CVS the rcs file list is crawled (all of the ,v files you see if you look at an actual CVS repository) and rlog is run on each file to pull it's revision history which is inserted into the table.

With SVN it is a little bit different - not much, but still different. SVN uses atomic commits so multiple files can be included for each revision and given that you can rename and move files and directories, you end up with an interesting design hitch: Do you use svn list to return the list of files and directories in the root of the SVN repository and walk that list using svn log or do you use svn log on the root of the SVN repository and as part of the walk thru the revision history you will get all files and directories.

Now for the part of bonsai that is updating the tables for each new checkin it's kind of a moot point (still a fun point, but made moot by how SVN does it's hooks.) During the post-commit hook you get the repository and revision number so it really makes it a no-brainer to use the svn log command and parse the info into checkins table insertions for each file touched.

For the part of bonsai that does the initial building I can see it going either way but currently I'm leaning towards the svn log method simply because that would allow me to have a common svn log parser routine. Code reuse is a Good Thing :)

I still need to chew it over in my head but so far I haven't seen any show stoppers, only a couple of questions that need working out. One of the questions is that by doing it the svn log way it changes how bonsai works with SVN from how it works with CVS. Not that I consider that a "bad" thing - just making note of it.

Hmm, this has turned out to be a nice rambling post so let me end it now and I'll continue it after I've run OSAF's cvs repository thru one bonsai and the svn repository thru another to make sure I'm getting the same insertions. woo fun stuff!


Mentions