the subversion of bonsai
Over the last couple of nights I've been trudging thru some code, mostly in perl, to allow bonsai to work with Subversion repositories. All in all pretty fun stuff.
The first thing I did was kind of a sanity check to see if svn commit data has all of the information that bonsai expects -- I wanted to see if I would need to generate any information required. To determine this I took a look at bonsai's dolog.pl and rebuildcvshistory.cgi -- both populate the files and checkins tables in the bonsai database.
dirs table
ColumnTypeComment
id
Integer
Primary key
dir
VarChar(255)
directory (relative to repository)
files table
ColumnTypeComment
id
Integer
Primary key
file
VarChar(255)
filename (filename only - does not contain path)
checkins table
ColumnTypeComment
type
Enum
'Change', 'Add', 'Remove'
ci_when
DateTime
Timestamp of the revision
repositoryid
Integer
Foreign Key (FK) into repositories table
dirid
Integer
FK into dirs table
fileid
Integer
FK into files table
revision
VarChar(32)
revision identifier
stickytag
VarChar(255)
tag identifier
branchid
Integer
FK into branches table
addedlines
Integer
number of lines added for this revision
removedlines
Integer
number of lines removed for this revision
descid
Integer
FK into descs table
Now, while I haven't dumped all of the tables, the above should give you a feel for how it's structured. What is interesting to note is that while the checkins table has columns for branch and even stickytag, they don't seem to be populated.
The heart of the bonsai system is the checkins table shown above where each row holds the information for each unique revision of each file. When this table is populated from CVS the rcs file list is crawled (all of the ,v files you see if you look at an actual CVS repository) and rlog is run on each file to pull it's revision history which is inserted into the table.
With SVN it is a little bit different - not much, but still different. SVN uses atomic commits so multiple files can be included for each revision and given that you can rename and move files and directories, you end up with an interesting design hitch: Do you use svn list to return the list of files and directories in the root of the SVN repository and walk that list using svn log or do you use svn log on the root of the SVN repository and as part of the walk thru the revision history you will get all files and directories.
Now for the part of bonsai that is updating the tables for each new checkin it's kind of a moot point (still a fun point, but made moot by how SVN does it's hooks.) During the post-commit hook you get the repository and revision number so it really makes it a no-brainer to use the svn log command and parse the info into checkins table insertions for each file touched.
For the part of bonsai that does the initial building I can see it going either way but currently I'm leaning towards the svn log method simply because that would allow me to have a common svn log parser routine. Code reuse is a Good Thing :)
I still need to chew it over in my head but so far I haven't seen any show stoppers, only a couple of questions that need working out. One of the questions is that by doing it the svn log way it changes how bonsai works with SVN from how it works with CVS. Not that I consider that a "bad" thing - just making note of it.
Hmm, this has turned out to be a nice rambling post so let me end it now and I'll continue it after I've run OSAF's cvs repository thru one bonsai and the svn repository thru another to make sure I'm getting the same insertions. woo fun stuff!
