By using our site you agree to our use of cookies to deliver a better site experience. Find out more here.

FreeGrep is now at GitHub

Wednesday November 25, 2009

• bsd • c • computer science • FreeBSD • FreeGrep • grep • information technology • NetBSD • OpenBSD • software • software engineering • systems science • UMD •

I have been learning to use Git, the distributed version control system. I am more than a bit familiar with both CVS and Subversion, and finding Git to be interesting and capable of some incredible feats of source management. But when working with new tools, it is often easier to find non-trivial examples upon which to experiment. This brings us to FreeGrep.

Back in July of 1999, I was between freshman and sophomore years as a computer science major at the University of Maryland (when returning a month later, I switched to the Department of Mathematics). When you are a compsci major, you tend to get excited over things like the Towers of Hanoi problem, self-balancing binary search trees, or NP-completeness. Back then, what got me going were finite state automata. One in particular, called the Boyer-Moore search algorithm caught my attention. (Also, see item 179 in HAKMEM.)

I had used FreeBSD, the 4.4BSD-derived operating system for several years, finding it to be a sane and sensible alternative to the madness that surrounded the Linux development process, especially ten years ago. A number of the tools used in 4.4BSD-derived operating systems, especially toolchain components, are the GNU versions, leading to more licensing holy wars than necessary (and I can keep up with). So with toys-a-plenty, and some fancy-pants education, I took to writing my own version of grep(1), a regular expression pattern matcher and cornerstone of any respectable Unix-like operating system. I posted an initial announcement to the FreeBSD Hackers mailing list (with a disturbingly misspelled subject line) where I expressed my hope it would someday be the version shipped with FreeBSD. I had barely scratched the surface of the problem, but a few others were inspired and patches started flying across the world with incredible alacrity. Source control was nowhere to be seen, but it was just for fun, anyway. As the summer wound down, I worked less and less on the program. It was almost completely correct but was terribly slow in some pathological cases.

For reasons I can no longer recall, on September 14, 2002, created a new CVS repository and checked in FreeGrep v0.16. And in early 2003, a man named Sean Farley contacted me with some patches. I committed them to my repository, but never cut a new release. But later that year, both the NetBSD and OpenBSD projects dropped GNU grep from their source trees and added FreeGrep v0.16 and started doing a lot of heavy work, dramatically increasing the speed and solving many lingering bugs in the code. I watched, but did not say much.

Because FreeGrep is a small program (2121 SLOC in the current release of OpenBSD, the largest implementation) with a complex development history, it was the perfect place to experiment with Git. So this week, I used Git’s CVS import feature to import my CVS tree, the OpenBSD CVS tree, and the NetBSD CVS tree into one repository. Now, the full history from these development strands are available and I did the obvious thing and merged work done by the OpenBSD project directly into the master branch and consider it the new base. Because I have been occasionally remiss in keeping this code available to all, I have uploaded everything I have, including fifteen pre-CVS source releases to GitHub where it can remain available to any interested party. Please enjoy.

And submit patches.