Wednesday, July 04, 2007

Using Lucene.Net to Index And Search C# Source

SimoneB's Blog - Indexing and searching source code with Lucene.Net

"...

My idea was to create a homemade source code indexing and search service, so I started fiddling with Lucene.Net, CastleProject, C# Parser and a couple other open source projects to see what I could come up with. There are already a lot of services which allows to search source code online, see Krugle, Google Code Search and Koders among others.

Well, of course I couldn't use one of them as my course project, so I started implementing my own. I called it CS2 - C Sharp Code Search, and its source code is available under the MIT license on its Google Project Hosting website. I think it's a good example of the usage of Lucene.Net and CastleProject's IoC container in a wanna be real life project.

At the moment only the indexing part is implemented and you can see it working launching the console application project contained in the solution. ..."

Wow, do you see the brightly lit 30w CFC light-bulb floating over my head?

This post is the basis for a great idea. Internal/in-house/behind the firewall/IP protected source code full text indexing and searching.

Think how cool it would be to full text search ALL the source code hosted in a TFS server? Tie into the TFS event and a service could index code as it's checked in. With a web front end (or a Part in the SharePoint team portal), all the developers in-house (or connected to your network via VPN, etc) could search your source code repository.

And not just boring/normal (Window Search, Google desktop, X1, etc) full text indexing, but fully parsed indexing. So searches could be limited to methods, properties, etc ("method:DoSomeWork CONTAINS blablabla" or "property:SillyFlag" or "comment:TODO" or ProjectAssembly:InHouseAssembly.DLL or... )

Think about dependency searching (list all projects using a given assembly, etc) or re-use scenarios (find other projects/code snips where object/property/method XYZ is used) or refactoring (how may times has this same code snip been copied over and over) or code review or... or...

I mean, we don't have anything like this available to us now, do we (remember, I'm talking server based, full repository, in house parsed source code full text indexing)? Doesn't this seem like a no-brainer?

Hum....

(via DotNetKicks.com - Indexing and searching source code with Lucene.Net)

6 comments:

  1. You just described the Koders Enterprise Edition product

    It indexes source code repositories such as TFS, Visual SourceSafe, Subversion, CVS, Perforce, etc... All the major ones that I'm aware of.

    If you want to get a taste for how it works, just try out http://koders.com/. It's like that, but inside your firewall indexing your own code.

    ReplyDelete
  2. I forgot to mention we're coming out with the Pro edition for small teams. It'll be really cheap.

    ReplyDelete
  3. Rock on! I'll check it out...

    Thank you

    ReplyDelete
  4. Thanks Greg,

    I found the below link was useful in creating lucene index in .net
    create lucene index in c#

    ReplyDelete
  5. @Yogesh
    Nice! Thanks for commenting with that... :)

    ReplyDelete

NOTE: Anonymous Commenting has been turned off for a while... The comment spammers are just killing me...

ALL comments are moderated. I will review every comment before it will appear on the blog.

Your comment WILL NOT APPEAR UNTIL I approve it. This may take some hours...

I reserve, and will use, the right to not approve ANY comment for ANY reason. I will not usually, but if it's off topic, spam (or even close to spam-like), inflammatory, mean, etc, etc, well... then...

Please see my comment policy for more information if you are interested.

Thanks,
Greg

PS. I am proactively moderating comments. Your comment WILL NOT APPEAR UNTIL I approve it. This may take some hours...