Wednesday, July 04, 2007

Using Lucene.Net to Index And Search C# Source

SimoneB's Blog - Indexing and searching source code with Lucene.Net

"...

My idea was to create a homemade source code indexing and search service, so I started fiddling with Lucene.Net, CastleProject, C# Parser and a couple other open source projects to see what I could come up with. There are already a lot of services which allows to search source code online, see Krugle, Google Code Search and Koders among others.

Well, of course I couldn't use one of them as my course project, so I started implementing my own. I called it CS2 - C Sharp Code Search, and its source code is available under the MIT license on its Google Project Hosting website. I think it's a good example of the usage of Lucene.Net and CastleProject's IoC container in a wanna be real life project.

At the moment only the indexing part is implemented and you can see it working launching the console application project contained in the solution. ..."

Wow, do you see the brightly lit 30w CFC light-bulb floating over my head?

This post is the basis for a great idea. Internal/in-house/behind the firewall/IP protected source code full text indexing and searching.

Think how cool it would be to full text search ALL the source code hosted in a TFS server? Tie into the TFS event and a service could index code as it's checked in. With a web front end (or a Part in the SharePoint team portal), all the developers in-house (or connected to your network via VPN, etc) could search your source code repository.

And not just boring/normal (Window Search, Google desktop, X1, etc) full text indexing, but fully parsed indexing. So searches could be limited to methods, properties, etc ("method:DoSomeWork CONTAINS blablabla" or "property:SillyFlag" or "comment:TODO" or ProjectAssembly:InHouseAssembly.DLL or... )

Think about dependency searching (list all projects using a given assembly, etc) or re-use scenarios (find other projects/code snips where object/property/method XYZ is used) or refactoring (how may times has this same code snip been copied over and over) or code review or... or...

I mean, we don't have anything like this available to us now, do we (remember, I'm talking server based, full repository, in house parsed source code full text indexing)? Doesn't this seem like a no-brainer?

Hum....

(via DotNetKicks.com - Indexing and searching source code with Lucene.Net)

6 comments:

Anonymous said...

You just described the Koders Enterprise Edition product

It indexes source code repositories such as TFS, Visual SourceSafe, Subversion, CVS, Perforce, etc... All the major ones that I'm aware of.

If you want to get a taste for how it works, just try out http://koders.com/. It's like that, but inside your firewall indexing your own code.

Anonymous said...

I forgot to mention we're coming out with the Pro edition for small teams. It'll be really cheap.

Greg said...

Rock on! I'll check it out...

Thank you

Yogesh said...

Thanks Greg,

I found the below link was useful in creating lucene index in .net
create lucene index in c#

Greg said...

@Yogesh
Nice! Thanks for commenting with that... :)

Anonymous said...

How about this Search Lucene Index in C# with Sorting Options