Photo from Flickr by smee.bruce
The IDE is a software developer's toolbench.  For the most part it's an ordered chaos, with all of a developer's favorite windows and extensions at the ready.  Unfortunately, the IDE's complexity helps to hide some of its glaring shortcomings -- it's easy to focus on the shiny wrenches across the front while ignoring your Grandfather's 1970s disc sander that's about to melt down from misuse. Like Grandpa's sander, code search* is a tool that's in dire need of an overhaul.

* In this article code search refers to the act of searching over the code base that one is currently working on, not searching over repositories of code.

Code Search Sucks.... Really?
Developers, having grown up on awful search tools, have become numb to the problem.  Many may not even believe it is a problem! In order to illustrate the sorry state of code search let's explore an analogy based on web search...

I've got a new web search service that I'd like for you to try, called Gaagle.  It's incorporated all of the key features of modern code search.  Let's try searching for "Visual Studio" using Gaagle:
Ohps! I misspelled "Visual Studio", but Gaagle doesn't fix my mistake (or even point it out).  Gaagle simply obediently reports that there are, in fact, no matches for the search terms "Vizual Sudio".  Thanks Gaagle!
Having corrected my search terms to "Visual Studio" I search using Gaagle again.  This time I get lots of hits!  However, the hits are visualized as a list of paths, leading to an exciting game of "what's behind that link!"  After clicking on each link in the list I have a pretty good idea (and a sore mouse-finger).
But wait, there's been an upgrade to Gaagle!  Instead of returning results in an ordered list Gaagle now returns unordered results.  That means that each and every web page in the world that contains the terms "Visual Studio" are returned to me for my search.  No worries!  I start wading through the thousands of (mainly irrelevant) hits until I find what I'm looking for.   

Not ready to switch to Gaagle? Well surprise... you're already using it! If you're using Visual Studio 2010 or earlier your code search (i.e., Edit > Find and Replace > Find in Files) operates almost exactly like Gaagle.

Join Us in Fixing Code Search!
Visual Studio, so powerful in many other ways, has neglected code search.  This January we're starting a project to modernize code search, called Sando.  If you long for a modern code search experience please join us at  http://sando.codeplex.com/.  We're looking for great developers (with C# or Java experience) but also for any suggestions or requests that can help shape our search tool.  



Sean Garrett
01/05/2012 9:17pm

um... this has been solved: http://hub.opensolaris.org/bin/view/Project+opengrok/

also if you're specifically worried about C# creating a Parser for C# is trivial.

01/06/2012 6:21am

@Sean Garrett

Hi Sean, thanks for your reply! I agree that OpenGrok is a great project and solves some of the problems I mention in my blog. However, it differs from the vision for our new project in several important ways:

1. Opengrok's GUI is web-based.
2. Opengrok ignores some advances in code search research.

Let me elaborate on both points....

On point #1, Opengrok is most-often used as an indexer on central code repositories so that users can search their company's code base (e.g., http://src.opensolaris.org/source/index.jsp). Thus, the search UI and results are provided via a web interface. In our project (Sando) we want to bring the power of a tool like Opengrok *into* the IDE. We want developers to be able to stay in their IDE to search the code they're working on and, as they click on results, we want to be able to open the file in the IDE. I believe that tight IDE integration is an important feature for any software development tool.

On point #2, I will certainly concede that Opengrok is a great, multi-language source code search engine. However, without diving into the implementation itself I have found a few problems that make it problematic for our use case. For instance, when searching the OpenGrok source code base using OpenGrok it can be tricky to find the code responsible for summarizing hit results. I searched for "summarize" and was disappointed with the results (see http://src.opensolaris.org/source/search?q=summarize&project=opengrok&defs=&refs=&path=&hist=). Note that this search failed because OpenGrok probably isn't using stemming when indexing and searching, which would have caused "summarizer", "summarize", and "summarizing" to all be transformed to "summarize". I also searched for "indexer exception", which I would have expected to return the IndexerException class, which it did not (see http://src.opensolaris.org/source/search?q=+indexer+exception&project=opengrok&defs=&refs=&path=&hist=). This is likely because OpenGrok is not splitting the identifiers (e.g., IndexerException -> {indexer, exception}) before indexing the file.

All of that being said, I was very impressed when I was using OpenGrok and we may look into leveraging it as a back end to our project. Thanks for the pointer Sean!

01/06/2012 3:46am

I've never used code search before. I have tried searching for Vizual Sudio as you stated and it did indeed return an empty result (no correction).

Searching for Visual Studio returned mainly C code.

I don't think it was helpful at all...

01/06/2012 6:20am

@itoctopus I searched for "vizuale sudio" and Google automatically corrected it for me and searched for "visual studio" instead. Try that out...

Otherwise, I'm not sure I completely understand your comment, but the web search was just an analogy to show how bad code search is currently. Check out these slides for further clarification: http://www.slideshare.net/davidcshepherd/code-search-sucks-10676069

01/07/2012 10:42pm

Nice idea. I assume the plan to create an extension/plugin for Visual Studio i.e. for .NET users. Any thoughts for Java and its popular IDE's like Intellij, Eclipse etc?

01/08/2012 8:15am

Hi Andy,
Yes, the current project on codeplex is focused on creating a Visual Studio extension so that users can have a good code search tool in their IDE. It'd be *great* to attack the same problem for IntelliJ, Eclipse, etc. in the future. We'll keep that in mind as we design the back end for this extension, hopefully capturing the overlap in reusable modules. Thanks for the comment... comments like these help us keep focused on solving the broader problem (i.e., code search across all IDEs).

01/24/2012 4:09pm

Nice ambition and vision. Are you doing anything for conferences or workshops this year? Are you looking at symbols, structural and semantic relation based queries, ranking by historical recency in edits or navigation, execution frequency?

01/25/2012 9:41am

Thanks Chris. We're excited about this project and so far it's getting a lot of great contributions from several individuals I haven't even met in person before. Amazing!

So far we haven't looked into submitting to any conferences or workshops but I hope to in the future. If you have any ideas and/or want to collaborate don't hesitate to contact me!

As far as the advanced software search measures that you mention we have not investigated anything but the straightforward application of obviously superior results from the past few years of research. However, part of this project is to make sure it has really clear, easy-to-use APIs so that others can use it to investigate these ideas.

For instance, if you wanted to investigate improving search results by also including program structure you could leverage our API to do all the indexing and searching, adjust the search results according to your new scheme using program structure info, and then show the new results in our viewer.


06/22/2012 7:23am

This blog is pretty interesting, will add a bookmark, thanks.

05/24/2013 3:28am

I am surprised to see how an intelligent mind is working! This “Gaagle’ might come to use for me too! Code failure is common in searching and we have to create alternate methods or amendments! Thanks for the share! All the best for your efforts!

09/05/2013 5:37pm

Which template is this for your blog?

Comments are closed.