Stop Searching Code Like a Chump 05/12/2012
Can you remember the last time you tried to search your code base for a relevant method or code snippet? I can, and it was a disaster. I used Visual Studio’s “Find in Files”, which returned 100s of results. After reviewing about twenty of them I gave up and started surfing the Solution Explorer, opening files one-by-one. After about 20 minutes of reading source files I began to wonder if there was a better way... Unfortunately this pattern of failure is all too common, as recent studies show that close to 90% of code searches during software maintenance tasks fail. While fruitless searches themselves can be annoying it’s the time spent reading and exploring irrelevant code that really compounds this cost. If you’re tired of failed searches and costly sidetracks read on to discover Sando, a new Visual Studio extension that leverages recent research advances to significantly improve code search.
Why Use Sando?Sando, our new Visual Studio Extension for searching source code, is built upon solid research findings, and in our opinion much better than the regular expression technology it replaces. We could argue that leveraging information retrieval instead of regex technology naturally leads to faster search execution. We could claim that multi-term search is difficult and annoying with regular expressions. We might even point out that regex-based searches return unranked results instead of highlighting the most relevant results. However, having been burnt by savvy marketing in the past we prefer to simply show a few concrete results of running Sando against Visual Studio's "Find in Files" and let you decide for yourself (note: these example searches were run on Sando's own source code). One of the trickiest types of searches involves searching for a term that is common in the given domain. Imagine searching for “text” in a text editor project or “paint” in a drawing project; you’re likely to be overwhelmed with 100s of results. This is exactly the case when searching Sando’s source code for the term “search” using "Find in Files" (shown above, left). However, using Sando with the same search term users can find relevant source code elements faster. Sando leverages information retrieval to find elements where the term “search” is more central. Thus, as you can see above (right), methods like “Search” are ranked highly in the search results. While unranked results can be annoying to sift through a more pressing problem may be missing results. Imagine using "Find in Files" with the search string “file open”. The result, as shown above on the left, is an empty set, yet I know that there exists relevant code in this solution. Fortunately, since Sando is not sensitive to word ordering it returns the method OpenFile as the first result, shown above right. Manually re-ordering search terms becomes a thing of the past when using Sando. While creating this post I noticed another difference between these two tools, the wait times. While the times are not drastically different, with "Find in Files" taking 5 – 10 seconds and Sando returning almost instantly, the lag time enforced by "Find in Files" discourages follow up queries and often leads to abandoned searches. As you might imagine, abandoned searches often lead to sub-optimal behavior (e.g., Solution Explorer surfing) and thus the instant results offered by Sando can be a significant advantage. Try Sando's Beta Release!Today we have only scratched the surface of ways in which Sando’s technology outperforms regular-expression-based searchers. However, we don’t expect you to take our word for it. Please try Sando's Beta Release today and don't hesitate to let us know what we should improve! 6 Comments Code Search Sucks. Join Us in Fixing It. 01/04/2012
The IDE is a software developer's toolbench. For the most part it's an ordered chaos, with all of a developer's favorite windows and extensions at the ready. Unfortunately, the IDE's complexity helps to hide some of its glaring shortcomings -- it's easy to focus on the shiny wrenches across the front while ignoring your Grandfather's 1970s disc sander that's about to melt down from misuse. Like Grandpa's sander, code search* is a tool that's in dire need of an overhaul. * In this article code search refers to the act of searching over the code base that one is currently working on, not searching over repositories of code. Code Search Sucks.... Really? Developers, having grown up on awful search tools, have become numb to the problem. Many may not even believe it is a problem! In order to illustrate the sorry state of code search let's explore an analogy based on web search... I've got a new web search service that I'd like for you to try, called Gaagle. It's incorporated all of the key features of modern code search. Let's try searching for "Visual Studio" using Gaagle: Ohps! I misspelled "Visual Studio", but Gaagle doesn't fix my mistake (or even point it out). Gaagle simply obediently reports that there are, in fact, no matches for the search terms "Vizual Sudio". Thanks Gaagle! Having corrected my search terms to "Visual Studio" I search using Gaagle again. This time I get lots of hits! However, the hits are visualized as a list of paths, leading to an exciting game of "what's behind that link!" After clicking on each link in the list I have a pretty good idea (and a sore mouse-finger). But wait, there's been an upgrade to Gaagle! Instead of returning results in an ordered list Gaagle now returns unordered results. That means that each and every web page in the world that contains the terms "Visual Studio" are returned to me for my search. No worries! I start wading through the thousands of (mainly irrelevant) hits until I find what I'm looking for. Not ready to switch to Gaagle? Well surprise... you're already using it! If you're using Visual Studio 2010 or earlier your code search (i.e., Edit > Find and Replace > Find in Files) operates almost exactly like Gaagle. Join Us in Fixing Code Search! Visual Studio, so powerful in many other ways, has neglected code search. This January we're starting a project to modernize code search, called Sando. If you long for a modern code search experience please join us at http://sando.codeplex.com/. We're looking for great developers (with C# or Java experience) but also for any suggestions or requests that can help shape our search tool. |
RSS Feed