Musings of a Tech Transfer Enthusiast
Can you remember the last time you tried to search your code base for a relevant method or code snippet? I can, and it was a disaster. I used Visual Studio’s “Find in Files”, which returned 100s of results. After reviewing about twenty of them I gave up and started surfing the Solution Explorer, opening files one-by-one. After about 20 minutes of reading source files I began to wonder if there was a better way...
Unfortunately this pattern of failure is all too common, as recent studies show that close to 90% of code searches during software maintenance tasks fail. While fruitless searches themselves can be annoying it’s the time spent reading and exploring irrelevant code that really compounds this cost. If you’re tired of failed searches and costly sidetracks read on to discover Sando, a new Visual Studio extension that leverages recent research advances to significantly improve code search.
Why Use Sando?
Sando, our new Visual Studio Extension for searching source code, is built upon solid research findings, and in our opinion much better than the regular expression technology it replaces. We could argue that leveraging information retrieval instead of regex technology naturally leads to faster search execution. We could claim that multi-term search is difficult and annoying with regular expressions. We might even point out that regex-based searches return unranked results instead of highlighting the most relevant results. However, having been burnt by savvy marketing in the past we prefer to simply show a few concrete results of running Sando against Visual Studio's "Find in Files" and let you decide for yourself (note: these example searches were run on Sando's own source code).
One of the trickiest types of searches involves searching for a term that is common in the given domain. Imagine searching for “text” in a text editor project or “paint” in a drawing project; you’re likely to be overwhelmed with 100s of results. This is exactly the case when searching Sando’s source code for the term “search” using "Find in Files" (shown above, left). However, using Sando with the same search term users can find relevant source code elements faster. Sando leverages information retrieval to find elements where the term “search” is more central. Thus, as you can see above (right), methods like “Search” are ranked highly in the search results.
While unranked results can be annoying to sift through a more pressing problem may be missing results. Imagine using "Find in Files" with the search string “file open”. The result, as shown above on the left, is an empty set, yet I know that there exists relevant code in this solution. Fortunately, since Sando is not sensitive to word ordering it returns the method OpenFile as the first result, shown above right. Manually re-ordering search terms becomes a thing of the past when using Sando.
While creating this post I noticed another difference between these two tools, the wait times. While the times are not drastically different, with "Find in Files" taking 5 – 10 seconds and Sando returning almost instantly, the lag time enforced by "Find in Files" discourages follow up queries and often leads to abandoned searches. As you might imagine, abandoned searches often lead to sub-optimal behavior (e.g., Solution Explorer surfing) and thus the instant results offered by Sando can be a significant advantage.
Try Sando's Beta Release!
Today we have only scratched the surface of ways in which Sando’s technology outperforms regular-expression-based searchers. However, we don’t expect you to take our word for it. Please try Sando's Beta Release today and don't hesitate to let us know what we should improve!
David Shepherd leverages software engineering research to create useful additions to the IDE.