In the past three weeks we have collected anonymous usage data from developers (with their knowledge and permission!!) using our search tool in Visual Studio. Learning from the failed Usage Data Collection effort in Eclipse, which perhaps had too broad a scope, our data collection is tightly focused on improving search, and nothing else. Thus, after reviewing even three weeks of data we have several useful conclusions. Primarily, that developers who utilize recommendations (shown above) when searching code are more satisfied with the results and execute 20% less failed queries than those not using recommendations.
As you can see in pane (b) above we collected data for 25 days. We collected 204 log files from 44 unique users (excluding all Sando developers). Interestingly, as a comparison, Eclipse's Usage Data Collection collected only 453 new users' data over a 14 day period in 2008, even though they were potentially collecting from all Eclipse users.
During this 25 day period we collected anonymous metrics about 363 user queries executed on local projects via Sando. As seen in pane (a) 90 utilized the recommendations provided by Sando. Of those 90, 41 were utilized pro-actively (e.g., by selecting an entry from a dropdown) and 49 were triggered automatically (e.g., when the user searched for a word that did not exist in the code base). Of the 41 active queries 25 were "lookup"-style searches, such as using the autocomplete to help remember the name of a method, and 16 were exploratory-style searches, such as searching for phrase like "open file".
Using the data from these 363 queries we tried to gauge user satisfaction level using a few different metrics. First, we measured the query failure rate. We considered a query to fail if a user did not review any of the results, as indicated via UI events. As you can see in pane (e) queries that utilized recommendations had a 20% lower failure rate, which certainly leads to higher user satisfaction.
We also used both the short-click and long-click satisfaction measures, as are often used in web search. The intuition is that if a user clicks and views a result and then immediately returns to the result list that result was not helpful (a short-click), where as if a user clicks and views a result for a long period before (if ever) returning that result was helpful. As you can see in pane (d) queries with and without recommendations provided about the same rate of long-clicks, which is a positive metric, whereas queries with recommendations showed a significantly lower short-click rate, which is good as short-clicks are a negative satisfaction indicator.
As I mentioned at the beginning of this post, we only collect information aimed at helping us improve code search. So what's useful about the information I've presented today? Well, in the short term, Sando users should start using recommendations, immediately. They will be more satisfied with the results and fail 20% less. While we intend to do our part to improve the recommendation UI, possibly the most useful thing about this data is that it gives Sando users field evidence to support feature adoption. And when you think about it...how often have your other software tools given you evidence that a new feature actually makes you better? ;)
David Shepherd leverages software engineering research to create useful additions to the IDE.