Eclipse Code Recommenders Proposes Code Based On Bayesian Networks
The idea of the code recommenders is to adjust and filter the set of proposals given when the code proposal key sequence is triggered. By default, Eclipse will show the list of public methods (or fields) in alphabetical order. However, when coding against unfamiliar APIs, or ones with many overloaded methods (Quick! Which of the 6
Date constructors is the right1 one to use?), it is not always clear which one should be called.
The code recommenders tool has a database of prior code samples, along with frequencies of the method callers, and uses that to prioritise which method or constructor to prompt for. If most
Date constructors use either the zero argument (or single
long argument), then these two choices will be presented first, with other ones filtered out. In addition, the proposal can use context sensitive information, so if completing a method call of
timezoneOffset = date.get it will prompt the
getTimezoneOffset() method as the first selection.
The recommenders project also provides a list of context-sensitive snippets of code. These can be constructed manually, or inferred from existing code samples. As with other Java templates (such as
syserr), these can be used to quickly implement code.
InfoQ caught up with Marcel Bruch, creator of the recommenders project, and started by asking what prompted the creation of the project:
Marcel Bruch: Code Recommenders started out as a research project at Darmstadt University of Technology roughly 3 years ago but its roots go back to 2006. Around that time, I started developing my first Eclipse plug-ins and reading the source code of other plug-ins to learn how to use the Eclipse APIs. At the same time, mining source code was quite popular in the research community and as PhD candidate, applying machine learning on source code to assist developers (me) to learn how to use an API was the obvious thing to do.
In 2009 I published the first version of Recommenders Intelligent Code Completion in Eclipse PluginCentral which had quite a few limitations that made it barely usable for developers and presented it on a local Eclipse DemoCamp. The presentation was average, but from there on one event followed the next which finally ended up in becoming an Eclipse project in January 2011.
What finally made me start Code Recommenders at Eclipse was the idea to create something that has the power to change the way how we develop software – and the encouraging feedback from a few Eclipse guys at the right time.
InfoQ: What kind of recommendations does the plug-in make?
Marcel Bruch: As of 1.0, Code Recommenders recommends developers which methods they likely want to call on an object, which methods to override when extending a framework base class, or assists developers on how to obtain an instance of a given type from the current location in code.
Essentially, it assists you on learning how to use new APIs and prevents you from doing unnecessary mistakes. And if you are already an expert of that API it just makes you much faster as it already knows what you want to type before you actually thought about it.
And all that information is tightly integrated into Eclipse JDT’s code completion as well as into an Extended Javadoc view which nicely summarizes all recommendations in one place.
InfoQ: Where does it get its data to make its recommendations from?
Marcel Bruch: Code Recommenders learns its recommendations by analyzing source code of existing applications which successfully used a certain API before. For Recommenders 1.0, the data is taken from the Eclipse Juno release train repository, a code repository consisting of 72 projects and more than 50 million lines of code.
Code Recommenders currently supports many Eclipse APIs, and partially supports the Java Standard Library, namely the main packages under
java.*and some packages under
javax.*. As the recommendation models are generated from the Eclipse Juno Release Train code base only, packages like
javax.swingare not yet supported as no data was available at generation time.
In future releases, we are looking for larger data sources to broaden the scope of Recommenders to support frequently used libraries like Apache Commons and the like. Another thing we would like to support is the Android Platform. It’s pretty close to Java and very popular even for semi-professional software developers which makes it a perfect candidate for Code Recommenders.
But supporting the fairly large Java and Android ecosystem requires quite a lot data. We are in discussion with some of the largest software repository maintainers to get access to their huge sets of example applications for Android and Java. There is no decision made yet but I hope they find Recommenders convincing enough to support it in the near future.
InfoQ: Can it be made to work with custom APIs or ones that it currently doesn’t know about?
Marcel Bruch: Code Recommenders can learn API usage patterns for any Java API – also for custom APIs. The Eclipse Code Recommenders team is currently working on a Recommenders Developer Kit that will enable developers to build recommenders for their own APIs from their Eclipse workspace.
However, we have learned that building excellent recommenders is not trivial. It quickly gets tricky when dealing with Big (Software Engineering) Data combined with large-scale machine learning and static code analyses. We thus decided to offer commercial services in the near future to assist companies to make Code Recommenders work for their APIs and to support their developers or customers using their APIs.
InfoQ: Does the list of proposals come from an analysis of the frequency of the client use base?
Marcel Bruch: Yes – but just looking on the frequency each method is used in client code wouldn’t help you much. For illustration, consider a code completion engine that would always propose to call
setText()on a text widget. Clearly, this may make sense when working in a method like, say,
Dialog.createContents()but doesn’t make sense in
Dialog.close(), right? Here only a call to
text.getText()makes sense as the developer probably wants to read the user input from the text widget.
A smart completion engine must consider this, and so Recommenders does. For intelligent calls completion, for instance, we track where an object is used, how the object was defined, and which methods have been invoked on it so far to make the most likely recommendations.
This model is currently extended for a more advanced recommender that also considers information like which other objects are visible in the current completion context, and how they have been used so far etc. Under the covers there are Bayesian Networks at work that make all these computations possible in just a few milliseconds.
InfoQ: Is it currently Java only? Are there plans for other languages?
But to tell the truth: we still have to come a long way to support other languages. We certainly can spend all our time in improving Code Recommenders for Java and supporting other languages would require some external investment and more team resources to make it happen soon.
InfoQ: What’s in the pipeline for code recommenders?
Marcel Bruch: First, we continue the development on intelligent code completion by adding a new method arguments guesser. Developers familiar with SWT know what trouble SWT’s style bits cause when, for instance, registering a listener, or creating a text widget. But which constant or even which objects to pass to a method call can be learned almost in the same way as we currently learn which methods to call as Cheng Zhang from Shanghai University has recently shown. Cheng is currently working towards integrating his tool into Eclipse Code Recommenders.
Another area we are making exciting progress is mining code snippets from example code. All developers know the template completion of their IDE that offers snippets like iterating over an array, doing type casts etc. But much more often we rather need code snippets that show how to use an object and how it is used in combination with other objects. Eclipse SWT Templates provides some of such examples that show how we typically instantiate and configure SWT widgets. Creating these templates, however, is costly and thus, only a few of these code templates exist. We at Code Recommenders are working on tools that can mine frequent code patterns from example code and integrate these mined patterns back into your IDE. The Snipmatch completion engine developed by Dough Wightman from Queens University and Chen Cheng via the Google Summer of Code programme will integrate these mined snippets in a new kind of template completion.
The third area is the development of a stacktrace search engine. Maven users are already familiar with the idea. Whenever a Maven build fails with an exception, the last line printed suggests the developer to visit a wiki page that potentially describes the cause of this exception. Unfortunately, these pages are mostly empty. The idea, however, is great! Stacktraces contain much more information than you might expect and leveraging all implicit information available at runtime to find resources in the web that discuss and solve your issue can be found quite easily if you know how. Johannes Lerch from Darmstadt University of Technology is working on this project and I’m curious to see how much time we’ll save in future with solving problems others have already solved before.
Lastly, we work on an example snippets code search engine that finds interesting code snippets related to a developer’s task at hand.
These are the candidates we have in mind for Eclipse 4.3 (Kepler) coming in 2013. But behind the scenes we are working on something larger – the vision of IDE 2.0.
What do you think of the Code Recommenders project?
1 - None of them, use JSR 310 instead. The Date class is one of the worst APIs ever.