AI-Powered Autocomplete for Programming Languages
Table of contents
Developing good software is tough. That’s because finding rock-star developers is extremely tough. These are the people who spend their free time coding because they love it so much. They’re good and they know it, they don’t suffer fools easily, and any tools they use better be built to meet their extremely high standards.
In past articles, we looked at how artificial intelligence is being used for software development in the areas of quality assurance, security, and hiring. Today, we’re going to talk about another cool application of AI for software development – autocomplete for programming languages.
By now we’re probably all familiar with how autocomplete functionality works. It all started with spell-check automatically correcting spelling errors and moved on to Smart Compose in Gmail deciding what the rest of your sentence ought to be (it works surprisingly well). Perhaps the most entertaining implementation of this functionality is the autocomplete for Google’s own search engine.
Google Autocomplete
When you type words into the Google search engine, it suggests what you might be looking for. This is based on what other people are commonly searching for. For example, you could say “why are the English…” and here’s what you get:
It’s an interesting look into people’s perception of others. You used to be able to do that with any culture or race until Google removed that functionality because a minuscule fraction of the population found it offensive and spoiled all the fun. If Google is smart enough to pick up on English people’s obsession with tea, perhaps this autocomplete functionality could be used for other things – like coding. That’s what Google was thinking when they rolled out their own machine learning autocomplete tool for a programming language they developed called Dart. Today, we’ll look at some startups building AI-powered tools for almost all widely-used programming languages.
Java and JavaScript Autocomplete
Founded in 2013, Israeli startup Codota has taken in $14.6 million in funding so far from investors that include Khosla Ventures. (Twelve million of that came in the form of a Series A that just closed last week.) The company builds tools that make developers more productive by automating mundane and repetitive parts of programming. “Codota understands the world’s code and provides you with the right suggestion at the right time,” says the company, and they’re expanding organically and through acquisition. In December of last year, Codota acquired a firm called TabNine which increased the breadth of programming languages they can support. Says the company:
Codota tries to deeply understand the semantic structure of the code, whereas TabNine optimizes for the widest language compatibility
Credit: Codota
The press release announcing the event implied that TabNine had a backlog of support issues, and other comments suggest that the product wasn’t as usable as it could be. Now that the two companies have joined together, Codota expects to be adding other languages “in the next few months” in addition to current support for Java and Kotlin. JavaScript support is currently in beta for some users, and will be released to all users soon. For the majority of readers that don’t speak nerd, here’s a brief glossary of the terms we just used:
- Java – A general-purpose programming language developed by Sun Microsystems in 1995. Standalone language that requires the Java Virtual Machine (JVM) to run (you’ll often be prompted to download the latest JVM if you use lots of applications).
- JavaScript – A programming language that makes webpages interactive. “Java is to Javascript as ham is to hamster,” as the old saying goes. Must be placed in an HTML document to function.
- jQuery – jQuery simplifies the execution of common JavaScript tasks making it easier to use JavaScipt on websites. Used by more than 70% of the most popular websites.
- Kotlin – A new programming language initially designed for the JVM and Android.
In its present form, Codota machine learning algorithms suggest code completions and related content by looking at millions of Java programs along with the current context present in whatever integrated development environment (IDE) you are developing in. (People who write code for a living – programmers – use various IDEs to create software.) While many development environments already have autocomplete functionality, Codota simply enhances it by combining techniques from program analysis, natural language processing, and machine learning to learn from code that’s already been written. Basic integration is merely adding a single step to your build script.
Any company that’s developing proprietary code would have concerns about security. That’s why Codota only extracts an anonymized summary of the current IDE scope. It does not access other files in your codebase and does not access other resources on your machine. (A codebase is a collection of source code used to build a particular software system.) The anonymized summary sent to Codota is only used for prediction and suggesting code to the user, and is not stored on their servers.
Codota is free and will always be free when serving results based on publicly available code. If you want Codota to learn from your own code, you’ll need to pony up some money. They’re not the only company working on autocomplete for programming languages.
Autocomplete for Python
You know how to tell if someone’s an Apple user? They’ll tell you. That old joke is a reminder of the constant tension between Apple users and everyone else. This same sort of elitist behavior manifests itself in programming languages, many of which have their own cultures. The programming language Python was named after famous comedy troupe Monty Python, and fashions itself around values like minimalism. The “Zen of Python” is used to communicate the community’s values, things like “Simple is better than complex,” and “Complex is better than complicated.” It’s the perfect sort of environment for some machine learning algorithms to step in.
Founded in 2014, San Francisco startup Kite has raised $21 million in funding so far from all kinds of notable people to service the Python community with a tool that helps Python developers write better code faster. In order to feed their hungry machine learning algorithms, Kite fed them all the code found in GitHub. (Used by more than 50 million developers, GitHub is a popular free tool for version control and the largest open source community in the world. The CEO of GitHub happens to be an investor in Kite.) Now, Kite’s tool can start to predict what code ought to be written next based on all the examples its been given, and continues to get from GitHub, which is always being populated with fresh examples.
In addition to providing line-of-code completions for Python, the tool also provides one-click documentation which helps developers quickly look things up or see examples while coding. In some cases, developers using Kite’s “Copilot” tool use 47% fewer keystrokes when coding. The tool runs locally on your machine eliminating many security concerns. Similar to Codota, Kite uses a freemium business model. It appears the ML stuff may be only available to subscribers soon.
The Kite plugin is available for all popular editing tools like Atom, PyCharm, Sublime, VS Code, Vim, Spyder, and IntelliJ. The below chart taken from a TechCrunch article shows just how effective these improvements can be compared to traditional autocomplete functionality.
Update 05/12/2020: Today Kite launched Kite Pro, their first paid product for professional developers. They’re also launching JavaScript completions and an all-new engine for Python completions, both powered by deep learning. Users say they are, on average, 18% more productive using Kite and over 250,000 people are coding with Kite every month.
Conclusion
Development environments like Microsoft Visual Studio already offer some autocomplete functionality out of the box. (It’s called IntelliSense.) The startups we’ve talked about are trying to improve the helpfulness of autocomplete tools by analyzing loads of examples. When coding, programmers often turn to Google for answers since whatever problem you have has probably been encountered already by someone else.
Machine learning algorithms are now clever enough to start predicting what problems you’ll run into and providing a solution before you even know you need it. If you’re a CTO looking to create some efficiencies before bonus time rolls around, just mandate that your teams use these autocomplete tools going forward and take credit for all the time savings you’ll enjoy. Soon, intelligent autocomplete for all programming languages will just become the norm, and investors in startups like these will be cashing some big checks.
Sign up to our newsletter to get more of our great research delivered straight to your inbox!
Nanalyze Weekly includes useful insights written by our team of underpaid MBAs, research on new disruptive technology stocks flying under the radar, and summaries of our recent research. Always 100% free.