Automated Machine Learning from DataRobot
The notion of nanoscale self-replicating machines getting out of control and covering the earth in “gray goo” is an idea proposed by Eric Drexler in his 1986 book Engines of Creation. Similar to how cancer cells begin multiplying in an out of control fashion and wreaking havoc in a host organism, machines might be capable of doing something very similar – but with planet earth as the host organism. One mistake in the master control program and the nano-sized machines suddenly start replicating out of control using all the building blocks they can get their grubby little nano-paws on until there’s this giant puddle of grey goo that eventually coats the entire planet and we all die. Then, that Italian guy can stop asking “where are they” because we’ll have found at least one proven cause for “the Great Filter.”
While we may not have machines that are capable of self-replicating at that scale, we do have things like Robotic Process Automation (RPA) where the machines are learning how to behave like your average Mumbai back-office worker. We’re now seeing code that codes itself. Machines that learn in an unsupervised fashion. And something called “automated machine learning.”
You know the old joke that people used to crack about how machine learning is like high school sex. Everyone talks about it, everyone says they do it because they think everyone else is doing it, but the reality is nobody’s doing it. Fast forward to today and everyone’s doing it and far too many people are talking about it. Anyone who uses basic Natural Language Processing (NLP) algorithms to screen scrape a few websites now claims to be using AI as a competitive advantage. AI has been commoditized, and anyone who doesn’t use AI is already being left behind. If you’re a CTO that thinks your company might be in that bucket, you need a quick low-risk fix so you can “fail fast” before bonus time rolls around. What you need to do is look for companies that are Making Artificial Intelligence Easy to Use. We wrote about three such companies, one being DataRobot.
The Growth of DataRobot
Founded in 2012, Bahstun-based startup DataRobot has taken in a total of $226.4 million in funding that came from a slew of big names in the venture capital world along with corporate investors – chipmaker Intel and the third-largest life insurance company in the United States, New York Life Insurance. They’ve used all that money to build a platform that ingests your raw data which it will cleanse and subject to a multitude of algorithms before choosing the best one using – you guessed it – more AI algorithms. Now, here’s where we feel compelled to address the elephant in the room.
Everyone’s talking about a Series E funding round that supposedly took place last week in the range of $200 million but the company told us they haven’t said squat yet so that means nothing has actually happened. However, the geniuses at CB Insights have now valued the startup at $1 billion giving them a spot on the CB Insights Unicorn List which now has 384 members. Since CB Insights is largely infallible, we’re not sure what to make of the whole thing except to say we’ll update this article when there’s more color around this mythical funding round.
Update 09/17/19: DataRobot has raised $206 million in Series E funding to continue building out its product line while looking for acquisition opportunities where it makes sense. This brings the company’s total funding to $430.6 million to date.
With 472 open jobs right now, the company describes a “hypergrowth mode where groups within the company work like start-ups within the start-up.” That sounds about right considering they’ve been acquiring other companies as they go along. According to CrunchBase, they’ve now made four acquisitions. One we featured before – Nutonian – which made our list of 10 Data Science and Predictive Analytics Startups. Let’s talk about the other three.
When Startups Eat Startups
When a larger company acquires a smaller company, that’s the last chance you get to see what the smaller company was doing before everything goes behind the curtains. The most recent acquisition made by DataRobot was a startup called ParallelM which was founded in 2016 and had taken in an undisclosed amount of funding to “help their customers automate the deployment, scaling and ongoing management of ML services in production,” something they refer to as “MLOps.” That’s just a play on the term “DevOps,” something we talked about in our article on “Algorithmia – The World’s Biggest Algorithm Marketplace.” An Israeli tech news site, C Tech, wrote a piece about the acquisition titled “Disappointing Exit for Machine Learning Company ParallelM” which states that the “sum of the acquisition, which stood at several tens of millions of dollars, did not yield a return on investment for the company’s shareholder.” Sounds like DataRobot came ahead in that transaction.
The second acquisition DataRobot made this year was a San Francisco startup called Cursor which was founded in 2017 and had taken in $2 million in funding. They’re now “part of the DataRobot family” which means their corporate website has disappeared and all that remains is a picture of the smiling founders who are probably thinking about which overpriced Bay Area property investments they plan to make with their newly acquired windfall. Press releases about the event say nothing of value, but rather make vague mentions of “quests,” “joining forces,” “unleashing value,” and “key pieces of the puzzle,” which is exactly why we try to avoid PR people like the plague. Best we can tell, Cursor was building a data collaboration platform which used AI algorithms to scour disparate data sources both internal – and more recently, external – to a firm. An article by TechCrunch last year elaborates on this a bit:
Cursor more or less behaves like a search engine internally. Users can search for information, which will surface up anything from a Tableau worksheet to an actual segment of SQL. Users can then comment on the information coming up from those searches, which are tagged with metadata to help employees find that information more easily. The idea is that if someone over on one side of a production team needs something (like a segment of code), they should have some kind of intuitive way for finding it rather than having to start an email chain with dozens of people on it.
No mention was made of how much Cursor was acquired for.
The third acquisition took place in 2018 – the only acquisition that year – and involved an Ohio startup called Nexosis that pulled out of the public’s view so fast that their domain is now for sale. Founded in 2015, Nexosis had taken in $7 million in funding to develop “a machine learning API for developers that featured automatic data processing, model selection, and multiple time series specific algorithms,” according to CB Insights. Said Jeremy Achin, CEO and Co-Founder, DataRobot:
Other companies in the data ecosystem looking to try to copy DataRobot should seriously think about firing their corporate development people for missing an opportunity like this.
Looks like DataRobot knows where to find value. Now, let’s look at what they’re doing with all this firepower.
Automated Machine Learning
The idea behind automated machine learning is exactly what it says on the tin. Companies have tons of data lying around in disparate systems and they want to extract insights from that data that they can use to create efficiencies which generate more profits for shareholders. More than 3,100 companies have used DataRobot’s automated machine learning platform to build 1,268,134,812 models – so apparently, it scales really well.
Here’s a look at just a few of the wide variety of use cases for automated machine learning. (By the way, if your company has a Marcoms team, make them spend a good chunk of their time putting together case studies because it helps these articles write themselves.)
- Concert Tickets – The marketing team at DR Koncerthuset sold 83% more tickets when emails were personalized and sent to specific subscribers that DataRobot predicted would buy tickets to particular events.
- Kobe Bryant – Sports is largely the opiate of the people, particularly for the Americans who have a bunch of sports the rest of the world largely doesn’t care about. In America, there’s this guy who plays basketball called Kobe Bryant and someone used DataRobot to analyze all 30,699 shots he took. You can read about it here if that’s your cup of tea.
- Supply Chain Models – Lenovo – big Chinese firm that sells $45 billion worth of computing stuff a year – needed to predict sell-out volume among their retailers using 59 variables. Creating a single model took 4 weeks to build and 2 days to deploy. With DataRobot, models were created in 3 days and deployed in 5 minutes – and accuracy increased from <80% to 90% today.
- Water Availability – Using a database of half a million water points around the world, a nonprofit was able to predict which water points were likely to break in the future.
- Oil Drilling – Used to analyze core samples taken from oil wells and predict where recoverable oil or gas is likely to be.
As you can see, if you have data from nearly any domain then DataRobot can extract useful insights from it. You don’t need to hire a team of data scientists in order to casually mention in the next board meeting how you used machine learning to extract some grand insight from your company’s data.
Before we wrap this up, it’s useful to see who else is dabbling in this space. Research firm Gartner puts together these “magic quadrants” which are the sort of thing an MBA might hand you and walk away thinking they actually did something of value. The accompanying data in the report is decent, and it’s good for letting us know who else is playing in this space.
Some of the names in the above list could represent a future exit for DataRobot. Maybe IBM can sort something out because the growth in our quarterly dividend checks need to come from somewhere and it’s not looking like Watson will be the source of that growth though we could be wrong. Here’s a blurb from that report on how DataRobot’s aforementioned acquisitions have helped them become a leader in this space:
DataRobot sets the standard for augmented data science and ML. Significant funding has enabled expansion via acquisitions to address time series modeling (Nutonian in May 2017) and an augmented approach for developers to incorporate models into applications (Nexosis in July 2018). These acquisitions give DataRobot the opportunity to extend its capabilities to new types of user, while focusing on its core competency of augmentation.
It then goes on to mention that DataRobot is quite pricey – but you get what you pay for, right?
One mistake that newbie investors make is to look at their portfolio too often. Every time a stock price fluctuates by more than 3% on a given day, they’ll panic and begin to question their investment decisions. While the “set it and forget it” approach has its downsides – at least, when it comes to the fast-moving world of tech investments – there’s a certain advantage you gain by looking at a company every several years and assessing their progress. Since we first looked at DataRobot over two years ago, they appear to be moving away from the pack and successfully scaling. Fresh funding means they can now fill some of those open job requisitions through “aqui-hires” as opposed to waiting for the recruiting department to win the “the war for talent” – a phrase some PR person came up with to describe the act of persistently pestering people on LinkedIn.