Dataminr – AI for Social Media Big Data Mining
Before the dawn of the Digital Age, there was an oft-used expression, “I’m an open book,” meaning it’s easy to know what someone is feeling or thinking because they hide nothing. These are also the same people who probably drive windowless cargo vans filled with shiny toys. The rise of social media empires like Facebook and Twitter means that millions, if not billions, of our fellow humans are an open book these days. And companies like the infamous (and now defunct) Cambridge Analytica harvested all that personal data from nearly 100 million people to allegedly get politically motivated dirt on politicians.
We’ve preached before how our “data exhaust”—online behaviors, especially on social media—is used to make money. That’s also valuable when aggregated, such as seen below:
While social media listening tools have been around for years, few have attracted the sort of investments that have come to New York-based Dataminr.
Founded in 2009 by some dudes with degrees from Ivy League schools, Dataminr first came across our radar in an article we published in early 2017 titled Big Data: Big Brother or Bigger Than Any of Us?” Since that article, funding has poured in. Late last month, Dataminr closed a $391.6 million Series E, bringing the startup’s total funding up to a staggering $577 million. That bought the company a $1.6 billion valuation, per Axios. As you might expect, Dataminr has quite the A-list cast of investors, with Morgan Stanley, Fidelity and Goldman Sachs among the more-than-20 backers posted on Crunchbase. Even EquityZen offered shares on a secondary market offering in September 2015. But that doesn’t tell you the whole story. Twitter reportedly has a 5 percent stake in the company and provides Dataminr with exclusive access to its “firehose,” or the full feed of users’ public posts. In addition, a couple years ago, The Intercept reported that Dataminr was one of a number of social mining companies, including a competitor called Geofeedia, that received undisclosed funding from In-Q-Tel, the CIA’s venture capital firm.
Mining Social Media for Actionable Insights
Before we get to the Spy vs Spy chapter in our story, let’s talk a bit more about what exactly Dataminr does. On the surface, it sounds pretty simple: Dataminr aggregates social media data in real-time, including upwards of 500 million tweets per day, and uses machine learning to identify and deliver relevant news to its clients before the breaking news headline hits CNN (though CNN is a media partner).
The company serves five broad industries: finance, corporate security, public sector (including law enforcement), news and public relations. It got its start serving hedge funds and asset managers, which can use alerts from Dataminr to react to events that might affect different markets. For example, last December, an explosion ripped through a major natural gas facility in Austria. Dataminr notified its clients of the blast about 90 minutes before the official word came in from the gas company. The event impacted gas supplies across Europe and sent gas futures soaring. The early warning from Dataminr gave its clients time to react before the market moved.
Mining Social Media for Breaking News
Dataminr says its social media surveillance platform is used by more than 400 newsrooms. The New York Post, which has a huge digital media presence, has used Dataminr for news tips on many major stories, such as the death of American student Otto Warmbier following his release from North Korea and the U.S. Congressional baseball practice shooting in Del Ray, Virginia last year. Or remember the story that circulated in 2017 about how Microsoft founder Bill Gates regrets the “ctrl-alt-delete” function in Windows? He made that comment at Bloomberg Global Business Forum and someone tweeted that little nugget, which Dataminr’s algorithms detected, alerting newsrooms to what would become a viral and rankings-boosting news story for little-known news sites like Inverse.com. Even journalists for the New York Times say Dataminr is the most important tool for staying on top of breaking news.
Meanwhile, companies use the platform to monitor and protect their brands. Take the unfortunate PR event that befell clothing retailer H&M when the Huffington Post tweeted an image that showed a young black model in a sweatshirt that read: “Coolest Monkey in the Jungle.” That went over like a barrel full of monkeys over Niagara Falls. In that case, there wasn’t much H&M could do, as it lost celebrity endorsements, while stores in South Africa were vandalized. But Dataminr was there to cover the ongoing controversy for its clients.
Mining Social Media for Emergency Response
In 2017, Dataminr rolled out a new tool in New York City that identifies emergency situations through the Twitter firehose and alerts the city’s first responders. The system can reputedly “filter out fake announcements, jokesters, pranksters” and other non-events, so emergency personnel don’t waste their time chasing cats stuck in trees, according to TechCrunch. The company’s algorithms can reportedly discern fake news from real news because the signals that come in during an actual emergency usually include details such as photos, TechCrunch reported.
In a real-life example from a few years ago, a gas explosion in New York City’s East Village caused a massive blaze that killed two people and injured a couple dozen others. Dataminr sent its first alert about the disaster about 12 minutes ahead of major news reports and helped emergency responders get to the scene that much quicker.
Mining Social Media for Surveillance
We invoked the story of Cambridge Analytica at the beginning of this article for an obvious reason. Like any technology, Dataminr can be used for evil as well as good.
The company got into its own PR nightmare in 2016 when news broke that Dataminr was selling geospatial intelligence data (i.e., location information) to police intelligence centers, which in government-speak are known as fusion centers. Dataminr had allowed these fusion centers to identify individual tweets and users linked to events and keywords, according to The Verge. In fact, the American Civil Liberties Union of California discovered Dataminr helped track social media posts relating to protests in one case. It turns out that sort of surveillance is against Twitter’s policies and access was eventually blocked.
That same year Dataminr had a contract with the FBI for access to the company’s early-warning system, which included permitting the federal law enforcement agency to “search the complete Twitter firehose, in near real-time, using customizable filters” in the name of fighting terrorism.
In 2017, in a short series of investigative pieces between The Verge and MapLight, more details emerged about how Dataminr allegedly offered its services to foreign governments in ways that the more cynical of us might construe as surveillance of political dissidents. One possible use case mentioned in the article says Dataminr could be used “to explore an individual’s past digital activity on social media and discover an individual’s interconnectivity and interactions with others on social media.” That’s one way to turn a movement like the Arab Spring, which relied heavily on social media sites like Twitter, into one long, deadly winter.
Just in case you’re not paranoid enough, The Intercept reported earlier this year about a former VP at Dataminr named Patrick Ryan (running for office at the time of the report) who allegedly used his access to various data mining technologies to spy on left-wing activists.
The Future of Social Media Mining
Despite privacy scandals linked to sites like Twitter and Facebook, it would be naive to think the value of social media mining will suddenly evaporate. Dataminr can count 391 million new reasons why that won’t be the case any time soon.
In a recent interview, Dataminr Chief Strategy Officer Peter Bailey (brother of founder and CEO Ted Bailey) said that the 300-person company aims to get its arms around even more big data to provide additional context to its breaking news alerts. For example, Twitter users might be the first to report on a chemical plant explosion but they’re not likely going to know what exactly exploded that’s turning them into an army of clown-faced zombies. But something like a space-based radar can perform an analysis of the atmosphere in real-time, adding an additional layer of information before the world turns into a post-apocalyptic hellscape.
Dataminr is hardly the only player in the social media data mining game, but with all this new funding they should find it a lot easier to accelerate out ahead of the competition. Companies like Dataminr will continue to find new opportunities to leverage big data and AI, and scandals will continue to happen as companies dabble in all this newly found big data. As long as people are willing to give away their data for free, there are plenty of companies out there willing to monetize it.