Perhaps no other technology term has seen a more exaggerated use recently than “big data” . Whatever happened to “data mining” and “data warehouses”? Aren’t those pretty much the same thing? Not really. A data warehouse is a massive structure of connected databases and schemas which you then query using “data mining” tools to learn things. “Big data” is the same concept really but the data sets are much larger and often contain lots of unstructured data as well. Take the below multi-terabyte big data set example which shows a graphical depiction of all Wikipedia daily edits over a period of time:
That data wasn’t neatly stored in some nice structured table on a database in Wikipedia’s server room. IBM (NYSE:IBM) used a robot called Pearle to capture that “big data” set. Being able to capture big data and structure it is a prerequisite to analysis. Once you’ve defined all the datasets and created relationships within the data, you can then start to ask questions. Sure, every company stands to benefit from looking at “big data” but it’s these “big data tools” that we want a piece of as investors. One company out there is seriously dominating this space. With their cool sounding Lord of the Rings name and a $20 billion valuation, Palantir is the fourth highest valued private company today and the most revered “big data” player out there.
Founded in 2004, Palatir Technologies has taken in a staggering $2.42 billion in funding so far with their latest round of $880 million closing in December of last year. While Palantir is backed by more than 12 different investors including the CIA’s investment arm In-Q-Tel, the Company’s biggest shareholder is Peter Thiel who also co-founded of Palantir. With a stated mission to use software to improve the world, Palantir has all but taken over Silicon Valley now occupying 23 buildings or about 250,000 square feet of office space which represents 10-15% of all commercial inventory. With some of these leases exceeding 10 years in duration, Palantir is here to stay. Over the past 3 years, the Company has been using their war chest to buy 5 other companies allowing them to grow by acquisition:
Their latest acquisition, Kimono Labs, develops a technology that lets you screen scrape from web pages. Over 125,000 developers were using that freely available tool and now that they’ve been acquired, the tool will no longer be offered as of February 29th 2016. That says something for how valuable Palantir sees this tool that will effectively let them turn the entire content of the Internet into a massive collection of data sets. We wrote about a company recently called Diffbot which is doing something similar in an article titled “Diffbot: Extracting Structured Data from the Internet“.
Palantir offers two products, Gotham and Metropolis, which can be used for multiple industries and applications as seen below:
Now we can try to understand the Palantir product offering by spending hours trying to regurgitate their marketing material or by interviewing one of their PR people to get the canned media spiel, but instead we’ll give you an example of what they can do.
In a demonstration of how powerful their platform is, the Company put together a Palantir Gotham instance that integrated anonymized data from Medicare and various other data sources to show the potential of a fully integrated, interactive system. The instance integrated the 6 following data sets:
- Medicare data representing 100 million claims, 1 billion medical procedures, 30 million individual beneficiaries, and 700,000 physicians.
- Data from the National Plan Provider Enumeration System, used to standardize identifiers across payers and providers.
- Data from the Dartmouth Atlas Project—a well-curated collection of hospital-specific performance data.
- Data from PubMed, representing 22 million biomedical journal articles.
- Data from the Department of Health Human Services Office of the Inspector General composed of entities excluded from participation as Medicare providers due to past fraudulent behavior.
- Data from the US Census showing demographic trends across the country.
Can you even imagine how tough it would be to link all these datasets? If you were building a data warehouse, you could just import all the databases into a master database and begin to query it. The problem here is that these aren’t all databases. Dataset #3 listed above consists of 22 million biomedical journal articles. How do you even begin to analyze those? That’s where Palantir’s tools can add value. And this example is just 7 datasets. The U.S government has made public almost 200,000 datasets from 170 different sources which they have posted for public access. There is an incredible amount of “big data” available for free and the first company that can start to link all these data sets and learn from them will have the one ring that rules them all. Palantir Gotham allows you to do this.
After you’ve created your ginormous big data set, you can then use Palantir Metropolis to analyze the data and learn from it. Examples given by Palantir include tracking and analyzing insurance claims data, network traffic flow, and financial trading patterns:
Metropolis ingests terabytes of claims data from Fortune 500 payers to identify plan members in need of critical services like immunization and chronic disease screenings. End-to-end workflows in Palantir lead to direct action to connect these members intelligently with providers based on distance, barriers to service, and provider quality.
Palantir is one of those companies that we actually get excited about investing in. The problem is, unless you’re a client of the subset of VC firms that have investments in Palantir, you’re just going to have to wait for an acquisition or IPO to participate. While the IPO market is starting to get a bad case of the shakes, Palantir would be one offering we’d be keen to get a piece of. The alternative is an acquisition. With Google’s mission being to “organize the world’s information and make it universally accessible and useful“, Palantir would make a great addition to Alphabet’s portfolio. Then again, all this data is exactly what Watson thrives on crunching over at IBM (NYSE:IBM) and an acquisition is just what IBM (NYSE:IBM) needs to get out of the doldrums.
Let's hope we see an IPO for Palantir so that retail investors can have a pure-play big data stock to invest in. One firm that allows you to buy shares in startups before they IPO is Motif Investing. You can open a Motif Investing account for free with no deposit required so you are ready to buy shares of future IPOs before they begin trading.