Analyzing Financial Forecasting Models Using AI

In the world of finance, financial data is the foundation of every company out there that offers a financial product. Financial data is so revered by the industry, that MBAs are told that a company can be valued solely based on data alone. The thing is, it actually works surprisingly well. You can get a basic idea of a company’s true valuation using various methods that are all based on underlying financial data. When we talk about financial data, we’re usually referring to two broad types of data:

  • Market Data – Asset identifiers, daily volume, open/close prices, often sourced from exchanges
  • Fundamental Data – Company financials found in regulatory databases

The first type of data is useful, but entirely structured and quantified in a manner that can’t really be improved upon, aside from market data vendors improving their often substandard data processing methods. The second type of data is where all the meat is at, and usually comes from regulatory filings.

When we write about companies that may be throwing off red flags, we completely avoid investor presentation decks and only look for information in the regulatory filings. This is where everything the company needs to disclose can be found. It’s the purest form of information that comes right from the horse’s mouth. Financial filings are required by law to disclose what is pertinent to investors and accurate to the “best of the company’s knowledge”.

Most of the data found in regulatory filings are already labeled and structured, like this example taken from the annual report of a space stock we looked at recently named ORBCOMM:

Source: ORBCOMM 10-K

Any half-decent programmer could easily collect data from tables like the one above and shove it into a database, but now we’re well beyond that. With the advent of newfound Natural Language Processing algorithms that use machine learning, analysts can begin to parse unstructured information. For example, check out how important the information contained within this line of unstructured data would be for investors to know (again, from the ORBCOMM 10-K):

Significant customers such as JB Hunt, Walmart, Caterpillar, Komatsu, Hub Group Inc., Onixsat and Satlink S.L. collectively, represented 31.9% and 26.5% of our revenues in 2017 and 2016, respectively, and are expected to represent a substantial portion of our revenues in the near future.

Imagine how useful it would be to identify company relationships like these scattered throughout the millions of financial filings around the globe – in almost real-time? It’s not unreasonable when you consider that Digital Reasoning can take a stack of documents that would take 300 analysts an entire year to read and munch through it in just several hours. A startup called Visible Alpha wants to change the way that analysts construct financial forecasting models by making the data they need (and plenty of data they didn’t know they needed) all readily available at the click of a button.

About Visible Alpha

Click for company websiteFounded in 2012, New Yawk startup Visible Alpha has taken in $34 million in funding from all the big names in banking like Bank of America, UBS, Goldman Sachs (lead investor), Wells Fargo, Morgan Stanley, and HSBC (lead investor). The Visible Alpha product is based on “deep data” that is derived from a large number of data sources as seen below:

Source: Visible Alpha

While some of these data sources are vanilla flavored, like balance sheets and cash flow statements, some others like “segment financial forecasts” are quite exotic. Remember how we talked about MBAs that use fundamental data to value companies and make predictions about future cash flows and earnings forecasts? Investopedia describes this process far better than we ever could:

Earnings forecasts are based on analysts’ expectations of company growth and profitability. To predict earnings, most analysts build financial models that estimate prospective revenues and costs. … Analysts’ forecasts are critical because they contribute to investors’ valuation models.

Those financial models built by the 1000s of analysts out there have historically existed in disparate sources, using varying accounting methods to arrive at their conclusions. What Visible Alpha does is to take all the underlying data for the financial models and then standardizes it so you can see where the data is coming from and how analysts compare when they forecast key numbers – like the number of iPhones sold by Apple seen in the below example:

Source: Visible Alpha

This sort of aggregated data is extremely valuable to everyone involved in the investment process, from the people who sell this research to the people that buy it, and even the companies that are being analyzed in the reports. Having the transparency into how analysts are coming up with their forecasts would be extremely valuable for Apple to understand as they try and explain why they’ve hit or missed earnings forecasts based on a large number of cumulative assumptions. With over 450 research providers using the Visible Alpha platform to manage their reports and financial models, it makes sense for companies to see what is being said about them in numbers.

Getting these rich datasets at scale is not easy because each forecasting model will be constructed and labeled differently, not to mention the various accounting methods being used. That’s why the Visible Alpha process still uses high and low touch processes that involve humans:

Source: Visible Alpha

Of course these “human-in-the-loop” processes are ripe for automation when you have AI algorithms start to “learn” based on human decisions. As the process becomes more automated, the likelihood of data errors decreases. For firms that sell financial data, incorrect data incur a huge support cost.

There’s also a regulatory driver here as well, something really boring called┬áMiFID II which is a European Union directive that requires firms to now pay for all research as opposed to purchasing a “bundle” of services that includes research. It’s all about increasing transparency they say. Of course, a number of startups cropped up to address this new need, several of which were acquired by Visual Alpha last year:

  • Alpha Exchange (acquired Nov 2017) – This London-based startup states “we are┬ábringing you a comprehensive end-to-end solution to discover, consume, track, budget, value and pay for research content” which will help “facilitate research monetization in a MiFID II world”.
  • (acquired Jan 2017) – Helps asset management firms “monitor, aggregate and analyze all of their interactions with research providers in one place”. Also focused on MiFID II compliance.

The end result is that now Visible Alpha can offer investors an award winning “comprehensive end-to-end solution to discover, consume, track, budget, value and pay for research content”. Perhaps the most interesting of these is “value”. Wouldn’t it be great to see which one of the 450 firms selling research reports out there is getting things right? Remember this is the same “research” that 80% of active portfolio managers use to underperform the market benchmark every year.

There are some challenges when it comes to global financial data. While it’s easy enough to scrape every single SEC filing for every single company out there and sell the data, it’s not that easy when you try to do that internationally. We recently talked about how AI can’t even understand the 1.3 billion Chinese people on this planet. Expect even more problems when you encounter some “creative accounting” along with a vague explanation that doesn’t translate properly. These are all challenges to be worked through, but Visible Alpha has a consortium of banks that are incented to help them succeed along with some deep pockets.


The application of AI that we’ve discussed so far has involved parsing all the data and making sense of it in a structured model so that it can then be used by MBAs to make bad decisions with. Why not feed this data to another set of AI algorithms and replace the analysts entirely? Are we to believe that a program that can master the most complex game known to man, Go, without any instructions, is somehow incapable of forecasting 3M’s next dividend increase?

Turns out lots of firms are using AI algorithms for forecasting, and lately they’ve been getting trounced by the market. “Hedge Funds That Use AI Just Had Their Worst Month Ever” says an article by Bloomberg published a few weeks back. Maybe the market is truly a random walk after all.

Want to know what 30 tech stocks we own right now? Want to know which ones we think are too risky to hold? Become a Nanalyze Premium member and find out today!

Leave a Reply

Your email address will not be published. Required fields are marked *