Should Social Media Data be Used for ESG Investing?

August 8. 2019. 7 mins read

You have to beat the system to change it. In today’s capitalist society, this means climbing the ladder alongside the rest of the grunts and accumulating wealth that you can use to change the world. It’s a far more effective strategy than waving a hand-painted sign around while chanting slogans. Another way to help change the world is by directing investment capital towards those enterprises that are doing good things. Responsible investing, sustainable investing, socially responsible investing, these are all names for something the finance industry has coined Environmental Social Governance (ESG) which refers to three categories of “good” that companies should all strive towards:

  • Environmental – Climate Change, Natural Resources, Pollution & Waste, Environmental Opportunities
  • Social – Human Capital, Product Liability, Stakeholder Opposition, Social Opportunities
  • Governance – Corporate Governance, Corporate Behaviour

ESG is an extremely lucrative business today. The amount of money being allocated to ESG investments is staggering, around $12 trillion in U.S. assets alone.

Sustainable growth
Source: Bloomberg

The drive behind the growth of ESG is often attributed to the changing preferences of millennial investors who want to put their newly accumulated wealth to good use. Whether or not ESG outperforms the broader market is a topic of debate.

Does ESG Generate Alpha?

Contrary to what you’ll hear from the mouths of those who sell ESG investment products, much debate has been had around whether or not these products can consistently outperform their “not so good” counterparts. If you cherry-pick time frames, you can make just about anything outperform. The argument that “doing good mitigates operational risk” seems to ignore the fact that in today’s globalized world, foreign competitors aren’t all subjected to the same sorts of rules and regulations that the ethnocentric ESG rule-makers think ought to be in place. Perhaps it’s best to forget about convincing people there’s alpha to be had in ESG. Instead, let’s make absolutely certain that we’re accurately measuring ESG so that investors can be rest assured their investments are making the world a better place. One company that’s trying to do that is Sensefolio.


Founded in 2015, Chicago-based startup Sensefolio uses Natural Language Processing (NLP) to analyze an incredibly large amount of information to determine ESG scores for over 20,000 companies around the globe. On the surface, it looks like other ESG methodologies we’ve seen – it uses natural language processing to scour the world’s information to assess how good a company is – but there’s one thing that seems out of place:

Sensefolio data sources
Source: Sensefolio

Sensefolio is scanning more than 500 million social media posts and company reviews and using this information to assess how good or bad a company is. Here are some of the data sources they are pulling from.

Social media posts and reviews
Source: Sensefolio

Sensefolio considers “company reviews from former and current employees as highly significant in the prospects and culture assessment of the company and its structure.” So, let’s look at some of those data sources.

Company Reviews

Gender equality in the workplace represents a notable social focus in the context of ESG. Deriving meaningful data to support associated scores and ratings is very delicate, and by its subjective nature is susceptible to tremendous bias if not composed as consistently and scientifically as possible.  Sensefolio has decided to take into account data from a website called Fairygodboss where female employees discuss things like whether or not your company holds unconscious bias training or whether or not the CEO “supports gender diversity.”

Another hard-hitting website that gets screen-scraped is InHerSight which also lets women anonymously discuss companies they work for. Sensefolio then looks at things like whether or not your company holds “social activities” (so Christmas parties are back now?), if women are happy with their salaries (is anyone, of any gender?), or if there are “learning opportunities” (managers know how ambiguous this one is).

Then, there’s a data source they use called Comparably which looks at the “quality of coworkers” and a “diversity score” which probably balls up half the world’s population and labels them as “Asians.” Ever quit an employer and leave a scathing review on Glassdoor? That’s being taken into account too. It’s hard to see how any of this data could be considered objective, but it gets even worse.

Then, There’s Twitter

There’s nothing more annoying than going over to HuffPo and reading an article which largely consists of mundane tweets the author copy-pasted off Twitter, uttered by a bunch of irrelevant people. This is now frequently passed off as “journalism.” Imagine if we used that stuff to calculate ESG scores? Sensefolio scans tweets that “can be from the companies themselves or individuals” and they look for “at least one external news provider from the Sensefolio’s database of news vendors to support the statement.” The problem is, when the mob of armchair CEOs on Twitter starts bashing companies over perceived indiscretions, it often gets picked up by the media.

Remember that belligerent old fellow who was causing problems on a plane and everyone was up in arms?” How would that have affected United Airlines ESG score and for how long would they have needed to self-flagellate on Twitter before we forgave them and moved on to policing the next injustice? Then, there’s that politically-charged Gillette ad that was deemed by some as the worst marketing move ever and is now one of the most disliked videos on Youtube. (In unrelated news, did you see what happened to Gillette in P&G’s earnings report last week?) Was that ad a good thing or a bad thing? Woke or broke?

Even if the majority of weight is given to company Twitter feeds, this sort of “self-reporting” makes little sense and leads to erroneously classifying big oil companies as some of the greenest people on the planet. Do companies now need to check boxes each week to make sure they’re saying all the right things? It’s very hard to be authentic when every word you say is being evaluated in real-time. So far, we can’t see any good reason why social media data should be used to calculate ESG scores.

NLP and ESG Investing

One leader in the ESG space is MSCI, a $20 billion financial services company whose CEO, Henry Fernandez, recently told the Financial Times that “MSCI is obsessed about becoming the world’s biggest supplier of ESG tools…there might be a point where MSCI gets defined by ESG.” Their methodology is elaborated upon in a solid white paper which details what they do in an academic manner. That’s because they have a very strong research team in house that constantly assesses what they’re uncovering. Social media isn’t included in their list of data sources. Instead, they use NLP to scour things like news sources, company disclosures, government data sources, NGO data sources, and the like.

ESG Rating Framework
Source: MSCI

The advantage of using NLP is speed in that they’re able to react more quickly to changes in the global environment. Keeping the data sources limited to what’s relevant allows you to always keep “humans in the loop” (in the case of MSCI, 185 research analysts), something that helps mitigate what Deutsche Bank has coined the “Hathaway effect.”

In 2012, a researcher noticed that each time a film was released starring Anne Hathaway there was a jump in the share price of Warren Buffett’s Berkshire Hathaway. News articles about the actress had a similar effect on Berkshire’s stock. The likely cause of this odd relationship was algorithmic systems that traded based on keyword searches of news articles. That story neatly highlights the problems with using traditional trading algorithms to search unstructured data sources such as news and social media.

The article by Deutsche titled “Big data shakes up ESG investing” goes on to talk about how slow ESG methodologies are to react to the news. For example, in the case of a litigation settlement, “companies outperform their peers by two percentage points,” but almost none of that alpha comes in the day of or after the announcement. Instead, it takes four months to be realized.

If MSCI needs 185 research analysts on staff to monitor their existing methodology, just imagine how many they would need to add if social media data was thrown into the mix. While we may not all agree on whether or not this ESG methodology generates alpha, most can probably agree that social media is best avoided when trying to correctly identify good and bad.


For ESG to truly measure whether or not a company is doing good, we need to move away from self-reporting as it allows companies to game the system. The same holds true for social media. Taking into account what a Twitter mob thinks about a company is equally inappropriate, especially when people can attack you under the veil of anonymity so you have no way to defend yourself in the case of slander or libel. Since not everyone agrees what ESG should measure or how it should be measured, we need to focus on transparency, and we’re not getting any closer to that by screen-scraping bitching sessions from disgruntled employees on company review sites. If we’re going to sell ESG to investors based on accuracy and transparency, will sifting through 500 million social media posts a day really help get us closer to that?


Leave a Reply

Your email address will not be published.

  1. This article made me want to learn more about Sensefolio, and thus made me read their technical paper. I can see that integrating social media posts constitutes only around 1/5 of their whole ESG Score calculation. Indeed, Sensefolio also integrates news, company reports, NGO reports etc.. Moreover, they seem to also assess the level of subjectivity or objectivity of the reviews and social media posts.

    1. 1/5 of the ESG calculation is still meaningful. The author of this article challenges whether social media signals should be used at all, using Sensefolio’s methodology as an example of a firm that has decided to use social media for calculating ESG scores. A firm like MSCI has 185 analysts assessing what their NLP covers and that excludes social media. On the other hand, Sensefolio appears to have a small handful of analysts according to LinkedIn. One would hope they have the manpower needed to separate the wheat from the chaff.

      Leaving Sensefolio out of the picture entirely, taking into account what a rabid Twitter mob has to say about any given company seems like a flawed approach to ESG.