Moogsoft – Machine Learning Takes Over IT Operations

April 8. 2017. 4 mins read
Table of contents

If you work in a corporation where your job involves sitting in front of a keyboard all day, you probably have your own version of an “IT Helpdesk” that most likely drives you absolutely nuts. IT Helpdesk is your resource for rectifying any computer related issues you might have because the “local IT support” team was canned during the last cost savings initiative “strategic realignment”. If you log a ticket with IT Helpdesk like you’re supposed to, you’ll hear from them no sooner than 48 hours, so you pick up the phone and call them:

  • IT Helpdesk: Hello IT Helpdesk, this is Abishek John. How can I help you?
  • Corporate Slave: Hi John, I’m having a problem with my mouse where it “lags” when I move between screens. I just need a new mouse.
  • IT Helpdesk: Please share your screen
  • Corporate Slave: No John, you don’t understand. You can’t actually experience the lag unless you use the mouse yourself. Just order me a new mouse and have it delivered ASAP please.
  • IT Helpdesk: We cannot issue any new hardware until the problem has been identified according to our procedures after which you need VP approval
  • Corporate Slave: I am a VP you recipe-driven muppet. Just send me a new mouse and I’ll leave you alone.
  • IT Helpdesk: Now sir, there is no need to talk like that. Please just share your screen so I can help you
  • Corporate Slave: <strangles self with mouse cord>

We’ve all been there, and the only solace we have is that John in Mumbai’s job is not going to be around for too much longer. You see, pretty soon computers are going to fix themselves. Sound a bit far-fetched? It’s not at all. Recently Google’s Deepmind had a scenario where AI Agent John was talking to AI Agent Sally. Google told them to converse secretly so nobody could understand what they were saying so they created their own cryptographic method of communicating. Then Google created AI Agent Abishek to harass them, so John and Sally actually started modifying their cryptographic language to thwart AI Abishek’s attempts at breaking the code. Simply amazing stuff, and it demonstrates how AI will be a critical factor in cybersecurity going forward.

The use of artificial intelligence for cyber security is simply a form of monitoring where the machine learning algorithms learn to distinguish “real threats” so they can sound the alarm. This sounds an awful lot like the job of IT operations teams which largely exist to sift through reams of “false positives” in production environments to try and identify real problems. One startup looking to address this pain point that most firms suffer from is Moogsoft.

Moogsoft – The Future of IT Operations

Click for company website

Founded in 2011, San Francisco startup Moogsoft has taken in $53 million in funding from investors that included Cisco to develop “technology that helps enterprise IT operations and development operations teams become faster, smarter and more effective“.  In other words, they are “freeing people up to do more value added activities” with their flagship product called Incident.MOOG. For those of us who don’t speak geek, here’s what this tool does.

Organizations everywhere use technology in production environments. Think about the company you work for and their corporate website. Right now, someone’s job is to watch for alerts from that corporate website to make sure that it stays online and that the people using it don’t run into any problems. Enterprises today use anywhere from 10-25 different tools provided by vendors which help them monitor their production stack of applications, networks, and infrastructure which generate millions of events and alerts every day. We call this “operational noise” because the majority of those events and alerts are not actual issues, but need to be analyzed anyway to make sure they’re benign. What Moogsoft has built is called “algorithmic IT operations” or AIOps for short. Here’s a largely useless chart that some MBA put together which tells you next to nothing about what AIOps is but looks cool:

Moogsoft AI Technology

Algorithmic IT operations or AIOps is when you let unsupervised machine learning algorithms figure out which alerts are real instead of having humans monitoring the alerts. How well does it work? Here are some actual results:

  •  A top tier web-scale company with hundreds of millions of active users, was able to detect incidents more than 24 hours before corresponding tickets were created by IBM Netcool.
  • A $450M SaaS company had cut its DevOps time to software release from 14 days to 1 day.
  • A Fortune 100 Manufacturer has cut their raw events from 115 million a day to 250 actionable situations
  • Reduced event to actionable incidents (called situations in Moogsoft) by up to 99.998% after a brief learning/priming, or by 90% with minimal configuration.
  • Royal Bank of Canada reduced “operational noise” by 50% and realized a 4X ROI in the first year of using the Moogsoft platform
  • HCL Technologies saw a 62% reduction in helpdesk tickets

That’s what machine learning is capable of doing folks. We can almost guarantee you that the efficiencies described above resulted in meaningful headcount reductions or “cost savings” as they’re called these days. If you’re a CTO, prepare to see that above diagram from Gartner in a presentation deck you’re going to run into this year or next when you get pitched on some AIOps technology that’s going to make those IT support monitoring teams in your emerging market centers redundant.

Moogsoft isn’t the only player in the “algorithmic IT operations” game. There’s a report out by Garnter called “An Introduction to AIOps” which lists the below players in addition to Moogsoft:

Source: Gartner

As software development professionals, we felt all smug when it turned out that all the jobs that senior management decided to outsource to Mumbai didn’t go exactly as planned. Just because you get 2 for 1 headcounts doesn’t mean you’re getting the same value. With AI on the other hand, it will never make a mistake, call in sick, or tell you to share your screen relentlessly. Anyone who works in IT operations for a living should be watching this space closely.


Leave a Reply

Your email address will not be published.