IT deserves clear answers, but let’s be careful not to equate transparency in how technology works with optimal results. Nearly four years ago, while I was an analyst at Gartner, I warned vendors and users alike that unless AIOps technologies provided the means to trace and make explicit the key steps executed by AI algorithms on the path from data to pattern, anomaly, and causal analysis, IT Operations teams would not take up those technologies with any enthusiasm. There were a couple of reasons for this warning.

First, memories are short in the vendor community, but it is important to recall that AIOps represents the second major attempt to commercialize the deployment of AI for IT Operations use cases. The first attempt occurred in the late 1980s / early 1990s which, in fact, led to market successes (e.g. the previous generation of help desk technologies, the Prolog-based Tivoli Management Environment), but also some undeniable failures (e.g. CA’s Neugents). Unfortunately the failures seared the experience of IT Operations teams more deeply than the successes, and are preserved in the institutional memory of many a datacenter. Hence there is a predisposition to skepticism about AIOps which vendors and practitioners need to overcome.

Second, most of the algorithms that drive modern AIOps are based on mathematics and statistical theory that go beyond what most computer science undergrads have been exposed to. Hence the algorithms themselves are a psychological “black box” to many IT Ops professionals.

Now even if these professionals come to trust the technology they are deploying, they will frequently find themselves in a position where they have to defend cost incurring decisions to executives who are, in most cases, even more math phobic than they are. If an IT Ops professional cannot, at least at a high level, make plausible the rationale for a decision to, let’s say, take down a given server in order to fix a problem, then the executive is unlikely to give his or her approval.

The Need for Sufficient Transparency with AI and Machine Learning

Take note that it is algorithmic transparency that is critical here — which can be defined as the ability to see the steps, starting with the data set you are working from, through the various operations applied to the data in sequence, until you get the final result.

Transparency is relative, of course. Each operation applied in sequence to a given data set can be sub-analyzed into a more fine grained sequence of sub operations. Furthermore, at a certain point this analysis will fork. One branch will yield fine grained structures that ultimately take us into the realm of combinators or lambda calculus terms. Another branch will yield machine code and ultimately electrical signals routed through silicon.

So when at Gartner, we originally warned the market about the need for algorithmic transparency, we should have been a bit more precise. Total algorithmic transparency is probably an impossible goal and, even if it were achievable, it would not be very helpful.

“OK, executive decision maker — this algorithm uses the Newton-Coates method to approximate a Gaussian quadrature. But you don’t care about that. What it’s actually doing is performing a reduction on a fixed-point lambda term and making use of the side effects! Now can I turn off that server?”

So our general warning about algorithmic transparency back then was problematic in the form that Gartner first gave it. What we should have said was that algorithmic transparency is required up to the point where an executive decision maker can grasp the rationale behind the pattern, anomaly, or causal analysis.

There is No Such Thing as an ‘Open Box’ Solution

Unfortunately, for better or worse, many other analysts as well as vendors have recently decided to echo Gartner’s 2015 vintage warning.

Flattered as I am, they should be collegially reminded to take into account the modification just proffered. Now it is also important to stress one thing that algorithmic transparency is NOT. It is NOT the ability to look at the results delivered by a pattern discovery algorithm, decide that one does not like the results, and arbitrarily introduce changes into the pattern to make it better accord with one’s intuition.

Let’s analyze why this approach to transparency makes no sense.

First of all, it renders whatever mathematical integrity the original algorithm had completely inoperative. If you change the results, you have undermined the rationale of the algorithm. Remember one deploys such algorithms in the first place precisely because human intuition in the face of large, complex, evolving data sets is useless. If a practitioner feels free to randomly alter the results based on his or her “gut” (and whatever they may ‘feel,’ they are just randomly altering the results), there was no reason to deploy the algorithm in the first place. Why not just look at the data and draw a curve that comports with your feelings? Using the algorithm to kick off the process, so to speak, is really not adding anything to the rationale of your result at all.

Second, and in some sense more importantly, adding such a capability has not solved the problem of algorithmic transparency at all. Let’s return to the situation where one needs an OK for a significant intervention. The executive decision maker, still math phobic, asks the practitioner to justify the action and the practitioner replies: “Well, some black box algorithm using math that neither you nor I understand yielded the result. I didn’t like the outcome so I tweaked it because … well, the curve just did not look right.”

The bottom line is that no explanation — not even at a high level — is being provided. There is no ‘open box.’ There is only the same old black box algorithm with some completely unjustified tinkering after the fact. Pro tip: The executive decision maker will not OK a significant intervention on that basis.

Vendors and practitioners alike need to work toward the goal of sufficient algorithmic transparency by making explicit the steps by which a given platform or complex, multi-part algorithm transforms data into result. A good case can be made that the level of algorithmic transparency we at Gartner argued was critical for AIOps market acceptance and enthusiasm is indeed upon us. Of course it will only get better over time. Even now, however, it represents a substantially superior way of addressing the concern — using a black box to generate patterns followed up by encouraging the practitioner to tinker with the result. This methodology renders the algorithm originally deployed pointless, and makes the whole exercise more obscure than ever.

Remember: unprincipled tinkering will never be allowed to justify significant interventions.

Download Moogsoft's new ebook here: "4 Simple Reasons Why Rules-Based Solutions Are Failing IT Operations".

Moogsoft is a pioneer and leading provider of AIOps solutions that help IT teams work faster and smarter. With patented AI analyzing billions of events daily across the world’s most complex IT environments, the Moogsoft AIOps platform helps the world’s top enterprises avoid outages, automate service assurance, and accelerate digital transformation initiatives.

https://www.re-work.co/events/applied-ai-summit-san-francisco-2019