By – IAIDL Staff
As AI becomes more mainstreamed as a software development approach, enterprise IT operations need to get involved with managing its complexity. The need for AI to help IT operations, AIOps, has accelerated as organizations attempt to incorporate AI systems into their production environments.
Tools positioning in the AIOps market incorporate analytics and machine learning to help get the job done. The use of tools in this category is projected by Gartner to grow from five percent of large enterprises in 2018 to 30% by 2023.
Tools for AIOps “provide modern ITOps teams a real-time understanding of any type of issue,” stated Venugopala Chalamala, founder and CEO of Atlas, an IT services company, in a recent account in Forbes. “Traditional IT management solutions can’t keep up with the volume as well as provide real-time insight and predictive analysis.”
Asked to provide advice to organizations starting out with AIOps, Wilson Pang, the CTO of Appen, suggested the problem to be addressed needs to be clearly defined. “Is the goal to detect anomalies that are hard to find by a human? Or do you want a tool to enable your Ops team to identify root causes quickly when an issue occurs? Or do you want to deploy some automatic recovery mechanism through AI? AIOps can help in many areas,” he stated. Appen is an AI services company that assists in the collection of images, text, speech, audio, video, and other data needed to build AI systems.
As the AI systems increase in number, the monitoring strategy needs to be adjusted, another executive suggested. “You need a good understanding of what is necessary to monitor and store. The more AI models, the more complex the monitoring strategy. Then, you need to define the criteria of acceptable performances by a model or a group of models” stated Rosaria Silipo, Ph.D., the principal data scientist at KNIME, based in Zurich, offering software for machine learning and data mining analysis. “Finally, a strategy is needed to retrigger training when performance drops below an acceptance threshold,” she stated.
Forrester Analyst Suggests Automated Tools Can Free Up IT Ops Staff
To the extent the AIOps tool can automate IT operations tasks, IT operations staff can be freed up for other work, suggests Rich Lane, senior research analyst, infrastructure and operations for Forrester Research, from an article in SiliconANGLE.
It would be better for the IT staff to concentrate on “project work that brings better digital services to customers and get them out of doing the low-complexity and high-volume tasks that they’re spending at least 20% of their day on, if not more,” Lane stated.
Tools with smart analytics that can review data collected from a range of applications and end-user devices and automatically react to issues in real time are preferable, he suggested. “If you look at where infrastructure operations people are today, and especially during the many months of the pandemic, many of them are getting really burnt out by doing the same tasks over and over again just trying to keep the lights on,” Lane stated. “We should automate those things.”
From the point of view of application performance management company AppDynamics, AIOps refers to the use of AI and machine learning to ingest and analyze large volumes of data from every corner of the IT environment, reducing its complexity by bringing data silos together with the means to filter them, detecting patterns, and clustering meaningful information for more efficient actioning.
This enables IT teams to manage performance challenges proactively, in real-time, before they become system-wide issues. AIOps tools are also capable of predicting when issues are likely to happen, so they can be prevented.
Currently, AIOps can be applied to the following uses cases, the company suggests:
Intelligent alerting: By ingesting data from any part of the IT environment, AIOps filters and correlates the meaningful data into incidents. This prevents alert storms coming from domino effects. Intelligent alerting also reduces alert fatigue and helps with prioritization based on user and business impact.
Cross-domain situational understanding: AIOps aggregates all the data and creates causality/relationships, providing IT with an overview of what’s at stake and enabling it to slice and dice the information as needed for a better understanding of the situation.
Automating the identification of probable root causes: Once alerted, IT is presented with the top suspected causes and evidence leading to AIOps’ conclusions. This helps to build trust and provides an opportunity for feedback, enabling the AI engine to learn from human expertise.
Cohort analysis: AIOps shines in the analysis of vast amounts of data. With modern, highly distributed architectures where tens of thousands of instances are running at the same time, identifying outliers in configuration or deployed application versions is an insurmountable task for humans.
Automating remediation: AIOps helps automate closed-loop remediation for known issues. Once problems are identified—and based on historical data from past issues—AIOps suggests the best approach to accelerate remediation.
In advice for IT managers on how to proceed, the AppDynamic experts recommended: identify your AIOps goals, go step-by-step and watch the AIOps market closely, because it is evolving rapidly.
In other suggestions from executives quoted in Forbes, Ali Siddiqui, the Chief Product Officer at BMC, a leading enterprise software company, suggested that the value of an AIOps tool increases the more data it can observe and analyze.
“It is also important that there is an open approach that can integrate with your existing IT tools and data sources. Once you have your tools, identify the right processes that support agility and collaboration across functions to integrate across Dev, Ops, and security,” he stated. “Finally, organizations have to think about the people–redeploy your most valuable resource to ensure the right tools and processes are in place and you can act on insights.”
It’s important that the organization seeking AIOps have a system in place to track what’s going on in the IT Operation. Muddu Sudhakar, the founder and CEO of Aisera, supplier of AI service management software, stated, “The key is to have a good incident management system. You also need to have a very good logging system in place. Also, there should be proactive and predictive management of incidents and outages. You don’t want humans doing this.”