Those in IT operations responsible for service delivery or infrastructure operations know what it’s like: collect and store a growing amount of the data that is necessary to do our jobs, but at a rate that drives up cost.
However, the problem with infinite detail is not much different than trying to organize and analyze noise; there’s plenty of it, but finding the signal underneath is the difficult, but critical point. Take event and audit logs for example: such data is important and valuable only to the extent it helps to simplify problem resolution, avoid service mishaps and keep operations running smoothly. All too often IT staff gets reports consisting of siloed information that requires significant further interpretation to glean any actionable information. Effectively, we are drowning in data and searching for intelligence.
What’s the issue?
Most IT departments have logs filled with data about events that are intended to help identify, alert to, avoid or help remediate a problem. Typically, log data is unstructured, developer shorthand that follows no specific or consistent standard in style, format, delineation or content. The generated reports typically sort or group according to specific data fields or event classification with some summarization. The net result is that the task of teasing out information for decision making can be a painful, tedious and drawn-out process; the results of which may not justify the effort.
All of this is to support three goals — inform, describe and predict. First, it’s to inform IT about what is happening — in a timely manner, with the intent to detect and avoid service disruptions and performance problems. An example is an alert to the fact that inquiry response times are trending toward a length that, if continued, will cause a violation of a Service Level Agreement (SLA). Second, it’s to describe what has happened — to analyze data to uncover what went wrong that caused the event and to gain insight into what can be done to avoid this problem in the future. An example is identifying that a fan failure led a component to over-heat, function erratically and disrupt a service. Finally, it’s to identify anomalies that predict a failure — an alert to take action to avoid the problem, such as identifying a device that has not had a recent patch update applied. Without the patch the device will likely fail and cause an operational problem.
The goal is get actionable information in enough time to remedy the anomaly and assure reliable, efficient and effective operations. Too often the reports from a packaged log analysis are barely comprehensible, only partially analyzed and with no actionable information included.
What to do?
What can a mid-market IT person do to resolve this? It is possible to get more integrated and comprehensive analysis and reporting of log data without purchasing an expensive, high-end, data analysis tool targeted at large enterprise. Make sure you understand the data available in your logs and the analysis that is done, and then identify what is missing from reports, what additional analysis you want and how you want it presented. Finally, prioritize your requirements list.
Look for an integrated log management solution with analysis, as well as pre-packaged and custom reporting capabilities built-in; this allows report modification and creation to get actionable insights. You want a solution that integrates multiple functions, like one that takes a comprehensive approach to management, so it will include automated relationship discovery, log review and analysis, behavior analysis, event correlation, anomaly and change detection supported by a strong analytics engine.
Understanding the results is critical, so the solution should include customizable, pre-packaged reports that are understandable and suggest ‘next steps’, using built-in best practices. All this must be easy to use and manage by IT staff, who are not always analytics experts, but who understand what information is useful to them.
There are lots of solutions that perform some of these functions. There are also expensive, sophisticated solutions that provide all of them, but require large skilled operations staff. The tough-part is finding a solution meeting these requirements, but designed and priced for the mid-market.
The SIEM and log management solutions available on the market range in the breadth of their capabilities, and extend from sophisticated solutions requiring a large skilled staff to be able to write scripts to get the desired results, all the way to basic solutions that simply collects logs and allows the user to search through the data to glean the information they want, to everything in between. By carefully defining the exact requirements and results that are needed for your organization, you can be sure that you have a solution that will satisfy your requirements and is in the appropriate price range. The common outcome of a SIEM and log management solution that is too complicated to use or doesn’t do what the organization needs is an expensive, idle device.
Publication Date: October 18, 2011
This document is subject to copyright. No part of this publication may be reproduced by any method whatsoever without the prior written consent of Ptak Noel & Associates LLC.
To obtain reprint rights contact email@example.com
All trademarks are the property of their respective owners.
While every care has been taken during the preparation of this document to ensure accurate information, the publishers cannot accept responsibility for any errors or omissions. Hyperlinks included in this paper were available at publication time.
About Ptak, Noel & Associates LLC
We help IT organizations become “solution initiators” in using IT management technology to business problems. We do that by translating vendor strategy & deliverables into a business context that is communicable and actionable by the IT manager, and by helping our clients understand how other IT organizations are effectively implementing solutions with their business counterparts. Our customers recognize the meaningful breadth and objectively of our research in IT management technology and process