Troubleshooting problems with enterprise applications and services are often exercises in frustration for IT and business staff. The reasons are well documented – complex architectures, disparate, unintegrated monitoring solutions, and minimal coordination between technology and product experts while attempting to pinpoint and resolve problems under the pressures of an escalating negative impact of delays and/or downtime on revenues, customer satisfaction and the delivery of services.
Simplifying infrastructure and application performance analysis across multiple technologies and coordinating efforts between all staff involved in the problem resolution process is a top priority for IT and operations staff whether the consumer is in research, government, financial, education or an enterprise. Minimizing downtime risks and lowering the cost of reliable services is the goal.
SIEM solutions are increasingly proving their worth and vital role in addressing the challenge. Such efforts do pay off: one company saved $1 million in one month after they implemented an integrated incident and problem management workflow. However, success isn’t automatic. It is important to have structured process to evaluate potential solutions. This holds true whether the search is for a SIEM solution or a comprehensive infrastructure performance management one. Consider some of the following criteria and attributes that should be included in the process of solution evaluation.
Solution Evaluation and Selection Checklist
- Rules must be easy to create and maintain.
- Rule evaluation and execution must be based on real-time data, collected and interpreted within an operational context that reflects business demands as well process and infrastructure reality.
- The time period within which rules are evaluated and responses automatically initiated must be configurable and highly agile – seconds (or less) matter.
- Policies must be applied with a level of granularity that is both application and business process specific.
2. Non-invasive operation:
- The solution must leverage existing processes, application operation and infrastructure realities.
- It must not require application modifications or extensive, proprietary modifications to the operational infrastructure to be effective.
- It must automatically integrate and adapt to infrastructure and business process changes.
3. Platform agnostic:
- The solution must accommodate a heterogeneous IT environment. There should be no “designed in” dependencies on hardware or software features.
- It must be built and based on the utilization and application of open standards and interfaces.
- The solution must be able to expand and scale with the operational environment.
- The solution must be modular in design so that functionality can be extended as needed.
- The solution should have the ability to utilize both in- and out-of-band operational metrics.
- There must be no architectural, structural or operational bottlenecks that will prevent the solution from operating as the infrastructure and business environment grows.
- It must interoperate and integrate with existing on-site commercial /proprietary tools.
- The solution must provide comprehensive automated monitoring, management, and control of operations.
- Monitoring and response activities must be able to monitor, manage (as appropriate) and report on performance and events across all desired infrastructure and devices.
6. Distributed, fault tolerant operation:
- If the solution supports business critical operations it must operate in a fault tolerant manner with no single point of failure.
- Data collection, data analysis, operational intelligence, policy-definition and implementation must be distributed to assure reliable operation even if part of the environment fails.
7. Operational and state reports that are easy to use and understand:
- The solution must provide for clear reporting to aid in designing and defining appropriate policies for appropriate remedial, repair or avoidance response as appropriate.
- Both the user interface and reporting must be designed to present an easily understood and consumable holistic view of infrastructure data and business information as desired and needed.
- Hard won experience in automating business processes, leads us to conclude that model-based, policy-driven automation helps to relieve the burden and risk associated with manual processes.
- In addition, such a solution provides the flexibility, adaptability, and scalability deliverable with a timeframe and ease of implementation that exceeds alternative structures.
This is not an exhaustive list. These important requirements must be supplemented with the ones that are unique to the specific situation and organization. There exist operational idiosyncrasies for each that impact the selection of an appropriate solution approach to its problems. These can be process, technology, procedural, or even politically based. In any case, they need to be identified and considered when defining solution requirements.