Performing well during a security
“Every crisis offers you extra desired power” William Moulton Marston
Jasmine’s corollary: “Only if you perform well during that crisis.”
Crises will happen no matter how many precautions we take. The need to blame someone is a human desire and it is easy to focus that on the crisis response team, because they are visible. Yet when teams perform well during the crisis they don’t merely avoid blame. They do garner the potential to become powerful advisors or outright leaders. It’s even better if you can also demonstrate that lessons learned from past crises are making the current environment more secure. After all, the Justice League members wouldn’t be heroes if no one knew about their actions. But what does it mean to perform well in a crisis?
Not so long ago performing well during an IT security crisis was about how rapidly the security administrator could shore up firewall breaches or deliver anti-virus patches. But times have changed, now performing well in a security crisis is a team effort – security, network, system, application and desktop folks are involved. Team performance, however, is not simply a sum of the individual talent of team members – just ask the 2004 US Olympic basketball team, or the current Cincinnati Bengals for that matter.
Joking aside, I’m sure that if you look at every large scale disaster you will find dozens, if not hundreds, of competent people working extremely hard to deal with the situation. Yet their individual efforts are often overwhelmed by the complexity of the situation and the lack of coordination (the broad brush of 20-20 hindsight doesn’t help either). IT security situations are no different. A diverse team of people must perform well during the crisis to protect not only corporate infrastructure and business intelligence, but the “digital lives” of their customers as well. Which begs the question, how can IT increase its odds of performing well under these stressful situations? As far as I can tell, the basics involve:
1. Understanding what is happening
This starts with real-time collection and correlation of subtle configuration changes or seemingly disconnected events that span systems, applications, and network infrastructure. It’s likely that the next big security crisis will be a multi-stage attack designed by organizations employing well trained programmers (see the discussion in Symantec Internet Security Threat Report, published in April 2008). Since enterprise environments are getting more complex and more dynamic, it is more difficult to rapidly investigate cause/effect during the crisis without some level of automated analysis. The automation must sift through large volumes of semi-structured IT data and produce customized reporting that allows each team member to understand the significance of situation so they can act effectively.
2. Having well known contingency configurations and plans
You can work with various experts to develop responses to different scenarios (rerouting traffic, isolating systems, disabling accounts, etc). Luckily computing contingency plans are more readily automated than any other type of disaster planning. Automation means that the plans can be executed the same way every time that a particular situation occurs. However, this automation can’t be the ‘set it and forget it’ type. Enterprise computing environments and IT staff change too frequently. The automation itself needs to be reviewed and updated regularly to accommodate infrastructure, application, and regulatory changes. The last thing you need is the automation to violate a compliance policy. New IT employees also need education about these automated responses. The second-to-last thing you need is a clueless admin mistaking the automated response for the attack itself.
Contingency planning is not only about to-do-lists. It is also about decision-making and responsibilities. There are lots of people who can make good decisions under pressure. But a worse disaster will ensue if every one of them went off and did their own thing, in their own way, without telling anyone. This will happen every time if the crisis management team is poorly defined and no one has established:
- who on that team is responsible for specific duties and decision
- how people on that team interact with each other and with related organizations,
- and most importantly, how information flows into, within, and out of that team.
If critical information doesn’t reach the right people, in the right way, at the right time, then you are in for many, many sleepless nights of preventable remediation work. It pays to clearly define the team, their responsibilities and information needs first – and then set up the emergency information consoles, reports, etc. that each team member needs.
While I think the various uTube creations based on Allen Iverson’s practice rant are hilarious, I also know that practicing for a crisis is important. First, when people don’t know what they are supposed to do, then they waste a lot of time figuring out what they should be doing. They are usually doing this with inaccurate or incomplete information, which means they will get it right only if they are very, very lucky.
Secondly, practice helps everyone understand that the crisis response plan is not a blame game in disguise. Instead, it is an opportunity to get people to trust the plan and the people involved. This is particularly important in large enterprises because there are more people involved, and those people are often not in the habit of collaborating. It is hard to work with someone new in stressful conditions because no one knows what they’ll do. Practice overcomes that.
4. Auditing everything and then some
You can never go wrong documenting everything that is part of the plan, shows the on-going efforts to comply with any related regulations, happens during practices, and happens during the actual crisis. Remember you’ll still need to demonstrate that your crisis efforts are compliant with various regulations. Auditors will want some visibility into what, where, why, and how financial systems or private information were handled. They’ll also take a fine-toothed comb to your compliance documentation. Lack of evidence (or the inability to find it in a sea of poorly archived log data) is the quickest path to nasty fines.
5. Dealing with the aftermath
Most technical folks assume this is mostly about in-depth forensic analysis to determine how to undo any damage that occurred and to determine if your strategic security plan needs tweaking or if a tactical prevention (such as changing an operational policy, or adding a new configuration check, or implementing a new event analysis rule) will do. While all of this is absolutely necessary, it is only partly true.
The other part of the aftermath is dealing with the hordes of misinformation that will be disseminated about the situation. Blogs, posted comments, and poorly worded customer notifications can add up to chaos. And good luck if you find yourself setting up a customer call center without a pre-negotiated contract; or you set up a ‘crisis info’ website that promptly crashes from zillions of hits; or you are dragged to a press conference without being able to explain everything from why it happened to the extent of the damage in non-technical terms.
But really, things don’t have to go this way. That’s what crisis planning, solutions and practice is for. Real IT executives have lived through these things and still have their jobs. Hopefully we can all be as effective.
Jasmine Noel is founder and partner of Ptak, Noel & Associates. With more than 10 years experience in helping clients understand how adoption of new technologies affects IT management, she tries to bring pragmatism (and hopefully some humor) to the business-IT alignment discussion. Send any comments, questions or rants to firstname.lastname@example.org
Looking for hard dollar savings today? Consider SIEM technology. It not only reduces the risk of costly breaches and non-compliance, but provides tangible cost savings
Credit-card security standard issued after much debate
The Payment Card Industry Security Standards Council, the organization that sets technical requirements for processing credit and debit-cards, has issued revised security rules. The council also indicated that next year it will focus on new guidelines for end-to-end encryption, payment machines and virtualization.
Did you know? EventTracker enables compliance with PCI section 10 and 11 with its integrated Log Management and Change Monitoring solution
Data breaches reach record high
The hits keep coming when it comes to U.S. data breaches. The Identity Theft Resource Center reports data breaches in 2008 have already exceeded the record breaches of 2007. Enterprise breaches continue to lead the pack with breaches tied to mobile data topping the incident reports.
Did you know? EventTracker helps safeguard critical data, whether at rest, in use or in motion
Cool Tools and Tips
Understanding Change Management
Understand how Change Management can help you:
- Analyze change data to quickly identify and back-out faulty changes.
- Identify new viruses before your Anti-Virus provider comes up with a patch.
- Have insurance when installing new software or making major configuration changes.
- Enhance security by having detailed information about all changes and accesses.
- Reduce dependence on human input to diagnose and resolve system/application problems.