Turning log information into business intelligence with relationship mapping
Now that we’re past January, most of us have received all of our W2 and 1099 tax forms. We all know that it’s important to keep these forms until we’ve filed our taxes and most of us also keep the forms for seven years after filing in case there is a problem with a previous year’s filing. But how many of us keep those records past the seven year mark? Keeping too much data can be as problematic as not keeping records at all. One of the biggest problems with retention of too much information is that storage needs increase and it becomes difficult to parse through the existing data to find what’s most important.
The challenge of balancing information with intelligence is often referred to as a “signal to noise ratio” problem. When there is too much noise, the signal gets lost. Without proper management, log data collection can quickly turn into a classic “white noise” scenario. Worst case, everything is stored, there is little organization, and the utility of the business intelligence is lost in terabytes of unsorted log entries.
Finding the Signal in the Noise
In order for log data collection to be of highest value to an organization, it needs to be filtered and parsed for business intelligence purposes. Consider a log aggregation system that captures failed logins. The raw log data shows the logins that have failed, but without contextual awareness and business intelligence filtering, the effectiveness of this information is impacted. Are the logins due to sleepy Monday morning fat fingers? Or are the failed logins an indication that an account, and possibly the organization, is under attack?
Without applying rules and intelligence to the analysis of the log data, it would be hard to determine the underlying cause and assess the potential risk to the organization. However, if additional information is parsed using correlation rules, the picture of what’s really going on becomes clearer. For example, did the login come from the user’s PC on the internal corporate network during normal business hours? Did the user access approved systems within their business role? If so, this probably indicates the failed login was user error rather than an attack. But if the login came from outside the company on the remote access system at 3a on a Saturday, it’s more likely to be true attack scenario.
This is why the first to turning mountains of log data into readable and useful business intelligence is to understand the business and to baseline normal activity. Keep in mind that what’s normal will vary from business to business and even user role to user role. In the above login scenario, the attack looks far less suspicious if it comes from a remote worker on the weekend shift – in a different time zone.
Map out user flows and business processes before trying to implement correlation rules into the log management or SIEM. Understand which services and devices user roles require access to and any time or location related information associated with their required business activities. To ensure that the log data and event information is focused into an accurate picture of business risk and activity, write rules that can correlate flows between devices and users. Make sure that you are able to track what a user does from the point of login entry through the network as zones and servers are accessed, and even what activities are completed within applications and services.
Using single-sign on and other identity aggregation solutions are useful for generating a picture of a specific user’s activity throughout a day or week. Although many services and servers require multiple logins, it’s possible to bring the disparate user IDs back together with rules in the log management or SIEM console.
Once the baseline of business flow and user activity has been determined, it’s possible to turn up the signal even louder by diagramming and understanding the unique relationships between devices, systems, users, and applications. This data can feed risk posture awareness reporting and link back into compliance assessments and analysis.
Consider an unpatched operating system that is running in a low security zone and houses PDF scans of publically available marketing brochures. This server may be considered a low security priority with limited monitoring in effect. What if a member of the development team, needing some extra processing power to test a new internal HR module, puts a VM on the server and links it to the development network? Now a high priority service is linked to the low security server and the relationship and risk posture has been altered in a way that may put the organization at risk.
Although it’s true that strong change management procedures may have prevented the unauthorized installation, it’s also true that when changes slip through the cracks, an intelligence log management or SIEM system can catch those changes quickly.
Questions to ask when implementing relationship mapping:
- What systems support what applications?
- What systems contain what data?
- What compliance mandates govern this data?
- Which systems and services are connected
- Directly via APIs or other connectors
- Who needs to be informed of what (prioritization and remediation)?
Business Intelligence Use Case Walkthrough
Bringing the general concepts down to a very specific use case will help to illustrate what is meant by translating log data into business intelligence. In this scenario, adapted from a real-world example, we follow a celebrity as she checks into the hospital.
1. Celebrity actress/singer Lady Jen checks into a hospital in Los Angeles, CA
2. The hospital is a covered entity under HIPAA and is required to ensure that only approved, authorized staff view patient records
3. For HIPAA purposes, the hospital must also ensure patient records are not duplicated or distributed without proper approvals
4. Raw log data is aggregated and normalized – but without relationship mapping. This does not prove to the HIPAA auditor that only approved, authorized staff are accessing the patient records
5. For high security patients, access to their records are restricted to only certain terminals and userIDs
6. The tabloids get wind of Lady Jen’s hospitalization and offer staffers cash to report on her health status
7. One unscrupulous staffer accepts the offer and attempts to access Lady Jen’s record from a shared terminal
8. The staffer logs in with stolen credentials from Lady Jen’s attending physician
9. The log management system flags the login – although it came from a user approved for access, it did not come from a server/terminal approved for access
10. The log management system sends an alert to the security team, the unscrupulous staffer is located and fired, and Lady Jen’s privacy is protected
Millions of unsorted log events can be too much of a good thing. Cut down on the noise and turn up the signal intelligence on log data by using relationship mapping, usage baselines and carefully written correlation rules.
Next Month: Why Anomaly Detection in Financial Fraud doesn’t work for IT/Log Mgmt Fraud – And what we can learn from it
Critical infrastructure security a mixed bag, report finds
A new report from the Center for Strategic and International Studies highlights the financial damage of cyber-attacks on critical infrastructure, but also paints a picture of IT security that is in turns good and bad. Among the findings is that only one-third of executives reported their organization had policies restricting or prohibiting the use of USB sticks or removable media, which has become a popular attack vector for malware.
Did you know? Prohibiting the use of USB devices can lead to unhappy employees and slash productivity. There is now another effective way to prevent costly damage from data stolen on USB devices without taking drastic measures.
Data breaches get costlier
The average total cost of a data breach rose from $6.65 million in 2008 to $6.75 million in 2009.Ponemon Institute conducted the study and said that 2009 brought “more sophisticated criminal attacks that didn’t show up on our radar screen” the previous year. These malicious attacks often involved botnets and were carried out for reasons of financial gain.
Did you know? Traditional perimeter defense systems do not present a comprehensive defense, especially against the more sophisticated, targeted attacks that are currently being witnessed. A comprehensive SIEM solution can address a number of security concerns including insider theft, website attacks, bruteforce attacks, external hacking, spyware, botnets and zero-day attacks
Strong demand for full-featured SIEM drives 3rd consecutive year of double-digit growth for Prism Microsystems
Despite a sluggish worldwide economy, Prism charted double-digit gains in annual sales, driven by strong demand across major verticals, increased government spending for IT security and compliance initiatives, and stiffer non-compliance penalties of the HITECH act for healthcare organizations. The company closed 2009 with over 120 new customers including Nintendo, the Salvation Army, the US Senate, MITRE and NASA.