EventTracker Search Performance

EventTracker 7.6 is a complex software application and while there is no easy formula to compute its performance, there are ways to configure and use it so as to get better performance. All data received either real-time or by file ingest (called the Direct Log Archiver) is first indexed and then archived for optimal disk utilization. When performance of a search is cross indexed, compression speed of results depend on the type of search as well as the underlying hardware.

Searches can be categorized as:
Dense – at least one result per thousand (1,000) events
Sparse – at least one result per million (1,000,000) events
Rare – at least one result per billion (1,000,000,000) events
Needle in a haystack – one event in more than a billion events

Based on provided search criteria, EventTracker consults indexing meta-data to determine if and in which archive contains events matching the search terms. As searches go from dense to needle-in-a-haystack, they move from being CPU bound to I/O bound.

Dense searches are CPU bound because matches are found easily and there is sufficient raw data to decompress. For the fastest possible response on default hardware, EventTracker will limit return results to the first (sorted by time with newest on top) 200 results displayed. This setting can of course be defeated but is provided because it satisfies the most common use case.

As the events containing the search term get to one in a hundred thousand (100,000), performance becomes more I/O bound. The reason is there is less and less data but more and more index files have to be consulted.

I/O performance is measured as latency which is a measure of the time delay from when a disk I/O request is created, until the time the disk I/O request is completed by the underlying hardware. Windows perfmon can measure average disk/sec transfer. A rule of thumb is to have this be below 25 millisec for best I/O performance.

This can be realized in various ways:
– Having different drives (spindles) for the OS/progam and archives
– Using faster disk (15K RPM performs better than 7200 RPM disks)
– Using a SAN

In larger installations with multipleVirtual Collection Points (VCP), dedicating a separate disk spindle for each VCP can help.

Trending Behavior – The Fastest Way to Value

Our  SIEM Simplified  offering is manned by a dedicated staff overseeing the EventTracker Control Center (ECC). When a new customer comes aboard, the ECC staff is tasked with getting to know the new environment, identifying which systems are critical, which applications need watching, and what access controls are in place, etc. In theory, the customer would bring the ECC staff up to speed (this is their network, after all) and keep them up to date as the environment changes. Reality bites and this is rarely the case. More commonly, the customer is unable to provide the ECC with anything other than the most basic of information.

How then can the ECC “learn” and why is this problem interesting to SIEM users at large?

Let’s tackle the latter question first. A problem facing new users at a SIEM installation is that  they get buried in getting to know the baseline pattern and the enterprise (the very same problem the ECC faces). See this  article  from a practitioner.

So it’s the same problem. How does the ECC respond?

Short answer: By looking at behavior trends and spotting the anomalies.

Long answer: The ECC first discovers the network and learns the various device types (OS, application, network devices etc.). This is readily automated by the StatusTracker module. If we are lucky, we get to ask specific the customer questions to bolster our understanding. Next, based on this information and the available knowledge packs within EventTracker, we schedule suitable daily and weekly reports and configure alerts. So far, so good, but really no cigar. The real magic lies in taking these reports  and creating flex reports where we control the output format to focus on parameters of value that are embedded within the description portion of the log messages (this is always true for syslog formatted messages but also for Windows style events). When these parameters are trended in a graph, all sorts of interesting information emerges.

In one case, we saw that a particular group of users was putting their passwords in the username field then logging in much more than usual — you see a failed login followed by a successful one; combine the two and you have both the username and password. In another case, we saw repeated failed logon after hours from a critical IBM i-Series machine and hit the panic button. Turns out someone left a book on the keyboard.

Takeaway: Want to get useful value from your SIEM but don’t have gobs of time to configure or tune the thing for months on end? Think trending behavior, preferably auto-learned. It’s what sets EventTracker apart from the search engine based SIEMs or from the rules based products that need an expen$ive human analyst chained to the product for months on end. Better yet, let the ECC do the heavy lifting for you. SIEM Simplified, indeed.

Surfing the Hype Cycle for SIEM

The Gartner hype cycle is a graphic “source of insight to manage technology deployment within the context of your specific business goals.”     If you have already adopted Security Information and Event Management (SIEM) (aka log management) technology in your organization, how is that working for you? As candidate, Reagan famously asked “Are you better off than you were four years ago?”

Sadly, many buyers of this technology are wallowing in the “trough of disillusionment.”   The implementation has been harder than expected, the technology more complex than demonstrated, the discipline required to use/tune the product is lacking, resource constraints, hiring freezes and the list goes on.

What next? Here are some choices to consider.

Do nothing: Perhaps the compliance check box has been checked off; auditors can be shown the SIEM deployment and sent on their way; the senior staff on to the next big thing; the junior staff have their hands full anyway; leave well enough alone.
Upside: No new costs, no disturbance in the status quo.
Downside: No improvements in security or operations; attackers count on the fact that even if you do collect log SIEM data, you will never really look at it.

Abandon ship: Give up on the whole SIEM concept as yet another failed IT project; the technology was immature; the vendor support was poor; we did not get resources to do the job and so on.
Upside: No new costs, in fact perhaps some cost savings from the annual maintenance, one less technology to deal with.
Downside: Naked in the face of attack or an auditor visit; expect an OMG crisis situation soon.

Try managed service: Managing a SIEM is 99% perspiration and 1% inspiration;offload the perspiration to a team that does this for a living; they can do it with discipline (their livelihood depends on it) and probably cheaper too (passing on savings to you);   you deal with the inspiration.
Upside: Security usually improves; compliance is not a nightmare; frees up senior staff to do other pressing/interesting tasks; cost savings.
Downside: Some loss of control.

Interested? We call it SIEM SimplifiedTM.

Big Data Gotcha’s

Jill Dyche writing in the Harvard Business Review suggests that “the question on many business leaders’ minds is this: Does the potential for accelerating existing business processes warrant the enormous cost associated with technology adoption, project ramp up, and staff hiring and training that accompany Big Data efforts?

A typical log management implementation, even in a medium enterprise is usually a big data endeavor. Surprised? You should not be. A relatively small network of a dozen log sources easily generates a million log messages per day with volumes in the 50-100 million per day being commonplace. With compliance and security guidelines requiring that logs be retained for 12 months or more, pretty soon you have big data.

So let’s answer the question raised in the article:

Q1: What can’t we do today that Big Data could help us do?   If you can’t define the goal of a Big Data effort, don’t pursue it.

A1: Comply with regulations like PCI-DSS, SOX 404, and HIPAA etc.; be alerted to security problems in the enterprise; control data leakage via insecure endpoints; improve operational efficiency

Q2: What skills, technologies, and existing data development practices do we have in place that could help kick-start a Big Data effort? If your company doesn’t have an effective data management organization in place, adoption of Big Data technology will be a huge challenge.

A2: Absent a trained and motivated user of the power tool that is the modern SIEM, an organization that acquires such technology is consigning it to shelf ware.   Recognizing this as a significant adoption challenge in our industry, we offer Monitored SIEM as a service; the best way to describe this is SIEM simplified! We do the heavy lifting so you can focus on leveraging the value.

Q3: What would a proof-of-concept look like, and what are some reasonable boundaries to ensure its quick deployment? As with many other proofs-of-concept the “don’t boil the ocean” rule applies to Big Data.

A3:   The advantage of a software-only solution like EventTracker is that an on premises trial is easy to set up. A virtual appliance with everything you need is provided; set up as a VMware or Hyper-Virtual machine within minutes.   Want something even faster? See it live online.

Q4: What determines whether we green light Big Data investment? Know what success looks like, and put the measures in place.

A4: Excellent point; success may mean continuous compliance;   a 75% reduction in cost of compliance; one security incident averted per quarter; delegation of log review to a junior admin.

Q5: Can we manage the changes brought by Big Data? With the regular communication of tangible results, the payoff of Big Data can be very big indeed.

A5: EventTracker includes more than 2,000 pre-built reports designed to deliver value to every interested stakeholder in the enterprise ranging from dashboards for management, to alerts for Help Desk staff, to risk prioritized incident reports for the security team, to system uptime and performance results for the operations folk and detailed cost savings reports for the CFO.

The old adage “If you fail to prepare, then prepare to fail” applies. Armed with these questions and answers, you are closer to gaining real value with Big Data.

Learning from JPMorgan

The single most revealing moment in the coverage of JPMorgan’s multibillion dollar debacle can be found in this take-your-breath-away passage from The Wall Street Journal: On April 30, associates who were gathered in a conference room handed Mr. Dimon summaries and analyses of the losses. But there were no details about the trades themselves. “I want to see the positions!” he barked, throwing down the papers, according to attendees. “Now! I want to see everything!”

When Mr. Dimon saw the numbers, these people say, he couldn’t breathe.

Only when he saw the actual trades — the raw data — did Mr. Dimon realize the full magnitude of his company’s situation. The horrible irony: The very detail-oriented systems (and people) Dimon had put in place had obscured rather than surfaced his bank’s horrible hedge.

This underscores the new trust versus due diligence dilemma outlined by Michael Schrage. Raw data can have enormous impact on executive perceptions that pre-chewed analytics lack.   This is not to minimize or marginalize the importance of analysis and interpretation; but nothing creates situational awareness faster than seeing with your own eyes what your experts are trying to synthesize and summarize.

There’s a reason why great chefs visit the farms and markets that source their restaurants:   the raw ingredients are critical to success — or failure.

We have spent a lot of energy in building dashboards for critical log data and recognize the value of these summaries; but while we should trust our data, we also need to do the due diligence.

Big Data, does more data mean more insight?

In information technology, big data consists of data sets that grow so large they become unwieldy to work with using available database management tools. How big is big? It depends on when you need to reconsider data management options – in some cases it may be 100 Gigabytes, in others, as great as 100 Terabytes.

Does more data necessarily mean more insight?

The pro-argument is that larger data sets allow for greater incidences of patterns, facts, and insights. Moreover, with enough data, you can discover trends using simple counting that are otherwise undiscoverable in small data using sophisticated statistical methods.

On the other hand, while this is perfectly valid in theory, for many businesses the key barrier is not the ability to draw insights from large volumes of data; it is asking the right questions for which insight is needed.

The ability to provide answers does depend on the question being asked and the relevance of the big-data set to that question. How can one generalize to an assumption that more data will always mean more insight?   It isn’t always the answer that’s important, but the questions that are key.

Silly human – logs are for machines (too)

Here is an anecdote from a recent interaction with an enterprise application in the electric power industry:

1. Dave the developer logs all kinds of events. Since he is the primary consumer of the log, the format is optimized for human-readability. For example:

02-APR-2012 01:34:03 USER49 CMD MOD0053: ERROR RETURN FROM MOD0052 RETCODE 59

Apparently this makes perfect sense to Dave:   each line includes a timestamp and some text.

2. Sam from the Security team needs to determine the number of daily unique users. Dave quickly writes a parser script for the log and schedules it. He also builds a little Web interface so that Sam can query the parsed data on his own. Peace reigns.

3. A few weeks later, Sam complains that the web interface is broken. Dave takes a look at the logs, only to realize that someone else has added an extra field in each line, breaking his custom parser. He pushes the change and tells Sam that everything is okay again. Instead of writing a new feature, Dave has to go back and fill in the missing data.

4. Every 3 weeks or so, repeat Step 3 as others add logs.

What is your maximum NPH?

In The Information Diet, Clay Johnson wrote, “The modern human animal spends upwards of 11 hours out of every 24 in a state of constant consumption. Not eating, but gorging on information … We’re all battling a storm of distractions, buffeted with notifications and tempted by tasty tidbits of information. And just as too much junk food can lead to obesity, too much junk information can lead to cluelessness.”

Audit yourself and you may be surprised to find that you get more than 10 notifications per hour; they can be disruptive to your attention. I find myself trying hard (and often failing) to ignore the smartphone as it beeps softly to indicate a new distraction. I struggle to remain focused on the person in my office as the desktop tinkles for attention.

Should you kill off notifications though? Clay argues that you should and offers tools to help.

When designing EventTracker v7, minimizing notifications was a major goal. On Christmas Day in 2008, nobody was stirring, but the “alerts” console rung up over 180 items demanding review. It was obvious these were not “alerts.” This led to the “risk” score which dramatically reduces notifications.

We know that all “alerts”  are not equal: some merit attention before going to lunch, some before the end of the day, and some by the end of the quarter, budget permitting. There are a very rare few that require us to drop the coffee mug and attend instantly. Accordingly, a properly configured EventTracker installation will rarely “notify” you; but when you need to know — that alert will come screaming for your attention.

I am frequently asked what is the maximum events per second that can be managed. I think I’ll begin to ask how many notifications per hour (NPH) the questioner can handle. I think Clay Johnson would approve.

Data, data everywhere but not a drop of value

The sailor in The Rime of the Ancient Mariner relates his experiences after long sea voyage when his ship is blown off course:

“Water, water, every where,

And all the boards did shrink;

Water, water, every where,

Nor any drop to drink.”

An albatross appears and leads them out, but is shot by the Mariner and the ship winds up in unknown waters.  His shipmates blame the Mariner and force him to wear the dead albatross around his neck.

Replace water with data, boards with disk space, and drink with value and the lament would apply to the modern IT infrastructure. We are all drowning in data, but not so much in value. “Big data” are datasets that grow so large that managing them with on-hand tools is awkward. They are seen as the next frontier in innovation, competition, and productivity.

Log management is not immune to this trend. As the basic log collection problem (different sources, different protocols and different formats) has been resolved, we’re now collecting even larger datasets of logs. Many years ago we refuted the argument that log data belonged in a RDBMS, precisely because we saw the side problem of efficient data archival begin to overwhelm the true problem of extracting value from the data. As log data volumes continue to explode, that decision continues to be validated.

However, while storing raw logs in a database was not sensible, their power in extracting patterns and value from data is well established. Recognizing this, EventVault Explorer was released in 2011. Users can extract selected datasets to their choice of external RDBMS (a datamart) for fuzzy searching, pivot tables etc.   As was noted here , the key to managing big data is to personalize the results for maximum impact.

As you look under the covers of SIEM technology, pay attention to that albatross called log archives. It can lead you out of trouble, but you don’t want it around your neck.

Top 5 Compliance Mistakes

5.   Overdoing compensating controls

When a legitimate technological or documented business constraint prevents you from satisfying a requirement, a compensating control can be the answer after a risk analysis is performed. Compensating controls are not specifically defined inside PCI, but are instead defined by you (as a self-certifying merchant) or your QSA. It is specifically not an excuse to push PCI Compliance initiatives through completion at a minimal cost to your company. In reality, most compensating controls are actually harder to do and cost more money in the long run than actually fixing or addressing the original issue or vulnerability. See this article for a clear picture on the topic.

4. Separation of duty

Separation of duties is a key concept of internal controls. Increased protection from fraud and errors must be balanced with the increased cost/effort required.   Both PCI DSS Requirements 3.4.1 and 3.5 mention separation of duties as an obligation for organizations, and yet many still do not do it right, usually because they lack staff.

3. Principle of Least privilege

PCI 2.2.3 says they should “configure system security parameters to prevent misuse.” This requires organizations to drill down into user roles to ensure they’re following the rule of least privilege wherever PCI regulations apply.   This is easier said than done; more often it’s “easier” to grant all possible privileges rather than determine and assign just the correct set. Convenience is the enemy of security.

2. Fixating on excluding systems from scope

When you make the process of getting things out of scope a higher priority than addressing real risk, you get in trouble. Risk mitigation must come first and foremost. In far too many cases, out-of-scope becomes out-of-mind. This may make your CFO happy, but a hacker will get past weak security and not care if the system is in scope or not.

And drum roll …

1. Ignoring virtualization

Many organizations have embraced virtualization wholeheartedly given its efficiency gains. In some cases, virtualized machines are now off-premises and co-located at a service provider like Rackspace. This is a trend at federal government facilities.   However, “off-premises” does not mean “off-your-list”. Regardless of the location of the cardholder data, such systems are within scope as are the hypervisor. In fact, PCI DSS 2.0 says, if the cardholder data is present on even one VM, then the entire VM infrastructure is “in scope.”

The 5 Most Annoying Terms of 2011

Since every cause needs “Awareness,” here are my picks for management speak to camouflage the bloody obvious:

  5. Events per second

Log Management vendors are still trying to “differentiate” with this tired and meaningless metric as we pointed out in The EPS Myth.

  4. Thought leadership

Mitch McCrimmon describes it best.

  3. Cloud

Now here is a term that means all things to all people.

  2. Does that make sense?

The new “to be honest.” Jerry Weismann discusses it in the Harvard Business Review.

  1. Nerd

During the recent SOPA debate, so many self-described “country boys” wanted to get the “nerds” to explain the issue to them; as Jon Stewart pointed out, the word they were looking for was “expert.”

SIEM and the Appalachian Trail

The Appalachian Trail is a marked hiking trail in the eastern United States extending between Georgia and Maine. It is approximately 2,181 miles long and takes about six months to complete. It is not a particularly difficult journey from start to finish; yet even so, completing the trail requires more from the hiker than just enthusiasm, endurance and will.

Likewise, SIEM implementation can take from one to six months to complete (depending on the level of customization) and like the Trail, appears deceptively simple.   It too, can be filled with challenges that reduce even the most experienced IT manager to despair, and there is no shortage of implementations that have been abandoned or uncompleted.   As with the Trail, SIEM implementation requires thoughtful consideration.

1) The Reasons Why

It doesn’t take too many nights scurrying to find shelter in a lightning storm, or days walking in adverse conditions before a hiker wonders: Why am I doing this again? Similarly, when implementing any IT project, SIEM included, it doesn’t take too many inter-departmental meetings, technical gotchas, or budget discussions before this same question presents itself: Why are we doing this again?

  All too often, we don’t have a compelling answer, or we have forgotten it. If you are considering a half year long backpacking trip through the woods, there is a really good reason for it.   In the same way, one embarks on a SIEM project with specific goals, such as regulatory compliance, IT security improvement or to control operating costs.   Define the answer to this question before you begin the project and refer to it when the implementation appears to be derailing. This is the compass that should guide your way.   Make adjustments as necessary.

2) The Virginia Blues

Daily trials can include anything from broken bones to homesickness, a circumstance that occurs on the Appalachian Trail about four to eight weeks into the journey, within the state lines of Virginia. Getting through requires not just perseverance but also an ability to adapt.

For a SIEM project, staff turnover, false positives, misconfigurations or unplanned explosions of data can potentially derail the project. But pushing harder in the face of distress is a recipe for failure. Step back, remind yourself of the reasons why this project is underway, and look at the problems from a fresh perspective. Can you be flexible? Can you make find new avenues to go around the problems?

  3) A Fresh Perspective

In the beginning, every day is chock full of excitement, every summit view or wild animal encounter is exciting.   But life in the woods will become the routine and exhilaration eventually fades into frustration.

In  much the same way, after the initial thrill of installation and its challenges, the SIEM project devolves into a routine of discipline and daily observation across the infrastructure for signs of something amiss.

This is where boredom can set in, but the best defense against the lull that comes along with the end of the implementation is the expectation of it. The journey’s going to end.   Completing it does not occur when the project is implemented.   Rather, when the installation is done, the real journey and the hard work begins.

Threatscape 2012 – Prevent, Detect, Correct

The past year has been a hair-raising series of IT security breakdowns and headlining events reaching as high as RSA itself falling victim to a phishing attack.   But as the year set on 2011, the hacker group Anonymous remained busy, providing a sobering reminder that IT Security can never rest.

It turned out that attackers sent two different targeted phishing e-mails to four workers at its parent company, EMC.   The e-mails contained a malicious attachment that was identified in the subject line as “2011 Recruitment plan.xls” which was the point of attack.

Back to Basics:

Prevent:

Using administrative controls such as security awareness training, technical controls such as firewalls, and anti-virus and IPS, to stop attacks from penetrating the network.   Most industry and government experts agree that security configuration management is probably the best way to ensure the best security configuration allowable, along with automated patch management and updating anti-virus software.

Detect:

Employing a blend of technical controls such as anti-virus, IPS, intrusion detection systems (IDS), system monitoring, file integrity monitoring, change control, log management and incident alerting   can help to track how and when system intrusions are being attempted.

Correct:

Applying operating system upgrades, backup data restore and vulnerability mitigation and other controls to make sure systems are configured correctly and can prevent the irretrievable loss of data.

Echo Chamber

In the InfoSec industry, there is an abundance of familiar flaws and copycat theories and approaches. We repeat ourselves and recommend the same approaches. But what has really changed in the last year?

The emergence of hacking groups like Anonymous, LulzSec, and TeaMp0isoN.

In 2011, these groups brought the fight to corporate America, crippling firms both small (HBGary Federal) and large (Stratfor, Sony). As the year drew to a close these groups shifted from prank-oriented hacks for laughs (or “lulz”), to aligning themselves with political movements like Occupy Wall Street, and hacking firms like Stratfor, a Austin, Tex.-based security “think tank” that releases a daily newsletter concerning security and intelligence matters all over the world. After HBGary Federal CEO Aaron Barr publicly bragged that he was going to identify some members of the group during a talk in San Francisco at the RSA Conference week, Anonymous members responded by dumping a huge cache of his personal emails and those of other HBGary Federal executives online, eventually leading to Barr’s resignation. Anonymous and LulzSec then spent several months targeting various retailers, public figures and members of the security community. Their Operation AntiSec aimed to expose alleged hypocrisies and sins by members of the security community. They targeted a number of federal contractors, including IRC Federal and Booz Allen Hamilton, exposing personal data in the process. Congress got involved in July when Sen. John McCain urged Senate leaders to form a select committee to address the threat posed by Anonymous/LulzSec/Wikileaks.

The attack on RSA SecurId was another watershed event. The first public news of the compromise came from RSA itself, when it published a blog post explaining that an attacker had been able to gain access to the company’s network through a “sophisticated” attack. Officials said the attacker had compromised some resources related to the RSA SecurID product, which set off major alarm bells throughout the industry. SecurID is used for two-factor authentication by a huge number of large enterprises, including banks, financial services companies, government agencies and defense contractors. Within months of the RSA attack, there were attacks on SecurID customers, including Lockheed Martin, and the current working theory espoused by experts is that the still-unidentified attackers were interested in LM and other RSA customers all along and, having run into trouble compromising them directly, went after the SecurID technology to loop back to the customers.

The specifics of the attack were depressingly mundane (targeted phishing email with a malicious Excel file attached).

Then too, several certificate authorities were compromised throughout the year. Comodo was the first to fall when it was revealed in March that an attacker (apparently an Iranian national) had been able to compromise the CA infrastructure and issue himself a pile of valid certificates for domains belonging to Google, Yahoo, Skype and others. The attacker bragged about his accomplishments in Pastebin posts and later posted evidence of his forged certificate for Mozilla. Later in the year, the same person targeted the Dutch CA DigiNotar. The details of the attack were slightly different, but the end result was the same: he was able to issue himself several hundred valid certificates and this time went after domains owned by, among others, the Central Intelligence Agency. In the end, all of the major browser manufacturers had to revoke trust in the DigiNotar root CA.   The damage to the company was so bad that the Dutch government eventually took it over and later declared it bankrupt. Staggering, isn’t it? A lone attacker not only forced Microsoft, Apple and Mozilla to yank a root CA from their list of trusted roots, but he was also responsible for forcing a certificate authority out of business.

What has changed in our industry? Nothing really. It’s not a question “if” but “when” the attack will arrive on your assets.

Plus ça change, plus c'est la même, I suppose.

Taxonomy of a Cyber Attack

Cyber Attack

New Bill Promises to Penalize Companies for Security Breaches

On September 22, the Senate Judiciary Committee approved and passed Sen. Richard Blumenthal’s (D, Conn.) bill, the “Personal Data Protection and Breach Accountability Act of 2011,” sending it to the Senate floor. The bill will penalize companies for online data breaches and was introduced on the heels of several high profile security breaches and hacks that affected millions of consumers. These included the Sony breach which compromised the data of 77 million customers, and the DigiNotar breach which resulted in 300,000 Google GMail account holders having their mail hacked and read. The measure addresses companies that hold the personal information of more than 10,000 customers and requires them to put privacy and security programs in place to protect the information, and to respond quickly in the event of a security failure.

The bill proposes that companies be fined $5,000 per day per violation, with a maximum of $20 million per infringement. Additionally, companies who fail to comply with the data protection law (if it is passed) may be required to pay for credit monitoring services and subject to civil litigation by the affected consumers. The bill also aims to increase criminal penalties for identity theft, as well as crimes including the installing of a data collection program on someone’s computer and concealing any security breached in which personal data is compromised.

Key provisions in the bill include a process to help companies establish appropriate minimum security standards, notifications requirements, information sharing after a breach and company accountability.

While the intent of the bill is admirable, the problem is not a lack of laws to deter breaches, but the insufficient enforcement of these laws. Many of the requirements espoused in this new legislation already exist in many different forms.

SANS is the largest source for information security training and security certification, and their position is that we don’t need an extension to the Federal Information Security Management Act of 2002 (FISMA) or other compliance regulations, which have essentially encouraged a checkbox mentality: “I checked it off, so we are good.” This is the wrong approach to security but companies get rewarded for checking off criteria lists. Compliance regulations do not drive improvement. Organizations need to focus on the actual costs that can occur by not being compliant:

  • Loss of consumer confidence: Consumers will think twice before they hand over their personal data to an organization perceived to be careless with that information which can lead to a direct hit in sales.
  • Increased costs of doing business as with PCI-DSS: PCI-DSS is one example where enforcement is prevalent, and the penalties can be stringent. Merchants who do not maintain compliance are subject to higher rates charged by VISA, MasterCard, etc.
  • Negative press: One need only look at the recent data breaches to consider the continuing negative impact on the compromised company’s brand and reputation. In one case (DigiNotar), the company folded.

The gap does not exist in the laws, but rather, in the enforcement of those laws. Until there is enforcement any legislation or requirements are hollow threats.

Top 10 Pitfalls of Implementing IT Projects

It’s a dirty secret, many IT projects fail; maybe even as many as 30% of all IT projects.

Amazing, given the time, money and mojo spent on them, and the seriously smart people working in IT.

As a vendor, it is painful to see this. We see it from time to time (often helplessly from the sidelines), we think about it a lot, we’d like to see eliminated along with malaria, cancer and other “nasties.”

They fail for a lot of reasons, many of them unrelated to software.

At EventTracker we’ve helped save a number of nearly-failed implementations, and we have noticed some consistency of why they fail.

From the home office in Columbia MD, here are the top 10 reasons IT projects fail:

10. “It has to be perfect”

This is the “if you don’t do it right, don’t do it at all” belief system. With this viewpoint, the project lead person believes that the solution must perfectly fit existing or new business processes. The result is a massive, overly complicated implementation that is extremely expensive. By the time it’s all done, the business environment has changed and an enormous investment is wasted.

Lesson: Value does not mean perfection. Make sure the solution delivers value early and often, and let perfection happen as it may.

9. Doesn’t integrate with other systems

In almost every IT shop, “seamless integration with everything” is the mantra. Vendors tout it, management believes it, and users demand it. In other words to be all things to all people, IT project cannot exist in isolation. Integration has become a key component of many IT projects and it can’t exist alone anymore.

Lesson: Examine your needs for integration before you start the project. Find out if there are pre-built tools to accomplish this. Plan accordingly if they aren’t.

8. No one is in charge, everyone is in charge

This is the classic “committee” problem. The CIO or IT Manager decides the company needs an IT solution, so they assign the task of getting it done to a group. No one is accountable, no one is in charge. So they deliberate and discuss forever. Nothing gets done, and when it does, no one makes sure it gets driven into the organization. Failure is imminent.

Lesson: Make sure someone is accountable in the organization for success. If you are using a contractor, give that contractor enough power to make it happen.

7. The person who championed the IT solution quits, goes on vacation, or loses interest

This is a tough problem to foresee because employees don’t usually broadcast their departure or disinterest before bailing. The bottom line is that if the project lead leaves, the project will suffer. It might kill the project if no one else is up to speed. It’s a risk that should be taken seriously.

Lesson: Make sure that more than just one person is involved, and keep a new interim project manager shadowing and up-to-date.

6. Drive-by management

IT projects are often as much about people and processes as it is about technology. If the project doesn’t have consistent management support, the project will fail. After all, if no one knows how or why to use the solution, no one will

Lesson: Make sure you and your team have allocated time to define, test, and use your new solution as it is rolled out.

5. No one thought it through

One day someone realized, “hey we need a good solution to address the compliance regulations and these security gaps.” The next day someone started looking at packages, and a month later you buy one. Then you realized that there were a lot of things this solution affects, including core systems, router, applications and operations processes. But you’re way too far down the road on a package and have spent too much money to switch to something else. So you keep investing until you realize you are dumping money down a hole. It’s a bad place to be.

Lesson: Make sure you think it all through before you buy. Get support. Get input. Then take the plunge. You’ll be glad you did.

4. Requirements are not defined

In this all-too-common example, half way through a complex project, someone says “we actually want to rework our processes to fit X.” The project guys look at what they have done, realize it won’t work, and completely redesign the system. It takes 3 months. The project goes over budget. The key stakeholder says “hey this project is expensive, and we’ve seen nothing of value.” The budget vanishes. The project ends.

Lesson: Make sure you know what you want before you start building it. If you don’t know, build the pieces you do, then build the rest later. Don’t build what you don’t understand.

3. Processes are not defined

This relates to #4 above. Sometimes requirements are defined, but they don’t match good processes, because these processes don’t exist. Or no one follows them. Or they are outdated. Or not well understood. The point is that the solution is computer software: it does exactly what you tell it the same way every time, and it’s expensive to change it. Sloppy processes are impossible to create in software making the solution more of a hindrance than a help.

Lesson: Only implement and automate processes that are well understood and followed. If they are not well understood, implement them in a minimal way and do not automate until they are well understood and followed.

2. People don’t buy in

Any solution with no users is a very lonely piece of software. It’s also a very expensive use of 500Mb on your server. Most IT projects fail because they just aren’t used by anyone. They are a giant database of old information and spotty data.  That’s a failure.

Lesson: Focus on end user adoption. Buy training. Talk about the value that it brings your customers, your employees, and your shareholders. Make usage a part of your employee review process. Incentivize usage. Make it make sense to use it.

1. Key value is not defined

This is by far the most prevalent problem in implementing IT solutions: Businesses don’t take time to define what they want out of their implementation, so it doesn’t do what they want. This goes further than just defining requirements. It’s about defining what value the new software will deliver for the business. By focusing on the nuts and bolts, the business doesn’t figure out what they want from the system as a whole.

Lesson: Instead of starting with “hey I need something to accomplish X,” the organization should be asking “how can this software help us bring value to our security posture, to our internal costs, to our compliance requirements.”

This list is not exhaustive – there are many more ways to kill your implementation. However if your organization is aware of the pitfalls listed above, you have a very high chance of success.

A.N. Ananth

Five reasons for log apathy — and the antidote

Five Reasons for Log Apathy – and the Antidote

How many times have you heard people just don’t care about logs? That IT guys are selfish, stupid or lazy? That they would rather play with new toys than do serious work?
I argue that IT guys are amazing, smart and do care about the systems they curate, but native tools are such that log management is often like running into a brick wall — they encourage disengagement.

Here are five reasons for this perception and what can be done about them.

#1 Obscure descriptions: Ever see a raw log? A Cisco intrusion or a Windows failed object access attempt or a Solaris BSM record to mount a volume? Blech… it’s a description even the author would find hard to love. Not written to be easy to understand, rather its purpose is either debugging by the developer or meant to satisfy certification requirements. This is not apathy, it’s intentional exclusion.

To make this relevant, you need a relevant description which highlights the elements of value, enrichs the information (e.g., lookup an IP address or event id) and not just spew them in time sequence but present information in priority order of risk.

#2 Lack of access: What easier way to spur disengagement than by hiding the logs away in an obscure part of the file system, out of sight to any but the most determined; if they cannot see it, they won’t care about it.

The antidote is to centralize logging and throw up an easy to under display which presents relevant information – preferably risk ordered

#3 Unsexiness:  All the security stories are about wikileaks and credit card theft. Log review is considered dull/boring, it’s a rare occurrence to make it to the plot line of Hawaii Five-O .

Compare it to working out at the gym, it can be boring and there are 10 reasons why other things are more “fun” but it’s good for you and pays handsomely in the long run.

#4 Unsung Heroes: Who is the Big Man on your Campus? Odds are, it’s the guys who make money for the enterprise (think sales guys or CEOs).

Rarely is it the folks who keep the railroad running or god forbid, reduce cost or prevent incidents.

However, they are the wind beneath the wings of the enterprise. The organization that recognizes and values the guys who show up for work everyday and do their job without fuss/drama is much more likely to succeed. Heroes are the ones who make a voluntary effort over a long period of time to accomplish serious goals, not chosen ones with marks on their forehead, destined from birth to save the day.

#5 Forced Compliance: As long as management looks at regulatory compliance as unwarranted interference, it will be resented and IT is forced into checkbox mentality that benefits nobody.

It’s the old question “What comes first? Compliance (chicken) or security (egg)?” We see compliance as a result of secure practices. By making it easy to crunch the data and present meaningful scores and alerts, there is less need to force this.

I’ll say it again, I know many IT guys and gals who are amazing, smart and care deeply about the systems they manage. To combat log apathy, make it easier to deal with them.

Tip of the hat to Dave Meslin whose recent talk at Tedx in Toronto spurred this blog entry

A.N. Ananth

Personalization wins the day

Despite tough times for the corporate world in the past year, spending on IT security was a bright spot in an otherwise gloomy picture.

However if you’ve tried to convince a CFO to sign off on tools and software, you know just how difficult this can be. In fact, the most common way to get approval is to tie this request to an unrelenting compliance mandate. Sadly, a security incident can also help focus and trigger the approval of budget.

Vendors have tried hard to showcase their value by appealing to the preventive nature of their products. ROI calculations are usually provided to demonstrate quick payback but these are often dismissed by the CFO as self serving. Recognizing the difficulty of measuring ROI, an alternate model called ROSI has been proposed but has met with limited success.

So what is an effective way to educate and persuade the gnomes? Try an approach from a parallel field, presentation of medical data. Your medical chart: it’s hard to access, impossible to read — and full of information that could make you healthier if you just knew how to use it, pretty much like security information inside the enterprise. But if you have seen lab results, even motivated persons find it hard to decipher and take action, much less the disinclined.

In a recent talk at TED, Thomas Goetz, the executive editor of Wired magazine addressed this issue and proposed some simple ideas to make this data meaningful and actionable. The use of color, graphics and most important personalization of the information to drive action. We know from experience that posting the speed limit is less effective at getting motorists to comply as compared to a radar gun which posts the speed limit and framed by “Your speed is __”. Its all about personalization.

To make security information meaningful to the CFO, a similar approach can be much more effective than bland “best practice” prescriptions or questionable ROI numbers. Gather data from your enterprise and present it with color and graphs tailored to the “patient”.

Personalize your presentation; get a more patient ear and much less resistance to your budget request.

A. N. Ananth

Best Practice v/s FUD

Have you observed how “best practice” recommendations are widely known but not followed as much? While it seems more the case in IT Security, it is observed true in every other sphere as well. For example, dentists repeatedly recommend brush and floss after each meal as best practice, but how many follow this advice? And then there is the clearly posted speed limit on the road, more often than not, motorists are speeding.

Now the downside to non-compliance is well known to all and for the most part well accepted – no real argument. In the dentist example these include social hardships ranging from bad teeth and breath to health issues and the resulting expense. In the speeding example, there is potential physical harm and of course monetary fines. However it would appear that neither the fear of “bad outcomes” nor “monetary fine” spur widespread compliance. Indeed one observes that the persons who do indeed comply, appear to do so because they wish to; the fear or fine factors don’t play a major role for them.

In a recent experiment, people visiting the dentist were divided in two groups. Before the start, each patient was asked to indicate if they classified themselves as “generally listen to the doctors advice”. After the checkup, people from one group were given the advice to brush and floss regularly but then given a “fear” message on the consequences of non-compliance — bad teeth, social ostracism, high cost of dental procedures etc. People from the other group got the same checkup and advice but were given a “positive” message on the benefits of compliance– nice smile, social popularity, less cost etc. A follow up was conducted to determine which of the two approaches was more effective in getting patients to comply.

Those of us in IT Security battling for budget from unresponsive upper management have been conditioned to think that the “fear” message would be more effective … but … surprise, neither approach was more effective than the other in getting patients to comply with “best practice.”  Instead, those who classified themselves as “generally listen to doctors advice” were the one who did comply. The rest were equally impervious to either the negative or positive consequences, while not disputing them.

You could also point to the great reduction in smoking incidence but this best practice has required more than 3 decades of education to achieve the trend and still can’t be stamped out.

Lesson for IT Security — education takes time and behavior modification, even more so.

Subtraction, Multiplication, Division and Task Unification through SIEM and Log Management

When we originally conceived the idea of SIEM and log management solution for IT managers many years ago, it was because of the problems they faced dealing with high volumes of cryptic audit logs from multiple sources. Searching, categorizing/analyzing, performing forensics and remediation for system security and operational challenges evidenced in disparate audit logs were time consuming, tedious, inconsistent and unrewarding tasks.  We wanted to provide technology that would make problem detection, understanding and therefore remediation, faster and easier.

A recent article in Slate caught my eye; it was all about Infomercials…staple of late night TV and a pitch-a-thon that was conducted in Washington DC for new ideas. The question is just how would you know a “successful” idea if you heard it described?

By now, SIEM has “Crossed the Chasm” , indeed the Gartner MQ puts it well into mainstream adoption, but in the early days, there was some question as to whether this was a real problem or if, as is too often the case, if SIEM and log management was a solution in search of a problem.

Back to the question — how does one determine the viability of an invention before it is released into the market?  Jacob Goldenberg, a professor of marketing at Hebrew University in Jerusalem and a visiting professor at Columbia University, has coded a kind of DNA for successful inventions. After studying a year’s worth of new product launches, Goldenberg developed a classification system to predict the potential success of a new product. He found the same patterns embedded in every watershed invention.

The first is subtraction—the removal of part of a previous invention.

For example, an ATM is a successful invention because it subtracts the bank teller.

Multiplication is the second pattern, and it describes an invention with a component copied to serve some alternate purpose.  Example: the digital camera’s additional flash to prevent “red-eye.”

A TV remote exemplifies the third pattern: division. It’s a product that has been physically divided, or separated, from the original; the remote was “divided” off of the TV.

The fourth pattern, task unification, involves saddling a product with an additional job unrelated to its original function. The iPhone is the quintessential task unifier.

SIEM and log management solutions subtract (liberate) embedded logs and log management functionality from source systems.

SIEM and log management solutions (via aggregation) the problems that can be detected with correlation that would have gone unnoticed otherwise.

EventTracker also meets the last two criteria–arguably decent tools for managing logs ought to have been included by OS and platform vendors (Unix, Linux, Windows, Cisco all have very rudimentary tools for this, if anything); so one can say EventTracker provides something needed for operations (like the TV remote) but not included in the base product.

With the myriad features now available such as configuration assessment, change audit, netflow monitoring and system status, the task unification criteria is also satisfied; you can now address a lot of security and operational requirements that are not strictly “log” related – “task unification”.

When President Obama praised innovation as a critical element in the recovery in his State of the Union, he may not have had “As Seen on TV” in mind but does SIEM fit the bill?

What’s the message supposed to be?  That SIEM and log management solutions are (now?) a good invention? SIEM has crossed the chasm!

5 Myths about SIEM and Log Management

In the spirit of the Washington Posts’ regular column, “5 Myths”, here is “a challenge everything you think you know” about SIEM/Log Management.

Driven by compliance regulation and the unending stream of security issues, the IT community, over the past few years, has accepted SIEM and Log Management as must-have technology for the data center. The Analyst community lumps a number of vendors together as SIEM and marketing departments are always in overdrive to claim any or all possible benefits or budget. Consequently some “truths” are bandied about. This misinformation affects the decision-making process so let’s look at them.

1. Price is everything…all SIEM products are roughly equal in terms of features/functions. 

An August 2010 article in SC Magazine points out that “At first blush, these (SIEM solutions) looked like 11 cats in a bag” quickly followed by “But a closer look shows interesting differences in focus.”  Nice save but the first thought was the products were roughly equal, and for many that was a key take-away. As so many are influenced by the Gartner Magic Quadrant, the picture is taken to mean everything separated from the detailed commentary, even though that commentary states quite explicitly to look closely at features.

Even better, look at where vendor started?  Very different places it turns out, but then added the features and functionality to meet market (or marketing) needs. For example, NetForensics preaches that SIEM is really correlation; Logrhythm believes that focusing on your logs is the key to security; Tenable thinks vulnerability status is the key; Q1Labs offers network flow monitoring as the critical element; eIQ origins are as a firewall log analyzer.  So, while each solution may claim “the same features”, under the hood, they each started in a certain place, and packed additional feature/functionality around their core – they continue to focus on their core as being their differentiator; adding functionality as the market demands.

Also, some SIEM vendors are software-based, while others are appliance-based, which in itself differentiates the players in the market.

All the same? Hardly.

2. Appliances are a better solution.
Can you spell groupthink? It’s a way; neither better nor worse as a technical approach; perhaps easier for resellers to carry.

When does a software-based solution win?

– Sparing.  To protect your valuable IT infrastructure, you will need to calculate a 1xN relationship of live appliances to back-ups.  If your appliance breaks down and you don’t have a spare, you have to ship the appliance and wait for a replacement.  With software, if your device breaks down, you can simply install the software on existing capacity in your infrastructure, and be back up and running in minutes versus potentially days.

– Scalability.  With an appliance solution, your SIEM solution has a floor and a ceiling.  You need at least one device to get started, and it has a maximum capacity before you have to add another appliance at a high price.  With a software solution, you can scale incrementally… one IT infrastructure device at a time.
– Single Sign On. Integrate easily with Active Directory or LDAP; same username/password or smartcard authentication; very attractive

– Storage. What retention period is best for your logs? Weeks? Months? Years? With appliances, its dictated by the disk size provided; with software you decide or can use network based storage

So appliances must be easier to install? Plug in the box, provide an IP and you are done? Not really – more than 99% of the configuration is local to the user.

3. Your log volumes don’t matter…disk space is cheap.

Sure…but as Defense Secretary Rumsfeld used to say $10B and $10B there and pretty soon you’re talking real money.

Logs are voluminous, a successful implementation leads to much higher log volume and terabytes add up very quickly.  Compression is essential but the ability to access network based storage is even more important. The ability to backup/restore archives easily and natively to nearline or offline storage is critical.

If you consider an appliance solution, it is inherently limited in the available disk.

4. The technology is like an antivirus… just install it and forget it, and if anything is wrong, it will tell you.

Ahh, the magic bullet!  Like the ad says, “Set it and forget it!”  If only this were true… wishing will not make it so.  There is not one single SIEM vendor that can justify saying “open the box, activate your SIEM solution in minutes, and you will be fine!”  To say so, or even worse, to believe it would just be irresponsible!

If you just open the box and install it, you will only have the protection offered by the default settings.  With an antivirus solution, this is possible because you have all of the virus signatures to date, and it automatically looks to the virus database to see if there are any updates, and is constantly updated as signatures are added.  Too bad they cannot recognize a “Zero Day” attack when it happens, but that for now, is impossible.

With a SIEM solution, you need something you don’t need with an antivirus…  you need human interaction.  You need to tell the SIEM what your organization’s business rules are, define the roles and capabilities of the users, and have an expert analyst team monitor it, and adapt it to ever-changing conditions.  The IT infrastructure is constantly changing, and people are needed to adjust the SIEM to meet threats, business rules, and the addition or subtraction of IT components or users.

Some vendors imply that their SIEM solution is all that is needed, and you can just plug and play.  You know what the result is?  Unhappy SIEM users chasing down false positives or much worse false negatives.  All SIEM solutions require educated analysts to understand the information being provided, and turn it into actions.  These adjustments can be simplified, but again, it takes people.  If you are thinking about implementing a SIEM and forgetting about it…then fuhgeddaboutit!

5. Log Management is only meaningful if you have a compliance requirement.

Seen the recent headlines? From Stuxnet to Wikileaks to Heartland? There is a lot more to log management than merely satisfying compliance regulations. This myth exists because people are not aware of the internal and external threats that exist in this century!  SIEM/Log Management solutions provide some very important benefits to your organization beyond meeting a compliance requirement.

– Security.  SIEM/Log Management solutions can detect and alert you to a “Zero-Day” virus before the damage is done…something other components in your IT infrastructure can’t do.  They can also alert you to brute force attacks, malware, and trojans by determining what has changed in your environment…

– Improve Efficiency.  Face it!  There are two many devices transmitting too many logs, and the IT staff doesn’t have the time to comb through the logs and know if they are performing the most essential tasks in the proper order.  Many times order is defined by who is screaming the loudest.  A SIEM/Log Management solution help to know of a potential problem sooner, can automate the log analysis, prioritize the order in which issues are addressed, improving the overall efficiency of the IT team!  It is also much more efficient to perform forensic analysis to determine the cause and effect of an incident.

– Improve Network Performance.  Are the servers not working properly?  Are the applications going slowly?  The answer is in the logs, and with a SIEM/Log Management solution, you can quickly locate the problem and fix it.

– Reduce costs.  Implementing a SIEM enables organizations to reduce the number of threats both internal and external, and reduce the operating cost per device.   A SIEM can dramatically reduce the number of incidents that occur within your organization, which eliminates the cost it would take to figure out what actually happened.  Should an event occur, the amount of time it takes to perform the forensic analysis and fix the problem can be greatly shortened, reducing the total loss per incident.

– Ananth

Panning for gold in event logs

Ananth, the CEO of Prism is fond of remarking “there is gold in them thar logs…” this is absolutely true but the really hard thing about logs is figuring out how to get the gold out without needing to be the guy with the pencil neck and the 26 letters after their name that enjoys reading logs in their original arcane format. For the rest of us, I am reminded of the old western movies where prospectors pan for gold – squatting by the stream, scooping up dirt and sifting through it looking for gold, all day long, day after day. Whenever I see one of those scenes my back begins to hurt and I feel glad I am not a prospector. At Prism we are in the business of gold extraction tools. We want more people finding gold and lots of it. It is good for both of us.

One of the most common refrains we hear from prospects is they are not quite sure what the gold looks like. When you are panning for gold and you are not sure that glinty thing in the dirt is gold, well, that makes things really challenging. If very few people can recognize the gold we are not going to sell large quantities of tools.

In EventTracker 6.4 we undertook a little project where we asked ourselves “what can we do for the person that does not know enough to really look or ask the right questions?” A lot of log management is looking for the out-of-ordinary, after all. The result is a new dashboard view we call the Enterprise Activity Monitor.

Enterprise Activity uses statistical correlation to looks for things that are simply unusual. We can’t tell you they are necessarily trouble, but we can tell you they are not normal and enable you to analyze them and make a decision. Little things that are interesting – like if you get a new IP address coming into your enterprise 5000 times. Or if a user generally performs 1000 activities in a day, but suddenly does 10,000, or even as simple as a new executable showing up unexpectedly on user machines. Will you chase the occasionally false positive ? definitely, but a lot of the manual log review being performed by the guys with the alphabets after their names is really simply manually chasing trends – this enables you to stop wasting significant time in detecting the trend — all the myriad clues that are easily lost when you are aggregating 20 or 100 million logs a day.

The response from the Beta customers indicates that we are onto something. After all, any thing that can make our (hopefully more) customers’ lives less tedious and their backs hurt less, is all good!

Steve Lafferty

Let he who is without SIM cast the first stone

In a recent post Raffael Marty points out the shortcomings of a “classic” SIM solution including high cost in part due to a clumsy, expensive tuning process.

More importantly, he points out that SIM’s were designed for network-based attacks and these are on the wane, replaced by host-based attacks.

At Prism, we’ve long argued that a host-based system is more appropriate and effective. This is further borne out by the appearance of polymorphic strains such as Nugache that now dominate Threatscape 2008.

However is “IT Search” the complete answer? Not quite. As a matter of fact, any such “silver bullet” has never worked out. Fact is, users (especially in mid-tier) are driven by security concerns, so proactive correlation is useful (in moderation), compliance remains a major driver and event reduction with active alerting is absolutely essential for the overworked admin. That said “IT Search” is a useful and powerful tool in the arsenal of the modern, knowledgeable Security Warrior.

A “Complete SIM” solution is more appropriate for the enterprise. Such a solution blends the “classic” approach which is based on log consolidation and multi-event correlation from host and network devices PLUS a white/greylist scanner PLUS the Log Search function. Long term storage and flexible reporting/forensic tools round out the ideal feature set. Such a solution has better potential to satisfy the different user profiles. These include Auditors, Managers and Security Staff, many of who are less comfortable with query construction.

One dimensional approaches such as “IT Search” or “Network Behavior Anomaly Detection” or “Network Packet Correlation” while undeniably useful are in themselves limited.

Complete SIM, IT Search included, that’s the ticket.

 Ananth

When you can’t work harder, work smarter

In life and business, the smart approach is to make the most of what you have. You can work for 8 hours and 10 hours and then 12 hours a day and hit your performance limit. How do you get more out of your work? By working smarter, not harder – Get others on board, delegate, communicate. Nowhere is this truer than with computer hardware. Poorly written software makes increasing demands on resources but cannot deliver quantum jumps in performance.

As we evaluated earlier versions of EventTracker it became clear that we were soon reaching the physical limits of the underlying hardware and the choke point to getting faster reports was not to work harder (optimize Code) but to work smarter (plan up-front, divide and conquer, avoid searching through irrelevant data).

This is realized in the Virtual Collection Point architecture that is available in version 6. By segregating log sources up front into virtual groups and stacking software processes from reception to archiving, improvement in performance is possible FOR THE SAME HARDWARE!

When comparing SIEM solutions for scalability, remember that if the only path is to add more hardware, it’s a weaker approach than making the best of what you already have.

– Ananth

Are you worth your weight in gold?

Interesting article by Russell Olsen on Windows Change Management on a Budget

He says: “An effective Windows change management process can be the difference between life and death in any organization. Windows managers who understand that are worth their weight in gold…knowing what changed and when it changed makes a big difference especially when something goes wrong. If this is so clear, why do so many of us struggle to implement or maintain an adequate change control process?”

Olsen correctly diagnoses the problem as one of discipline and   commitment. Like exercising regularly, its hard….but there is overwhelming evidence of the benefits.

The EventTracker SIM Edition makes it a little easier by automatically taking system (file and registry) snapshots of Windows machines at periodic intervals for comparison either over time or against a golden baseline.

Given the gap between outbreak and vaccine for malware and attacks, as well as the potential for innocuous human error when dealing with complex machinery, the audit function makes it all worthwhile. The CSI 2007 survey shows the annual loss from such incidents to be $350,000.

Avoiding such losses (and regular exercise) will make you worth your weight in gold.

– Ananth

Why words matter…or do they?

Well this is starting to turn into a bit of a bun fight, which was not my intent as I was merely attempting to clarify some incorrect claims in the Splunk post. Well, now Anton has weighed in with his perspective:

“I think this debate is mostly about two approaches to logs: collect and parse some logs (typical SIEM approach) vs collect and index all logs (like, ahem, “IT search”).”

Yes, he is right in a sense. It is a great concise statement, but the statement needs to be looked at as there are some nuances here that need to be understood.

Just a bit of level-set before going to work on the meat of the statement.

Most SIEM solutions today have a real-time component, (typically a correlation engine), and some kind of analytics capability. Depending on the vendor some do one or the other better (and of course we all package and price them all differently).

Most of the “older” vendors started out as correlation vendors targeting F2000 enabling real-time threat detection in the SOC. The analytics piece was a bit of a secondary requirement, and secure, long term storage not so much as all. The Gartner guys called these vendors SEM or Security Event Management providers which is instructive – event to me implies a fairly short-term context. Since 2000, the analytics and reporting capability has become increasingly important as compliance has become the big driver. Many of the newer vendors in the SIEM market focused on solving the compliance use-case and these solutions typically featured secure and long term storage, compliance packs, good reporting etc. These new vendors were sometimes referred to as SIM or Security Information Management. These vendors fit a nice gap left in the capabilities of the correlation vendors. Some of the newer vendors like Loglogic made a nice business focusing on selling log collection solutions to large enterprise – typically as an augmentation to an existing SIM. Some of these newer vendors like Prism , focused on mid tier and provided lower-cost, easy to deploy solutions that did both compliance as well as provided real-time capabilities to companies that did not that did have the money or the people to afford the enterprise correlation guys. These companies had a compliance requirement and wanted to get some security improvements as well.

But really all of us, SIM/SEM, enterprise, mid-tier, Splunk were/are collecting the same darn logs – we were just doing slightly different things with them. So of course the correlation guys have released log aggregators (like Arcsight Logger), and the Log Management vendors have added or always had real-time capability. And at the end of the day we ended up getting lumped into the SIEM bucket, and here we are.

For anyone with a SIEM requirement… You should understand what your business requirements are and then look long and hard at the vendor’s capability – preferably by getting them in house to do an evaluation in your own environment. Buying according to which one claims to do the most events per second or supports the most devices, or even the one has the most mindshare in the market is really short sighted. Nothing beats using the solution in action for a few weeks, and this is a classic “the devil is in the details…”

So, back to Anton’s statement (finally!). When Anton refers to “collect and parse some logs” that is the typical simplification of the real-time security use case – you are looking for patterns of behavior and only certain logs are important because you are looking for attack patterns in specific event types.
The “collect and index all the logs” is the typical compliance use case. The indexing is simply the method of storing for efficient retrieval during analysis – again a typical analytics requirement.

Another side note. The importance of collecting all the logs is a risk assessment that the end user should do. Many people tend to collect “all” the logs because they don’t know what is important and it is deemed the easiest and safest approach. The biggest beneficiaries of that approach are the SIEM appliance vendors as they get to sell another proprietary box when the event volume goes through the roof, and of course those individuals that hold stock in EMC. Despite compression, a lot of logs is still a lot of logs!

Increasingly, customers I talk to are making a conscious decision to not collect or retain all the logs as there is overhead and a security risk in storing logs as they consider them sensitive data. Quite frankly you should look for a vendor that allows you to collect all the data, but also provides you with some fairly robust filtering capability in case you don’t want or need to. This is a topic for another day, however.

So when Anton claims that you need to do both – if you want to do real-time analysis as well as forensics and compliance -then yes, I agree, but when he claims the “collect and parse” is the typical SIEM approach then that is an overgeneralization, which really was the purpose of my post to begin with. I tend not to favor them as they simply misinform the reader.

– Steve Lafferty

More thoughts on SIEM vs. IT Search

I posted a commentary a while ago on a post by Raffy, who discussed the differences between IT Search (or Splunk, as they are the only folks I know who are trying to make IT Search a distinct product category) and SIEM. Raffy posted a clarification  in response to my commentary. What I was pointing out in my original post was that all vendors, SIEM or Splunk, are loading the same standard formats – and what needed to be maintained was, in fact, not the basic loader, but the knowledge (the prioritization, the reports, alerts etc) of what to do with all that data. And the knowledge is a core part of the value that SIEM solutions provide. On that we seem to agree. And as Raffy points out, the Splunk guys are busily beavering away producing knowledge as well. Although be careful — you may wake up one morning and find that you have turned into a SIEM solution!

Sadly the concept of the bad “parser” or loader continues to creep in – Splunk does not need it which is good. SIEM systems do, which is bad.

I am reasonably familiar with quite a few of the offerings out there for doing SIEM/log management, and quite frankly, outside of perhaps Arcsight (I am giving Raffy the benefit of the doubt here as he used to work at Arcsight, so he would know better than I), I can’t think of a vendor that writes proprietary connectors or parsers to simply load raw data. We (EventTracker) certainly don’t. From an engineering standpoint, when there are standard formats like Windows EVT, Syslog and SNMP it would be pretty silly to create something else. Why would you? You write them only when there is a proprietary API or data format like Checkpoint where you absolutely have to. No difference here. I don’t see how this parser argument is in any way, shape or form indicative of a core difference.

I am waiting on Raffy’s promised follow-on post with some anticipation  – he states that he will explain the many other differences between IT Search and SIEM, although he prefaced some of it with the Splunk is Google-like and Google is God ergo…

Google was/is a gamechanging application, and there are a number of things that made them unique – easy to use, fast, and the ability to return valuable information. But what made Google a gazillion dollar corporation is not the Natural Language Search – I mean, that is nice but simple “and” “or” “not” is really not a breakthrough in the grand scheme of things. Now the speed of the Google search, that is pretty impressive – but that is due to enormous server farms so that is mechanical. Most of the other early internet search vendors had both these capabilities. My early personal favorite was AltaVista, but I switched a long time ago to Google.

Why? What absolutely blew my socks off and continues to do so to this day about Google is their ability to figure out which of the 10 millions entries for my arbitrary search string are the ones I care about, and providing them, or some of them, to me in the first hundred entries. They find the needle in the proverbial haystack. Now that is spectacular (and highly proprietary) and the ranking algorithm is a closely guarded secret I hear. Someone once told me that lot of it is done around ranking from the millions of people doing similar searches – it is the sheer quantity of search users on the internet. The more searches they conduct the better they become. I can believe that. Google works because of the quantity of data and because the community is so large – and they have figured out a way to put the two together.

I wonder how an approach like that would work however, when you have a few admins searching a few dozen times a week. Not sure how that will translate, but I am looking forward to finding out!

– Steve Lafferty