NY Times article on Stuxnet worm

Interesting background on Stuxnet worm in the Sunday NY Times article. It highlights how effective state sponsored cyber warfare can be.

How will history percieve the cyber events of 2010? Including the Google’s standoff with China earlier in the year.

In this environment Data is King, and logs are the data for detecting and understanding all the cyber warfare events of last year.

Posted in Logging | Leave a comment

Evolution of Log Management

Log management (sometimes referred to as SIEM) has evolved over the years that I have been working on it.  I have seen several significant stages of how organizations create, collect, search, and report on their log data.  It is interesting to look at the past decade of log management and try to think about what will influence the next major changes.  I’ve tried to summarize some of the stages from my perspective:

1990s – Roll your own logging solution

A decade ago most organizations were doing spotty log analysis.  Usually a log server was setup for a specific application or device on the network.  An engineer would set up a syslog server and run some perl and grep scripts against some existing log data.  Log files were rotated on a scheduled interval and only the most knowledgeable engineer could write the scripts and know what to look for in the data.  Occasionally, a perl script would run a cgi application to post the log data to a web server so the log data could be viewed through a web browser.  Application developers were typically writing the log analysis tools for their own application.

Best practices were not well documented and there were few organizations that specialized in analyzing logs.  Security focused organizations like SANS and CERT had written a few articles about log management, but there were no guidelines like ITIL and no regulations existed that would require companies to review any of their log data.  Logs were not perceived as very important and in many cases were not reviewed or stored for very long.

There weren’t many commercial tools that made centralized log collection, retention, reporting, and alerting easy.  Very few organizations wanted to develop their own solutions and there wasn’t a big enough financial incentive for executives to  invest in those technologies, but that was all about to change.

2001 – Security Information Management (SIM)

Security was the first comprehensive driver for IT organizations to look at their log data.  Security Information Management tools came along soon after the proliferation of Intrusion Detection Systems (IDS) and were primarily used to reduce the number of false positives those systems were notorious for generating.  The goal of SIM products was to uncover the real security issues from the noise that was in the log data.

SIM applications identified and alerted users to security threats by normalizing and correlating log data from multiple sources.  For example, if an IDS sent a log message that an attack signature was recorded on the network and  vulnerability assessment software indicated that that system was vulnerable to that kind of attack, then it alerted someone to the attack.  This correlation became especially useful between security, network, and system devices.

SIM vendors needed to develop log parsing software and normalization for each source device they collected so they could run them through their correlation engines.  This made SIM tools very expensive and difficult to maintain because it required an organization to constantly write correlation rules about what was normal traffic.   It usually required a staff of very talented (and expensive) security experts.  It also required a large investment in hardware and software because correlation rules required a lot of CPU cycles and memory.

SIM tools were very useful because they identified and alerted their users to potential security issues without having to read through all the log data, but because of their complexity and expensive price tag, a lot of organizations couldn’t use them.  One valuable feature that the SIM products showed to all organizations was the centralized collection and retention of log data that was then available for reporting and forensics analysis.

2004 – PCI and log management appliances

In 2004 the Payment Card Industry Data Security Standards (PCI) were created.  They set a requirement that any business that wanted to use credit cards for payments had to implement a long list of security controls that included log management.  Those best practices included how long to keep the log data (1 year), in what format (tamper proof), how often someone had to review it (daily), and many other requirements.  If organizations wanted to continue to collect credit card payments they would have to meet the guidelines and be audited regularly or they would face financial penalties.

When PCI was created companies across several industries started looking for log management solutions.  For more than security reasons, they needed to meet PCI guidelines.  The IT department that was required to meet the guidelines was typically short staffed and so the most likely choice was to acquire tools that wouldn’t take much time to setup and maintain, but could meet all the requirements.  IT managers often turned to log management appliances that would collect the log data, parse it into reports, and provide secure storage of the sensitive data.

These log management appliances often were built for creating reports that could be used for security and to pass compliance audits.  The appliance would collect the log data and then parse it into a relational database for normalization and reporting.  These appliances excelled at finding the log data from a specific time period, but as the volume of log data increased or the period that people wanted to analyze it grew, the database became very slow.

Most of the SIEM and log management products support specific source log types because the the products need to parse the log data into common formats or database schemas for analysis.  These common formats allow the software to apply meaning to the log data so it can be used for correlation rules or a common reporting engine (i.e.. compliance reports).  These common formats or database schemas also prevent the products from being flexible enough to easily support other applications.

2008 – Full text indexing and fast search

By 2008 most log management products had added full text indexing to add search and troubleshooting capabilities for applications, especially the ones they weren’t going to build parsing code or support.   By indexing all messages, or full text indexing, the data could easily be searched quickly for key words and organized by words instead of parsed fields.  Usually a ‘word’ in a full text index was anything that was between two delimiters like a space or a comma.  Full text indexing allowed all the words, or the full messages, in a log file to be put in an index and retrieved when the software searched the data.  Simple reporting could be generated by doing a full text index search and then parsing the results, thereby parsing only the records that meet the search criteria rather than the whole data set.

Searching log data and using the simple reports changed the way organizations used log data and identified the information they needed.  It is the best way to find the needle in the haystack, but it still requires talented engineers to know what to search for in their applications.  Full text indexing and search also allow for less work on application logs that were not traditionally supported by vendor common formats or database schemas.

Searching could be used for forensics or troubleshooting an application, but since the data was not parsed and normalized  into fields it couldn’t be used effectively for security correlation, anomaly detection, or other analysis that require semantic information or relationships stored with the data.

2010 and beyond – What is next?

Log management will continue to evolve over the coming years.  With the volume of log data growing an average of 20% (in some cases over 100%) annually at organizations, the log management tools that were designed five or ten years ago are not scalable for the log volume expected tomorrow.  Scalability, flexibility, and functionality need to continue to evolve in log management products.

Several developments are changing the way logs are utilized:

  • The Cloud.  Many people point to the cloud as the future of log management.  Cloud applications will definitely impact who owns the log data and where it originates.  Also, it should change the volume and complexity of centralized log management.  Data center logs won’t go away; they will probably continue to grow.  Most organizations will need to come up with a strategy on how to monitor the log data that they have in the data center and the log data from cloud applications.
  • HIPAA.  In the US. HIPAA regulations have the potential to drive log management and security in the health care and insurance industries.  Also, there are regulations coming in European countries that will influence many international organizations.
  • New Applications.  Fraud detection and cyber warfare are driving many of the financial and government organizations to make large investments in SIM and log management tools.
  • Big Data.  Hadoop/Map-reduce and NoSQL applications are often discussed as having potential for large scale log management applications.  I see more and more organizations turning to these tools for their largest log data archives and problems.  I heard from one hadoop vendor this year that over half of their customers were using hadoop for log analysis.

As log data continues to grow in organizations and the requirements for identifying security, application troubleshooting, and business intelligence increase there will be a need to utilize new techniques or technologies.  This is what keeps me interested and excited about the future of log management.  I’m just trying to stay ahead of the curve.

Posted in Logging | Leave a comment

What I learned from Malcolm Gladwell

Do you ever get the feeling that you should be doing something else? Do you wonder what you would do if you had a year to do research without having to worry about getting paid? I’ve always thought that academic sabbaticals would be great for everyone, even people who don’t work in academia. Over the last year I have had a rare opportunity to try different jobs, research some of my interests, and basically take time to figure out what I want to do next.

After leaving my last log management startup, I spent several months in a fog trying to figure out what to do next without doing much at all. I spent some time working with a couple of startups that had nothing to do with log management, but I felt like I was wasting both my time and the knowledge that I had developed over the previous ten years. Also, I felt like I knew too little about the other technology at the time; it felt like I was starting all over again. It just wasn’t fun or interesting.

During this time I started reading the book Outliers by Malcolm Gladwell. In his book Gladwell examines the factors that contribute to high levels of success for people. He examines how some people become very successful – from professional hockey players, to musicians, to technology tycoons (like Steve Jobs and Bill Gates). The book shows that these people weren’t overnight sensations, but rather that it took many years of work, a passion for what they were doing, good timing, and lucky breaks along the way to become successful and become an ‘Outlier’.

All this got me to thinking about my own situation. To be most successful at what I was doing, I needed to focus on my knowledge base and the things about which I was passionate. This took me down the path of researching the opportunities in log management and the direction technology was heading. Finally, I hoped luck and good timing would be on my side.

My conclusion was that log management is a very hot field, and I had spent over ten years helping to develop it. It was where I needed to be.

I have spent the past nine months doing research and talking to friends and former colleagues who work with logs. I have found that there are many opportunities for innovative software companies working with log data. It’s exciting to see that in the large enterprise there has been a tremendous shift in the IT operations staff and an increased focus on log data. Governance, Risk, and Compliance have really pushed security and IT operations staff to focus on collecting and analyzing corporate log data. It started with the SIEM vendors collecting security logs. It has moved to log management vendors expanding to network and operating system log data.

My self-induced sabbatical is almost over. Thank you Malcolm Gladwell!

It has been a few months since my last post. I have been busy working on my next startup idea. I plan to announce what I am doing over the next couple of weeks.

Posted in loglogic, startup | Leave a comment