Going through HOOPs pt.2

Last week we posted part 1 of this blog series to try and introduce the vision behind HOOP Cyber. Our key mission is to enable our customers’ success in their journey to cybersecurity maturity. This week we want to talk about data storage considerations.

At HOOP we understand the importance of data-source heterogeneity. That is the need to collect and operationalise data from multiple sources in order to effectively know, detect and respond to your cybersecurity incidents. It's not an easy task, and in part 1 we explored some of the challenges of the collection of the data. This week we will focus on what we should be thinking once we get our hands on the data, and where do we store it? What options are there? For how long, and what it means for us to have different data stores.

The “Four Seasons” of Data Readiness

Traditionally, we think of data stores in terms of readiness: Hot, warm, cold, frozen. This is due to technological constraints driven primarily by cost! As the technology becomes better, scale becomes better and that means costs are coming down and before you know it everyone is walking around with an LLM in their pockets. We are not there yet, so for now we have to deal with storing data in multiple locations, serving different needs.

At HOOP we are breaking down the data store problem on two main storage mechanisms: short-term storage, and long-term storage. 

For us short-term storage is the very active, readily available for consumption data sets that allows customers to perform day to day operations. For that we tend to want to store data for 30 to 90 days, some might argue you need 180 or even 12 months, others only 7 days! The reality is, everyone is right - as different security use cases require different retentions. As an example, if you read IBM’s report of the cost of data breach you will see that the average dwell time for ransomware is 237 days to identify and 89 days to contain, if you read Mandiant’s M-trend report the average dwell time for ransomware is 5 days! Now, at the same time Mandiant states that the most successful customers were the ones that had retention of their data for 10 times the global dwell time of an attack, which is 24 days (so 240 days!!). So, that gives you a rough idea of the landscape. The bottom line is that whatever you do, you need to be able to get to your data fast! And that, right there, is what we consider “short-term” storage. 

For long-term storage the first two things that come to mind are: S3 and Compliance. The most common long-term storage solution is S3 right now and for a good reason. The downside to that is that it takes “ages” to get to that data, and to get to that data it also costs a lot as you need to retrieve it and ingest it to your platform. Then also, the most common use case is compliance or regulatory needs where customers are asked to store for longer periods of time.

Breaking Paradigms

Wouldn't it be great to have a place where you can have all your data in one place, to be able to search it and create dashboards based on the whole picture rather than a subset? What a dream, right? I know! 

Well, we’ve come a long way and technology has become better. There are platforms that are up and coming that provide great compression - to minimise storage cost - great search time - to be able to quickly search and get to your answers and ultimately minimise the TCO and the ROI you get from putting your data in, and ultimately, are open platforms that allow you to work with a variety of open source tools to get your data in. These are not here to replace your existing tooling, rather to enhance it and minimise your security gaps - MTTRs and MTTDs. And let's not forget to make you look good (I'm looking at you, CISOs)!

There are ways now that you can clean and optimise your data sets so that your SIEM has what it needs to perform optimally and keep the noise handy for when you need to dive deeper. A platform that offers such capability is what most still refer to as Humio(now known as Falcon Logscale). I was lucky to have been part of the field team for a time and I can say with confidence that it works. It offers great ease to get your data in, provides great cost savings - primarily due to its high compression that applies to the data in its data store - and it’s super fast to search through your data with a very powerful query language. 

So what should you do? Consider your data strategy with two data platforms that allow you to search through your data with equal amounts of search power, one for short term needs, your SIEM for example, and one that gives you the ability to store for longer term and be able to give you answers through larger data sets - when you most need it.

As ever, if you have any questions feel free to reach out to us. We are more than happy to help you with your data needs for cybersecurity!

References

  1. https://www.ibm.com/reports/data-breach
  2. https://mandiant.widen.net/s/bjhnhps2mt/m-trends-2022-report

To view or add a comment, sign in

Others also viewed

Explore content categories