Cyber Security Breakdown 101 Series. Part 5 – What is a SIEM really?
SIEM, or Security Information and Event Management solution is in its base form a database that collects logs from your environment and that has some form of search layer that make these logs searchable based on some parameters as for instance time, date, the system or service that the logs where collected from etc. so that you can go back in time and have a look at what happened at a given time or timeframe to verify or rule out that something particular happened.
So, in its base form a SIEM is very much a regulatory system to verify that you live as you should. For instance, ISO27001 requires you to be able to verify that you adhere to the standard over time, not just “right now”.
But it also comes in handy if you experience a breach or leak of data where a SIEM possibly would be able to tell you how, what, when and why by letting a human analyse the logs and thus find the red thread to the breach or leak.
Actually a base SIEM is quite dumb really, and is only used for reactive validation purposes.
To get a SIEM to go from reactive to active you need to add intelligence, and this is usually done by adding layers referred to as UEBA and SOAR to the log-collection and storage that is the SIEM base.
UEBA or User and Entity Behaviour Analytics
Or in the case of IBM and Splunk UBA, stands for User Behaviour Analytics.
A UBA analyses the user behaviour and puts this in to a “normal” user behaviour in a baseline, so that if the user starts behaving out of bounds from that baseline the UBA module can detect this and set off an alarm.
This can be a lot of things like that the normal user work behaviour is between 08:30 to 16:30 Monday to Friday, and during this time the user works in the Office and Adobe suites and from this only works on files in the marketing department file-share.
If that user suddenly starts fiddling with file in the sales folders, downloading a copy of “presumptive customers 2024.xlsx” during a normal day but at 22:00, from an IP address originating in a whole other country where the user has not been working before, the UBA would flag this as suspicious behaviour and put a risk score on it. The risk score is usually stipulated between 0-100 and a “normal” user might find itself between 25-40 in risk score whilst IT personnel might be given a risk score of 40-65 or even higher depending on their tasks.
For example, using RDP to connect to a lot of machines and using elevated privileges to logon to these machines would increase the risk score significantly from a normal user, and if you then add that the user is using powershell to command server changes it would lift the risk score even higher etc.
So the UBA module uses logs that have been stored, sorted and made searchable in the SIEM and then makes RELEVANCE of these logs based on machine learning so that the SIEM also can become an alerting system that can detect and alert in “almost” real time.
The “almost” real time is because of the way collecting logs are done. Say you pull logs from a client endpoint, the logs are generated withing Windows for instance and then stored to the local harddrive. When the logs are stored there will be some form of agent that takes these logs to be polled during given time intervals to a log-collector on the network. That log-collector then pushes the logs to a backend (or a cloud SIEM), which receives the logs, sorts and stores them. At that point the U(E)BA starts analysing the logs for sketchy behaviour, and this takes some time as well as there is a metric shit tone of logs ingested into the solution at any given time.
All SIEM vendors seem to do stuff their own way, so the above is just an example of how a SIEM could be implemented.
For UBA/UEBA most SIEM’s utilize cloud offload in the SIEM vendors own datacentre/private cloud to be able to do the machine learning efficiently, so keep that in mind when shopping for a SIEM with UBA/UEBA and you want an on-prem solution.
So for a client log to be generated on the client, sent to the SIEM and be analysed for sketchy behaviour it could take all from a few seconds to a few minutes for not so smartly configured implementations or sub par dimensioned implementations.
If you can ingest logs via API integrations from, say cloud services as MS Azure and 365 however, the API can stream data continuously to the SIEM cutting out the delay of local retention as with the client, and thus you can get a very efficient system with near real time alarms, and in some cases even automatic responses. For instance, there is something often called “land speed detection” that finds connections that differ over regions and time in such a manner that no human being can travel that fast. So for example, if a user logs in from New York, US at 1300 and then from Amsterdam, Netherlands 45 minutes later… There is no way a user can travel from New York to Amsterdam within 45 minutes, and thus this is sketch. If the user also enters the wrong password, this is a breach attempt indicator and the SIEM would then be able to block any further connections to that account via MS API’s from that IP. (Not all SIEMS have this function, and some have them only in the SOAR module thus not in the U(E)BA module)
The missing E stands for Entity in which we extend the machine learning to other stuff than just the logs collected in the SIEM. So the Entity part extends the detection to other sources as well as for example: flow records, packet capture data, and other datasets of interest to increase the inflow of data, and as such extend the detection-capability as we get a bigger dataset to look in to.
Side note: This is just the same as with all "AI"... The "new AI wave" going on now isn't "new" technology, AI has been around for like 10 years or so... Sure we do have beefier hardware and such that increase the processing speed in all aspects, but the AI revolution isn't driven from new AI technology... The AI revolution is primarily driven out of the amount of data we now offer the AI algoritms to analyse... So in essence you and what you share on for instance social media is what drives AI, not the code. Digital books and companies as Storytel that digitalise books in combination with sales figures gives the AI the raw data to become an author. The pictures combined with likes and dislikes on Instagram gives the AI the possibility to create good images etc.
This means in short that the UBA can ONLY detect human and account behaviour whilst the UEBA also can detect machine threats (machine actors) which means that if for instance an application has been infected with some form of malware and starts doing stuff it generally does not, say establish connections to a new DNS server the UEBA would see this and detect it as an anomaly and thus could trigger an alarm or let the SOAR module respond.
SOAR or Security Orchestration and Automated Response then…
SOAR is the responding part of a NG (NextGeneration) SIEM. Whilst the SIEM in itself is the storage and search function, the UEBA is the triage, the SOAR is the guardian.
When something is detected within the UEBA the SOAR playbooks automate responses to whatever is detected malicious and have a suitable playbook.
A playbook is a set of static engine rulesets that work in some way dynamically with each other under the “If this, then do that” premises.
So if we take the Land Speed example above and extend that to SOAR, the connection from another country of origin IP hits Office 365, the UEBA detects this and in some installations can block that specific connection but it won’t take preemptive actions. So, if the SOAR playbook is implemented it could stipulate that, IF:
Land Speed and failed logon
Then: do nothing
Land Speed and failed logon 2:nd time
Then: demote Azure AD account privileges, force two-factor authentication for successful logon, trigger alert to SOC
Land Speed and successful logon with failed 2-factor authentication
Then: lock Azure AD account, force account logout, kick all open sessions globally, reset two-factor token, blacklist IP in central firewalls, trigger critical alarm to SOC
So by implementing the SOAR module you can tailor responses to threats that then become fully automated. When purchasing a SIEM with a SOAR module, there will be a lot of playbooks readily available that has been written or verified by the staff of the SIEM vendor so you don’t have to write everything yourself, but be aware… SOAR acts, so the playbooks you implement in to your environment better be tested thoroughly before going live, because it will not ask before executing as they are static rules without any machine learning to talk about.
In some cases as with IBM Qradar, the SOAR module also integrates with a lot of other stuff. So when purchasing the SOAR module from IBM you will not only get the playbooks, but you will get the integrations to a CMDB as ServiceNow for instance, whilst other vendors as LogPoint integrate the CMDB on the SIEM level.
So the NG SIEM is a magic box, that’s what you’re saying, right?
NO! The NG SIEM is not a magic box and even though what I’ve written above might seem cool ish, the NG SIEM is one of the most complex solutions that you can have within the environment and thus probably also will take up most time from the personnel to setup and specifically manage it. SIEM's are not a “Set and Forget” solution, and it will require constant maintenance and a delicate hand when doing changes as these can affect the whole organisation in a very potent and brutal way.
Just changing a minor setting in a playbook and not testing it thoroughly first could kick your entire organizations logons to a specific service. That taken into account and that every software update of the SIEM is a possible source of failed data parsers, playbooks, rules and what not I would NOT recommend anyone to rely on detection within a NG SIEM.
The SIEM is primarily very good for regulatory and compliance purposes, and as an added detection tool to the other ones you already have, such as an EDR and maybe an NDR, but it shall not be you primary line of detection nor defence!
Security is never a checkbox, it’s a bunch of them…