Boosting Your SOC Operations with Optimized Automation Using Security Copilot
Security Copilot supporting automations in SOC operations

Boosting Your SOC Operations with Optimized Automation Using Security Copilot

#TLDR

In this article, I describe the techniques I adopted to reduce SCU consumption in an automation solution for investigating potential user compromises. The prototype of the solution is available here: LINK.

It can be reused as-is, freely modified, or used as a starting point for other types of automations, always “optimized” in terms of SCU usage.

In the usage tests conducted, a single execution consumed just 0.7 SCU.

Update - June 2nd: Check out this new article presenting an evolution of the solution presented here: The Impact of Direct Skill Invocation in Automations | LinkedIn


Objective

The solution described in this article was developed in response to a customer request to automate the generation of a table with predefined information during investigations of incidents involving—or potentially involving—suspected user compromises. Examples of such information include: whether the access IP is a TOR node, whether the password was successfully entered during the suspicious authentication event, whether the MFA challenge was successfully passed, and so on.

More generally, this solution can be applied to scenarios where, given any incident involving one or more identities among the affected resources, you want to automate actions that provide a preliminary understanding of what happened to those identities—at the time of the incident, in the days leading up to it, as well as during and after the incident, up to the present day.

In particular, I identified the following list of actions to be executed:

  1. Retrieve the UPNs of the identities involved and any associated IP addresses.

  2. Gather contextual and reputation information for the IP addresses involved in the incident.

  3. Extract details about the sign-in event that may have triggered the incident:Was the password authentication successful? Was MFA requested? Was it successfully completed?From which country and city did the sign-in request originate?

  4. For the identities involved, summarize the “unusual” locations from which they successfully logged in—from X days before the incident up to the current date.

  5. For the identities involved, summarize the types of alerts they were associated with in the X days leading up to the incident.

  6. For the identities involved, summarize the types of actions recorded across various audit systems (e.g., Microsoft Entra, Azure Activity, etc.) from X days before the incident, as well as during and after the incident.

  7. (Optional – Requested by the customer who inspired this deep dive) Summarize the processed information in a table with predefined fields.


General Considerations

Note: I suggest skipping this section if you're short on time. I recommend reading it only if you're interested in understanding the rationale behind the design choices made in the solution.

Why Security Copilot?

To build this automation, I found multiple advantages in using Security Copilot:

  • It eliminates the need to interface with multiple APIs, avoiding the associated complexities of authentication, authorization requests, parameter configuration, result parsing, and error handling.

  • It allows for processing of data sent and received through these calls using natural language instructions, without the need to manipulate data structures manually.

  • It enables formatting of data for display simply by specifying the desired format in natural language, with the flexibility to switch formats “on the fly” by changing just a word (e.g., from “HTML” to “Markdown”).

  • When needed, it can summarize, expand, or interpret the semantic meaning of data, allowing for reprocessing and presentation in the desired format.

Achieving economic sustainability in SCU consumption

We know that data processing through LLMs requires significant computational power, which still comes at a high cost. Specifically, Security Copilot allows you to pre-allocate the desired computational capacity in terms of Security Compute Units (SCUs), and to set a threshold for any potential overages. These thresholds—both the base and the overage limits—can be adjusted on an hourly basis to account for recurring or occasional fluctuations in usage volumes by operators or automations.

To make this processing economically sustainable—that is, to ensure that automations consume only a “reasonable” portion of the pre-allocated computational capacity—it’s essential to implement techniques that minimize Security Copilot’s compute consumption as much as possible.

This means, for example:

  • Specifying a response format for each prompt that minimizes overall text volume while maintaining readability. The “efficiency champion” in this regard is Markdown, which is in fact the default format used by Security Copilot. However, it’s important to note that Markdown may not be optimal in certain contexts. For instance, in Sentinel incident comments, Markdown tables can be hard to read; in such cases, it’s better to instruct Security Copilot to avoid tables and use bullet lists instead. Additionally, Markdown-formatted responses may be difficult to read when sent via email.

  • Reducing the number of calls to Security Copilot by combining multiple requests into a single one where possible. For example, when using custom KQL plugins, if two or more prompts translate into separate KQL queries that can be merged into one, it’s best to do so and issue a single prompt. Another example is requesting that a prompt’s response be directly formatted (e.g., in HTML) only when such specific formatting is strictly necessary.

  • Minimizing the amount of data sent in each request. This might even justify—where feasible—executing requests in isolated “sessions”, meaning not including the full interaction history in every request.

  • Avoiding the use of high-cost skills where possible. For instance, replacing calls to the Natural Language to KQL skill with calls to custom KQL plugin-defined skills.

  • Limiting the amount of data expected in the response to only what is strictly necessary. In other words, ensure that the request clearly specifies which data to return—exactly what is needed, and nothing more.

  • Avoiding compute consumption for intent interpretation.

Direct skill invocation

To achieve this last objective—ensuring that Security Copilot consumes compute capacity only to generate the response and not to interpret the request—there is now the option of using direct skill invocation. This means specifying within the prompt which skill should be used and how to populate the parameters expected by that skill.

As of today—and based on my own experimental findings—direct skill invocation is not consistently available across all interaction modes with Security Copilot. Specifically, here are the current capabilities and limitations (Disclaimer: please do not consider this summary as official documentation—it's simply what I’ve been able to deduce through hands-on testing, and there may be inaccuracies):

  • Full direct skill invocation—specifying both the skill and its parameter values—is possible at the individual prompt level on the standalone Security Copilot portal.

  • Partial direct skill invocation—specifying only the skill but not the parameter values—is possible within individual prompts of a promptbook on the standalone portal. In this case, parameter values must be provided in natural language within the prompt text.

  • Full direct skill invocation is also possible when calling a single prompt from an Azure Logic App.

  • Direct skill invocation is not possible when calling promptbooks from Azure Logic Apps. If a promptbook includes prompts configured with specified skills and those skills require mandatory parameters, attempting to pass those values via natural language in the prompt text results in an error.

The standalone Security Copilot portal is the primary interface for “ad hoc” interactions with the service—whether through individual prompts or the more convenient execution of promptbooks. However, for cost-optimized automation scenarios, it may be worth the effort to shift interactions into Logic Apps, applying all the compute-saving techniques described earlier.

While Logic Apps can call entire promptbooks, invoking individual prompts can be more advantageous in terms of flexibility—particularly when setting up logical conditions, which are not supported within promptbooks. It also enables the use of direct skill invocation, offering greater control over how each interaction is executed.

Custom Skills Design

It is essential to design the skills in the custom plugins so that they return the smallest possible amount of information. If you're searching for a specific event or a small set of events, the response will likely be limited in size: it will be enough to restrict the number of details requested for each event. Conversely, if you're dealing with a very large number of equally important events, it is unfeasible to design a skill that returns them all, as the response would be truncated due to interaction limits with the LLM. Moreover, even if it were possible to receive the entire dataset in the response, the token consumption—and thus the computational load—would be extremely high. In such cases, it is better to design skills that return useful "statistical" information to provide an initial overview of the entire set of events.

This is the technique I adopted in the prototype for skills related to searching for alerts and recent actions for the selected users over a broad time range: rather than attempting to return all these events, I return the number of occurrences for each type of alert and action.

Clearly, the process can be further refined by returning only those actions that are particularly relevant - within the context of the prototype, for identifying potential account compromise signals. These types of statistical responses help form a preliminary understanding of what has occurred, enabling a faster triage of the situation. The investigator will then need to delve deeper into the more suspicious types of events (e.g., suspicious actions) by running targeted queries in management tools or continuing to query that data source in Security Copilot with a focus on specific events


Shared Prototype

The solution shared as a prototype and described in this article is available at the following link: LINK.

It consists of:

  • A custom KQL plugin that connects to a Sentinel workspace - NOTE: the custom plugin is configured to use Microsoft Sentinel in order to enable a wider time interval when querying the SecurityAlerts table. Modify it to use Microsoft Defender if needed.

  • An Azure Logic App that executes a specified sequence of prompts.

The Logic App executes the automation described at the beginning of this article, applying the SCU consumption optimization techniques outlined earlier. Specifically, it was designed and implemented with the aim to:

  • Minimize the number of calls to Security Copilot

  • Avoid using computationally expensive skills like NL2KQL by replacing them with the use of the optimized custom KQL plugin

  • Minimize both the input text for prompts and, most importantly, the amount of content requested in the responses

  • Perform each call using direct skill invocation, specifying the skill name and the expected parameter values

The Logic App is conceptually divided into three sections:

  1. Initialization - The first section receives data from the trigger and initializes variables using the input data (e.g., incident details) and parameters defined in the flow. One of these parameters contains a JSON data structure that describes the individual prompts to be sent to Security Copilot, including how to invoke them directly by specifying the skill and its parameters. This section also includes a step that processes the JSON by replacing placeholders in the prompts with actual input values (e.g., incident number).

  2. Prompt Execution - The second section executes the prompts defined in the JSON. Note: If you plan to reuse this Logic App as a base for other flows, this section can typically be reused as-is, even if the logic in the first and third sections changes.

  3. Response Delivery - The third section processes the responses as needed. In the shared solution prototype, it sends the responses via email and/or posts them as comments on the related incident, depending on how the Logic App parameters are configured.

Logic App's Structure

I want to emphasize that if you plan to reuse this Logic App as a foundation for other cost-optimized automations, you should:

  1. Modify the initial trigger if needed (currently it’s based on a Microsoft Sentinel Incident, but it could be adapted to start from other services or event types).

  2. Create your own JSON with the desired list of prompts, keeping the same structure (schema) as the one provided here—only updating the content. When filling in the JSON fields, be sure to include placeholders that the Logic App will dynamically replace with values received at execution time. Save this JSON in the corresponding Logic App parameter.

  3. Update the first section of the Logic App to insert your own variables and to handle appropriately the replacement of placeholders in the JSON (read from the parameter) with the actual values received at runtime.

  4. Adapt the third section of the Logic App to fit your specific needs, using the content generated by the LLM in the way that best suits your scenario.

Steps implemented by the automation

For the format (schema) of the JSON data structure used to list the prompts, I based it on the one exposed by the unofficial API available for exporting and importing prompts into promptbooks via scripting (the solution is described here: https://github.com/Azure/Security-Copilot/tree/main/Promptbook%20samples/Powershell%20to%20Manage%20Promptbooks).

To this structure, I added two attributes for each element (i.e., for each prompt description):

  • position: the position of the prompt in a list of questions and answers, useful when prompts are processed in parallel rather than sequentially.

  • replacePrompt: a simplified version of the prompt, used to replace - in the result delivered by email or written in the comments of the incident - the actual prompt sentence submitted to Security Copilot.

As an example, here is an excerpt showing the first two prompts included in the JSON structure used:

Example of the first two prompts in the JSON structure set as parameter in the Logic App

As shown, both of these prompts are invoked using the direct skill invocation technique, by passing the skill name and its parameters.

As another example, below is the final prompt from the JSON structure:

Example of the last prompt in the JSON structure set as parameter in the Logic App

When invoking this last prompt, which uses the specific skill "AskGPT", it is not currently possible (as far as the author of this article is aware) to use the direct skill invocation technique. Therefore, the instruction was written entirely in natural language.

In the solution, I also applied a few additional techniques aimed at further reducing the computational cost of interactions with the LLM:

  • In the Logic App, before analyzing the IP addresses identified in the incident, I filtered out non-public IPs.

  • In the tables returned by the various skills created in the custom KQL plugin, I always included a column for the UPN to indicate which user (involved in the incident) each row refers to. However, when the incident involves only one user, I ensured that this column remains empty across all rows. As far as I know, KQL does not allow conditional removal of an entire column from the result set—so I opted to leave it unpopulated instead.

  • Currently, in the first prompt defined in the JSON, I explicitly requested a concise style of responses. This means that many potentially available details in each prompt’s response are intentionally not displayed.

It’s worth noting that I had to use Sentinel as the destination for the custom KQL plugin because, based on my testing, specifying Defender does not allow retrieval of alerts older than 30 days—even when broader time ranges are specified. Specifically, below are the insights I gathered during testing. Once again, I want to emphasize that these are not official statements, but rather personal observations that have not (yet) been validated by official Microsoft sources. As such, they may contain inaccuracies:

  • The AlertInfo and AlertEvidence tables in Defender do not contain data older than 30 days.

  • The SystemAlerts table, although no longer listed in the schema of available tables in the unified portal, is still queryable in KQL. However, when queried from a skill within a custom KQL plugin targeting Defender, the queries appear to be limited to a maximum of 30 days in the past even when broader time ranges are specified in the code (those ranges are simply ignored). The issue disappears when the plugin is configured to target Sentinel instead of Defender.


Results

In the tests conducted, each execution consumed 0.7 SCU (approximately $2.80). In practice, each individual prompt showed a minimum consumption of 0.1 SCU.

Prototype's execution consumption

Of course, this consumption is only indicative: the actual usage for each execution may vary depending on contextual factors—such as the number of identities and IP addresses involved in the incident under investigation or the number of results returned by each prompt (e.g. the number of the existing alerts involving the analyzed accounts).

As an example, below are some of the responses received during execution.

Prompt #1 - Extract information about the first IP in the incident from AbuseIPDB

Prompt #1 - Extract information about the first IP in the incident from AbuseIPDB

Prompt #2 - Extract information about all the IPs in the incident from MDTI

Prompt #2 - Extract information about all the IPs in the incident from MDTI

The IP—although we could have more public IPs, this incident involves only one—has no known reputation.

Prompt #3 - Retrieve the details of the sign-ins nearest to the incident start time for the users involved in the incident

Prompt #3 - Retrieve the details of the sign-ins nearest to the incident start time for the users involved in the incident

The logon events near the beginning of this incident succeeded at both the first and second factors.

Prompt #4 - Summarize the successful sign-in events that originated abroad by users involved in the incident

Prompt #4 - Summarize the successful sign-in events that originated abroad by users involved in the incident

Recently, the user has successfully logged in from GB and IE.

Prompt #5 - Retrieve the alerts in the 30 days before the incident for the users involved in the incident

Retrieve the alerts in the 30 days before the incident for the users involved in the incident

The user involved in this incident had multiple serious alerts in the days leading up to the incident.

Prompt #6 - Retrieve the actions (in audit logs) in the 30 days before the incident and until now registered for the users involved in the incident

Prompt #6 - Retrieve the actions (in audit logs) in the 30 days before the incident and until now registered for the users involved in the incident

Recently, the user involved in this incident has performed several actions in Entra and Azure, including creating a Resource Group, creating a user, and assigning the user to an Entra role.

Prompt #7 - Summarize all the information retrieved

Prompt #7 - Summarize all the information retrieved
Prompt #7 - Summarize all the information retrieved - Full Table

In the third section of the currently shared Logic App prototype, the available actions for delivering results include writing them into the comments of the originating incident and/or sending them via email. The choice of one or both delivery channels is made by setting the appropriate parameters.

For easier readability, each response is preceded by its corresponding prompt in the simplified form provided in the prompt JSON passed to the Logic App.

This is an example of a comment generated by the Logic App within the Sentinel incident.The responses are concatenated in their original Markdown format.

Example of the results in the comments of the incident

It’s important to note that, as shown in the previous image, comment text in Sentinel does not support Markdown formatting: content is displayed with the full Markdown structure in plain text. Markdown tables, in particular, are difficult to read in this format. For this reason, it’s advisable to instruct Security Copilot—via the prompt—to structure content using bullet points instead of tables.

If you wish to retain the final summary prompt, it’s advisable to choose between the following two alternatives:

  • [Preferred solution for reducing SCU consumption] Modify the final prompt so that the summary content is structured as a bullet list instead of a table.

  • Configure the Logic App to convert the content into HTML. This way, the final table will be displayed in a readable format.

For convenience, the Logic App prototype also includes a feature that writes a link to the Security Copilot session in the incident comments. This allows easy access to a well-formatted view of the investigation results, as well as the option to export the initial summary table to Excel.

Link to the Security Copilot session within the comments of the incident

The content of the responses viewed via email is more easily readable if the Logic App—or Security Copilot—is instructed to convert all results from Markdown to HTML. This is an example of an email received when the Logic App is configured with the parameter ConvertToHtml set to true.

Example of the email sent by the Logic App with the results of the prompts

As previously mentioned, converting content to HTML generates significantly more text compared to Markdown, which can lead to slightly higher SCU (Security Compute Unit) consumption—especially for prompts with more verbose responses. In the specific example described in this article, with HTML formatting the total SCU consumption reached 1.3, due to the following:

  • For one of the prompts, the Logic App detected Markdown-specific characters in the response and, following the implemented logic, triggered an additional prompt to convert that response into HTML (+1 SCU).

  • Prompts #5 and #6 had particularly verbose responses, increasing their SCU usage from 1 to 2.

  • The final summary prompt, which had to process all the HTML generated up to that point, saw its SCU usage rise from 1 to 3.

For this reason, as already noted, it is recommended to keep the output in the more efficient Markdown format whenever possible to reduce SCU consumption.


Considerations on Sessions and Parallelism

It’s worth exploring the topic of sessions and parallelism in prompt execution.

Session Management and Prompt Dependencies

The first aspect concerns sessions in Security Copilot. From a Logic App—or manually via the portal—it’s possible to chain multiple prompts within the same session or to launch each prompt in separate sessions. The advantage of the latter approach is that each individual prompt typically consumes less compute capacity, since the LLM does not receive, along with the prompt, the potentially lengthy summary of everything processed so far in the current session.

On the other hand, when a prompt needs to reference information defined in previous prompts, keeping all prompts within the same session becomes particularly convenient. An alternative approach could involve managing the "retrieval and passing" of information outside the session—for example, within the Logic App itself. However, this method is not only more complex but could ultimately be more "expensive" than simply using a session.

A typical case where this need arises is when generating a final summary prompt that must take into account everything processed in the previous prompts. Without a session, you would need to reconstruct and send the entire sequence of prior interactions to the prompt—potentially exceeding the token limit for a single interaction.

In the specific example described in this article, I chose to retain data within the session precisely because a final summary was required. That said, I also added a dedicated UseSingleSession parameter to the Logic App to allow immediate use without a session in contexts where that’s feasible and beneficial. This parameter is considered for both prompts invoked via direct skill invocation and those triggered using natural language.

Logic implemented to use a single session or separate sessions

Parallel vs. Sequential Prompt Execution

The second consideration involves whether to parallelize or serialize the execution of the loop that invokes prompts from the JSON.

The main advantage of parallelization is execution speed: the total duration of the entire prompt cycle becomes equivalent to that of the slowest individual prompt. In practice, tests showed a reduction from 3–4 minutes down to about 1 minute or less.

However, parallel execution means that the results collected in a dedicated Logic App variable are not in the same order as in the original JSON. To address this, I added a "position" field to each prompt in the JSON, allowing to reconstruct the correct order of prompts and responses in the third section of the Logic App—where the results are processed and delivered.

That said, using parallelism requires giving up on ordered execution within a session. In the tests conducted with the solution shared here, I opted to forgo parallelism in favor of maintaining both session continuity and serialized prompt execution.

To ensure that prompts are executed sequentially, you need to adjust the default setting for “Concurrency control / Degree of parallelism” in the “For-Each” action. By default, this value is set to 20. To enforce serial execution, it should be limited to 1.

Default settings in the For-Each action: parallelism enabled with a degree of 20

Modified settings in the For-Each action - In this way, parallelism is disabled

It will be interesting to experiment with solutions that adopt parallelism—in order to significantly speed up execution—by foregoing sessions while still enabling the reconstruction and delivery of previously processed information to prompts that require it (e.g., the final summary prompt).The challenge lies in doing so without increasing compute costs or exceeding token limits.


Current Limitations and Future Directions

I’m fully aware that the logic implemented in my solution may appear—especially to those reading this—somewhat incomplete or even of limited practical use when compared to the established procedures adopted within their own SOCs for incident handling or user compromise investigations.

However, I believe the true value of what I’m presenting lies not so much in the solution itself, but in the thought process behind its design, as well as—perhaps—in some of the practical techniques applied during its development.

That said, I’d like to explicitly highlight what I consider to be clear limitations—as well as opportunities for evolution—of the proposed solution:

  1. The AbuseIPDB call is made only on the first IP address found in the incident. This is because the available custom plugin for AbuseIPDB does not support passing multiple IPs. The issue could be addressed by implementing a loop of calls—which would significantly increase Security Copilot’s compute cost—or, preferably, by creating a custom plugin (e.g., a Logic App-based one) that loops through the IPs, queries the API, and returns results in batch.

  2. Other threat intelligence services can be used in addition to—or instead of—AbuseIPDB and/or MDTI. If connectors are available, these calls can simply be added to the prompt JSON passed to the Logic App. Naturally, each additional interaction increases SCU consumption (with a minimum of 0.1 SCU per interaction).

  3. The investigation logic for suspected user compromise scenarios can be significantly enriched—for example, by querying anomaly events generated by Microsoft Sentinel’s UEBA component.

  4. The KQL code in the custom plugin is not optimized for performance and has only undergone limited testing. It may contain bugs.

  5. In the custom KQL plugin, the skill that searches for actions in the audit logs can be improved by specifying which actions to include or exclude in the search criteria and by expanding the scope of the search beyond just Entra and Azure—for example, by including the Unified Audit Log from Microsoft Purview.

  6. Currently, the solution can only be triggered from an incident via the Defender or Sentinel portals. It would be valuable to extend its usage to allow launching the solution by directly specifying one or more users, without being tied to a specific incident.

Regarding the previous point:

  • The solution could be packaged as a custom plugin or Logic App–based skill, making it callable directly from individual prompts in the standalone Security Copilot portal (for example, with a prompt like: “Help me investigate the possible compromise of account XYZ”).

  • Through integration with Copilot Studio, it should be possible to make the solution accessible also from Teams (for example, via chat with a custom agent). This will be my next area of investigation.

  • The custom KQL plugin already includes skills that can be invoked without specifying an incident; in such cases, a time range must be explicitly provided.

  • Packaging the entire investigation as a single skill is an ideal scenario for leveraging the parallelism offered by Logic Apps to significantly reduce execution time—refer to the earlier considerations on this topic.


Conclusion

I hope the shared prototype and the concepts described throughout this in-depth article will be helpful to those looking to accelerate and enhance automation efforts within their SOC operations.

Robert Zen

Cyber Security Architect at Microsoft

2mo
Martin Hooft

CISO, (T)ISO, cloud en datacenter | Cybersecurity | Data protection

2mo

Bedankt voor het delen, Stefano

To view or add a comment, sign in

Others also viewed

Explore topics