Dont Use AI
Analyst work is built on the human capacity for creativity, memory recall and information gathering and using so called 'AI Tools' will actively diminish you in these areas. Your ability to form useful thoughts is built on the continued labour of your mind. Each aspect of the labour you endeavour to partake in creates its own unique quality in your own cognition. Your drive connects you to this labour but does not assure you of any benefits. If you ask AI tools to format documents you will get worse at building narratives and if you ask AI tools to process registry modification events you will get worse at pattern recognition.
After slowly reducing the individual minute qualities of your cognition you will one day be asked to solve a difficult problem. You will fail. You will resign the problem as not possible or unfair blinded by the damage you have already done to yourself. People who lack the capacity to understand even turn to frustration and anger but it will be in vain. Set in you will be behaviours that have overwritten what took perhaps 18 years to develop.
Our world is full of complexity and challenge. Our drive pushes us to explore mysteries and grow. These things build you with incredible depth. Inserting AI into the middle means you are not being built by your experiences you are merely speeding through them. The experiences you can derive from using AI are as shallow as the puddle at the end of every street. Do not trade the ocean for a puddle just because its easier to swim in. Love the labour, Become Better.
Introduction
AI tools are being gradually integrated into Security Operations Centres to either increase the quality of investigations or expand a given analysts capacity to handle workload. This blog post walks through what you should consider when evaluating AI tools for these purposes. In particular the focus of this document is to enable people to review the text outputs and establish whether what the AI tool is generating is valuable.
In the future I will publish further content on how to generate simulations and measure triage responses and case management functions of any given AI tool.
Understanding AI Tool triage efficacy
When assessing the quality of an AI tool for SOC triage workflows
consideration must be taken for each individual component of the process.
These components can be articulated as the following:
Investigation
Investigation describes a process whereby an analyst processes a provided
alert, generates potential investigative avenues, weighs those avenues for
appropriateness and finally translates selected investigative avenues into
actions or tasks that collect evidence. By assessing an AI tools outputs through
each abstraction an assessment to how well it will perform during a live
incident can be arrived too.
- Alert comprehension
- Investigative Avenues
- Collection of evidence
- Analysis of evidence
- Conclusions
Alert Comprehension
In most instances an analyst engages with an investigation due to the triggering
of an alert. Alerts capture behaviours that have been observed within telemetry
and provide a high level description of what the behaviour is. Analysts apply
this information against their own understanding of the relevant systems and
adversary behaviour to create a list of potential explanations for the behaviour
and also a list of additional behaviours that would further suggest that the alert
is a true positive.
This can be measured through the following products:
- Is the AI tools description of the behaviour functionally similar to the alert
- If the AI tool has identified a relationship between the behaviour and
adversary playbooks
- Is the alluded to association known to be correct
- Does the AI tool stipulate a confidence rating for the association
- Has the AI tool identified other possibilities
- items that increase likelihood of a true positive
- items that increase likelihood of a false positive
Investigative Avenues
Once an alerts basic content has been captured a list of unanswered questions
is typically generated by an analyst. These questions will either enable an
absolute conclusion to the alerts nature or enrich the analyst during further
reasoning. Investigative avenues are usually weighed for likely effort
expenditure and closeness to previously successful investigations. Where
avenues that would take very little effort to explore and are known to be of
been useful in past investigations are selected first.
This can be measured through the following products:
- Do the avenues selected by the AI tool reflect those found to be successful
in past incidents?
- Do the avenues appear relevant to the alert?
- Are the investigative avenues sorted or ordered by priority?
- are justifications provided for sorting?
- How broad are the investigative avenues?
- do the avenues utilise multiple data sources
- do the avenues utilise all the data sources available
- Did the investigative avenues return results that forwarded the investigative
towards a confident conclusion
- Is a false positive justification provided
- Is a true positive justification provided
Collection of evidence
Collecting evidence can be considered a demonstration of an analysts forensic
ability. Translating investigative avenues into evidence collection requires broad
domain knowledge of computer systems and available tools and the rigor to
ensure that items collected are easy to process and store.
This can be measured through the following products:
- Did the AI tool collect evidence from all the relevant sources
- Was the evidence collected easily searchable
Analysis of evidence
Once evidence has been gathered it must be associated to the drafted
investigative avenues and used to provide either an absolute conclusion to
what is being explored or justification for further evidence collection. Where an
analyst is aiming to understand a behaviour in more detail the evidence should
contain data not found in triggering alert and where questions to existence of
additional behaviours are being explored the evidence should be from a
surface or data source known to be interacted with by adversaries.
Using the gathered evidence analysts assess whether its properties are
suggestive of malicious intent and compare what is being analysed to known
good examples. In some instances this is done cognitively particularly for
evidence such as authentication attempts from Russia but in others where an evidences
properties are highly variable this is achieved through direct comparisons of different data sets. This direct comparison is used to arrive to assertations on
how unlike the properties are compared to baseline. Where certain variables
are deemed highly unlike a known good baseline these are considered
positively contributing to the investigation. For example volume of data downloaded from a SharePoint point site over time and distribution of file types in the download.
This can be measured through the following products:
- Has the AI tool used the evidence to generate verifiable facts
- Has analysis of the evidence generated confidence statements
- likelihood of true positive
- likelihood of false positive
- Are references provided for the AI tools reasoning
- How much of the evidence was used by the AI tool
- Facts derived
- Associated to adversary behaviour
- If minimal or no context is derived from a piece of evidence
- Does the AI tool prompt for more evidence collection
- Is reasoning provided for the absence of information
Conclusions
Investigative work must end with conclusions. These conclusions can either be a statement to uncertainty and what work is required to alleviate the uncertainty or a statement to the nature of the behaviours captured in the alert. To arrive to a conclusion analysts fairly evaluate each component of their investigation and associate their findings to their own understanding of how adversary behaviour manifests in the products or systems. Often conclusions contain some uncertainty due to the general nature of computer system openness however this uncertainty should be justified and surrounded by context that describes specific circumstances the behaviour would be a false positive.
This can be measured through the following products:
- Did the AI tool conclude a true or false positive
- Is the provided conclusion correlated or linked to evidence
- Does the evidence support the conclusion
- Is the conclusion clear and without ambiguity
- Does the conclusion provide insight into causality
- Is evidence used to arrive to causality
- Is the degree of confidence in the conclusion stated
Assessment of language
Beyond measuring an AI tools outputs for investigative efficacy its important to understand how well the tools write with regards to syntax and semantics. AI tools deployed in the SOC are typically multi modal large language models that tokenise content and arrange it based on probability. This probability is primarily shaped by pre deployment training where a model vendor will control inputs and programmatically adjust parameters and conditions until the model generates outputs that are shaped in a desired way.
These shaped outputs are what analysts will consume when using AI tools to aid in investigations and alert triage so its important to measure whether the model vendor is designing the models to communicate in a way that analysts can apply into their work.
The below general framework has been created to help measure the models outputs when using the measurements described in the ‘Investigations’ section. This section is slightly subjective as the shape and style of communications preferred by an individual analyst is dictated by what they have been exposed to in the past.
When making an assessment of the AI tools output each item below should be easily extractable from the text:
Point
Where an AI tool is creating an output the point must be clear. What is being said should be easily identified amongst facts or statistics and it should be associated to surrounding context. The point acts as the idea the analyst wields when they read the literature and review any findings.
Reason
A reason is a collection of details on why the point is being made. It is usually the most comprehensive component and reveals to the analyst surrounding supporting context, processes of thought and any assumptions that are being made.
Evidence
Evidence consists of artefacts either forensic data points or facts that any reasonable person would arrive to based on the given information. Often forensic data and facts about the data are paired together to create a piece of evidence. Evidence must have integrity in way that makes it easy to challenge as not hallucinated (references) and be relevant to the point.
Poor output
Considering the three aspects of the outputs above you can identify poor content either through the absence of one or more of the items or where continuity between each aspect is missing. Continuity throughout the text is important to ensure analysts are able to weigh the total set of circumstances where if poor continuity is present it is likely analysts will not understand relationships between different pieces of content nor how to attribute severity or causality due to the fragmented information.
Beyond the three aspects to writing you should also consider whether the AI tool is overly repeating points or highlighting facts or information with significant bias. Analysts using AI tools must be presented with content fairly and while using a structure or shape to the text that makes it clear what is the most critical information can be useful it must not be in a form that obscures other pieces of information. If an AI tool repeats a single point for different distinct purposes in the structure of the text then it is likely acting without sufficient evidence.
Ending
AI tools are still struggling to integrate well with Security Operations Centres particularly due to the Cyber Security industry's tendencies to act without rigor or deliberateness. Ensure that you establish small use cases and aim to perfectly achieve those first before attempting to use AI tools for all analyst work.