Goblin Detection Diary #1 - Data is queen

Introduction

Detection engineering underpins half of the entire cybersecurity industry but remains only ever softly spoken about or kept in some corner of the conference. So I've started this diary to capture the work I do in my roles and demonstrate different best practices implemented into the real world.

This diary has five goals set:

  • share with people the complexity of working in detection engineering
  • highlight best practice and how it fits
  • share candid details on inter-team work and break/fix tasks
  • ensure every entry has something that can be used immediately by other people
  • keep the format like diary with no formal structure and a personal tone


Note: The content I generate will be sanitised but will avoid high level overviews. I hope other detection engineers enjoy my pain with me and new aspirants become engorged with new ideas.


Starting Slowly..

This entry was made at the start of the week and so I spent time running my regular reporting. I run reporting and metrics at the start of the week because it allows me to setup my planner (Microsoft ToDo) with tasks and deadlines.

As always, key to creating a good reporting schedule is understanding where your data is so I maintain a map of all the alerting sources I manage and how to query them. It looks something like this:



I've already built the reporting to happen automatically for me but what's not mentioned in the diagram are analytics I use to discover new elements of infrastructure or data generated. These analytics essentially capture what was seen last week and compare it to what was captured this week to highlight new additions. This is important because I rely on third party platforms like Azure / Microsoft and Crowdstrike who don't keep their customers up to date on everything they change so whenever they add new logs or expand old ones I'm kept in the loop.

This week I shared a few observations with other teams as they lack the capacity to understand the data like I can but still benefit from the information I generate.

  • Quietest periods of the week were sent to SOC leaders for rota management
  • Playbook errors due to case sensitivity sent to other engineers

Besides sharing information about what I find in a reporting cycle I also set a new task in my planner to explore user account creation patterns at a particular client. They had generated in excess of 30 new accounts recently and with a quick cursory check I can see its not the first time so I want to explore whether that behaviour is relevant enough to adjust my own models and baselines.

Now the cogs are turning..

Another engineer had been working on changes to the deployment pipeline and was exploring detections I've written that were generating flags in his new linter. This was my favourite part of the day because I got to showcase detection methodologies I apply across a large number of rules and also demo the advantages and disadvantages of different algorithms and functions.

In particular I walked through how we evaluate events in sequence where two behaviours must occur in that the first is required to occur x times before the second. This is a really common logic found across many different detection ideas.

For new detection engineers I would recommend trying to learn query optimisation and good data modelling practices from outside the cybersecurity industry as the large vendors such as Splunk and Microsoft have in their generosity fostered a lot of lazy attitudes. In addition to this exploring different logic concepts in pure code will help build skills that you will find universally applicable as apposed to being focused on SPL or KQL.

There are a few ways to approach this and largely you want to make the decision based on the nature of the platform your using. SaaS platforms like Sentinel and Crowdstrike allow for inefficient approaches to function well. If you're using one of these you can usually approach the problem by generating a list of records in series and aggregating on a bin of those records for a timespan you give like say 45 minutes. This will then allow you to evaluate patterns inside the list/index and hopefully its obvious to most which option is better:

store it in a string

user.name @timestamp _duration event.type[1] source.ip AccessDenied AccessAllowed
fluffiest.cat@supercat.org Oct. 28, 2025 14:07:13.000 25000 denied denied denied denied denied allowed 172.16.8.11 36 139

store it in an array


New Data New Problems

Later in the day the SOC raised to me that a detection which evaluates Entra ID authentication events for properties that are either anomalous or known bad was firing too many false positives because a client had onboarded a new region. Looking at the data I immediately had two avenues to explore:


  • What activity are the new users performing in this new region
  • How much variance do the authentication properties have in a given week

Modelling this data to explore the two above points is really easy and can be done using basic aggregate functions. I added some additional logic to look at the volumes of activity for the users and calculate percentages for increases or decreases which allows me to easily understand how significant of a change has occurred in the data.

Across most tools you will simply need to compare two time windows of data and define your basic medium and upper bound along with calculating the standard deviation. You should get a result like this!

Present90 Historic50 Historic90 avg_daily_events daily_event_variation PercentDecrease PercentIncrease
0.4149804220344742 0.42 0.45448793562740775 231765.47 18291.06 -8.55 0.91
1.5183902122100648 1.63 2.802939744889062 173693.43 80118.5 -75 0.25
1.0924868727462969 0.84 1.200818894660522 215086.27 52950.67 15.96 1.16
1.6016681095583383 1.63 1.7209205113771453 218395.55 45422.7 -6.88 0.93

Often in analytics shared online you will see thresholds with hard coded values but gaining understanding of volume using basic statistics like this allows teams to move away from them because in the best of worlds hard coded thresholds require manual regular reviews to ensure coherency with changing understandings of the business and at worse are left unchanged for months. 

Iterate 

Another thing I was exploring today was how our analysts find documentation, while I was working across different detections I described above I had the thought that some of the fields I drop (remove from an output) before presenting the final result might be useful because inside the logic for all our detections is a name for the detection and the documentation assigned to each one uses this same value so if I kept the key:value  for analysts they can just copy and paste the whole string into our documentation store instead of searching on keywords. Luckily I keep logs on what searches analysts make so i can answer questions like what paths analysts are taking to find materials and key phrases they use that might help me improve search optimisation.

Constantly identifying points of improvement in the analyst experience through having a robust understanding of how the analysts interact with your systems and data can help alleviate fatigue and decrease both triage time and complexity. At the moment I'm in the process of migrating our old rule sets to a new framework that outlines an absolute minimum needed from any given alert. In some cases this means that our analytics grow in complexity by about a third (the number of functions and operators) however the value gained from the significant investment in time will be realised forever and hopefully appreciated by all analysts.

Example items in the alert framework:
  • All values with a relationship to an identity or asset such as a user name or IP must be surrounded by metadata such as privileges assigned or in which country the user typically authenticates.
  • All results are transformed into a table.
  • All results are compatible with graph.
  • Where external interfaces are required to triage the alert a link is surfaced to that interface in the alert.




Popular

Brilliance in the Basics

Investigate

Endpoint on Adrenaline : One

Endpoint on Adrenaline 3

Endpoint on Adrenaline Two

Investigate Two

Investigate Three

Writing detections when stuck with EDR

Standardized Note Taking Format For Analysts