How to Measure SOC Efficiency and Performance (Lessons from the Frontlines)

Published 11/24/2025

Written by Ben Brigida, Expel.

This blog is based on a recent session where Ray and I (Ben) discussed the key aspects to measuring security operations center (SOC) effectiveness.

Over the years leading SOCs, I've learned that measuring success is one of the toughest challenges we face. A SOC requires both speed and quality, and balancing those can sometimes feel like an oxymoron.

The stakes couldn't be higher. Poor SOC efficiency and performance can cause burnout, human error, missed threats, and even lead to huge financial losses, fines, or the end of a business.

Recently, I sat down with my colleague Ray Pugh to share what we've learned from our time in the trenches. Here's what I wish I'd known when I started.

Start small, build momentum

Don't wait for the perfect measurement system—start small, trend your data over time, and learn from that initial subset. The key is measuring outcomes that indicate team success and efficiency.

I recommend focusing on industry standard metrics that answer fundamental questions: How long does it take to look at something? How long to decide if it's malicious? How long to take action? Core metrics include mean time to detect (MTTD), mean time to respond (MTTR), work time, and mean time to triage (MTTT).

Understanding capacity is fundamental. I often refer to the Kingman equation to measure and monitor capacity. Past about 70% capacity utilization, work time increases and decision quality decreases precipitously.

Balancing speed and quality

SOC efficiency often gets reduced to speed metrics, but sacrificing quality for efficiency is dangerous. The solution involves developing clear quality standards and measuring against them consistently.

My approach is being opinionated about what good looks like, defining it as a rubric, then scoring against those criteria. Key strategies include using AI tools for 100% sampling instead of random sampling, creating feedback loops where quality inspection drives system improvements, and building cultures where mistakes become learning opportunities.

As Ray emphasizes in our session, if you're just inspecting quality off to the side without taking action, what's the point? You need an environment where you can have open, honest conversations as a team.

Understanding bottlenecks

The alert lifecycle typically involves triage (evaluating "is it bad or not bad?"), investigation (when needed), outcomes (where alerts aren't malicious or need more context), and incident handling.

Common bottlenecks include vendor alerts generating high volumes while requiring constant tuning, specific alert types or environments that consistently take longer, and capacity management during volume spikes. Good instrumentation helps you see specific steps that bog things down.

Context Is everything

Isolated metrics can be misleading without proper context. I've seen this play out countless times. For example, our best analyst's individual metrics showed their incident investigation took longer than anyone else. At surface level, this isn't ideal. However, a deeper look revealed they were identifying threats faster than anyone, declaring incidents within 10 minutes from low-severity alerts.

In another case, an analyst had low alert closure numbers but was mentoring others and handling the most incidents. Without understanding their day-to-day movements within the team, they looked unproductive.

Anytime a metric becomes a target, it changes behavior. You have to inspect the data and understand all variables at play before making assessments.

Measuring accuracy

Beyond speed metrics, SOCs should track true positive and false negative rates, alert determination changes when initial decisions get reversed, and detection gap analysis (identifying where activity could have been detected but wasn't).

We think about how many unique decisions go into each incident. If it's a single alert for an incident, that's a near miss to us. We need as many detections as possible to give us multiple opportunities to make the right decision.

Building high-performance culture

Before getting into cybersecurity, I didn't understand how emotionally challenging the SOC job is. I thought it was highly technical, but it's scary.

Team culture is critical for reducing the fear factor in this job. For Ray and me, this means creating environments where asking for help is encouraged, building momentum around saying "I don't know" and "I was wrong," focusing on collective success over individual metrics, and hiring for traits like candor, passion for helping others, and growth mindset.

If we unintentionally build a culture where people protect their ego, we're going to fail and get beat.

And a little silliness doesn't hurt either. Making time for banter and puns is something little that goes a long way, and can help your team feel safe taking breaks in this high-stress environment.

Technology's role

Technology should enhance human capability, not replace human judgment. Practical applications include using AI tools for consistent quality assessment with rubrics, automating routine tasks to give analysts time for quality work, and implementing systems that provide better data for decision-making.

As Ray puts it, the technology has to serve the analysts, not the other way around. We use data to inform where to focus automation efforts.

Getting started

If you measure what you're able to do and then set goals around that, you're going to fail. You have to set goals around what must be accomplished, then figure out how to accomplish it.

My implementation strategy involves starting with constraints on quality deliverables, defining what "good" looks like in your context, beginning with incremental measurements, getting hands-on with data rather than relying solely on dashboards, and cross-referencing metrics with regular analyst conversations.

We're never completely satisfied—always seeking ways to improve because we feel there's more out there and we're always learning more.

About the Author

Ben Brigida is the Senior Director of SOC Operations at Expel. Ben has been on the frontlines of cybersecurity for over a decade, with practitioner and manager-level experience at both Expel and FireEye, Inc. Ben brings a wealth of knowledge to his day-to-day work and holds a passion for both SOC efficiency and measurement and analyst development.