External Policy Studies Cybersecurity Policy

Enterprise Cybersecurity Measurement

by Paul Rosenzweig AND Adam Isles

March 29, 2021

On Feb. 4, 2021, the New York State Department of Financial Services issued guidance on the cyber insurance market to foster more robust industry approaches to “managing and reducing the extraordinary risk we face from cyber intrusions.” Critical elements of that guidance include expectations that insurance companies should “rigorously measure insured risk” and “incentivize the adoption of better cybersecurity measures by pricing policies based on the effectiveness of each insured’s cybersecurity program.” This follows a separate recommendation last year from the Cyberspace Solarium Commission to establish a Bureau of Cyber Statistics. The commission envisioned that the bureau would be “the government statistical agency that collects, processes, analyzes, and disseminates essential statistical data on cybersecurity, cyber incidents, and the cyber ecosystem to the American public, Congress, other federal agencies, state and local governments, and the private sector.”

A key impetus behind both initiatives is to foster cyber metrics that bring transparency, accuracy and scalability to the way that industries understand, manage and communicate around cybersecurity risk. Transparency refers to and includes clear traceability in mapping defensive countermeasures and countermeasure effectiveness from threat and impact. Accuracy means that metrics present an authentic representation of threat, countermeasure value and countermeasure effectiveness. And scalability points to methodologies that should generate similar results when consistently applied in a decentralized ecosystem.

In earlier offerings, Paul has written several posts that looked at the problem of evaluating cyber metrics from a general perspective and also inquired and assessed whether a set of external metrics that would be a good proxy predictor for the security of an enterprise might be available. Adam’s more practical work in developing a security risk management framework for private enterprises has been designated under the Safety Act, and forms much of the basis for our review here.

Our review here examines a more “traditional” vision of evaluating security. Most ideas about cyber metrics conceive of the assessment as one that works from the inside-out. The thought is to conduct an internal analysis of an enterprise based on some combination of assessments of its governance, its processes and the ways in which it implements security solutions. To a large degree, this is an attractive and familiar construct. It is modeled on how policymakers and government officials normally think about assessing other areas of concern such as the environmental compliance, health, or safety of an enterprise. In regulatory America, any enterprise would be experienced with this audit, assessment or compliance construct. Is such a construct feasible and practical for cybersecurity?

Broadly speaking, our intent in conducting this sort of internal assessment is to allow an enterprise to manage, mitigate and monitor its cyber risk—that is, to implement a set of safeguards accurately aligned with reasonably foreseeable threats and to know that these safeguards are operating as intended. Nobody thinks that eliminating all risk is feasible, so the objective here and with any system of measurement is to enable an organization to answer critical questions about the enterprise’s ability to manage cyber risk effectively. From this perspective, an organization should be able to: reflect on its own business model, and identify reasonably foreseeable threat actor groups who might have an interest in the organization based on that business model; identify how those threat actors could actually compromise their operational environment; and assess whether their existing security approach provides reasonable coverage against this kind of threat tradecraft.

An organization armed with this situational awareness might then ask a series of architectural, engineering and operational questions: how it should weigh tradeoffs in alternative security investments under consideration; whether the security countermeasures in place actually work; and, of course, whether the organization is prepared to respond effectively in the event of a compromise. If our goal is to measure security performance, it may be enough to quantify risk “likelihood” as against a predefined set of impacts in some meaningful and intuitive manner. If our goal is to justify security investment, we need an additional layer that quantifies risk in terms of dollars.

In this paper, we want to describe, in some detail, exactly how such an inside-out measurement system might work in practice. First, we’ll describe, generically, what it means to do a risk assessment of an enterprise. Second, we’ll explain the MITRE ATT&CK framework—a knowledge-based tool that allows a company to understand the tactics, techniques and procedures (TTPs) that an adversary might use (the overarching thesis of the analysis is that TTPs are hard for an adversary to change, so understanding them is the key to understanding the threat to an enterprise). Third, we’ll describe how knowledge of a threat model allows an enterprise to assess whether the countermeasures being deployed are in sync with those threats, and thus what vulnerabilities may be in place. Fourth, we’ll examine how this kind of mapping effort gives an enterprise the ability to conduct an internal evaluation of its own security, providing a valid internal measurement that is reproducible and auditable. And finally, the paper concludes with some considerations about the utility of this form of measurement more broadly, identifying mechanisms needed for this to scale while maintaining accuracy and transparency.

Defining the Metric

Our goal is simple: to describe a transparent, accurate, scalable and generally agreed upon metric of cyber security performance or, if you prefer a corollary, a similar measure of the degree to which an enterprise has reduced its cyber risk.

When we speak of risk reduction for an enterprise, our focus is on the traditional risk assessment conceptual framework. In this construct, risk is a function of the likelihood of a harm occurring (a given threat source exploiting a potential vulnerability) and the degree of harm (the resulting impact or consequences of that adverse event on an organization or entity). Risk evaluation can take place at a strategic level or, as for many cyber enterprises, at a tactical level. What we describe here is a tactical security assessment that is informed by strategic-level knowledge.

The initial challenge for an enterprise is to define more precisely the nature of the threat. That is, to collect information that will allow the enterprise to assess the likelihood of some event occurring. As Figure 1 below shows, threat information can take many forms, each of which has utility in differing contexts. Some of the information is about the threat actor, some about the nature of the threat and some about its operation—in effect, the who, what, when, where, why and how of the threat. Put another way, threat information is a combination of assessing intent and capability that allows one to assess: who can harm us, who wants to, and how they would go about it?

Figure 1 illustrates that threat information can take many different forms.

The last decade has seen a proliferation of cybersecurity guidance and large increases in cyber spending—over $123 billion globally on products and services in 2020—and yet large, market-leading companies continue to be victimized by major attacks. Why? Because, in addition to constant challenges on cyber hygiene, defining the universe of threats that could target an enterprise is opaque and often focuses on techniques around initial access—for example, phishing emails, exploiting public-facing applications, brute forcing valid account passwords and using USB sticks. These potential threats are, in turn, different from methods an organization might use to defend itself assuming an adversary has an initial foothold inside the organization.

Similarly, most security guidance is controls-focused but provides little direction on how to map threats to control choices, and usually provides very limited guidance on how to link threats with controls assurance and testing. Security vendors have historically offered poor characterizations of tool coverage against threat techniques, and defenders often struggle in tuning tools, leading to a false sense of security.

Here we describe a more nuanced approach, one that is informed by a structured knowledge of threats and an enterprise-specific assessment of vulnerability and consequence.

Assessing Threats: the MITRE ATT&CK framework

On the threat side of the equation, one relatively recent innovation to improve the use of threat information is the MITRE Corporation’s Adversarial Tactics, Techniques, and Common Knowledge (ATT&CK) framework. MITRE is the United States’s oldest and largest operator of federally-funded research and development centers and ATT&CK, which was first publicly released in 2015 and has evolved significantly since then. The ATT&CK framework is the most comprehensive and leading approach to the mapping of threat actors; related tactics, techniques, and procedures (TTPs); and mitigations openly available today (ATT&CK is copyright 2018 The MITRE Corporation. The work is reproduced and distributed with the permission of The MITRE Corporation.). The ATT&CK knowledge base gives users a new level of accuracy in risk-based planning and evaluation.

The framework is a knowledge base and behavioral model that consists of the following core components: First, tactics, denoting step-by-step tactical adversary goals during an attack lifecycle; second, techniques, describing the means by which adversaries achieve each tactical goal; and third, documented, mapping of usage of techniques to specific threat adversary groups.

How does this help? The ATT&CK framework’s library of adversary groups and related motivations helps tie business objectives to threat actors, and its mapping of actors to TTPs establishes the foundation for evaluating defensive countermeasure coverage (in particular, assuming initial access has been achieved by an adversary). Likewise, security teams can use testing tools based on the ATT&CK framework to measure control performance with added accuracy, instilling greater confidence that controls are operating effectively.

In practice, what this means is that professionals can link threats to anticipated consequences in an enterprise. The ATT&CK framework can serve as the underlying knowledge base for an organization to consider how its business objectives influence its attractiveness to reasonably sophisticated threat actors defined in MITRE’s authoritative library. Put another way—it enables the mapping of business models like banking and credit card processing to threat actor groups that target these sectors (such as Lazarus Group and FIN 6) to consequences (such as subversion of payment system integrity and theft of consumer payment account numbers).

As a result, an enterprise can model threats more realistically by mapping threat actor groups to TTPs that are relevant to that enterprise and its operations. This allows organizations to develop a sample set of threat-actor-specific TTPs—a “threat model” for the enterprise.

The particular value of ATT&CK is that its knowledge base is both practical and powerful because it reflects tactics and techniques that have actually been used in the real world. While ATT&CK is not all-encompassing, with new techniques constantly being discovered and added, TTPs are the more invariant part of a threat and are a good place to focus analytic efforts. While adversaries can change hash values, IP addresses, domains, and other indicators leveraged as part of their tradecraft, it is much more difficult for them to change tactics and techniques. David Bianco’s “Pyramid of Pain” illustrates this difficulty and how orienting defenses around TTPs thus makes it substantially harder for an adversary to change course. Indeed, while the recent SolarWinds breach was both highly sophisticated and devastating, analysts had seen a number of techniques deployed against SolarWinds previously utilized in other software supply chain attacks.

Figure 2 shows David Bianco’s “Pyramid of Pain.”

Looking at threats through the lens of TTPs allows an enterprise to leverage the many threat intelligence resources now available that use the ATT&CK framework to categorize confirmed threats and provide a view of likely adversary behavior.

For example, Red Canary’s annual Threat Detection Report provides an in-depth look at the most prevalent ATT&CK techniques based on thousands of threats detected across its customers’ environments, presenting overall and sector-specific views. Similarly, CrowdStrike’s annual Global Threat Report organizes techniques it observes against the ATT&CK framework to illuminate changes and trends in adversary behavior.

In sum, instead of just thinking “we are under threat,” the first step for an enterprise is understanding who might particularly threaten them, and how that threat is likely to be realized.

Countermeasure Coverage Assessment

With a threat model in hand, the ATT&CK framework then allows an enterprise to map its countermeasures against that threat to see if the internal security work it is doing is directed at the most critical parts of the system. It makes little sense to buy a new data loss prevention (DLP) system, for example, if your anticipated adversary will be using TTPs to attack a critical asset or process that the DLP system doesn’t protect. In short, by overlaying a threat model to a coverage map of defensive controls, organizations can gain an understanding of what technologies and standards applied in their environments are potentially addressing what TTPs.

How to do so? The ATT&CK framework provides a mechanism to support this mapping through the “data sources” and “mitigations” ascribed to each TTP contained in the ATT&CK framework. In other words, for each TTP, the cybersecurity community generally knows what the currently most effective defensive mitigation is. Some of these are identified in authoritative frameworks like NIST SP 800-53, NIST Cybersecurity Framework, ISO and CIS 20 Critical Security Controls. MITRE has released formalized mappings to NIST SP 800-53, and mappings to additional controls frameworks are expected. As enterprises generate overlays, they can also start to see relative values of countermeasures—for example, how a single countermeasure like process monitoring might provide broad coverage against large swaths of, but not all, TTPs in our particular threat model. These coverage maps help organizations justify and prioritize purchasing decisions, building credibility in the budgeting process.

The data adds further depth to the overlays we just described: now, an enterprise can not only see whether a countermeasure provides broad coverage against TTPs, but also the extent to which it covers critical or high priority TTPs. Put another way, if countermeasure A provides narrow coverage, but addresses TTPs that are all critical and high priority, it may be more impactful than countermeasure B, which provides broad coverage but mostly against medium and lower priority TTPs.

The ultimate result is exactly what the government, or a corporate board, or anyone responsible for risk mitigation in an enterprise might want in an inside-out assessment of security—a rank ordering of countermeasure priorities informed by threat information. Key factors in the evaluation will, ultimately, be the result of considerations such as the ease of attack, the difficulty of defense and organization-specific considerations about criticality. This kind of evaluation leads to the realization that not all relevant threat techniques represent the same level of risk to an organization. One can, in fact, prioritize performance measurement and defensive countermeasure investment based on risk reduction value.

Testing Performance

The creation of this prioritized list of threat-informed countermeasures also allows an enterprise to test these countermeasures and thereby evaluate how well its defensive program is performing. With such testing comes the ability for an enterprise to meaningfully strengthen its programs by validating the extent of protective and detective capabilities’ performance against simulated threat activity. Indeed, breach and adversarial simulation tools are now available that can run TTP-specific diagnostics on an organization’s technology stack.

To implement testing, an organization can select a sample of assets or images that reflect both “crown jewel” considerations and machines that may be representative of the information technology (IT) environment as a whole. The diagnostic assessments are conducted on these sample sets. Besides being retrospective, so as to determine how the enterprise is currently doing, this type of testing can also be prospective, allowing an enterprise to evaluate the potential positive impact of investment in a tool or standard on security performance. That sort of scoring is the beginning of an ability to achieve the holy grail of cybersecurity—realistic assessments of the costs and benefits of future security investments. This allows an organization to measure its own efforts and answer the important questions: Are we performing at a sufficiently robust level? Is the risk reduction worth the investment?

In the end, an evaluation based on the ATT&CK framework will give IT enterprises the same level of insight that currently exists in the healthcare field. Consider: Medical professionals have comprehensive knowledge bases that map disease attributes to diagnostics, therapies and vaccines, and related quality assurance steps. In cybersecurity, practitioners too often focus on therapies and vaccines without linking them to specific diseases on the front-end or quality assurance on the back-end. As this paper suggests, we now have the beginning of an enterprise-level ability to connect disease attributes to diagnostics, therapies and vaccines. New testing tools connect diagnostics, therapies and vaccines to calibration and quality assurance. ATT&CK’s mappings (threat to TTP; TTP to defensive countermeasure) plus TTP-specific tests now give us the ability to connect risk to risk treatment.

Quantifying Risk–Measuring Security

But how to quantify the end result? We cannot reasonably expect our effort to produce a single metric. But in some circumstances, where the enterprise-level framework is applicable, we can develop concrete measures of security performance. Recall that risk is a function of likelihood (threat x vulnerability) and impact:

Likelihood. Better data on countermeasure coverage (based both on breadth and criticality of TTPs addressed) and performance (TTP-specific pass/fails on testing) is now available. This data can be converted into weighted numerical values (a low value TTP might be a “3” where a high priority TTP might be a “12”). Results can then be aggregated to generate an overall risk score—such as a 550 on a 300-850 scale (akin to a credit score), or a 65 on a 0-100 scale (akin to a grading scale). Results can be tracked over time to generate performance data and trending insights. As these approaches are repeated over time, benchmarking data starts to become available. For example, in September 2020, the Cybersecurity and Infrastructure Agency (CISA) mapped its aggregated risk and vulnerability assessment findings to each technique in the ATT&CK framework. CISA’s report highlights relative performance rates within its constituency in blocking or detecting ATT&CK TTPs.

Impact. By evaluating threat actor motivations, an enterprise can also generate data on primary impacts—consider, for example, financial impacts such as theft, subversion of payment systems and extortion. MITRE has created entire exfiltration and impact categories within ATT&CK. Merging these impacts with insurance loss modeling (still evolving as a model) can supplement our impact assessment in two respects: (1) by adding in secondary impact factors (third party liability, regulatory enforcement impacts, customer substitution effects); and (2) by generating dollar impact estimates.

Publicly available data on financial impacts from cyber intrusions is sparse for several reasons, including that cyber incidents are, while headline-grabbing, a relatively new phenomena and victims may often not report a loss or its full impact. That said, data on primary and secondary losses—particularly for tail risks—could be derived a couple of ways:

Modeling publicly-disclosed data breach costs. Companies’ 10-Q and 10-K disclosures, and related investor presentations, are providing increasing data on the costs of cyber data breaches. For example, Equifax has disclosed that the three-year cost of its 2017 data breach, which resulted in the loss of roughly 146 million consumer records, is $1.8 billion, including technology and data security, product liability, investigative, litigation and regulatory expense accruals—all net of $187.4 million in insurance recoveries. This amounts to a cost of $12.57 per consumer record, which can be modeled onto other organizations by multiplying the Equifax ratio by an organization’s total number of records. While most data breaches will not entail Equifax-level costs, regulatory fines and potential litigation exposure is growing—consider how potential fines for violations of the European Union General Data Protection Regulation can amount to up to four percent of a firm’s annual revenue for the preceding year.

Modeling publicly-disclosed downtime from disruptive cyber incidents. Organizations are also disclosing downtime as a result of recent disruptive cyber incidents, which can also be modeled onto other organizations. For example, consider the June 2020 ransomware attack on Japanese motor company Honda. With the disclaimer that authoritative impact data for Honda is hard to come by, here’s our best sense (from piecing together open-source reporting and company statements) for impact to Honda:

Honda experienced a ransomware attack that reportedly started Sunday, June 7 (U.S. time), Monday June 8 (Japan time)—in addition to IT impacts, manufacturing operations were disrupted for several days (it is unclear whether this was as a direct result of ransomware or based on precautionary measures).
The ransomware’s impact on manufacturing operations reportedly included car and engine factories in North America and Turkey, plus motorcycle plants in India and Latin America. According to successive company statements, Honda resumed production at most plants by Tuesday, June 9, but auto and engine plants in Ohio continued to be impacted for some additional period—the company reported all plants were up and running by Thursday, June 11. U.S. auto manufacturing accounts for roughly 23 percent of Honda’s global auto manufacturing capacity, and Honda’s Ohio auto manufacturing plants account for roughly one-third of that (or roughly eight percent of global production capacity). That said, Honda’s Ohio engine plant is one of the company’s largest engine plants, accounting for slightly more than 20 percent of Honda’s global engine capacity and includes Honda’s most technically advanced engine assembly line in North America. So, assume that approximately eight percent of global auto capacity plus 20 percent of global engine capacity was potentially offline for three to five days (not including impacts in Turkey, India and Latin America).
Now, we have impacts measurable in terms of both time and overall manufacturing capacity. Another manufacturing organization can now model these impacts onto its own environment—asking what would a three-to-five day impact on eight percent of global manufacturing capacity for one core product, plus 20 percent of key-component capacity that fed into multiple products mean for us?

With this information in hand, an industry or an enterprise can now make risk appetite judgments about the level of security performance expected given a potential loss. For example, if 20 percent of a company’s global engine capacity going offline for three to five days is highly problematic, then a high standard of security performance might be expected in defending against threat activity that could achieve that outcome.

The Factor Analysis of Information Risk (FAIR) framework can be useful in formalizing this approach to quantifying and managing risk. FAIR helps organizations quantify risk in financial terms by categorizing loss event frequency and loss magnitude scenarios, making estimates around ranges of minimum, maximum and most likely distributions, running Monte Carlo simulations around these scenarios and distributions, and then expressing resulting risk calculations in dollar terms.

Loss event frequency (analogous to likelihood within a given timeframe, such as a year), is defined in FAIR as a function of threat event frequency and vulnerability. And vulnerability is determined by estimates of threat capability and resistance strength. In our model, threat event frequency would be roughly analogous to a basic threat profile (here are the threat campaigns likely to be deployed against you (this year))—threat frequency analysis could be enriched as data on sector-specific campaigns and TTP sightings, such as in the above-described Red Canary report, become more available. Threat capability would be analogous to a TTP’s risk rating, and resistance strength can be potentially derived by whether a TTP is successfully blocked or detected in a diagnostic.

Loss magnitude in FAIR is a function of both primary losses and secondary risks. Exfiltration and impact tactics within ATT&CK are roughly analogous to primary loss types, although they would still need to be quantified in financial terms, as would secondary losses.

Limitations

There are any number of limitations that will impact the utility of this kind of detailed, enterprise-level assessment of security. First, and most obviously, the countermeasure assessment anticipated is often enterprise-specific (though in some industries, like retail, a sector-level threat model might be sufficient for all but the largest enterprises). Unlike higher strategic-level metrics, the assessment contemplated here will necessarily require a meaningful level of engagement and expertise—a requirement that suggests the need for independent, trusted intermediaries capable of conducting the assessment whose conclusions are generally-accepted and credited. In short, this form of security evaluation will, of necessity, lead to the creation of a security-assessment standard of practice—including training and certifications—akin in some ways to the generally accepted accounting principles used in accounting today.

Second, given the level of detail necessary for this evaluation, it seems likely that there will be questions about the scalability of implementation of this type of metric. It follows, we think, that this metric will be of greatest utility for larger, high-value, high-impact enterprises whose security is of particular importance to national security or the economy. That said, we believe that many opportunities exist to create efficiencies for smaller organizations—these include, on the front end, the development of sector-level threat models by Information Sharing and Analysis Organizations, and on the back-end, adoption of these threat models and related testing by managed security service providers.

Third, given the highly detailed nature of the inquiry involved, it is likely that much of the data necessary to complete the assessment will be highly confidential (with respect to individual countermeasures) and proprietary (with respect to TTP sightings in incidents and related risk characterizations). Thus, some level of de-identified data sharing seems also necessary to meet the requisite standard of transparency to be of wide-spread utility. Such models exist in other domains: The aviation community has, for example, successfully established the Aviation Safety Information Analysis and Sharing (ASIAS) platform, run by MITRE. ASIAS fuses airline proprietary data sets, manufacturer data, Federal Aviation Administration data and other sources to proactively identify safety trends. Such models will be important prerequisites for a fully transparent metric of security.

Possible Future Avenues

Given this, we offer three thoughts for possible future avenues of development of this inside-out metric for cybersecurity assessment: Core avenues for incentivizing strong cybersecurity performance include regulatory, procurement and insurance mechanisms. Each should be pursued as a potential avenue.

First, regulation may drive adoption of this concept. For example, bank regulators continue to emphasize the importance of sound practices to strengthen operational resilience. Firms are, for example, expected to regularly review and update their systems and controls for security against evolving threats. Regulators have indicated they intend to convene public discussions in the coming months on how to improve operational resilience. Banks should thus consider how the assessment described here could be used to address regulatory expectations.

Second, the government should consider a pilot program for this form of procurement-related assessment. Large members of the Defense Industrial Base, for example, are of sufficient national significance that they might be incentivized to use this form of self-assessment as a way of responding to the Department of Defense’s move to incorporate security performance expectations into its procurement process. For example, the Capability Maturity Model Certification program includes expectations on risk management, security assessment, situational awareness and related domains that could be addressed in part through the above-described approach.

Third, the insurance and reinsurance industries also have a client base of national economic significance. They, too, might consider whether a pilot project to test more fully the utility of this type of security assessment in terms of rating the risks to their insureds.

Conclusion

Like so much of cybersecurity, the concepts we outline here are in flux. The assessment we describe necessarily changes as business drivers, threats and technology platforms evolve. That said, the ATT&CK framework brings transparency to threat tradecraft that was formerly opaque (particularly post-initial access); TTPs are significantly more difficult for adversaries to alter than other threat attributes, and individual TTPs are used by multiple threat actor groups, offering opportunities to scale defenses. To mitigate the risk of cybersecurity metrics fighting the last war, rather than the next one, threat models need to be periodically updated—perhaps annually or on the occurrence of a significant (SolarWinds) event. That is why we value so highly the nimble, flexible, modular nature of the approach we have described. While it is no silver bullet, it does offer the prospect for creating a reproducible, valid measure of security performance for an enterprise with significantly more transparency and accuracy than anything currently available.