Create and Assess Test and Evaluation Strategies

Definition: A Test and Evaluation strategy "provides program engineers and decision makers with knowledge to measure progress, identify problems, and to characterize system capabilities and limitations, and manage technical and programmatic risks" [1].

Keywords: evaluation, governance, strategy, test

MITRE SE Roles and Expectations: MITRE systems engineers (SEs) work with sponsors to create or evaluate test and evaluation strategies in support of acquisition programs. They are often asked to recommend test and evaluation approaches, which provide insights that can be used to manage acquisition risks. They also monitor government and contractor test and evaluation processes and recommend changes when they are warranted. Subsequent to contract award, MITRE SEs evaluate test and evaluation master plans produced by both contractors and government test organizations. They also evaluate test plans and procedures that are applied during development testing, operational testing, and for some sponsors, live fire testing. Occasionally they help formulate the plans and procedures as a member or adviser to the government test team. As a consequence, MITRE SEs are expected to understand why acquisition programs are required to create and execute a test and evaluation strategy. They are expected to understand where test and evaluation activities such as interoperability testing, information assurance testing, and modeling and simulation fit in the acquisition life cycle and where they can be used most effectively to identify and mitigate risk. Finally, in the course of their other activities such as requirements and design analysis, MITRE SEs are expected to include test and evaluation concerns in their analysis.


The fundamental purpose of test and evaluation (T&E) is to "enable the DoD [Department of Defense] to acquire systems that work. To that end, T&E provides engineers and decision makers with knowledge to assist in managing risks, to measure technical progress, and to characterize operational effectiveness, suitability, and survivability. This is done by planning and executing a robust and rigorous T&E program [1]."

The program manager is responsible for creating and submitting a test and evaluation strategy after the decision is made to pursue a materiel solution. The creation of the test and evaluation strategy involves planning for technology development (including risk), evaluating the system design against mission requirements, and identifying where competitive prototyping and other evaluation techniques fit in the process.

The content of a test and evaluation strategy is a function of where it is applied in the acquisition process, the requirements for the capability to be provided, and the technologies that drive the required capability. A test and evaluation strategy should lead to the knowledge required to manage risks, the empirical data required to validate models and simulations, the evaluation of technical performance and system maturity, and a determination of operational effectiveness, suitability, and survivability. In the end, the goal of the strategy is to identify, manage, and mitigate risk, which requires identifying the strengths and weaknesses of the system or service being provided to meet the end goal of the acquisition program. Ideally, the strategy should drive a process that confirms compliance with the Initial Capabilities Document (ICD), instead of discovering later that functional, performance, or nonfunctional goals are not being met. The discovery of problems late in the test and evaluation phase can have significant cost impacts as well as substantial operational repercussions.

Historically, test and evaluation consisted of testing a single system, element, or component and were carried out in a serial manner. One test would be performed, data would be obtained, and then the system would move to the next test event, often at a new location with a different test environment. Similarly, the evaluations themselves were typically performed in a serial manner—how well the system met its required capabilities was established by combining test results obtained from multiple sites with differing environments. The process was time consuming and inefficient, and with the advent of network-centric data-sharing strategies, it became insufficient. In large part, this was due to an acquisition approach that did not easily accommodate the incremental addition of capabilities. Creating and maintaining an effective test and evaluation strategy under those conditions would have been difficult at best. A test and evaluation strategy is necessary today because adding capabilities via incremental upgrades is now the norm, and there's been a shift to a network-centric construct where data is separated from the applications, data is posted and made available before it is processed, collaboration is used to make data understandable, and a rich set of network nodes and paths provides the required supporting infrastructure.

The need to deliver a set of capabilities as quickly as possible can introduce further complexity to creating a test and evaluation strategy, especially in cases where ICDs are largely nonexistent, ambiguous, inconsistent, or incomplete. In this situation, the development of a test and evaluation strategy represents a significant challenge, and in some cases, it may be largely ignored to get a capability in the field as quickly as possible. However, this approach is not without attendant risk assessments and mitigation strategies—they are just accomplished at a high level very early in the process. Quick reaction capabilities (QRCs) of this sort are often followed by a more formal acquisition effort, a program of record. Nonetheless, test and evaluation of QRCs cannot be completely ignored. At the outset, the critical capabilities must be identified, and their risks must be identified, managed, and mitigated through some level of test and evaluation.

The advent of service-oriented architectures, virtualization, cloud technologies, and agile acquisition and development methodologies has further complicated the need for and development of test and evaluation strategies. Agile software development emphasizes individuals and interactions over processes and tools, working software over comprehensive documentation, sponsor collaboration over contract negotiation, and responding to change over following a plan. In this type of environment, rigid adherence to requirements documents and checklists is virtually nonexistent, and developers have to wear both test and quality assurance hats, in the same manner that the test team has to take a strong development viewpoint. Even where there are regulatory requirements for testing, the test and evaluation approach is less about finding problems and improving the product than it is about verification, regulatory compliance, and auditing. Test and evaluation strategies become a collaborative activity with strong sponsor involvement that has less focus on validation than on building quality in, recognizing that everyone has a role, and providing fast feedback to the development team and faster delivery into production for the sponsor.

Government Interest and Use

Government acquisition communities are recognizing the need for a test and evaluation strategy that is in concert with evolving department and agency network-centric data-sharing strategies. Although a test and evaluation strategy is created early in the acquisition process (Figure 1), it has to be refined as the acquisition process evolves and system details become more specific. A test and evaluation strategy needs to be developed early in the acquisition process to ensure that it is consistent with the acquisition strategy, identifies the required resources (facilities, ranges, personnel, and equipment, including government-furnished equipment), encourages shared data access, engages the appropriate government test agencies, identifies where and when modeling and simulation will be used, and establishes both the contractor's and government's test and evaluation efforts.

MITRE can and should influence how a test and evaluation strategy evolves and is applied and, in particular, should ensure that it is consistent with the acquisition strategy and the systems engineering plan, if there is one. It is rare for MITRE, or any other single organization, to be asked to independently create a test and evaluation strategy. It is far more common for MITRE to collaborate with the government stakeholders to create a test and evaluation strategy, or to be employed to evaluate and recommend changes to a strategy that is the product of a test and evaluation working group or other test and evaluation stakeholder organization. In these instances, it is important that MITRE become a collaborator and consensus builder.

In most instances, the government establishes a working group to execute the test and evaluation strategy. This group is often referred to as a test and evaluation working integrated product team, and it consists of test and evaluation subject matter experts from the program office, sponsor headquarters, the sponsor's user representatives, test and evaluation organizations, higher oversight organizations (e.g., Office of the Secretary of Defense for DoD systems), supporting FFRDCs, and other stakeholders. The test and evaluation strategy is a living document, and this group is responsible for any updates that are required over time. The program manager looks to this group to ensure that test and evaluation processes are consistent with the acquisition strategy and that the user's capability-based operational requirements are met at each milestone in the program. Finally, as a program progresses from pre-systems acquisition to systems acquisition, the test and evaluation strategy begins to be replaced by a test and evaluation master plan, which becomes the guiding test and evaluation document (Figure 1). The DoD's interest in and application of a test and evaluation strategy is documented in Incorporating Test and Evaluation into Department of Defense Acquisition Contracts [2] and Chapter 9 of the Defense Acquisition Guidebook [3].

Figure 1. T&E in the Defense Acquisition Management System [4]

Best Practices and Lessons Learned

New thinking required for T&E in network-centric and SOA environments. The transition to network-centric capabilities has introduced new test and evaluation challenges. Network capabilities can reside in both nodes and links, and the basic system capabilities can reside in service-oriented architecture (SOA) infrastructures, with the remaining capabilities provided by services that are hosted on the SOA infrastructure. Test and evaluation of capabilities in this type of framework require new thinking and a new strategy. For example, evaluating the performance of the network itself is probably not going to be accomplished without extensive use of modeling and simulation because the expense of adding live nodes in a lab increases dramatically with the number of nodes added to the test apparatus. This places a greater burden on the veracity of the modeling and simulation because one of the keys to obtaining the metrics that will support risk mitigation is to understand how a new host platform will affect the network infrastructure, as well as how the network infrastructure will affect the new host platform. A test and evaluation strategy that mitigates risk in the development of a network infrastructure that will support network-centric warfare requires a balance of theoretical analysis and laboratory testing. MITRE can help develop a strategy that uses a mix of modeling and simulation that has been verified, validated, and accredited; laboratory testing; and distributed testing that takes advantage of other network-enabled test components and networks. The capabilities required to execute a network-centric test and evaluation strategy have evolved over the past few years, and today we have a rich set of networks (such as the DREN and SDREN) that host nodes that constitute government laboratories, university facilities, test centers, operational exercise sites, contractor facilities, and coalition partner facilities.

The network-centric transformation has emerging technology aspects with which test organizations have limited experience, and these aspects are where MITRE can help create and assess test and evaluation strategies. These new technology areas constitute the heart of the SOA that will make up the enterprise, as well as the services themselves that make up new capabilities.

Accounting for governance in T&E. The transition to a service-based enterprise introduces some new complexities that must be accounted for in the test and evaluation strategy. Service-based enterprises rely on a more formalized business model for identifying required capabilities. Though this is not a new concept, the formalization of business processes into the engineering process, and the addition of the concomitant governance, add new complexities to both the systems engineering and the test and evaluation processes. A test and evaluation strategy must account for the governance of capabilities (e.g., services) as well as the capabilities themselves. Service repositories become critical parts of the test and evaluation strategy and must encompass how services are distributed, populated, managed, and accessed because a critical aspect of service-based capabilities is the reuse of existing services to compose new capabilities.

Accounting for business process re-engineering and scalability of service-based infrastructure in T&E. The shift to network-centric service-based enterprise capabilities is rarely accomplished in a single stroke; instead, it is accomplished incrementally, beginning with business process re-engineering and the identification of scalable service-based infrastructures. Both of these activities need to be incorporated into the test and evaluation strategy, and their evaluation should begin as early as possible. Prototyping or competitive prototyping are common techniques used to evaluate service-based infrastructures, especially the ability of the infrastructure to scale to meet future needs and extend to accommodate future capabilities.

The importance of factoring in refactoring. Business process re-engineering leads to segregating capabilities into those that will be provided by newly developed services and those that will be provided by refactored legacy components. It also enables a block and spiral upgrade strategy for introducing new capabilities. Evaluating how to decide which capabilities will be newly developed and which will be refactored legacy components is critical to the health of the program and should constitute another early and critical aspect of the test and evaluation strategy. Each legacy component selected for refactoring must be analyzed to determine how tightly coupled it is to both the data and other processes. Failure to do so can lead to the sort of "sticker shock" some current programs have experienced when attempting to add capabilities through spiral upgrades.

Distributed test environments. A key distinction of, and enabling concept in, the network-centric service-based construct is the ability to reuse capabilities through a process referred to as finding and binding. Achieving the true acquisition benefits of service-based programs requires that capabilities that can be reused be discoverable and accessible. To do this, service registries must be established and a distributed test environment must be employed, which in turn places new requirements on the test and evaluation strategy for these types of programs. Distributed test and evaluation capabilities must be planned for, resourced, and staffed, and shared data repositories must be established that will support distributed test and evaluation. Network infrastructures exist that host a wide variety of nodes that can support distributed test and evaluation (e.g., DREN and SDREN). However, early planning is required to ensure they will be funded and available to meet program test and evaluation needs.

Importance of metrics for loose coupling in T&E strategy. Another area in which a test and evaluation strategy can be effective early in a service-based acquisition program is in the continuous evaluation and measurement of the loose coupling that maintains separation of data and applications and enables changes in services with minimal impact to other services. The average contractor business model leans toward tight coupling simply because it ensures that the contractor is continuously engaged throughout the program's life cycle. Failure to establish and apply metrics for loose coupling as part of the test and evaluation strategy will lead to a lack of insight into system performance; the impact of tight coupling with respect to interfaces will be unknown until the interfaces are actually in play, which is often too late to mitigate the risk involved. Consequently, the test and evaluation strategy must include an identification and metrics-based analysis of interfaces to mitigate the risk that data and applications are tightly coupled; the earlier this is accomplished, the easier it is to mitigate the problem.

Data-sharing implications for T&E strategy. Often overlooked in development and test and evaluation of service-based enterprises are the core capabilities for data sharing. Although time is devoted to the test and evaluation of services that enable data sharing, the underlying technologies that support it are often not brought into the test and evaluation process until late. The technologies critical to data discovery and sharing are embedded in metadata catalog frameworks and ontology products, both of which require a skill set that is more esoteric than most. As a consequence, aspects of discovery and federation through the use of harmonized metadata are overlooked, and instead individual contractor metadata is employed for discovery. This leads to a downstream need for resource adapters that bridge metadata used in one part of the enterprise or for one type of data to other parts of the enterprise. In several instances, the downstream requirement for resource adapters has ballooned to account for nearly every data store in the enterprise. A test and evaluation strategy that incorporated the harmonization of metadata, the development of a single ontology, and the early test and evaluation of these items would have saved time and money and delivered a capability earlier.

Agile development implications for T&E strategy. Agile development through the use of small teams, user stories vs. requirements, scrums, and the significant change in focus brought by this change in approach requires careful re-thinking and re-planning of the T&E strategy. The roles of designers, coders, and testers are blurred as are the distinctions among managers and workers and sponsors and developers. A T&E strategy that accommodates agile development in any of its many forms must recognize this blurring of roles and adapt a shared roles model for the project. In addition, test and evaluation must be automated to the maximum extent possible so that testing becomes an activity that is shared across the team and not solely the province of the test team. This, in turn, suggests that test artifacts should be developed with maximum reuse in mind so that their value is captured and retained. Finally, agile development often involves a significant level of refactoring. A T&E strategy that works well with agile development methodologies will recognize this and build in mechanisms or governance rules that ensure that both developers and T&E staff are aware of any refactoring. Otherwise, that refactoring can be invisible to the T&E staff in particular, and confirmation that the system works as it did before can be lacking. 


The shift to a network-centric data-sharing strategy has introduced a new set of challenges in the acquisition process. Incremental development of capabilities has become the norm, and distributed enterprise capabilities are the desired end-state. Test and evaluation must evolve to keep pace with the shift in development processes. This article captures a few of the best practices and lessons learned, but the list could go on at length to include those practices that still provide significant risk identification, management, and mitigation. In addition, as information technology in particular evolves, the risk areas will shift and coalesce, driving the need for new and updated test and evaluation strategies.

References and Resources

  1. Department of Defense Instruction Number 5000.02, January 7, 2015, Operation of the Defense Acquisition System, USD(AT&L), Enclosure 4 Developmental T&E, p. 90, 98, accessed October 10, 2017.
  2. Incorporating Test and Evaluation into Department of Defense Acquisition Contracts, October 24, 2011, Office of the Deputy Under Secretary of Defense for Acquisition and Technology, Washington, D.C., accessed October 10, 2017.
  3. Defense Acquisition Guidebook, accessed October 10, 2017.
  4. Mosser-Kerner, D., August 17, 2009, Test and Evaluation Working Level Integrated Product Team, Department of Defense, accessed October 10, 2017.

Additional References and Resources

Department of Defense Directive Number 5000.01, May 12, 2003, certified current as of November 20, 2007, The Defense Acquisition System, USD(AT&L).


Download the SEG

MITRE's Systems Engineering Guide

Download for EPUB
Download for Amazon Kindle
Download a PDF

Contact the SEG Team