Create and Assess Test and Evaluation Strategies


Definition: A Test and Evaluation strategy "...provide[s] information about risk and risk mitigation, ...[and] empirical data to validate models and simulations, evaluate technical performance and system maturity, and determine whether systems are operationally effective, suitable, and survivable... [1]."

Keywords: evaluation, governance, strategy, test

MITRE SE Roles & Expectations: MITRE systems engineers (SEs) work with sponsors to create or evaluate test and evaluation strategies in support of acquisition programs. They are often asked to recommend test and evaluation approaches, which provide insights that can be used to manage acquisition risks. They also monitor government and contractor test and evaluation processes, and recommend changes when they are warranted. Subsequent to contract award, MITRE systems engineers evaluate test and evaluation master plans produced by both contractors and government test organizations. They also evaluate test plans and procedures that are applied during development testing, operational testing, and for some customers, live fire testing; occasionally, they help to formulate the plans and procedures as a member or advisor to the government test team. As a consequence, MITRE SEs are expected to understand the rationale behind the requirement for acquisition programs to create and execute a test and evaluation strategy. They are expected to understand where test and evaluation activities such as interoperability testing, information assurance testing, and modeling and simulation fit in the acquisition life cycle, and where they can be used most effectively to identify and mitigate risk. Finally, it is expected that MITRE systems engineers, in the course of their other activities such as requirements and design analysis, will include test and evaluation concerns in their analysis.

Background

The fundamental purpose of test and evaluation (T&E) is to "provide knowledge to assist in managing the risks involved in developing, producing, operating, and sustaining systems and capabilities. T&E measures progress in both system and capability development. T&E provides knowledge of system capabilities and limitations to the acquisition community for use in improving the system performance, and the user community for optimizing system use in operations. T&E expertise must be brought to bear at the beginning of the system life cycle to provide earlier learning about the strengths and weaknesses of the system under development. The goal is early identification of technical, operational, and system deficiencies, so that appropriate and timely corrective actions can be developed prior to fielding the system."[1] The program manager is responsible for creating and submitting a test and evaluation strategy after the decision is made to pursue a materiel solution. The creation of the test and evaluation strategy involves planning for technology development, including risk; evaluating the system design against mission requirements; and identifying where competitive prototyping and other evaluation techniques fit in the process.

The content of a test and evaluation strategy is a function of where it is applied in the acquisition process, the requirements for the capability to be provided, and the technologies that drive the required capability. A test and evaluation strategy should lead to the knowledge required to manage risks; the empirical data required to validate models and simulations; the evaluation of technical performance and system maturity; and a determination of operational effectiveness, suitability, and survivability. In the end, the goal of the strategy is to identify, manage, and mitigate risk, which requires identifying the strengths and weaknesses of the system or service being provided to meet the end goal of the acquisition program. Ideally, the strategy should drive a process that confirms compliance with the Initial Capabilities Document (ICD), instead of discovering later that functional, performance, or non-functional goals are not being met. The discovery of problems late in the test and evaluation phase can have significant cost impacts as well as substantial operational repercussions.

Historically, test and evaluation consisted of testing a single system, element, or component, and was carried out in a serial manner. One test would be performed, data would be obtained, and then the system would move to the next test event, often at a new location with a different test environment. Similarly, the evaluations themselves were typically performed in a serial manner, with determinations of how well the system met its required capabilities established through the combination of test results obtained from multiple sites with differing environments. The process was time consuming and inefficient, and with the advent of network-centric data-sharing strategies, it became insufficient. In large part this was due to an approach to acquisition that did not easily accommodate the incremental addition of capabilities. Creating and maintaining an effective test and evaluation strategy under those conditions would have been difficult at best. A test and evaluation strategy is a necessity today because of the addition of capabilities via incremental upgrades, which is now the norm, and the shift to a network-centric construct where data is separated from the applications; data is posted and made available before it is processed; collaboration is employed to make data understandable; and a rich set of network nodes and paths provide the required supporting infrastructure.

When there is a need to deliver a set of capabilities as quickly as possible, further complexity in creating a test and evaluation strategy can be introduced, especially in cases where ICDs are largely nonexistent, ambiguous, inconsistent, or incomplete. In this situation, the development of a test and evaluation strategy represents a significant challenge, and in some cases it may be largely ignored to get a capability in the field as quickly as possible. However, this approach is not without attendant risk assessments and mitigation strategies – they are just accomplished at a high level very early in the process. Quick reaction capabilities (QRCs) of this sort are often followed by a more formal acquisition effort, a program of record. Nonetheless, test and evaluation of QRCs cannot be completely ignored. At the outset, the critical capabilities must be identified, and their risks must be identified, managed, and mitigated through some level of test and evaluation.

Government Interest and Use

Government acquisition communities are recognizing the need for a test and evaluation strategy that is in concert with evolving department and agency network-centric data-sharing strategies. Although a test and evaluation strategy is created early in the acquisition process (Figure 1), it has to be refined as the acquisition process evolves and system details become more specific. A test and evaluation strategy needs to be developed early in the acquisition process to ensure that it is consistent with the acquisition strategy, identifies the required resources (facilities, ranges, personnel, and equipment, including government-furnished equipment), encourages shared data access, engages the appropriate government test agencies, identifies where and when modeling and simulation will be employed, and establishes both the contractor's and government's test and evaluation efforts.

MITRE can and should influence how a test and evaluation strategy evolves and is applied, and, in particular, should ensure that it is consistent with the acquisition strategy and the systems engineering plan, if there is one. It is rare for MITRE, or any other single organization, to be asked to independently create a test and evaluation strategy. It is far more common for MITRE to collaborate with the government stakeholders to create a test and evaluation strategy, or to be employed to evaluate and recommend changes to a strategy that is the product of a test and evaluation working group or other test and evaluation stakeholder organization. In these instances, it is important that MITRE become a collaborator and consensus builder.

In most instances, the government establishes a working group to execute the test and evaluation strategy. This group is often referred to as a test and evaluation working integrated product team, and it consists of test and evaluation subject matter experts from the program office, customer headquarters, customer user representatives, test and evaluation organizations, higher oversight organizations (e.g., Office of the Secretary of Defense for DoD systems), supporting FFRDCs, and other stakeholders. The test and evaluation strategy is a living document, and this group is responsible for any updates that are required over time. The program manager looks to this group to ensure that test and evaluation processes are consistent with the acquisition strategy and that the user's capability-based operational requirements are met at each milestone in the program. Finally, as a program progresses from pre-systems acquisition to systems acquisition, the test and evaluation strategy begins to be replaced by a test and evaluation master plan, which becomes the guiding test and evaluation document (Figure 1). The DoD's interest in and application of a test and evaluation strategy is documented in Incorporating Test and Evaluation into Department of Defense Acquisition Contracts [2] and Chapter 9 of the Defense Acquisition Guidebook [3].

Figure 1. T&E in the Defense Acquisition Management System [4]

Best Practices and Lessons Learned

New thinking required for T&E in net-centric and SOA environments: The transition to network-centric capabilities has introduced new test and evaluation challenges. Network capabilities can reside in both nodes and links, and the basic system capabilities can reside in service-oriented architecture (SOA) infrastructures, with the remaining capabilities provided by services that are hosted on the SOA infrastructure. The test and evaluation of capabilities in this type of framework requires new thinking and a new strategy. For example, evaluating the performance of the network itself is probably not going to be accomplished without extensive use of modeling and simulation because the expense of adding live nodes in a lab increases dramatically with the number of nodes added to the test apparatus. This places a greater burden on the veracity of the modeling and simulation because one of the keys to obtaining the metrics that will support risk mitigation is gaining an understanding of the effect of a new host platform on the network infrastructure, as well as the effect of the network infrastructure on the new host platform. A test and evaluation strategy that mitigates risk in the development of a network infrastructure that will support network-centric warfare requires a balance of theoretical analysis and laboratory testing. MITRE can help develop a strategy that employs a mix of modeling and simulation that has been verified, validated, and accredited; laboratory testing; and distributed testing that takes advantage of other network-enabled test components and networks. The capabilities required to execute a network-centric test and evaluation strategy have evolved over the past few years, and today we have a rich set of networks (such as the DREN and SDREN) that host nodes that constitute government laboratories, university facilities, test centers, operational exercise sites, contractor facilities, and coalition partner facilities.

There are emerging technology aspects of the network-centric transformation where test organizations have limited experience and these aspects are where MITRE can help create and assess test and evaluation strategies. These new technology areas constitute the heart of the SOA that will make up the enterprise, as well as the services themselves that make up new capabilities.

Accounting for governance in T&E: The transition to a service-based enterprise introduces some new complexities that must be accounted for in the test and evaluation strategy. Service-based enterprises rely on a more formalized business model for the identification of required capabilities. While this is not a new concept, the formalization of business processes into the engineering process, and the addition of the concomitant governance, add new complexities to both the systems engineering and test and evaluation processes. A test and evaluation strategy must account for governance of capabilities (e.g., services) as well as the capabilities themselves. Service repositories become critical parts of the test and evaluation strategy and must encompass how services are distributed, populated, managed, and accessed, since a critical aspect of service-based capabilities is reuse of existing services to compose new capabilities.

Accounting for business process re-engineering and scalability of service-based infrastructure in T&E: The shift to network-centric service-based enterprise capabilities is rarely accomplished in a single stroke; instead it is accomplished incrementally, beginning with business process re-engineering and the identification of scalable service-based infrastructure. Both of these activities need to be incorporated into the test and evaluation strategy, and their evaluation should begin as early as possible. Prototyping or competitive prototyping are common techniques used to evaluate service-based infrastructures, especially the ability of the infrastructure to scale to meet future needs and extend to accommodate future capabilities.

The importance of factoring in refactoring: Business process re-engineering leads to segregating capabilities into those that will be provided by newly developed services, and those that will be provided by refactored legacy components. It also enables a block and spiral upgrade strategy for introducing new capabilities. An evaluation of how it is decided which capabilities will be newly developed and which will be refactored legacy components is critical to the health of the program and should constitute another early and critical aspect of the test and evaluation strategy. Each legacy component selected for refactoring must be analyzed to determine how tightly coupled it is to both the data and other processes. Failure to do so can lead to the sort of "sticker shock" some current programs have experienced when attempting to add capabilities through spiral upgrades.

Distributed test environments: A key distinction of, and enabling concept in, the network-centric service-based construct is the ability to reuse capabilities through a process referred to as finding and binding. Achieving the true acquisition benefits of service-based programs requires that capabilities that can be reused be discoverable and accessible. To do this, service registries must be established and a distributed test environment be employed, which in turn places new requirements on the test and evaluation strategy for these types of programs. Distributed test and evaluation capabilities must be planned for, resourced, and staffed, and shared data repositories must be established that will support distributed test and evaluation. Network infrastructures exist that host a wide variety of nodes that can support distributed test and evaluation (e.g., DREN and SDREN). However, early planning is required to ensure they will be funded and available to meet program test and evaluation needs.

Importance of metrics for loose coupling in T&E strategy: Another area where a test and evaluation strategy can be effective early in a service-based acquisition program is in the continuous evaluation and measurement of the loose coupling that maintains separation of data and applications, and enables changes in services with minimal impact to other services. The average contractor business model leans toward tight coupling simply because it ensures that the contractor is continuously engaged throughout the program's life cycle. Failure to establish and apply metrics for loose coupling as part of the test and evaluation strategy will lead to a lack of insight into system performance; the impact of tight coupling with respect to interfaces will be unknown until the interfaces are actually in play, which is often too late to mitigate the risk involved. Consequently, the test and evaluation strategy must include an identification and metrics-based analysis of interfaces to mitigate the risk that data and applications are tightly coupled; the earlier this is accomplished, the easier it is to mitigate the problem.

Data sharing implications for T&E strategy: Often overlooked in development and test and evaluation of service-based enterprises are the core capabilities for data sharing. While time is devoted to the test and evaluation of services that enable data sharing, the underlying technologies that support it are often not brought into the test and evaluation process until late. The technologies critical to data discovery and sharing are embedded in metadata catalog frameworks and ontology products, both of which require a skill set that is more esoteric than most. The consequence of this is that aspects of discovery and federation through the use of harmonized metadata are overlooked, and instead individual contractor metadata is employed for discovery. This leads to a downstream need for resource adapters that bridge metadata used in one part of the enterprise or for one type of data to other parts of the enterprise. In several instances, the downstream requirement for resource adapters has ballooned to account for nearly every data store in the enterprise. A test and evaluation strategy that incorporated the harmonization of metadata, the development of a single ontology, and the early test and evaluation of these items would have saved time and money, and delivered a capability to the warfighter earlier.

Summary

The shift to a network-centric data-sharing strategy has introduced a new set of challenges in the acquisition process. Incremental development of capabilities has become the norm, and distributed enterprise capabilities are the desired end-state. Test and evaluation must evolve to keep pace with the shift in development processes. In this article we have captured a few of the best practices and lessons learned, but the list could go on at length to include those practices that still provide significant risk identification, management, and mitigation. In addition, as information technology in particular evolves, the risk areas will shift and coalesce, driving the need for new and updated test and evaluation strategies.

References & Resources

  1. Department of Defense Instruction Number 5000.02, December 8, 2008, Operation of the Defense Acquisition System, USD(AT&L), Enclosure 6 Integrated T&E, pg. 50.
  2. Incorporating Test and Evaluation into Department of Defense Acquisition Contracts, 2009, Office of the Deputy Under Secretary of Defense for Acquisition and Technology, Washington, DC.
  3. Defense Acquisition Guidebook.
  4. DoD Presentation: Test and Evaluation Working Integrated Product Team, August 17, 2009.

Additional References & Resources

Department of Defense Directive Number 5000.01, May 12, 2003, The Defense Acquisition System, USD(AT&L).

Publications

Download the SEG

MITRE's Systems Engineering Guide

Download for EPUB
Download for Amazon Kindle
Download a PDF

Questions?
Contact the SEG Team