About Us Our Work Employment News & Events
MITRE Remote Access for MITRE Staff and Partners Site Map
Our Work

Follow Us:

Visit MITRE on Facebook
Visit MITRE on Twitter
Visit MITRE on Linkedin
Visit MITRE on YouTube
View MITRE's RSS Feeds
View MITRE's Mobile Apps
Home > Our Work > Systems Engineering > SE Guide > Systems Engineering Life-cycle Building Blocks
Systems Engineering Guide

Assess Test and Evaluation Plans and Procedures

Definition: Test and evaluation is the set of practices and processes used to determine if the product under examination meets the design, if the design correctly reflects the functional requirements, and if the product performance satisfies the usability needs of personnel in the field.

Keywords: acceptance test, integration test, operational test, peer reviews, system test

MITRE SE Roles & Expectations: MITRE systems engineers (SEs) are expected to be familiar with different kinds of tests, which group conducts the tests, how to evaluate test documents, and the developer's procedures for test control. MITRE SEs are also expected to analyze test data.

Background

Testing is the way a product, system, or capability under development is evaluated for correctness and robustness, and is proved to meet the stated requirements. Testing is done at each stage of development, and has characteristics unique to the level of test being performed. At a macro level, testing can be divided into developer testing conducted before the system undergoes configuration management, and testing conducted after the system undergoes configuration management. Testing done before configuration management includes peer reviews (sometimes called human testing) and unit tests. Testing done after configuration management includes integration test, system test, acceptance test, and operational test. An operational test is normally conducted by government testing agencies. The other tests are conducted by the developer; in some cases, such as acceptance test, government observers are present.

Assessing Test and Evaluation Plans and Procedures

Assessment normally begins with the Test and Evaluation Master Plan (TEMP), which is the driver for much of what follows. The TEMP is developed by the government; detailed test plans and procedures are created by the developer. The scope, direction, and content of the TEMP are driven by the nature of the program, the life cycle, the user needs, and the user mission. For example, testing software developed for the program is quite different from testing systems that are largely based on, and require considerable integration of, commercial-off-the-shelf (COTS) products. The TEMP will influence the testing documents produced by the developer, but the developer's documents are largely driven by what it produces and is working to deliver.

The government program management office (PMO) is tasked with assessing the developer's test and evaluation plans and procedures. Often MITRE plays a central role in helping the PMO perform this assessment. The requirements on which the developer's test plans and procedures are based must be well crafted. A valid requirement is one that is measurable and testable. If it is not measurable and testable, it is a poor requirement. Developer test plans and procedures should be based on the functional requirements, not the software design. Both the test community within the developer organization and the development community should base their products on the functional requirements.

When assessing the developer's test plans and procedures, the focus should be the purpose of the test—that is, to assess the correctness and robustness of the product, system, or service. The tests should prove the product can do what it is intended to and, second, can withstand anomalous conditions that may arise. This second point requires particular care because there are huge differences in how robustness is validated in a COTS-based system versus software developed for a real-time embedded system. The environment in many COTS-based business systems can be tightly bound. A name or address field can be limited in terms of acceptable characters and field length. In a real-time embedded system, you know what the software expects to receive if all is going as it should, but you do not always know what possible input data might actually arrive, which can vary in terms of data type, data rate, and so on. Denial-of-service attacks often try to overwhelm a system with data, and the developer's skill in building robustness into the system that allows it to handle data it is not intended to process has a great deal to do with the eventual reliability and availability of the delivered product. It is not unusual for the error protection logic in complex government systems to be as large as or larger than, the operational software.

Assessment of the test plans and procedures must take all of these issues into account. The assessor must understand the nature and purpose of the system and the kind of software involved, and must have the experience to examine the test plans and procedures to assure they do an appropriate job of verifying that the software functions as intended. The assessor must also verify that, when faced with anomalous data conditions, the software will respond and deal with the situation without crashing. The test conditions in the test plans and procedures should present a wide variety of data conditions and record the responses.

For software systems, especially real-time systems, it is impossible to test all possible paths through the software, but it should be possible to test all independent paths to ensure all segments of the software are exercised by the tests. There are software tools to facilitate this, such as the McCabe suite that will identify paths as well as the test conditions needed to put into a test case. However it is accomplished, this level of rigor is necessary to assure the requisite reliability has been built into the software.

Unlike the unit test, the integration test plans and procedures focus on the interfaces between program elements. These tests must verify that the data being passed between program elements will allow the elements to function as intended, while also assuring that anomalous data conditions are dealt with at their entry point and not passed to other programs within the system. The assessor must pay particular attention to this when assessing the integration test plans and procedures. These tests must be driven by the functional requirements, because those drive what the software must do for the system to be accepted by the sponsor.

Test and Evaluation Phases

Pre-Configuration Management Testing

The two primary test practices conducted prior to configuration management are:

  • Peer Reviews are performed to find as many errors as possible in the software before the product enters the integration test. Peer reviews are one of the key performance activities at Level 3 of the Software Engineering Institute's (SEI) Capability Maturity Model. The SEI accepts two kinds of peer reviews: code walkthroughs, and software inspections (the SEI preferred process, sometimes called Fagan Inspections in reference to Mike Fagan, who developed the process). Software inspections have a well-defined process understood throughout the industry. Done properly, software inspections can remove as much as 87 percent of the life-cycle errors in the software. There is no standard process for walkthroughs, which can have widely differing levels of rigor and effectiveness, and at best will remove about 60 percent of the errors in software.
  • Unit Test is conducted by the developer, typically on the individual modules under development. Unit test often requires the use of drivers and stubs because other modules, which are the source of input data or receive the output of the module being tested, are not ready for test.

Post-Configuration Management Testing

Testing conducted after the product is placed under developer configuration control includes all testing beyond unit test. Once the system is under configuration management, a problem discovered during testing is recorded as a trouble report. This testing phase becomes progressively more expensive because it involves integrating more and more modules and functional units as they become available; the system therefore becomes increasingly more complex. Each test requires a documented test plan and procedure, and each problem encountered is recorded on a trouble report. Each proposed fix must be validated against the test procedure during which it was discovered, and must also verify that the code inserted to correct the problem does not cause another problem elsewhere. With each change made to respond to a problem, the associated documentation must be upgraded, the fix must be documented as part of the configuration management process, and the fix must be included in the next system build so that testing is not conducted with patches. The longer it takes to find a problem, the more rework is likely, and the more impact the fix may have on other system modules; therefore, the expense can continue to increase. Thus performing good peer reviews and unit tests is very important.

  • Integration Test is a developer test that is successively more complex. It begins by integrating the component parts, which are either the modules that have completed the unit test or COTS products, to form functional elements. The integration test progresses from integration of modules to form entire functional elements, to integration between functional elements, to software-hardware integration testing. Modeling and simulation are often used to provide an operational-like testing environment. An integration test is driven by an integration test plan and a set of integration test procedures. Typically an integration test will have embedded within it a subset of tests indentified as regression tests, which are conducted following a system build. Their objective is to verify that the build process did not create a serious problem that would prevent the system from being properly tested. Often regression tests can be automated.
  • Test Data Analysis: When conducting peer reviews, unit tests, integration testing, and system tests, a significant amount of data is collected and metric analysis is conducted to show the condition state of the system. Significant metric data is produced related to such things as defect density, pass-fail data on test procedures, and error trend analysis. MITRE SEs should be familiar with test metrics, and they should evaluate the test results and determine the likelihood of the system being able to meet the requirements of performance delivered on time and within budget.
  • System Test is an operational-like test of the entire system being developed. Following a successful system test, a determination is made whether the system is ready for acceptance test. After the completed system test, and before the acceptance test, a test readiness review (TRR) may be conducted to assess the readiness of the system to enter the acceptance test.
  • Acceptance Test is witnessed by the government. It is the last test before the government formally accepts the system. Similar to the system test, the acceptance test is often a subset of the procedures run during system test.
  • Operational Test is performed by an operational unit of the government. It is the final test before the system is declared ready for general distribution to the field.

Best Practices and Lessons Learned

  • Examine the reports on the pre-configuration management tests to evaluate the error density information and determine the expected failure rates that should be encountered during subsequent test periods.
  • Review the peer review and unit test results prior to the start of integration testing. Due to the expense and time needed to correct problems discovered in the post-configuration management tests, the systems engineer should understand how thorough the prior tests were, and whether there is a hint of any issues that need to be addressed before the integration test starts.
  • If peer reviews and unit tests are done properly, the error density trend data during the integration test should show an error density of 0.2 to 1.2 defects per 1,000 source lines of code.
  • Consider modeling and simulation options to support or substitute for some aspects of integration that are either of lower risk or extremely expensive or complex to perform with the actual system.
  • Complete a thorough independent review of the test results to date prior to supporting the TRR. This is especially true for performance or design areas deemed to be of the greatest risk during the design phase. Once the TRR is passed and the program enters acceptance testing, correcting problems is extremely expensive and time consuming.
  • Involve the government Responsible Test Organization (RTO) early (during the concept development phase is not too early) so they understand the programmatic and technical issues on the program. Including the RTO as part of the team with the acquisition and engineering organizations will lessen conflicts between the acquisition organization and RTO due to lack of communication and misunderstanding of objectives.

References & Resources

  1. Department of Justice, Test and Evaluation Master Plan, accessed May 28, 2010.
  2. Federal Aviation Administration, System Process Flowcharts/Test and Evaluation Process and Guidance, accessed May 28, 2010.
  3. Tufts University, Department of Computer Science, The Test Plan, accessed May 28, 2010.
  4. Link to MITRE-Only Resource "System Test Plan," SEPO Library via Onomi, viewed March 15, 2010.
  5. Link to MITRE-Only Resource "Interoperability," SEPO Library via Onomi, viewed March 15, 2010.
  6. Link to MITRE-Only Resource "Best Practices," SEPO Library via Onomi, viewed March 15, 2010.
  7. Link to MITRE-Only Resource "Capability Maturity Model Integration (CMMI)," SEPO Library via Onomi, viewed March 15, 2010.
  8. Link to MITRE-Only Resource "Commercial Off the Shelf Software (COTS)," SEPO Library via Onomi, viewed March 15, 2010.

Not all references and resources are publicly available. Some require corporate or individual subscriptions. Others are not in the public domain.

Link to MITRE-Only Resource References and resources marked with this icon are located within MITRE for MITRE employees only.


Page last updated: May 3, 2012   |   Top of page


For more information on the Systems Engineering Guide, or to suggest an article, please Contact Us.


Homeland Security Center Center for Enterprise Modernization Command, Control, Communications and Intelligence Center Center for Advanced Aviation System Development

 
 
 

Solutions That Make a Difference.®
Copyright © 1997-2013, The MITRE Corporation. All rights reserved.
MITRE is a registered trademark of The MITRE Corporation.
Material on this site may be copied and distributed with permission only.

IDG's Computerworld Names MITRE a "Best Place to Work in IT" for Eighth Straight Year The Boston Globe Ranks MITRE Number 6 Top Place to Work Fast Company Names MITRE One of the "World's 50 Most Innovative Companies"
 

Privacy Policy | Contact Us