Data Driven Contractor Evaluations and Milestone Reviews

Definition: Data-driven contractor evaluations and milestone reviews provide an objective assessment of contractor performance at technical milestone reviews. Technical reviews and the content to be addressed are typically prescribed by government agency or department mandates available to MITRE staff and other project members prior to the actual milestone.

Keywords: empirical data, independent technical assessments, metrics, milestone reviews, performance assessments, technical reviews

MITRE SE Roles and Expectations: MITRE systems engineers (SEs) are expected to provide technical thought leadership and assessment throughout an entire government program life cycle. Although ongoing insight is needed to quickly grasp and respond to program risks and opportunities, its importance peaks at event-driven milestones when key government decisions are made. At those times, MITRE SEs are expected to lead and participate in teams reviewing the contractor-proposed technical approach. MITRE SEs analyze design review content against milestone entry and exit criteria to ensure that the contractor delivers quality products on time and within budget. They are expected to assess the contractor's technical and programmatic approaches, work packages, prototypes, and deliverables before and during reviews to identify issues and ensure that decision makers are provided with data-driven recommendations during technical and program milestone reviews [1].


Government milestone reviews over the past decade reinforce the current thinking within the federal government to establish and emphasize affordability goals and review them at major milestones and decision points to make sound trade-off decisions. This has been especially apparent in the Department of Defense Better Buying Power initiatives. Without solid data-driven analysis, the right decision cannot be made. 

MITRE SEs can assume many roles at technical milestone reviews. Depending on the size and complexity of the program, many staff may be supporting the same technical review or, on some programs, only one or two. Staff typically perform as subject matter experts (SMEs) for specific technical areas (e.g., adequacy of requirements capture, maturity of the architecture) to be reviewed; they provide informal and formal assessments to the government sponsor. It is also not uncommon for MITRE to develop an overall assessment of the entire technical review. This assessment may include aggregating the input from MITRE staff and other program office contractor support. Whatever the scope, focus, or size of the MITRE review effort, the overall assessment must be based largely on empirical data, metrics, and the trends they indicate, and demonstrated system performance. During reviews, MITRE staff needs to be prepared, inquisitive, confident, technically competent, thorough, current with program progress, tactful in dealing with the contractor, and convincing in their overall assessments. Finally, the resulting assessment of and recommendation on whether the technical review "passed" or "failed" can have a significant impact on whether the program meets its schedule or experiences long and costly delays.

Government Interest and Use

The government has myriad guidelines and mandates that define how systems should be acquired, developed, delivered, and sustained. In attempts to track the progress of a system development, the government has also defined a set of technical reviews to be conducted at various phases of development. Conducting these reviews successfully requires insight into contractor progress. Although it is a government responsibility to formally sign off on the final assessment of a technical review, federally funded research and development centers (FFRDCs) are relied on heavily to provide convincing and credible technical evidence to support the assessment.

Independent, fact-based engineering analysis is essential to government program managers (PMs) in making their assessment of whether a program meets its technical review criteria.

For large, critical, and high-visibility programs undergoing oversight by their respective department or agency acquisition authority, conducting an Independent Technical Assessment (ITA) to assess the maturity of the program at a major technical review (e.g., PDR, CDR) can help develop objective evidence to inform the final assessment. An Independent Program Assessment (IPA) is performed for a government sponsor to determine the health of a project or program, either in response to known or suspected programmatic deficiencies or as a tool to regularly assess program health status. An IPA is a structured assessment, usually of short duration, using predefined program success criteria focused on program outcomes and ability to meet user requirements, maintain schedule, and execute within cost and resource constraints. The assessment must be fact-based, drawing conclusions from evidence obtained through interviews with program staff, key stakeholders, and thorough examination of program artifacts. The IPA results are reported to government seniors, highlighting identified gaps, risks, and recommended mitigation actions.

Independent assessments can be used in other reviews to evaluate the health of a program. Major defense acquisition programs are required to conduct annual Configuration Steering Boards to review proposed changes to the program’s requirements or significant technical configuration changes that may impact cost and schedule performance.

It is important to ensure that technical recommendations are not influenced by the natural, collective desire of program stakeholders for the program to be viewed as a success and to move forward. Because of program pressures to succeed, technical assessments that indicate program problems may not be immediately embraced. In rare cases, it may be necessary to provide a formal, independent message of record to the PM documenting the technical assessment, the rationale for the perceived risk to the program (i.e., the likelihood of not meeting technical objectives, schedule, or cost, and the impact), what may happen if the situation is not addressed, and recommended steps to mitigate the risk. The PM should be made aware of such a message and its contents personally before it is issued. Although such a communication may not be welcomed in the short term, in the long run, it maintains the high standard that our customers expect of us.

Best Practices and Lessons Learned

Ensure consensus-based entry/exit criteria. The name, purpose, and general requirements of each technical review in standard acquisition processes are usually well defined in department or agency regulations [2]. What is often not done, but is essential for conducting a coordinated and successful technical review, is to ensure that the government team and contractor have documented formal entry and exit criteria and that consensus has been reached on their content. If these do not exist, it is important to ensure that they are created and defined. The entry/exit criteria should be tailored to meet the needs of each program. This is an area where it is important to emphasize criteria (e.g., data, prototypes, and metrics) that can be objectively assessed. Sample entry/exit criteria for many reviews are contained in the Mission Planning Technical Reviews [3].

Prepare, prepare, prepare. The backgrounds, skill sets, and experiences of the systems engineering team supporting the government at a technical review can vary widely. Depending on our role in the supported program, MITRE can and should instigate and lead government preparation meetings to ensure that entry/exit criteria are known, responsibilities of each SME are defined ahead of time, there is a pre-review artifacts/contract data requirements lists, and government leadership attending have been "prepped" on strengths/weaknesses of the contractor and where they should weigh in. It is also beneficial to conduct technical review "dry runs" with the contractor prior to the review. At the same time, be sensitive to the demands that dry runs place on the contractor. Structure them to be less formal and intrusive while achieving the insight they provide. The benefits of these dry runs are:

  • They require the contractor to prepare for the review earlier and reduce the possibility that they will create "just-in-time" charts for the major review that may have disappointing content from the government perspective. If the content falls short of expectations, there is time for the contractor to correct it.
  • They allow more people to attend a version of the review and have their questions answered because meetings will be smaller. Though key PM and technical team members will attend both the dry run and the final review, others are likely to attend only one.
  • They allow a graceful way to reschedule the review if the contractor is not ready by dry run. This is especially important for programs that are under substantial scrutiny.

Divide and conquer. No one can know all aspects of a contractor's effort, regardless of how able the staff are, how long they have been on the program, or how technically competent they are. It may also happen that a program's systems engineering staff resources may be weighted in a particular discipline (e.g., software engineers, radar engineers, network specialists). Program technical reviews are all-encompassing. They must address user requirements, risk identification and mitigation, performance, architecture, security, testing, integration, and more. If staff resources are limited, it is advisable to assign SMEs who are strong in one discipline (e.g., software engineering) the secondary responsibility of another discipline (e.g., risk identification) at the technical review. This has the benefit of ensuring that all disciplines are covered at some level during the review and provides the opportunity to train staff in secondary systems engineering disciplines that broaden their skill set and help the government in the long run.

Gauge "ground truth" for yourself. Be aware of the true program progress well ahead of the review. Know the "real" workers responsible for day-to-day development, who may be different from those presenting progress reports at reviews. This will allow you to more accurately gauge progress. This requires advanced preparation, including meeting with programmers, attending contractor in-house peer reviews, reviewing development metrics, witnessing early prototype results, observing in-house testing, and spending time in the contractor's facility to know fact from fiction.

Assess when fresh. Recognize that technical reviews can be long, tedious, information packed, and physically and mentally draining events. As difficult as it may be, attempt to conduct a government team caucus at the end of each day to review what was accomplished and to gain preliminary team feedback. Meetings do not have to be long; a half hour can be sufficient. It is advantageous to gather team members' impressions because they can quickly confirm the review's formal presentations or uncover differences. Use the entry/exit criteria to voice what was "satisfactory" and what was not. Finally, when it is time to aggregate all input for the entire review, it is valuable to have the daily reviews to streamline the assembly of the formal assessment.

Use mostly data, part "gut feeling." Although it is desirable for the technical reviews to be civil, "just the facts" affairs, sometimes exchanges become contentious and relationships between government and contractor representatives become strained. Personalities can get involved and accusations may be made, which are driven more by defensive instincts than impartial assessment of data. This is the time to make maximum use of objective data to assess contractor progress and solution development maturity, while refraining from over-reliance on anecdotal information and subjective assertions. Use metrics and the trends they illuminate as the basis for questions during the review and assessments after the review. Assess metrics to demonstrate software size, progress, and quality. (For software-intensive systems, it may be advisable to compare productivity/defect rates to other industries [4], other military systems [5], or CMMI maturity level standards [6].) Examine and, if necessary, challenge preliminary data to indicate system performance, reliability, and user satisfaction. Use staffing metrics to corroborate sufficiency of assigned resources. Review testing metrics, as well. Don't ignore "gut feelings," but use them selectively. When the data says one thing and your intuition says another, intensify your efforts to obtain additional fact-based evidence to reconcile the disparity.

Search for independence. Regardless of how knowledgeable organic project staff is on all phases of your acquisition and the technologies responsible for the most prominent program risks, it is advisable to call on independent SMEs for selected technical reviews. In fact, Department of Defense (DoD) guidance for the developments of systems engineering plans, as well as the Defense Acquisition Guidebook (DAG), calls out the need for independent SMEs. This is excellent advice. It may also be advisable to include an SME from a large, respected technical organization on the team to provide advice in their areas of special expertise (e.g., Carnegie Mellon Software Engineering Institute [SEI] on Capability Maturity Model issues). It may be advantageous to use a qualified, senior-level MITRE technical SME to lead the team, as a way of bringing the corporation to bear. It is also advisable to include a senior manager from the prime contractor being reviewed, as long as this person is not in the direct management chain of the program leadership. This can open many doors with the prime contractor that may have seemed closed in the past. Recognize that bringing on independent SMEs for a review has a distinct cost (e.g., organic staff resources will need to bring SME members up to speed). However, judiciously done, it can be worthwhile.

References and Resources

  1. MITRE Systems Engineering (SE) Competency Model, Version 1, September 1, 2007, p. 38.
  2. Defense Acquisition University, May 2013, Defense Acquisition Guidebook, Chapter 4, "Systems Engineering" (System Engineering activities to support technical reviews), accessed September 12, 2017.
  3. Defense Acquisition University, February 12, 2015, Manual for the Operation of the Joint Capabilities Integration and Development System (JCIDS), accessed September 12, 2017.   
  4. Jones, C., April 2008, Applied Software Measurement: Global Analysis of Productivity and Quality, Third Ed., McGraw-Hill Osborne Media.
  5. Reifer, J., July 2004, Industry Software Cost, Quality and Productivity Benchmarks, The DoD Software Tech News, Vol. 7, No. 2, accessed September 12, 2017.
  6. Croxford, M., and R. Chapman, May 2005, "Correctness by Construction: A Manifesto for High-Integrity Software," Crosstalk: The Journal of Defense Software Engineering, accessed September 12, 2017.  

Additional References and Resources

DoDD 5000.01, The Defense Acquisition Systems.

DoDI 5000.02, January 7, 2015, Operation of the Defense Acquisition System (November 26, 2013).

Jones, C., May 2000, Software Assessments, Benchmarks, and Best Practices, Boston: Addison-Wesley Longman.

Maister, D. H., C. H. Green, and R. M. Galford, The Trusted Advisor, 2001, Touchstone Books, Simon & Schuster. 

Under Secretary of Defense for Acquisition, Technology and Logistics (USD AT&L), "Better Buying Power" (BBP), accessed September 12, 2017. 


Download the SEG

MITRE's Systems Engineering Guide

Download for EPUB
Download for Amazon Kindle
Download a PDF

Contact the SEG Team