Program evaluation: Difference between revisions

From Citizendium
Jump to navigation Jump to search
imported>Nick Bagnall
mNo edit summary
 
(18 intermediate revisions by one other user not shown)
Line 1: Line 1:
{{subpages}}
{{subpages}}
'''Program evaluation''' is the systematic collection, assessment, and dissemination of information to provide constructive feedback about a particular program, project, or policy. Information gleaned from a program evaluation can determine whether and why a program is needed, whether and how a program is being implemented correctly, whether and to what extent a program is actually making an impact, and other specifics useful to facilitate a program’s development, implementation, or improvement.
'''Program evaluation''' is the systematic collection, assessment, and dissemination of information to provide constructive feedback about a particular program, project, or policy. Information gleaned from a program evaluation can determine whether and why a program is needed, whether and how a program is being implemented correctly, whether and to what extent a program is actually making an impact, and other specifics useful to facilitate a program’s development, implementation, or improvement. These specifics are then used by the program's clients or stakeholders in making decisions about whether to continue or modify the program. 
 
A form of applied research with wide application, program evaluation is most commonly used in health and human services, education, business administration, economic development, and public policy.  


==Types of program evaluation==
==Types of program evaluation==
Line 7: Line 9:
Formative and summative evaluations can be further subdivided into other evaluation types, but their classification is only important insofar as it helps evaluators clarify key questions, such as for what purposes an evaluation is being done and what kinds of information are needed for it. After all, different types of program evaluation yield different types of information. Although programs are most often evaluated to measure their effects, program evaluation may be conducted at any stage of a program’s life to assess the program’s necessity or goals, logic or theory, process or implementation, outcomes or impact, and cost-benefit ratio or cost-effectiveness.  
Formative and summative evaluations can be further subdivided into other evaluation types, but their classification is only important insofar as it helps evaluators clarify key questions, such as for what purposes an evaluation is being done and what kinds of information are needed for it. After all, different types of program evaluation yield different types of information. Although programs are most often evaluated to measure their effects, program evaluation may be conducted at any stage of a program’s life to assess the program’s necessity or goals, logic or theory, process or implementation, outcomes or impact, and cost-benefit ratio or cost-effectiveness.  


When conducted in chronological order for a single program, these different types of evaluation are better thought of as stages in a multi-step evaluation process. First, a [[needs assessment]] tells about whether and to what extent a program is necessary. Second, a program theory, also known as a [[logic model]], details how and why a necessary program’s activities will bring about the program’s outcomes. Third, the implementation of those activities is assessed in a [[process evaluation]]. Fourth, once the program activities have been implemented for a long enough time, its outcomes may be evaluated to see what it has achieved—either through [[outcome evaluation]] or [[impact evaluation]]. Finally, after the outcomes (both the costs and the benefits) are known, a [[cost-benefit analysis]] or [[cost-effectiveness analysis]] may be done.
When conducted in chronological order for a single program, these different types of evaluation are better thought of as stages in a multi-step evaluation process. First, a [[needs assessment]] tells about whether and to what extent a program is necessary. Second, a program theory, also known as a [[logic model]], details how and why a needed program’s activities will bring about the program’s outcomes. Third, the implementation of those activities is assessed in a [[process evaluation]]. Fourth, once the program activities have been implemented for a long enough time, its outcomes may be evaluated to see what it has achieved—either through [[outcome evaluation]] or [[impact evaluation]]. Finally, after the outcomes (both the costs and the benefits) are known, a [[cost-benefit analysis]] or [[cost-effectiveness analysis]] may be done.


==A framework for program evaluation==
==A framework for program evaluation==
By conducting a program evaluation in logical steps or stages, evaluators at least partially follow an evaluation framework. An evaluation framework is useful to "summarize and organize the essential elements of program evaluation; provide a common frame of reference for conducting evaluations; and clarify the steps in program evaluation"<ref name="CDC">[http://www.cdc.gov/eval/framework.htm Center for Disease Control (CDC) detailing the purpose of an evaluation framework]</ref>
By conducting a program evaluation in logical steps or stages, evaluators at least partially adhere to an evaluation framework. An evaluation framework is useful to "summarize and organize the essential elements of program evaluation, provide a common frame of reference for conducting evaluations, and clarify the steps in program evaluation."<ref name="CDC">[http://www.cdc.gov/eval/framework.htm Center for Disease Control (CDC) detailing the purpose of an evaluation framework]</ref>


Treating a program evaluation as sequential steps is only one of two important components in adhering to an evaluation framework; the other is abiding by a set of predefined standards when carrying out each step.  
Treating a program evaluation as sequential steps is only one of two important components in following an evaluation framework; the other is abiding by a set of predefined standards when carrying out each step.  


===Standards in program evaluation===
===Standards in program evaluation===
Standards for program evaluation address important concerns such as whether the evaluation yields accurate information and whether it is done in an ethical manner. Although the standards are not unified, the [[American National Standards Institute]] approved a set of standards published in 1994 by the [[Joint Committee on Standards for Educational Evaluation]] (JCSEE).<ref name=”note1”>Although the JCSEE is primarily concerned with educational evaluation, its set of standards may apply to ''all'' types of program evaluation.</ref> The JCSEE established [http://www.jcsee.org/program-evaluation-standards/program-evaluation-standards-statements thirty standards] and divided them into four categories:
Standards for program evaluation address important concerns such as whether an evaluation yields accurate information and whether it is done in an ethical manner. Although the standards are not unified, the [[American National Standards Institute]] approved a set of standards published in 1994 by the [[Joint Committee on Standards for Educational Evaluation]] (JCSEE).<ref name=”note1”>Although the JCSEE is primarily concerned with educational evaluation, its set of standards may apply to ''all'' types of program evaluation.</ref> The JCSEE established [http://www.jcsee.org/program-evaluation-standards/program-evaluation-standards-statements thirty standards] and divided them into four categories:


*Utility: who needs the information from this evaluation, and what information do they need?
*'''Utility''' standards, which are designed to ensure that an evaluation is useful to clients or stakeholders by providing timely, clear, and above all pertinent information.
*Feasibility: how much money, time and effort can we put into this?
*'''Feasibility''' standards, which are designed to ensure that an evaluation is practical, politically viable, and cost effective.
*Propriety: who needs to be involved in the evaluation for it to be ethical?
*'''Propriety''' standards, which are designed to ensure that an evaluation is done legally and ethically.
*Accuracy: what design will lead to accurate information?
*'''Accuracy''' standards, which are designed to ensure that information gleaned from an evaluation is carefully documented and considered for its validity and reliability.


====Measurement in program evaluation====
====Measurement in program evaluation====
Line 27: Line 29:
When doing these evaluations, however, both the government agency and the private school must ensure that their measurement instruments are valid, reliable, and sensitive, or risk distorting the results of the evaluation. A measurement is valid if it measures what it is intended to measure, reliable if it produces the same results when used to repeatedly measure the same thing, and sensitive if it can accurately discern changes in a scrutinized measure.
When doing these evaluations, however, both the government agency and the private school must ensure that their measurement instruments are valid, reliable, and sensitive, or risk distorting the results of the evaluation. A measurement is valid if it measures what it is intended to measure, reliable if it produces the same results when used to repeatedly measure the same thing, and sensitive if it can accurately discern changes in a scrutinized measure.


Consider, for example, the government agency that measures the number of needy people in a given area. Is the U.S. federal government’s historical poverty measure, which is based exclusively on family cash income, a valid measure of neediness? The [[Department of Commerce]]’s Rebecca Blank denies that it is, citing both expansion in federal safety net programs (which may overstate a household’s neediness) and relative increases in necessary expenditures like out-of-pocket medical costs (which may understate a household’s neediness).<ref name="brookings">[http://www.brookings.edu/testimony/2008/0717_poverty_blank.aspx Brookings Institution testimony by Rebecca Blank]</ref> By failing to account for many potential changes to disposable income (e.g., [[Medicaid]] assistance for a disabled individual), the historical poverty measure is also insensitive to corresponding changes in neediness.  
Consider, for example, the government agency that measures the number of needy people in a given area. Is the U.S. federal government’s historical poverty measure, which is based exclusively on family cash income, a valid measure of neediness? The [[Department of Commerce]]’s Rebecca Blank denies that it is, citing both expansion in federal safety net programs (which may overstate a household’s neediness) and relative increases in necessary expenditures like out-of-pocket medical costs (which may understate a household’s neediness).<ref name="brookings">[http://www.brookings.edu/testimony/2008/0717_poverty_blank.aspx Brookings Institution testimony by Rebecca Blank]</ref> By failing to account for many potential changes to disposable income (e.g., [[Medicaid]] assistance for a disabled individual), the historical poverty measure is also insensitive to corresponding changes in neediness.


==A brief history of program evaluation==
==A brief history of program evaluation==
Standardized measurements and formalized procedures for program evaluation are relatively new and, in the United States at least, correspond to, and were practically necessitated by, the explosive growth of major domestic programs in the 1960s, such John F. Kennedy's [[New Frontier]] and Lyndon B. Johnson's [[Great Society]]. In 1993, Congress passed the [[Government Performance and Results Act]] (GPRA), which is overseen by the [[Office of Management and Budget]] (OMB) and mandates that all federal agencies "report annually on their achievement of performance goals, explain why any goals were not met, and summarize the findings of any program evaluations conducted during the year."<ref name="gaopdf">[http://www.gao.gov/new.items/gg00204.pdf GAO Report 2000 (PDF)]</ref> A 2004 report from the [[Government Accountability Office]] (GAO) remarked that the GPRA's requirements had "established a solid foundation of results-oriented performance planning, measurement, and reporting in the federal government."<ref name="gao">[http://www.gao.gov/products/GAO-04-38 GAO Report 2004]</ref> However, the GAO criticized the OMB for its poor commitment to "GPRA implementation in its guidance to agencies and in using the governmentwide performance plan requirement of GPRA to develop an integrated approach to crosscutting issues."<ref name="gao">[http://www.gao.gov/products/GAO-04-38 GAO Report 2004]</ref> The report concluded that "government-wide strategic planning could better facilitate the integration of federal activities to achieve national goals."<ref name="gao">[http://www.gao.gov/products/GAO-04-38 GAO Report 2004]</ref>
Standardized measurements and formalized procedures for program evaluation are relatively new and, in the United States at least, correspond to, and were practically necessitated by, the explosive growth of major domestic programs in the 1960s, such John F. Kennedy's [[New Frontier]] and Lyndon B. Johnson's [[Great Society]]. In 1993, Congress passed the [[Government Performance and Results Act]] (GPRA), which is partly overseen by the [[Office of Management and Budget]] (OMB) and which mandates that all federal agencies "report annually on their achievement of performance goals, explain why any goals were not met, and summarize the findings of any program evaluations conducted during the year."<ref name="gaopdf">[http://www.gao.gov/new.items/gg00204.pdf GAO Report 2000 (PDF)]</ref> A 2004 report from the [[Government Accountability Office]] (GAO) remarked that the GPRA's requirements had "established a solid foundation of results-oriented performance planning, measurement, and reporting in the federal government," but concluded that more work yet remains to achieve the lofty goal of a truly "results-oriented government."<ref name="gao">[http://www.gao.gov/products/GAO-04-38 GAO Report 2004]</ref>


==Notes==
==Notes==
<references />
<references />[[Category:Suggestion Bot Tag]]

Latest revision as of 11:01, 7 October 2024

This article is developing and not approved.
Main Article
Discussion
Related Articles  [?]
Bibliography  [?]
External Links  [?]
Citable Version  [?]
 
This editable Main Article is under development and subject to a disclaimer.

Program evaluation is the systematic collection, assessment, and dissemination of information to provide constructive feedback about a particular program, project, or policy. Information gleaned from a program evaluation can determine whether and why a program is needed, whether and how a program is being implemented correctly, whether and to what extent a program is actually making an impact, and other specifics useful to facilitate a program’s development, implementation, or improvement. These specifics are then used by the program's clients or stakeholders in making decisions about whether to continue or modify the program.

A form of applied research with wide application, program evaluation is most commonly used in health and human services, education, business administration, economic development, and public policy.

Types of program evaluation

The broad scope of program evaluation has inspired the creation of multiple classification schemes dividing evaluations by strategy or purpose. The most basic distinction in evaluation types is that between formative and summative evaluation. Formative evaluations are conducted concurrently with the program's implementation and are intended to assess ongoing program activities and provide feedback to monitor and improve the program; summative evaluations are conducted retrospectively of the program's implementation and are intended to assess a program's outcomes or impacts.

Formative and summative evaluations can be further subdivided into other evaluation types, but their classification is only important insofar as it helps evaluators clarify key questions, such as for what purposes an evaluation is being done and what kinds of information are needed for it. After all, different types of program evaluation yield different types of information. Although programs are most often evaluated to measure their effects, program evaluation may be conducted at any stage of a program’s life to assess the program’s necessity or goals, logic or theory, process or implementation, outcomes or impact, and cost-benefit ratio or cost-effectiveness.

When conducted in chronological order for a single program, these different types of evaluation are better thought of as stages in a multi-step evaluation process. First, a needs assessment tells about whether and to what extent a program is necessary. Second, a program theory, also known as a logic model, details how and why a needed program’s activities will bring about the program’s outcomes. Third, the implementation of those activities is assessed in a process evaluation. Fourth, once the program activities have been implemented for a long enough time, its outcomes may be evaluated to see what it has achieved—either through outcome evaluation or impact evaluation. Finally, after the outcomes (both the costs and the benefits) are known, a cost-benefit analysis or cost-effectiveness analysis may be done.

A framework for program evaluation

By conducting a program evaluation in logical steps or stages, evaluators at least partially adhere to an evaluation framework. An evaluation framework is useful to "summarize and organize the essential elements of program evaluation, provide a common frame of reference for conducting evaluations, and clarify the steps in program evaluation."[1]

Treating a program evaluation as sequential steps is only one of two important components in following an evaluation framework; the other is abiding by a set of predefined standards when carrying out each step.

Standards in program evaluation

Standards for program evaluation address important concerns such as whether an evaluation yields accurate information and whether it is done in an ethical manner. Although the standards are not unified, the American National Standards Institute approved a set of standards published in 1994 by the Joint Committee on Standards for Educational Evaluation (JCSEE).[2] The JCSEE established thirty standards and divided them into four categories:

  • Utility standards, which are designed to ensure that an evaluation is useful to clients or stakeholders by providing timely, clear, and above all pertinent information.
  • Feasibility standards, which are designed to ensure that an evaluation is practical, politically viable, and cost effective.
  • Propriety standards, which are designed to ensure that an evaluation is done legally and ethically.
  • Accuracy standards, which are designed to ensure that information gleaned from an evaluation is carefully documented and considered for its validity and reliability.

Measurement in program evaluation

Among the standards proposed for accuracy are common standards of measurement. Measurement is essential to program evaluation. A government agency conducting a needs assessment of a financial assistance program for the indigent must measure the number of needy people to define the target population. A private school conducting a cost-benefit analysis of a merit-based promotion program for its teachers must measure both the costs and the benefits of the program to calculate the cost-benefit ratio.

When doing these evaluations, however, both the government agency and the private school must ensure that their measurement instruments are valid, reliable, and sensitive, or risk distorting the results of the evaluation. A measurement is valid if it measures what it is intended to measure, reliable if it produces the same results when used to repeatedly measure the same thing, and sensitive if it can accurately discern changes in a scrutinized measure.

Consider, for example, the government agency that measures the number of needy people in a given area. Is the U.S. federal government’s historical poverty measure, which is based exclusively on family cash income, a valid measure of neediness? The Department of Commerce’s Rebecca Blank denies that it is, citing both expansion in federal safety net programs (which may overstate a household’s neediness) and relative increases in necessary expenditures like out-of-pocket medical costs (which may understate a household’s neediness).[3] By failing to account for many potential changes to disposable income (e.g., Medicaid assistance for a disabled individual), the historical poverty measure is also insensitive to corresponding changes in neediness.

A brief history of program evaluation

Standardized measurements and formalized procedures for program evaluation are relatively new and, in the United States at least, correspond to, and were practically necessitated by, the explosive growth of major domestic programs in the 1960s, such John F. Kennedy's New Frontier and Lyndon B. Johnson's Great Society. In 1993, Congress passed the Government Performance and Results Act (GPRA), which is partly overseen by the Office of Management and Budget (OMB) and which mandates that all federal agencies "report annually on their achievement of performance goals, explain why any goals were not met, and summarize the findings of any program evaluations conducted during the year."[4] A 2004 report from the Government Accountability Office (GAO) remarked that the GPRA's requirements had "established a solid foundation of results-oriented performance planning, measurement, and reporting in the federal government," but concluded that more work yet remains to achieve the lofty goal of a truly "results-oriented government."[5]

Notes

  1. Center for Disease Control (CDC) detailing the purpose of an evaluation framework
  2. Although the JCSEE is primarily concerned with educational evaluation, its set of standards may apply to all types of program evaluation.
  3. Brookings Institution testimony by Rebecca Blank
  4. GAO Report 2000 (PDF)
  5. GAO Report 2004