Jumat, 25 Maret 2016

Evaluation of Education

Evaluation is a vital component of the continuing health of organizations. If evaluationsare conducted well, organizations and their people will have the satisfaction of knowing withconfidence which elements are strong and where changes are needed. Evaluation is thereforea constructive pursuit. Evaluation is perhaps society’s most fundamentaldiscipline; it is an essential characteristic of the humancondition; and it is the single most important andsophisticated cognitive process in the repertoire of humanreasoning and logic (Osgood, Suci, & Tannenbaum, 1957). In general, we refer to objects of evaluations as evaluands. When the evaluand is a person, however, we follow Scriven’s recommendation to label the person whose qualifications or performance is being evaluated as the evaluee(Scriven, 1991). Objects of evaluations may be programs,projects, policies, proposals, products, equipment, services, concepts and theories, data and other types ofinformation, individuals, or organizations, among others.

The extended definition of evaluation has provided an expanded look at key generic criteriafor evaluating programs. From the discussion, it is evident that the Joint Committee’s 1994 definition of evaluation and our adaptation focused on generic evaluative criteria are deceptivein their apparent simplicity. When one takes seriously the root term value, then inevitablyone must consider value perspectives of individuals, groups, and organizations, as well asinformation. The combining of these in efforts to reach determinations of the value ofsomething cannot be ignored. To serve the needs of clients and other interested persons,the information supplied to support evaluative judgments should reflect the full range ofappropriate values.We now expand the definition to outline the main tasks in any program evaluation anddenote the types of information to be collected. Our operational definition of evaluation statesthat evaluation is the systematic process of delineating, obtaining, reporting, and applyingdescriptive and judgmental information about some object’s merit, worth, probity, feasibility,safety, significance, and/or equity. One added element in this definition concerns the genericsteps in conducting an evaluation. The other new element is that evaluations should produceboth descriptive and judgmental information.

Many evaluations carry a need to draw a definitive conclusion or make a definite decisionon quality, safety, or some other variable. For example, funding organizations regularly haveto decide which proposed projects to fund, basing their decisions on these projects’ relativequality, costs, and importance compared with other possible uses of available funds (also see
Coryn, Hattie, Scriven, & Hartmann, 2007; Coryn & Scriven, 2008; Scriven & Coryn, 2008). Fora project already funded, the funding organization often needs to determine after a fundingcycle whether the project is sufficiently good and important to continue or increase its funds.In trials, a court has to decide whether the accused is guilty or not guilty. In determinations ofhow to adjudicate drunk-driving charges, state or other government agencies set decision rulesconcerning the level of alcohol in a driver’s blood that is legally acceptable. These examples arenot just abstractions. They reflect true, frequent circumstances in society in which evaluationshave to be definitive and decisive.

And then a big question has come. How can we say that something good is good enough? How bad is intolerable? The problem of how to reach a just, defensible, clear-cut decision never has an easy solution.In a sense, most protocols for such precise evaluative determinations are arbitrary, but they arenot necessarily capricious. Although many decision rules are set carefully in light of relevantresearch and experience or legislative processes, the rules are human constructions, and theirprecise requirements arguably could vary, especially over time. The arbitrariness of a cut score(for example, a score that classifies scores above it [the cut line] as good and those below itas unsatisfactory) is also apparent in different 𝛼 (alpha) and 𝛽 (beta) levels that investigatorsmay invoke for determining statistical significance. Typically, 𝛼 is set, by convention, at 0.05or 0.01, but it might as easily be set at 0.06 or 0.02. In spite of the difficulties in settingand defending criterion levels, societal groups have devised workable procedures that moreor less are reasonable and defensible for drawing definitive evaluative conclusions and making associated decisions.

    These procedures include applying courts’ rules of evidence andengaging juries of peers to reach consensus on a defendant’s guilt or innocence; settinglevels for determining statistical significance and statistical power; using fingerprints and DNAtesting to determine identity; rating institutions or consumer products; ranking job applicants orproject proposals for funding; applying cut scores to students’ achievement test results; pollingconstituents; grading school homework assignments; contrasting students’ tested performancewith national norms; appropriating and allocating available funds across competing services;and charging an authority figure with deciding, or engaging an expert panel to determine, aproject’s future. Although none of these procedures is beyond challenge, as a group they haveaddressed society’s need for workable, defensible, nonarbitrary decision-making tools (also seeCizek & Bunch, 2007).

When it is feasible and appropriate to set standards, criterion levels, or decision rules inadvance, a general process can be followed to reach precise evaluative conclusions. The stepssuggested here would be approximately as follows: (1) define the evaluand and its boundaries;(2) determine the key evaluation questions; (3) identify and define crucial criteria of goodnessor acceptability; (4) determine as much as possible the rules for answering the key evaluationquestions, such as cut scores and decision rubrics; (5) describe the evaluand’s context, culturalcircumstances, structure, operations, and outcomes; (6) take appropriate measurements relatedto the evaluative criteria; (7) thoughtfully examine and analyze the obtained measures anddescriptive information; (8) follow a systematic, transparent, documented process to reach theneeded evaluative conclusions; (9) subject the total evaluation to independent assessment;and (10) confirm or modify the evaluative conclusions.Although this process is intended to provide rationality, rigor, fairness, balance, andtransparency in reaching evaluative conclusions, it rarely is applicable to most of the programevaluations treated in this book. This is so because often one cannot precisely define beforehandthe appropriate standards and evaluative criteria, plus defensible levels of soundness for eachone and for all as a group. So how do evaluators function when they have to make plans,identify criteria, and interpret outcomes without the benefit of advance decisions on thesematters? There is no single answer to this question. More often than not, criteria and decisionrules have to be determined along the way. We suggest that it is often best to address the issuesin defining criteria through an ongoing, interactive approach to evaluation design, analysis,and interpretation and, especially, by including the systematic engagement of a representativerange of stakeholders in the deliberative process.

0 komentar:

Posting Komentar