Evaluation Theories/Week 15: 4/30/14: Credible Evidence in Evaluation and Professional Issues

PSYCH 315z – Comparative Evaluation Theory

Credible Evidence edit

I think that there is no greater topic than these: and we’ll connect theory to practice here; the way that they theorize about how to approach evaluation might be related to how they approach quality and merit.

The other thing I want to do before we get started is some announcements:

1. We are going to be doing a review session next week which we’ll talk about. 2. We’ll talk about the final and all that; next week. 3.

Teaching evals for the course can be ready next week;

I want you to take these seriously: we take them seriously; we use them; we pay attention to them.

If there are aspects that you think we’re the worst thing; let us know that; both the good and the bad; so that over time we develop better quality classes; better quality instruction; instruction that meets your need.

Those are my two announcements.

Questions? Concerns? Challenges? Issues?

Format: You can keep creative with the format.

Agenda: edit

  1. Lecture: Credible Evidence
  2. Small Group Discussions; Henry & Greene
  3. Debates on Contemporary Practice Issues
    1. Things like value judgements; credible evidence; etc.

Q1: What counts as credible evidence?

A: Based on the readings: Time dependency; evaluation

Modern day idea for “Credible Evidence” started with “The Experimenting Society” (“I’m going to look at what works, and what’s using it”) - Trust scientists to use the most rigorous methods; and then trust the politicians to use the most rigorous methods.

Things that are repeatable is important.

Over time, information would improve society. We would learn about what works and what doesn’t. The other key issue here in the roots piece; this is when methodological rigor was beginning to be connected to quality evaluation.

Quality evaluation was dependent on the methodological rigor of the study. This is when you see the preoccupation with qualitative studies; focus on statistics and tools; and so on.

However, moving forward in time. What would be limited by this approach?

  1. Issues with withholding treatment. (Cook and Campbell would say, “We don’t know if it works!”)

Different level of evidence needed to answer a stakeholder question:

Paradigm Wars; the Quantitative-Qualitative debate; were basically resolved until the Fed. prioritized experimental - before that there was an uneasy peace: but there was at least peace between the post-positivists and the constructivists.

In the 90s we had increased competition, scrutiny, and demands for results. There was increased accountability.

In the 1990s we went from “experimenting society” to “evidence based” - and across multiple disciplines; Policy; social services; management; healthcare; decision-making have the words “evidence-based” before it; we’re seeing this globally.

(J: Re: you would need to show a strong link between the measure you propose and some desired criterion in research. . .)

With the US Department of Education - you’re prioritizing the kind of information that you’re looking at. J: Did the evaluation community write any documents looking at it from the DOE’s point of view; looking at their use cases, with a logic model? From their perspective, what can they use? Have we looked at the use cases?

Post-Positivists: Chochrane Collaboration (Health), Campbell Collaboration (Social, ed. Programs), IES What Works Clearinghouse - different levels of evidence; SREE (Society for Research in Educational Effectiveness” - trainings on Heirarchical Statistical Modeling;

(J: But causality outside of the academic world is dependent on stakeholders: the

Submitted a study to the What Works Clearinghouse because 7 years ago: it has yet to be reviewed. (J: Transparency would be really useful!)

What’s the gold standard for the Constructivist and Pragmatist Perspective? We need to have a way of doing this that’s based on A) Information Theory, B)

Pragmatism in Action edit

CDC Framework; (J: the quote “Step 4: Gather Credibel Evidence” - “Compiling information that stakeholders perceive as trustworthy. . . experimental or observational, qualitative or quantitative…” - it doesn’t Reduce Uncertainty as much as calling the “RCT” the “Gold Standard” - if we were able to incorporate both, and show how they’re not necessarily the best: what are

(J: Study: at what point does the level of confidence in a result by contingent

That is the most powerful method we’ve come across. . .but if we look at how much it reduces the entropy in our system. . .

Mathematica did the most rigorous study to-date: launched 1999; final reports in 2003; they had elementary and middle school samples; one of the key pieces, you will be aghast! - After 4 years of doing this study, they found no effects; especially, no effects on academic achievement. And that is what they had set out to find. “Does this program impact student achievement?” They also found out that no one was coming to the programs. . .dosage was so low that there would be no way for it to impact academic achievement!

It was a 5-million dollar study, and the result was “The dosage is too low” - What would Campbell say about evaluating programs when they might not be ready? “Evaluate them when they’re proud.” The other key thing is that they prioritized measures that were well-recognized and standardized in the literature: That was inappropriate, because in 1999, the focus of the programs was on “Providing a safe and supervised environment for kids” - Bush attached from Billion to 600 million - part of the critique, “To what extent did you not think about measures that were more appropriate to what we were doing?” The after-school community went ballistic:

  1. How do we use evidence from just one study?!

Q: Was this an example of credible evidence? Facetiously: “It was an RCT, so “Yes”!”

We had a debate on this

J: If these methods are mapped out;

- Henry: Made an argument that understanding whethe rpublic policy works is essential in a democracy; 

“when we find that any particular objects are constantly conjoined with each other. “ - (Hume, Treatise)

  • “Let an object be presented to a man of ever so strong natural reason and abilities; if that object be entirely new to him, he will not be able, by the most accurate examination of its sensible qualities, to discover any of its causes or effects. Adam, though his rational faculties be supposed, at the very first, entirely perfect, could not have inferred from the fluidity and transparency of water that it would suffocate him, or from the light and warmth of fire that it would consume him. No object ever discovers, by the qualities which appear to the senses, either the causes which produced it, or the effects which will arise from it; nor can our reason, unassisted by experience, ever draw any inference concerning real existence and matter of fact.”

ODD: for; EVEN: againstj

You need to be an expert in the area you are doing evaluation to actually do a good evaluation.

Negative argument: Federman: “Evaluation can take on many different ways”

You just used a term in its own definition.

Empowerment evaluation isn’t really evaluation at all, its a feel-good team building exercise.

  1. J: My restatement: “Empowerment evaluation isn’t really evaluation at all, it’s training for evaluation and evaluation capacity-building: It’s only evaluation in as much as it makes a judgement.


The nature of evidence: having articulated points, rationalities that can get to your perspectives.

Q2: “Multiple Realities” not scientifically sound


  1. Reality includes everything that has existed; exists; or will exist: reality is reality; there may be multiple perceptions of reality
  2. Falsifiability is important
    1. Hypothesis; scientific only if you can disprove it.


  1. Talk around the dinner table is not a waste of time
  2. Acknowledge everyone’s understanding of that context


  1. Triangulations is a good way of getting at reality; multiple perspectives can compensate; but that doesn’t mean those multiple perspectives were each correct.


Q3: Slightly inaccurate


  1. You can’t escape innacurate information; This information is better than nothing


  1. If you look at high quality evaluations with methods that are most appropriate to the context, how are you going to pass on inaccurate information to decision-makers?
  2. (J: do you know your chances of being right vs. the other person’s chances of being right by chance? - That’s assuming you even asked the same question!)
  3. You’re not doing your ethical duty for the greater god__
  4. Passing on inaccurate information goes against standarsd and guideslines


  1. Every Evaluation you do you can’t say it’s going to be 100% accurate
    1. Use most sound; always room for inaccuracy


  1. Knowing its inaccurate but passing it on without saying that it’s inaccurate
  2. Every evaluation might have room for inaccuracy
  3. J: You don’t know what the accuracy is of your experiment if you’re not using RCTs


  1. YOu need first step to ensure evaluation itself;
  2. You pass on inaccurate information so that somoene else can better the procedures from then on
  3. YOu have to acknowledge that there are inaccuracies and there are is bias


  1. If the first step is in teh wrong direction it might slow the accumulation of knowledge by skewing the perspective right off the back
  2. What’s the point of doing an evaluation if you don’t get accurate results?

  1. 5 You aren’t doing eval unless you make a value judgement
  2. Society needs a science of valuing to know that products are effective.
  3. Evidence isn’t credible if you don’t make a value judgement


You know what i love as a developmental psychologist? The growth! If I were teaching this next year, I would START with this activity!

Great job: You can see how mucho f this is a challenge to define, “What is credible evidence?”

Credible Evidence edit

J: “Yes, some ways of knowing ‘’’should’’’ be privileged above others . . . but only when they have evidence! What constitutes credible evidence?

  1. “when we find that any particular objects (J: phenomena…) are constantly conjoined with each other. “
  2. “when we have a logical reason”
    1. (J: logical reasons are usually based on past “constant conjoining” that has been compressed into short-cut rules (some of which we term, “recognition”),
  3. All of the research methods in the social sciences are attempts to show “constant conjoinment”. RCTs are a way to be able to do this reliably with the least amount of information possible.
  4. from here, all questions of evidence are questions of communication