This article about concerns I have about a traditional school structure and assessment.
Today, many more educators are using data than ever before to improve student achievement. As a trend, it’s one of the best ones in years. With greater accountability and opportunities to reach failing students, educators are flocking towards a better understanding of how to use data. It’s used primarily for instructional analysis, but also a host of other needs. The only way to get your school where you want to be is by using more accurate information. Whether it’s in the military, engineering, the post office or in front of the water cooler, accurate information is critical. I’m going to introduce you to a different way to gather data and a different way to use the data.
Brain based education is a different way of thinking about education. It says, “What do we know about the brain, and how might we do education differently?”
The current climate which ties together high-stakes testing and standards-based funding means the words “research” or “scientifically-based teaching” are now as common as attendance, discipline and suspensions. This chapter will illuminate a different, complementary basis for data collection. In short, it validates the qualitative sources of data as much as the quantitative. It adds the time component whenever possible. It’s a more whole, balanced, I believe, way to understand the data about our students. Part of reason for some misconceptions about the value of quantitative research come from the three false paradigms that have each shaped our understanding of school data.
Where It All Went Wrong
The early paradigm was the 19th century Prussian “factory model” (adapted for schools), which measured input and output efficiency. This model focused on students as “products.” This meant they wanted to measure what students achieved numerically more than who they became. The second model is the “business valuation” model. This latter half of the 20th century model focuses strongly on short-term (usually quarterly) profits. In schools, this means there’s attention (some say obsession) on this year’s profit and loss (the test scores). That’s a very different focus than a longitudinal model that measures the long-term effects including cultural, economic and environmental sustainability. And finally, to measure all these results more precisely, the scientifically based researchers draw upon their understandings and inferences from a third outdated paradigm, the Newtonian model of science. This measurement model says, “study and measure particles and matter.” It emphasized that the answer is in “the parts” and “never mind what you can’t see or touch.” It completely ignored what we now know to be fundamental: energy, waves and non-linear motion govern all of physics.
The brain of Newton was a genius one—in his time. But the Newtonian basis for science has been outdated for over 75 years. Quantum physicists have proven beyond doubt (see appendix notes for more detailed discussion on this topic) that it is energy and other non-visible forces as much or more than matter, which originates, characterizes and orchestrates life in the universe. Most research textbooks in chemistry, biology, psychology, statistics and education still do not reflect this new understanding from the quantum sciences and as a result, the information gathering methodology remains essentially bankrupt or one-sided at best. As an example, would you say that learning to learn skills or love of learning is valuable? Most educators would. Yet, how many state or national assessments measure “drive” or “love of learning?” None! In fact, the prevailing research dogma is that they are either not measurable (false) or not worth measuring (false). The dogma’s wrong on both counts.
Let’s recap for a moment. The preceding three paradigms form the philosophical and structural basis for over 90% of schools in the Western world. Imagine that— the current platform, the underlying base of assumptions on which most schools are dependent to operate, has been outdated or debunked for decades. Why is this relevant? If you’re driving around Detroit, Michigan looking for an address, but your map is of Albuquerque, New Mexico, you’ll NEVER, EVER get there. We will never get it “right” in education until we start with appropriate “maps” of the territory. The “maps” on which we have based much of our understandings of how to work effectively with students are bankrupt. How do we know that? It’s simple. If your school is easy to run, enjoyable to work in and it’s a whopping success (by all measures), you’re using the right “maps” for how to run a great school. If it’s hard, and you’re constantly struggling, stop reading and using the wrong directions and wrong “maps”—they’re not working. When things are done in alignment with the way the people, constraints and systems actually work, things will go smoothly. Otherwise, all your efforts are ill advised. Until what you’re doing “matches up with reality” you’ll always be struggling, working way too hard or looking for someone or something to blame. Brain based education is a different way of thinking about education.
What would you suggest is a better path?
The Resulting Dogma
Education has always been torn between the approach characterized 100 years ago by Horace Mann and that characterized by John Dewey. The result is that we have a wide variety of approaches to education and an enormous variety in schools. But that’s changing. What’s happening, for the first time in educational history, is that the Federal government is trying to describe and mandate what is the “right way” to educate. It’s being done not by mandating the curriculum (that’s usually left to state agencies or local school boards), but by dictating what research to use to guide educational decision-making about teaching practices. That, in turn, influences what programs will or will not get funded. It’s a bit like the Federal government saying to a private school, “You can admit only Catholics if you want, but we just won’t make any grant money available for you.”
Another part of this action is captured in essence by bold passage of The No Child Left behind Act and the reorganization of the Office of Educational Research and improvement as the Institute for Educational Science. The Institute has essentially defined and pushed hard for a careful definition of “Scientifically-Based Research” (SBR) according to several investigators (Feuer et al. 2002). The way SBR is defined systematically excludes many approaches to teaching and carefully includes other, more preferred ones. You may think that’s a good idea until you remember the definition of dogma is, “A point of view or tenet put forth as authoritative without adequate grounds” (Webster’s On-Line Dictionary).
Here is a paraphrase of the dogma put out by the government which describes what “acceptable educational research” should be like in “Identifying And Implementing Educational Practices Supported By Rigorous Evidence: A User Friendly Guide”:
Well-designed and implemented randomized controlled trials are considered the “gold standard” for evaluating an intervention’s effectiveness, in fields such as medicine, welfare and employment policy, and psychology.” (pg. 8, US Govt.Report. 2003)
All of these criteria above may sound quiet harmless and maybe even good to you. But educators who don’t do critical thinking can easily fall prey to the dogma this criterion represents. These qualifiers represent only one way of looking at educational data, not the only way (Ginsburg and Rhett 2003). This viewpoint is primarily applicable to quantitative research, not qualitative research. The last time I checked, schools were about people, not machine parts. This does not suggest that there’s no place for quantitative data. It does invite us to consider that a complementary approach (known as mixed methods), which considers BOTH types of research, should be used. I could summarize the current educational research dogma in the following hypothetical paragraph that represents current “SBR” craze.
“We are looking at an institution called school. Schools have certain basic factors in common such as buildings, staff and students. Yes, we find enormous variations in the languages used, the length of the school day, the amount of technology used, the background of the students, amount of money available, culture and the motivations of the teachers. We also find differences in administration, curriculum, the number of hours of instruction, assessments, the food served and relationships with parents. But since culture’s agreed upon goal is to improve the school’s standardized test scores, we need proven test-score raising strategies.
So, let’s look at the studies already done on schools. Of course, with so many variables, we’ll just to have to get some numbers to work with. Let’s first isolate all those variables, make some statistical assumptions, eliminate those fringe statistics (they’re too random), average the data, use multiple regressions, and come up with a graph or chart that ranks all the ‘proven’ strategies educators should be using. Finally, we can sell the lists to the educators, much like David Letterman’s Top Ten Lists.”
Now if you believe in the assumptions and conclusions in the paragraph above, then you may be pretty satisfied with what you already know about data and student achievement. After all, many traditional educational researchers have clung to that kind of process and conclusions referred to above for decades–remember the earlier parable on “dogma”? Their attitude is, “If you can’t prove it with statistics, it’s not valid.” Well, I have news. That paradigm may be prevailing, but it’s an outdated educational dogma and millions are NOT buying it.
I do understand statistics and there are multiple, alternative ways of valuing, selecting, sourcing, organizing, understanding, manipulating, interpreting and drawing conclusions from the same information. I have a Ph.D. and used statistics in my own research. But there are a considerable number of false assumptions made by quantitative researchers that are either highly debatable or simply dead wrong. They include the beliefs that information is context-free, value free and that the researcher is an objective source.
Here’s one example.
Several researchers have shown that it’s not the test itself that is producing all of the data. It is also the surrounding conditions of the test. For example, if the student is African-American and the test giver is a racist Anglo, that typically influences the outcome. Why? The new understanding of “stereotype threat” tells is that if we feel anxious about the possibility that our performance may confirm a negative stereotype, we under perform (Steele and Aronson 1995). In one simple experiment, only the conditions of the test (the subjects were told either a test was measuring intelligence or just to measure the psychology of how we problem- solve) changed the scores dramatically. The difference was that the Black students solved, on average twice many problems under one condition (the more relaxed one) than they did under the stereotype threat condition. This concept of threat is critical because it seems to influence many possible conditions (Aronson and Steele 2005). Thos who are most negatively influenced by stereotype threat are those who care the most about what others think. This is a classic case of the importance of understanding that test scores are not the “holy grail” fixed entity and that what we call intelligence is much more variable than many believe.
This simple example shows the role of environments on our learning. That’s why savvy teachers influence the environment to raise test scores. Brain based education is a different way of thinking about education. It says, “What do we know about the brain, and how might we do education differently?”
Poor school outcomes sometimes, but not always, reflect on the teachers. As an example, should poor overall health outcomes indict the efforts of physicians in Third World countries? Many work in terrible environments, with cultures they have no influence over and often use substandard equipment and supplies. Quantitatively, the data might bad. But for those patients who survived because of the kindness and aid, the data looks pretty good. Likewise, objective evidence of below-grade or below-standard mean performance of a group of students should not necessarily indict their teachers. On standardized tests, comparing averages or other indicators of overall performance from tests across classrooms, schools, or school districts may ignore the resources, student profiles and support provided to a school, school district, or individual professional.
But it ignores the long-term, sometimes “difference-making” results of teachers who “plant a seed.” What if ten years from now, some teachers have 25% better graduation rates than others? Do we measure for that? Data gathering requires many types of sources. Dogma flourishes when only one belief or paradigm is considered right. Dogma is what is being “sold” to educators about “scientific research” these days. Without intentionally crashing your whole world of assumptions on the unassailable nature of statistics, I’ll just briefly mention some of the more serious, “red-flag” concerns and flaws in the current dogma of best- selling educational research.
Using Critical Thinking to Evaluate Research
In the mainstream journal of the science community, Science, one recent article stood out. It pleaded for all of us to engage in more “scientific teaching” to improve learning (Handelsman et al. 2004). I couldn’t agree more. But part of the process is to understand what real scientists should do. They should learn to question any and every assumption on which your data, your model and the eventual conclusions are based. Scientific thinking is not being on “the bandwagon” at every turn. After all, each assumption you make carries with it a whole set of values, beliefs and corresponding decisions that are made. And each of those establishes a whole new domain for statistical exploration and a resulting vector of “possible truths.” School testing should be based on an aggregate of 10-20 factors, over a year’s time and also be longitudinal.
Here is a summary of the type of dogma that’s currently influencing educational policy:
|Dogma: “You can evaluate schools based on money spent per increment of test score increased.”|
|Dogma: “Effect sizes ought to be the best way of understanding the data.”|
|Dogma: “Strategies can be evaluated by them and still considered effective.”|
|Dogma: “Fortunately, the research needed is already done, so we can just draw from it.” (This is otherwise known as the “Always look backward dictum.”)|
|Dogma: “To understand the whole, you merely need to understand the parts.”|
|Dogma: “Our collective goal is to score the higher on standardized achievement tests.”|
|Dogma: “The basic research methods used by most educational researchers are all standard scientifically accepted practices. After all, it’s science.”|
|Dogma: “We can evaluate what’s being learned effectively because we know what curriculum should be offered.”|
|Dogma: “We have a pretty good idea of what works to make a school successful.”|
|Dogma: “All learning “factors” are created the same.”|
You might have guessed that I disagree with the assumptions above. For the next few paragraphs, we’re going to ask you to suspend what you already “know” and try something different. Start fresh, with your mind unfettered by the dogma of hardened answers. Begin by questioning the things that are not typically questioned at all, or those that are simply not challenged seriously enough. Here’s where your critical thinking might lead you.
Dogma:“You can evaluate schools based on money spent per increment of test score increased.”
Reality: You can come up with any formula statistically, but the validity of the result may be near zero. There are far too many other variables to the equation of money per student per test score increase. For example, this equation leaves out variables such as physical condition of the buildings, massive changes in the socioeconomic status of kids, teacher control over the environment, skill, enthusiasm or experience level of teachers, building air quality, prevailing health care access of students, match of curriculum to students, new technology, love of learning, participation in after school programs, drug abuse, classroom lighting, changes in pop culture, parent interest, support for special needs populations, etc.
You can’t functionally compare dollars spent per student from ten years ago to today’s dollars. It was a different world ten years ago. More students have depression, are anxious, more use the Internet, use cell phones and many even eat differently. Compared to ten years ago, more students are tested, more have disabilities and even the teachers are different. In short, schools are not factories where the tool design, parts measure and output are to be efficient first. They are organizations, which influence character, achievement, socialization, learning and life skills. They must be effective first, efficient second, not the other way around. On top of all this, the assumption that the test score is a valid predictor of life success is not proven.
Dogma: “Effect sizes are a critical way of understanding the data.”
Reality: Any basic statistics course will tell you that all effect sizes are not equal;
that’s easy to determine. But some factors only matter when others are done first. In isolation, they may be useless. The dogma assumes that a variable-oriented approach is superior to a process oriented or context-driven approach. This point of view is highly debatable. The variable approach uses effect sizes. But they ignore the common sense saying, “The straw that broke the camel’s back.” You could have an effect size of 0.15, yet it might be far more important than another factor with an effect size of .95 when it co-occurs. Effect sizes are irrelevant when a strategy is a “necessary condition” for learning. If it’s needed, it must be included.
As an example, it may be that for 20% of your student population that struggles with reading, using more movement (Reynolds et al. 2003) and frequent energizers (Sutoo and Akiyama 2003) as teaching strategies are precisely what helps some students focus and succeed. Why? The value of those resources is, to certain populations, the difference between being attentive, engaged and learning and, well, NOT learning at all. In short, some variables are critical only when others are implemented first. Other strategies may only be effective on 20% of the population, but if you go by effect sizes, you’d exclude them. That, effectively, would lower student achievement overall. The dogma denies the possibility of multiple causation in single individual cases. Yet we all know this occurs.
Dogma: “Strategies by themselves can be evaluated and considered effective.”
Reality: There is always more than one factor present at any given time. When there’s the use of visual mapping tools, there is also either cooperative learning or no cooperative learning. Each variable produces a new result, which is now mediated by the social conditions. This is the nature of synergy. Combine nitrogen and hydrogen and you get ammonia. Mix two odorless, harmless elements and you get a smelly, toxic compound. That’s synergy. You’ve heard of sports teams that have the talent to win, but just don’t “click.” That’s an example of no synergy. Similarly, you can research the strength of every factor in isolation, but that’s not the final effect. It’s likely that some factors are only important if and only if, other factors are already in place. It’s also likely that it is the combination of certain factors that produce an effect greater than the sum of the parts.
This means it’s not very accurate to claim this factor is better than that one unless you could state the other variables present (not likely). Fifty teachers will use fifty different combinations of factors when using a strategy. How on earth can one claim a strategy works unless every other variable is the same? What’s present may not be a “silent factor.” It may, in fact, be another single variable that makes it all work! That means without certainty over all the factors used, all we have are rough probabilities. This suggests that you may want to remember that factors would rarely have a singular effect and more likely they may work in synergy with others. The central dogma of quantitative SBR says, “Causation must be regular to be considered proof.” This ignores the reality of complementary co-factors, environmental and contextual influences and the wide diversity of populations.
Dogma: “Fortunately, the research needed is already done, so we can just draw from it.” This is otherwise known as the “Always look backward dictum.”
Reality: In spite of the tomes of available research, many important studies have still not been done. Let’s go back in time just one generation. What journal articles or studies would you find on the learner’s perception of safety, problem-based learning, emotional intelligences, or cooperative learning? The answer is, not much. There was just one abstract on emotional intelligence in 1966 and not one study done on safety in learning, cooperative learning or problem-based learning before 1974. Could you “prove” their effectiveness before that time? No. Did that mean teachers who were using those strategies were not using “research-based” strategies? Yes, but were they wrong? This assumption says, “We should do what’s already been done.” That’s the message of the dogma! Unfortunately, it may discourage many teachers from using highly effective and classroom tested ideas.
As an example, how much do you know about the relationship between various emotional states and a particular learning strategy? Affect (emotional state) is an example of a make or break principle. Unless that is neutral or positive, performance drops or stops (Hancock 2001). If someone in your family is overseas at war, injured, sick, about to be evicted, short of money, doing drugs, pregnant, in trouble with the law, or violent, you’ll likely get very stressed. Stress turns out to be a significant co-factor in nearly every type of research on cognition and behavior, yet it’s not addressed in the quantitative research paradigm. In fact, too much stress could even lower your intelligence (Koenen et al. 2003). You can’t ignore these qualitative factors; their effect size is so great, that it nearly invalidates all others.
Dogma: “To understand the whole, you merely need to understand the parts.”
Truth: In any complex system, an understanding of the parts does not provide an honest understanding of the whole. Context is a fundamental feature of all learning. As an example, teachers may notice a child’s feet up on a desk and a grumpy attitude. Those are certainly two parts (of the whole). But if the child’s being abused, lacks basic nutrition and gets no emotional support lower test scores are the least of his or her problems. Maybe he’s upset with the kid sitting near him, too. Hunger, affect, distress, poverty, racism, environments, violence and even brain disorders have a huge influence on students and staff. Those are “big picture” items that create significant statistical difficulties in analysis. Yet many of those are “make or break” factors in how and whether a child even learns.
The old “parts in isolation” is considered a gold standard for the dogma. But what works in one environment (classroom) may be an absolute disaster in another environment (the real world). The traditional quantitative researcher thinks of context (the complex circumstances of learning—physical, social, environmental, mental, emotional, etc.) as a statistical noise. But that notion is incomplete and often, inaccurate (Ceci and Roazzi 1994). Studies in many countries, most notably those done with the adolescent street vendors in Brazil, show that the difference between the evidence of learning in context and being tested out of context can be staggering (Carraher et al. 1985). In these studies, 98% of math problems were solved correctly in the street context by the adolescent vendors. But in the decontextualized school-like Formal Test, the average score was 37%. This staggering difference suggests that the kind of high-stakes testing we use may not be capturing what students really know. Brain based education tells us that we may want to be thinking about education in different ways.
You simply can’t disembody students, learning or schools (every piece of the system influences every other part of the system). You can’t “rate” the value of the parts accurately. You can’t separate how students feel about their teachers from how well they do in the class (affect is just as important to measure as achievement scores). You can’t justify a policy just because there are studies to support it (what was studied may be symptomatic, not causal). You can’t offer a strategy without asking whether it’s the strategy or the teacher who uses it (how much variance is there in teacher skill?). Strategies are only useful when there are underlying principles that drive their effectiveness.
Dogma: “Our collective goal is to score the higher on standardized achievement tests.”
Truth: Critical thinkers will ask questions such as “Are the standardized tests valid? Do high scores on them transfer to the real world? If one scores high on it what does it mean? Are there any downsides to the tests?” Studies suggest that the standardized tests have little correlations with other tests or real world success. One team (Koretz et al. 1991) showed that performance on a high-stakes exam did not generalize to other tests for which students had not been specifically prepared.
Other researchers also showed that high stakes local testing rarely even transfers to national standardized testing (Amrein and Berliner 2002). This challenges the notion that they’re even relevant to learning. In fact, some have found that high stakes testing actually creates the opposite of the intended effect—lower achievement (McNeil 2000). Linda McNeil claims, “Focusing exclusively on measurements and accountability may have [had] precisely the opposite of [their] intended outcomes” (p. 93).
Many savvy educators do not and will not ever buy into the standardized testing paradigm. They’ll find a balance between why they came into the profession for (love of learning and helping students) and doing just enough to keep their job (decent test scores). The tests don’t measure many things that teachers truly believe in (i.e. love of learning, social skills, life skills, etc.). Just as importantly, in order to focus on raising most test scores, teachers are forced to ignore many other things that may be even more important. So why do we persist with them? Rick Stiggins, who has studied assessments for decades, says that our current assessments actually prevent us from improving our schools. He advocates continuous, student- involved assessments, that give real-time tools for educators to act on, not sort students like apples (Stiggins, 2004).
MYTH: “The basic research methods used by most educational researchers are all standard scientifically accepted practices. And they must be right. After all, it’s all science.”
Truth: Absolutely false. Most teaching strategies offered are highly sensitive to curriculum, culture, teacher ability, grade level or special needs populations. That means there is considerable variability in the effectiveness of these suggestions. Certain strategies are curriculum sensitive (what works in an arts class may not work in a math class). Many other teachers have found certain strategies to be culture sensitive. For example, in Native American and Hawaiian populations, storytelling is an integral part of their learning process. For them, sitting in a circle makes more sense than in teams or rows. Some strategies are more teacher ability sensitive (they simply require better teacher skills). As an example, very few teachers use the range or depth of cooperative teaching strategies taught and emphasized by Laurie and Spencer Kagan (particularly the use of “structures”). If you are contrasting an ineffective use of cooperative learning against other more easily mastered strategies it will lead to meaningless data.
There are other problems with listing non-universal factors. Certain rankings are more grade-level sensitive. What works for a second-grade teacher does not always work for a senior high teacher. Not every teacher can make the necessary modifications and in some cases, the strategy simply won’t transfer. And finally, many strategies are student sensitive. They may be less effective with special needs populations than with more typical students. For students with dyslexia, reading is sluggish and almost painful. Yet more typical students ought to be reading a great deal on a consistent basis. These school realities suggest that there are many more nuances to teaching strategies than we see at first glance.
MYTH: “We can evaluate what’s being learned because we know what curriculum should be offered. What worked in the past can be prescriptive for the future. Policymakers and curriculum makers can successfully predict what role schools should take and make effective curriculum decisions.”
Truth: We actually don’t have a real good idea of what will be needed in the next 10 or 20 years. In December of the year 2000, the hottest fields for high school career paths were in technology, business and travel-related industries. A year later, they were security-related industries, health care and housing-related careers.
Why the change? One plausible explanation is that after 9/11/2001, more people stayed at home, they were more stressed, lost jobs, got overweight and sick more often. They refinanced their homes and remodeled them, too. Right now, much of what we learn in school becomes largely irrelevant when placed in the context of today’s rapidly changing, highly unstable global economy. It didn’t used to be like this—things used to be more, not less predictable.
The rate of world changes is actually accelerating. That’s why test scores are less relevant than the ability to reinvent oneself, change jobs or careers. Unless tests are measuring what we really care about, an even larger rift will develop between teachers who care about students in real life and those who work to get the higher test scores. That means we all need coping skills, stronger social skills, teamwork, love of learning, relationship skills, critical thinking skills, creativity and ability to learn skills. Are you willing to sacrifice those skills for a high stakes test score measuring artificial merits that have yet to be proven in the real world? And finally, you can’t offer curriculum based solely on the past. Most of the research from the past is what got us into the mess we’re in now. We have many political, economic and educational leaders who were educated in the old, outdated paradigm that’s no longer working. That’s why it is called dogma.
MYTH: “We have a pretty good idea of what works to make a school successful.”
Truth: That’s false. The stated goal of most schools is to prepare students for life. But we actually don’t know for sure what makes for successful human beings, but we can guess that it’s probably not memorizing the quadratic formula, knowing state capitols or diagramming a sentence. Naturally, no offense is meant to those who teach those subjects. I happen to love science, math and literature, but I’m not sure how much time there is in school for all those. Realistically, we only have probabilities about which factors may raise standardized test scores in selected populations under certain circumstances. That’s very different than having some certainty about what makes for successful human beings in life.
If you want to call a school effective, don’t tell me how it rates on state or national tests. I’m pretty clear that the criteria used are not very good. How do I know? As an employer, I’ve interviewed high school graduates for jobs, looked at the drug use incidence, reviewed youth crime statistics, analyzed the breakdowns in social structure, and I don’t like what I’m seeing very much. I’d like to see longitudinal studies to convince me that the high stakes testing we are pushing on educators is actually the best thing for our future generations. You and I know these tests will never be done—we just have to await the scary results of this grand experiment on another vulnerable generation.
Instead of flaunting test scores with percentile rankings, let’s ask if the school was a true success. Tell me how a school’s graduates do in real life. Tell me if the graduates could find a meaningful job or fit into the social fabric, pay taxes, if they participate in our democracy and follow the laws. Let’s ask if the school’s graduates contributed to peaceful, moral and ethical decision-making (no more Unabombers, please). Tell me if the graduates make contributions to our society, are law biding (no more Enron deceptions, please) and ask if they protect our environment (where I live, at least 20% of the beaches are closed on ANY given day of the year because of environmental violations).
Tell me if the graduates can raise their children without violence and have reasonable loving relationships. Tell me if the school was good for scientific inquiry. Can you assure me that students can express themselves emotionally? Can they take part in a conversation, understand the arts, or do they have an avocation? Can the graduates read complex material, apply problem-solving skills to real- world problems like urban sprawl, pollution, crime, ineffective family policies, bankrupt political systems and education? Or do they invent computer viruses and sell munitions to third world terrorists?
In short, if you can’t tell what your school’s graduates do in real life, don’t tell me a school is “effective” when you really have no clue if it’s effective or not. I’m tired of hearing what effective teachers do, what effective schools do when it really means, “How do I raise my standardized test scores?” We need two very critical things. One, we need better criteria to measure qualities of students in school besides what we’re using for measurement now. And two, we need to study what we are doing to students over the long haul. Anyone who screams “too many variables” has a valid point. But what about all the variables in parenting? We want to hold parents accountable, but schools often have kids for more waking hours per year than parents even do. Yet we don’t even have decent longitudinal data on how primary, middle school or high school graduates fare in life compared to their peers in other schools. Yes, it would be expensive to do, but it might change what we think are “effective schools.”
Dogma: “All learning “factors” are created the same.”
Reality: Factors can be organized, analyzed and understood on multiple levels. Many factors are helpful to the learning process, but may not be absolutely necessary to make it happen. An example is the implementation of the arts. I’m a huge fan of the arts, and it does correlate with higher test scores (Fiske 1999), but it’s not a “make or break” factor—at least not for ALL learners. Other factors may be a “necessary” but not a “sufficient” condition for learning. An example is the level of student brain maturation. That factor is necessary for learning, but it’s of course, not sufficient. The point here is simple. Educators have a right to know what’s necessary. Do educators absolutely have to use that “factor” or is it simply a good idea? If you don’t make the distinction between the two, you can’t decide what to do. You just don’t have time to try out every strategy offered.
Truths in Educational Research
You’ve started to understand that the research typically accepted as the “gospel” is much closer to dogma than the truth. In case I have not made myself clear by now, let me summarize the problems with some (not all) of the quantitative educational research.
1. It doesn’t match up with the real world. It is scientifically outdated because the three primary paradigms that traditional research is based upon are false or outdated.
2. Effect sizes are insufficient for understanding the data.
3. Strategies cannot be evaluated by themselves without context.
4. Much of the important research remains undone; there is no scientific proof for many things we believe to be true.
5. Understanding the parts does not help us understand the whole.
6. Our collective goal should not be to score the higher on standardized achievement tests only.
7. The research methods used by educational researchers are not the only way to identify, test, evaluate, collect and analyze data.
8. We simply don’t know yet what curriculum should be offered or how to test for it, but we do know that some of it’s not working now.
9. We do not have a good idea of what works to make a school successful because we have not done “gold standard” studies (random, blind, longitudinal). In fact, nobody has done that. Do not believe the so-called research 100% until those studies are done. What we have so far are very rough approximations.
10. Learning “factors” are not created equal; some are not necessary, but helpful. Others are necessary.
The exclusion of qualitative research from the scientifically based researcher (SBR) agendas is an arrogant and contemptuous treatment of the last 100 years of research on the value of emotions, context, the human experience, and quantum science and complex systems thinking. When only one view is being presented, it evokes some strong reactions from those who have a different understanding. Both the National Research Council (NRC) and the “SBR” typify the current educational dogma. Joseph Maxwell is an Associate Professor in the Graduate School of Education at George Mason University and specializes in educational research. He says,
“I have argued that the NRC (2002) report (and SBR generally) assumes a regularity, variance-oriented understanding of causation, and ignores an alternative, realist understanding. This leads the authors to ignore or deny the possibility of identifying causality in particular cases, the importance of context as integral to causal processes, and the role of meaning and interpretive understanding in causal explanation—all issues for which qualitative research offers particular strengths. In addition, the NRC report denies that there are important differences in epistemology and logic between qualitative and quantitative research. In combination, these assumptions lead to a hierarchical ordering of research methods in the report, treating qualitative methods as merely descriptive and supplementary to “causal,” quantitative methods, largely ignoring the unique contributions that qualitative methods can make to causal investigation. I believe that this is one important reason why there has been such a negative reaction to the report, and to SBR more generally, by many educational researchers.” (Maxwell 2004, pg.6)
But Dr. Maxwell is far from the lone dissenting voice. Wayne Wright at Arizona State University has shown how high stakes testing has actually been harmful to student learning (Wright 2002). Others, including the American Evaluation Association, American Educational Research Association), advise caution (Berlinger 2002) in using singular test scores. Their policy statement says:
“…If high-stakes testing programs are implemented in circumstances where educational resources are inadequate or where tests lack sufficient reliability and validity for their intended purposes, there is potential for serious harm. Policy makers and the public may be misled by spurious test score increases unrelated to any fundamental educational improvement; students may be placed at increased risk of educational failure and dropping out; teachers may be blamed or punished for inequitable resources over which they have no control; and curriculum and instruction may be severely distorted if high test scores per se, rather than learning, become the overriding goal of classroom instruction.” (AERA 2000)
As a researcher myself, I couldn’t agree more. Much of what AERA warned about (lack of funding for needed resources, misleading information, coercion, increased dropouts, etc.) above, has already happened (McNeil 2000). Why? The use of exclusively quantitative data in driving school performance is simply inadequate. This, in no way is to disparage all the quantitative researchers who are highly committed to improving education. While they are not the only research standard, they provide a valuable piece of the larger puzzle. Here are some examples of what happens when just one achievement score (in California, it’s the Stanford-9) is used for school measurement. To boost scores at one school in the Long Beach area, “Test your best” assemblies are held, rewards (a field trip) for classes with 100% attendance on test days and for months prior to testing, teachers were required (this mandate came from the district level) to dedicate 30-60 minutes a day for test preparation. Linda, a veteran kindergarten teacher, says,
“Everything that has to do with the test has been given such a high priority, that there is no priority any more but that…The bottom line question comes down to, “Well, what’s going to help them do better on the test?” And if it’s not going to help them do better on the test, well, we don’t have time for that right now.” (Wright, 2002, online)
I’m wondering if this is what we all had in mind when we originally went into teaching years ago? I’m also wondering if anyone realizes that this is not just an event or policy—it’s an massive social and economic experimental initiative with zero data to support it’s long-term validity.
The Debate is Two Sides of the Same Coin
We researchers are, in a way, like the six blind men (or women) and the elephant. We each gather very different information from various parts of the elephant and each can become very convinced that he or she understands the big picture. But it’s clear by the published research that we are all in different domains that need integration. In fact, my message is to both the qualitative data gatherers (like me) and to the quantitative researchers. Both groups are guilty of dismissing the contributions of the other. We truly need both sides to appreciate each other’s point of view and to make research more relevant. There are many things done well by both types of researchers and I do not doubt their sincerity or impugn motives. If we want to have standards that everyone buys into, both sides will have to commit to listening to the other’s view. Steven Covey’s work has often reminded us to seek first to understand. The best kind of research-driven practices is ones built on a complimentary approach. My point is not that I have the only or the right way, but that more of a balanced point of view needs to be heard.
The name for the blending of the two is known as mixed methods research. However, there must be a time component added to the “gold standard” for research, too. Columbia’s Madhabi Chatterji at the Teachers College argues for Extended-term Mixed-Method (ETMM) evaluations of students (Chatterji 2005). This model would target the life span of an intervention as well as other smarter design features. You and I know that’s a pretty high standard for any research. But as a goal, it’s far better than what we have now.
I’ve tried to bring this kind of balance into this discussion. While not every study used is my ideal (preferably all would be ETMM), the same thoughtful consideration went into each of the resources used and conclusions drawn. I hope you’ll join me in moving towards a better paradigm from which to understand and evaluate education. The paradigm I’m pushing is brain-based education. It is a different way of thinking about education. It says, “What do we know about the brain, and how might we do education differently?”
AERA. (July 2000). AERA Position Statement Concerning High-Stakes Testing in PreK-12 Education. Retrieved October 23, 2004, from http://www.aera.net/about/policy/stakes.htm
Amrein, A. L. & Berliner, D. C. (2002, March 28). High-stakes testing, uncertainty, and student learning. Education Policy Analysis Archives, 10(18). Retrieved March 30, 2002 from http://epaa.asu.edu/epaa/v10n18/
Aronson, J. and Steele, C. (2005) Stereotypes and the fragility of human competence, motivation and self-concept. In C. Dweck and E. Elliot (Eds.) Handbook of competence and motivation. New York, Guilford Press.
Berlinger, D. (2002) Educational Research: the hardest science of all. Educational Researcher, 31(8), 18-20.
Carraher, T., Carraher, D. and Schliemann, A. (1985). Mathematics in the streets and in the schools. Brit. Jour. of Dev. Psychol., (3), 21-29.
Ceci, S.J., and Roazzi, A. (1994). The effects of context on cognition: Postcards from Brazil. in R. J. Sternberg & R. K. Wagner, eds., Mind in Context: Interactionist Perspectives on Human Intelligence. New York: Cambridge University Press.
Chatterji, M. (2005) Evidence on “What works”” An Argument for Extended- Term Mixed Method (ETMM) valuation Designs. Vol 33, no. 9, pgs.3-13.
Feuer, M. J., Towne, L., & Shavelson, R. J. (2002). Scientific culture and educational research. Educational Researcher, 31(8), 4–14.
Fiske, Edward B. (Editor) (1999). Champions of Change: The Impact of the Arts on Learning (New in 2nd Edition). Massachusetts: Arts Education Partnership Publications. Retrieved from http://www.aeparts.org/PDF%20Files/ChampsReport.pdf
Ginsburg, A. and Rhett, N. (2003). Building a Better Body of Evidence: New Opportunities to Strengthen Evaluation Utilization. The American Journal of Evaluation, 24(4), 489-498.
Hancock, D. (2001, May-June). Effects of test anxiety and evaluative threat on students’ achievement and motivation. Journal of Educational Research, 94(5), 284-90.
Handelsman, J., Ebert-May, D., Beichner, R., Bruns, P., Chang, A., DeHaan, R., Gentile, J., Lauffer, S., Stewart, J., Tilghman, S.M., Wood, W.B. (2004). Scientific Teaching. Science, 304(5670), 521-522, 23.
Koenen, K.C., Moffitt, T.E., Caspi, A., Taylor, A., Purcell, S. (2003). Domestic violence is associated with environmental suppression of IQ in young children. Dev Psychopathol., 15(2), 297-311.
Koretz, D. M., Linn, R. L., Dunbar, S. B., & Shepard, L. A. (1991, April 5). The effects of high-stakes testing on achievement: Preliminary findings about generalizations across tests. Paper presented at the American Educational Research Association, Chicago, IL.
Maxwell, J.A. (2004). Causal explanation, qualitative research and scientific inquiry in education. Educational Researcher, 33(1).
McNeil, L. M. (2000). Contradictions of school reform: Educational costs of standardized testing. New York: Routledge.
Reynolds, D., Nicolson, R.I. and Hambly, H. (2003). Evaluation of an exercise- based treatment for children with reading difficulties. Dyslexia, 9(1), 46-71.
Steele, C. and Aronson, J. (1995) Stereotype threat and the intellectual test performance of African Americans. Journal of Personality and Social Psychology, 69 (5), 797-811.
Stiggins, R. (2004) New Assessment Beliefs for a New School Mission. Phi Delta Kappan. Sept. Vol 86, No. 1. Pgs. 22-27.
Sutoo, D. and Akiyama, K. (2003). Regulation of brain function by exercise. Neurobiol Dis., 13(1), 1-14.
U.S. Department of Education Council for Excellence in Government, Institute of Education Sciences, National Center for Education Evaluation and Regional Assistance (2003). Identifying and Implementing Educational Practices Supported By Rigorous Evidence: A User-Friendly Guide. Retrieved October 23, 2004 from: http://www.excelgov.org/displayContent.asp?NewsItemID=4885&Keyword=prppc Evidence.
Wright, W. E. (2002, June). The effects of high stakes testing in an inner-city elementary school: The curriculum, the teachers, and the English language learners. Current Issues in Education [On-line], 5(5). Retrieved from http://cie.ed.asu.edu/volume5/number5/