Children are entitled to a broadly-based assessment
of their academic progress, and grading structures that enhance individual
strengths and potential.
Jean Larson, a second-grade teacher in House Two of the Good Common School for the past Þve years, walks the aisles of her classroom as a test monitor must. Twenty-four young heads bend intently over test sheets; twenty-four young hands grip #2 pencils. And-she's willing to bet-half that many young stomachs are tight with anxiety, despite her careful instructions, despite group-taught relaxation lessons, and despite the fact that she knows she is a good teacher.
She does not administer standardized achievement tests to her students out of any love for large-scale, multiple-choice testing. She administers them because the school district insists. She dislikes test days, has little faith that the test measures what she has taught or what her students have learned, and dreads the reactions of students who score poorly-and of their parents, as well.
She dislikes the political ways test scores are used in her school district. Last year, it was obvious that a prize principalship went to the applicant whose previous school had the highest standardized test scores. She knows that these tests yield little diagnostic information useful in planning how to instruct students better.
After administering the tests, she always Þnds herself Þlled with feelings of frustration and helplessness. On every test day, she is tempted to ask the Good Common School Council to request a waiver from the school district, allowing the school to replace standardized tests with alternative assessments of student progress. But she has never yielded to the temptation.
Jean knows she is a good teacher, but she doesn't consider herself a leader. Backing her car out of the school parking lot, she remembers the quiet courage of Rosa Parks, and thinks wryly, "This girl's soul is weary too."
As she enters her kitchen, the telephone is ringing. It's the mother of one of her students, anxious about her daughter's progress. When the call is Þnished, Jean picks up the phone again and dials the number of Ella Davis, who chairs the Good Common School Council. She asks for a place on the agenda of next week's meeting. Permission is quickly granted, as Jean knew it would be. Now the wheels are turning; she can't retreat.
She has two important tasks to accomplish quickly. She must develop a brief position paper based on research to back up her presentation and she must build support among teachers and parents. She doesn't expect many teachers to disagree with her. She knows most of her colleagues share her frustration over standardized tests.
She suspects Phyllis Walker, the new principal, is sympathetic but still susceptible to the inþuences of central ofÞce politics. She knows parents will vary in their responses. Some believe test scores are an accurate indication of how well their children are progressing. Others don't. Even among parents who support the testing program, there is ambivalence; too many youngsters don't sleep the night before test day.
Jean goes to school early the following morning to tell the principal she has taken action. The conversation goes better than she expects. First, Principal Walker reveals that some principals and central ofÞce administrators are already critical of the massive changes that have occurred at the Good Common School under her administration. Then she says, "Go for it. The tiger already has so many stripes, one more won't matter." So far, so good.
Next, Jean goes to the teacher's lounge to spend the remaining minutes before Þrst bell buttonholing other teachers. Two of her colleagues are drinking coffee. One responds positively. The other shakes her head and says, "Hey, I just get paid for doing my job. I am not a fan of these tests, but I'm not going to make waves." Jean then heads for her classroom, mentally noting she must call a few parents tonight-including some of those who sit on the school council-to explain her concerns.
During lunch hour, she calls the Boston College Center for the Study of Testing, Evaluation, and Educational Policy and the ofÞces of FAIRTEST- a national organization based in Cambridge, Massachusetts that advocates broadly-based assessment in public schools-to ask for resource materials describing alternatives. Both organizations promise to put information in the mail.
By the time the council meeting arrives, the memorandum is drafted, and Jean is Þrmly convinced she is doing the right thing. Faculty support is strong; conversations with parent members of the council have been encouraging.
Jean opens her presentation by stating Þrmly that she favors account-ability, but she opposes both the pressures associated with testing and the ten-dency toward "high-stakes" testing, explaining the latter involves making important decisions about children's futures solely on the basis of standardized test scores.
She names a few such decisions that affect students: grade promotion, graduation from high school, assignments to remedial classes, assignment to "gifted" programs. Then she gives examples of how "high-stakes" testing affects teachers when merit pay, certiÞcation, or recertiÞcation ride on a single score. She explains her view that high-stakes testing is really "automatic" decision-making deliberately designed to eliminate the input of educators.
Jean next talks about how she and other teachers often feel pressure to "teach to the test"-and how, when they yield to that pressure, parts of the curriculum that deserve in-depth attention only receive it once-over-lightly.
Aware her audience is interested, she scores some additional points. Students may actually possess skills or knowledge measured by the test but may-for many reasons-be unable to demonstrate that knowledge on test day. She notes that students have a legal right to receive instruction in skills and knowledge covered on tests prior to being tested.
She proposes a variety of authentic assessment alternatives, including student portfolios, student performance tasks, student projects, structured classroom observations, and curriculum-based assessment.
Next, she hands out copies of a resolution for consideration by the council. The resolution:
· Asks the superintendent for a waiver from large-scale testing of students and seeks approval of a proposal to test only a random sample of Good Common School students to provide the school district with a measure of the school's quality; these scores will not be entered in individual student records.
· Requests formation of a teacher/parent committee to develop alternative assessment strategies.
· Seeks participation of counselors and parents in the assessment of each child's progress.
· Asks for replacement of simplistic letter grades with written descriptions of progress prepared by a child's teacher.
After some discussion, the council approves the second and third points, tables the fourth point for later consideration, and agrees to send the Þrst point to the superintendent of schools.
The next day, Phyllis Walker hand-carries the Þrst recommendation to the superintendent's ofÞce. The superintendent, in turn, presents the request to the school board at its next meeting with a recommendation that it be approved on a one-year pilot basis, effective for the next school year.
At the meeting, two members of the school board are worried the waiver will poke a hole in the district's accountability plan and generate criticism from parents. Several Good Common School parents in the audience speak in support of the proposal. The superintendent notes that increasing numbers of educators are becoming critical of large-scale, multiple-choice testing programs; one state, North Carolina, no longer funds testing of primary school students.
After that, Reverend Washington moves for approval of the proposal. His motion is quickly seconded and the waiver is approved by the school board with a vote of three to two. Reverend Washington asks the Good Common School Council's Assessment Committee to report back to the board on the outcome of its search for assessment alternatives.
For the remainder of the current school year, Good Common School students must still take standardized tests. However, the school council is considering a policy that no decision about a child's placement be made on the basis of a single test score. Next fall, experimentation with assessment alternatives will begin and standardized tests will be used solely as one rough measure of the quality of the Good Common School's instructional program.
The school board meeting adjourns at 10:30 p.m. Jean Larson is bone-tired, but her soul feels Þne. She is already thinking about another school practice she dislikes: forcing children to repeat grades.
The superintendent, who closely follows educational research, does not favor holding students back. In fact, every fall he presents research Þndings to principals showing that:
· children make progress during the year they repeat a grade, but not as much as similar children who are not retained;
· use of transitional grades or tracking on the basis of readiness tests is no more effective than retention.
Although the retention rate at the Good Common School is signiÞcantly lower than that of other district elementary schools, it is still high enough to concern the school council- especially since the majority of children who repeat grades are students of color or are limited-English-proÞcient.
As the council discusses grade retention at its next meeting, one mother in the audience asks to be recognized. When council chair Ella Davis gives her the þoor, the woman rises to speak eloquently of her son's experi-ence with retention:
"The day he brought that yellow slip of paper home I was devastated. So was he. He came home for lunch and cried. He felt like a failure at school. I felt like a failure as a mother.
"I didn't know that making children repeat grades is something schools do to cope with their own failures. I took it personally and my son took it personally. All over this city children and parents are taking it personally- while the school system does business as usual."
Another mother in the audience also raises her hand and is recognized. She stands and says she thinks it is just "good common sense" to have a student repeat a year if he is having difÞculty.
The principal then reviews the district's promotion policy and gives each member of the council a copy. She notes how research indicates that a single grade retention increases a student's chance of leaving school before graduation by 40 percent, while a second retention increases it by 90 percent (Mann, 1986).
Next, Principal Walker passes out a memorandum summarizing suggestions from researchers to lower retention rates (Smith and Shepard 1987):
replace inþexible grade structures with classes that encompass a wide range of developmental levels and learning styles, allowing children to progress at their own pace;
use a variety of instructional practices that consider natural variations in children's achievement, ability, linguistic competence, and background;
provide services to enhance school success and minimize school failure, such as tutoring, summer school, learning laboratories, guidance services, parent education, and individualized instruction;
provide enrichment and remediation in the regular classrooms as students proceed through grades;After considerable discussion of these suggestions, Ella Davis asks the principal and the school faculty to draft a set of speciÞc recommendations for changes in grouping practices, curriculum, and support services that could be implemented during the coming school year, and to set target goals to lower the school's future retention rates.
Finally, the council chair asks the principal to prepare statistical data at the end of each school year on the number and kinds of children by race, gender, and ethnicity who are retained in grade at the Good Common School. This information will be translated into the necessary languages and distributed to all parents who have children attending the school.
The discussion winds down. Phyllis Walker writes herself a reminder to take care of one last matter. She must ask the Pupil Personnel Services Director to generate a set of labels coded by families' Þrst languages from the district-wide computerized data base. Last year, the Central OfÞce produced parent information about the district's testing program, had it translated into several languages, then delivered stacks of these materials in all languages to each school. In some schools, chaos resulted when teachers who didn't know the home language of their students sent Laotian translations to Khmer-speaking families.
At the Good Common School, consciousness levels have risen a good deal. Still, there is no need to risk repeating that disrespectful and embarrassing blunder. As the meeting adjourns, Phyllis wonders how she would have responded as a young parent if she received a communication from her child's school written in Vietnamese. She might have wondered if it was a message from outer space. In those days, communities weren't nearly as diverse.
STRATEGIES FOR LOWERING RETENTION RATES
Promote an understanding that children do not develop in even stages according to a set internal timetable.
Provide training to ensure that teachers have skills to employ alternative methods of assessment, instruction, grouping, and classroom management to work successfully with highly diverse student groups.
Establish high expectations for all children.
Establish þexible standards of competence in early grades.
Provide enrichment and remediation as students proceed through grades-in the regular classroom to the maximum extent possible.
Use a variety of curricula and instructional practices that consider natural variations in achievement, ability, linguistic competence, and background.
Provide services to enhance school success and minimize school failure, such as tutoring, summer school, learning laboratories, guidance services, parent education, and individualized instruction.
Support teachers as they strive to resist sorting, labeling, tracking, and retention.
Form teacher support teams to assist regular classroom teachers to meet students' differing needs.
Adopt the assumption that promotion is preferable to retention.
Source: Smith and Shepard (1987); Dentzer and Wheelock (1990).
WHY STANDARDIZED TESTS FAIL LEP STUDENTS
Standardized tests are particularly inappropriate when assessing the academic progress of children for whom English is not a Þrst language. Children not þuent in English are likely to score poorly on standardized tests for several reasons.
Many schools fail to instruct limited-English-proÞcient (LEP) students in their native language while they are learning English.
Many schools terminate English language acquisition and related support services before students have acquired academic command of their new language.
Standardized tests contain inherent cultural biases.
LEP students may lack test-taking skills and many experience intensiÞed anxieties during test taking.
Source: First, et al., (1988).
IMPROVING ASSESSMENT OF STUDENT PROGRESS
The best schools do not rely on data generated by standardized, multiple-choice tests to assess student progress. Instead, they use a variety of assessment strategies to:
When teachers use a variety of assessment strategies to verify a child's progress, there is a high level of accountability. Among such strategies are:
Teachers' observations and notes. Teachers combine information gained from basic evaluative activities, such as focused observations of children's classroom behavior and activities, along with examples of student's work, such as writing, records of books the student has read, and how well she comprehends and uses the information the books contain.
Student portfolios. These collections contain progressive samples of a student's work, such as successive drafts of a paper he had written. Parents can look at these materials for themselves and compare the teacher's evaluation with their own. This offers important protection against teacher bias.
Checklists and inventories. These simple record-keeping techniques help teachers maintain a clear focus on the progress of an individual child. They should not be used in isolation.
Tests with open-ended questions. These tests help teachers by showing how students think and use knowledge in different subject areas. They are often teacher-made.
Products. A picture drawn in art class is a product. So is a paper based on historical research. Products offer a concrete demonstration of student progress.
These assessment techniques provide teachers and parents with useful information on which to base educational decisions about individual students. Multiple-choice tests can be one part of an overall assessment program when they are administered to a sample of students and are not administered at every grade.
Adapted from Grades 1 and 2: Assessment in Communication Skills & Mathematics, by North Carolina Department of Public Instruction; and Standardized Tests and Our Children: A Guide to Testing Reform, by FAIRTEST, the National Center for Fair and Open Testing.
Problems with Student Assessment and Grade Structure in U.S. Public Schools
Student Assessment
One of the most problematic results of recent school reform is an increased reliance on inþexible measures of student academic achievement. The drive for accountability has further rigidiÞed public schools at precisely the time when shifting student demographics require greater þexibility.
Children perceived as "different" by virtue of their race, language, culture, or economic class are not well served by narrow assessment methods that heighten existing inequities while providing very little information actually useful for improving individual instruction.
Researcher Dennis Wolf (1989) suggests some basic underlying assumptions that shape assessment techniques currently used by most public schools.
As noted by researcher Lorrie Shepard (1989), assessment techniques that serve the growing obsession with competitively ranking schools and school systems are likely to be large-scale, formal, objective, time-efÞcient, cost-efÞcient, widely applicable, and centrally processed. In order for the results to be useful to policymakers, they must also be highly simpliÞed. These demands are nicely Þlled by standardized tests.
Standardized Testing
Researchers Noe Medina and Monty Neill (1988, 1990) observe that public schools in the United States administer an estimated 100 million standardized tests to more than forty million students each year-an average of more than two and one-half tests per student.
The number of states that mandate standardized testing has increased greatly in recent years. By 1987, twenty-four states required students to pass a standardized test before graduating from high school, twelve states used standardized tests to determine grade promotion, and forty-two states used standardized tests for student assessment.
Standardized tests are now widely employed to make critical decisions in early elementary school grades. A survey conducted by M. Therese Gnezda and Rosemary Bolig (1989) reports that thirty states employ readiness testing prior to kindergarten, and finds use of readiness tests prior to Þrst grade in forty-three states. As a result, more and more five- and six-year-old children are being denied the opportunity to enter or attend school with their age-mates.
Researchers George Madaus and Diana Pullin (1987) warn of the perils of "high-stakes" testing-the reliance on standardized tests to make important decisions such as school admission, grade promotion, assignment to remedial classes or "gifted" programs, graduation from school, allocation of funding, or certiÞcation of teachers or schools.
High-stakes testing usually begins when a state board of education or the legislature mandates a testing program for implementation by the state Department of Education-often as the result of demands for increased accountability or tougher school standards. Because of understafÞng, many state education departments must contract with an independent vendor to create, validate, score, and report test results. Generally, this contract goes to the lowest bidder.
Madaus (in Brandt, 1989) notes that two types of testing companies have come into existence over the years-established, main-line businesses that publish and market their own tests, and smaller companies that build customized tests for their clients. These smaller companies have captured a large share of the lucrative state-level market, even though many states lack the personnel or training to properly evaluate the technical adequacy of test contractors.
The type of test most often used is a traditional norm- or curriculum-referenced test. Nevertheless, school districts simply set a cut-off score and assume the test is valid for use in any context. The pressure to come up with test scores-particularly higher scores-has become great, while there is little political incentive to properly evaluate the tests or their impacts.
The development and selection of standardized tests is driven more by concerns of cost and ease of use than by quality and accuracy. The American Educational Research Association (1985) has joined other national organizations to urge those who develop and use standardized tests to consider issues of 1) reliability, 2) validity, 3) norming, and 4) bias, but many test developers address these issues in either a cursory or inadequate manner.
1. Test reliability assesses the degree to which a test provides a consistent measure. In demonstrating reliability, test publishers generally examine consistency among different forms or subsections of a test. They rarely inform potential test buyers about variations in individual test scores from one administration of the test to the next.
2. Test validity assesses how well a test measures what it claims to measure and what can be accurately inferred from its measurement. Efforts to demonstrate test validity often rely on review of test items by a panel of experts or a comparison of test results with results from other standardized tests. Because validity relates to the particular use of a test, not simply to the test itself, a test can have high validity for one purpose and low validity for others. Test publishers and purchasers often ignore or fail to understand this and assume or imply that tests are equally valid for all purposes.
3. Establishing test norms to effectively gauge student scores has become controversial in recent years. Because test norms are often based on populations considerably different from those being tested, some norms are so inaccurate and out-of-date that they artiÞcially inþate test scores, leading to the ludicrous impression that most children are performing above average.
4. Tests can be biased because they are designed by and for White, middle- to upper-class individuals and rarely offer an accurate measure of academic skills and abilities of minority, low-income, or limited-English-proÞcient students. Instead, such tests actually measure divergence from White, middle- and upper-class language, cultural experiences, and learning style.
While proponents base increased reliance on standardized tests on claims of improved educational accountability, evidence continues to mount that such tests: employ faulty assumptions about how children learn; fail to provide parents, students, and teachers with useful diagnostic information; and are racially, ethnically, and economically biased.
Standardized tests operate on the assumption that human intelligence is essentially one-dimensional and that learning patterns are consistent among different individuals. This simplistic view of thinking and learning is contradicted by established theories of intelligence and child development which conclude that children develop at their own pace, following individual paths to learning, problem-solving, and thinking.
Because standardized tests focus on a limited range of basic academic skills, their increasing importance distorts student instruction. "Teaching to the test" has narrowed curricula in many classrooms, a phenomenon intensiÞed by textbook publishers who shape book content to prepare students for standardized tests-a trend that has not made the learning experience more interesting to students. It is ironic that many efforts intended to promote "higher-order" skills-such as decision-making, problem-solving, and comprehension-have also mandated the use of standardized tests that measure little beyond basic academic skills.
The National Association for the Education of Young Children (NAEYC, 1988) notes that multiple-choice, true/false, and Þll-in-the-blank questions generally test basic reading, writing, or mathematics skills in an artiÞcial manner. Even though current research supports an approach to reading instruction that integrates oral language, writing, reading, and spelling in a meaningful context that emphasizes comprehension, standardized tests view reading simply as word recognition and phonics.
NAEYC also observes that, while current theories of math instruction focus on use of Þrst-hand experiences that allow a child to construct a concept of numbers, standardized tests continue to deÞne math skills as knowledge of numerals. Test results reveal little about the ability to read and comprehend, to write and communicate ideas effectively, or to compute and use computations to solve real-world problems.
In most high-stakes testing programs, a single test score-called a cut-score-is used as a point at which automatic decisions are made about students, ignoring the fact that most skills, knowledge, or competencies are continuous in nature. George Madaus (in Brandt, 1989) observes that the use of cut- scores makes each question a "mini-test" in itself, because a single wrong answer or faulty, ambiguous, or miskeyed item can be the difference between passing or failing the test.
Standardized tests offer one single opportunity for students to display the skills or knowledge they possess. For many reasons- personal, health, problems at home, test anxiety, lack of test-taking skills-a child may be unable to demonstrate these skills or knowledge when taking the test.
While many seem persuaded that standardized test scores are accurate indicators of academic achievement, simplistic interpretations of test scores as a tool for educational accountability fail to consider a dark truth. There are two ways to "raise" student test scores. Children can learn more, or the composition of the testing pool can be altered. Low-achieving students, who may get lower test scores, can be removed from the testing pool by labelling them as handicapped and placing them in separate special education classes, by transferring them between elementary schools, or by pushing them out of school by "counseling" them to leave before graduation.
Readiness Testing
The most disturbing trend in recent years is the increased use of standardized tests to delay a child's enrollment in kindergarten, to retain young students in a second year of kindergarten, or to place children in "extra-year" programs such as junior kindergartens, developmental Þrst grades, or "transition" classes. A survey by researcher Tom Schultz (1989) revealed that forty states report the use of developmental kindergartens or transitional Þrst grades in some school systems.
Barbara Willer and Sue Bredekamp (1990) of the National Association for the Education of Young Children view the increase in readiness testing as a means of "gatekeeping" rather than as an earnest effort to determine children's needs as a basis for effective intervention and provision of services. They suggest a series of faulty assumptions that act to exclude children who fail to demonstrate certain skills in a certain way:
Learning only occurs in school. Although parents are children's Þrst teachers-providing rich, experiential bases for language, social, emotional, physical, and cognitive development-current conditions in the U.S. work against children being prepared for formal school experience; more young children are in poverty than members of any other age group; children start life ready to learn, but the lives they live enhance or restrict their potential.
Readiness is a speciÞc condition inherent within every child. In fact, readiness is multidimensional, including social skills, physical development, cognitive abilities, and emotional adjustment; tremendous individual differences exist; gatekeeping deÞnitions of readiness are based primarily on a single dimension such as knowledge of numbers and letters.
Readiness can be easily measured. Researchers agree that no reliable and valid measure of readiness is available.
Readiness is more a function of time; some children need more time than others. Outdated theories of child development view growth as a function of maturation; development is now viewed as an interaction between what goes on inside a child's head and their experience with people and objects; rather than waiting for it to occur, adults have to play an active role by structuring an environment that challenges the child's construction of new knowledge.
Children are ready to learn when they can sit still and listen to the teacher. Children learn through active manipulation of materials and experiences; when forced to sit still, not talk, and circle the correct answer, they learn: 1) not to work cooperatively with peers; 2) there is only one right answer; and 3) learning takes place to pass a test, not for the sake of learning.
Children who aren't ready don't belong in school. The very traits that are used to label children as "unready" are often those best developed in school; the concept of "readiness" assumes it is the child's task to meet demands of school, rather than the school's job to be ready for the child.
Grade Structure
Public elementary schools are generally organized around a rigid set of sequential grade levels, beginning with kindergarten and-depending on the school system-continuing through grades 4, 5, 6, or 8.
Although students are expected to remain in each grade for a single school year, they are allowed to proceed to the next grade only upon demonstrating competence in knowledge and skills deemed appropriate to that grade level. Students who fail to demonstrate the necessary competence are not promoted and are generally required to repeat the grade until competency can be demonstrated.
Grade Retention
Because the practice of requiring children to repeat grades has always been viewed as a state or local issue, there is no reliable longitudinal data available on national retention rates in public schools. After reviewing available state and local data, researchers Mary Lee Smith and Lorrie Sheppard (1987) estimate an overall retention rate of 15 to 19 percent, placing U.S. public schools on a par with those of countries like Haiti or Sierra Leone. By contrast, Japan has a 1 percent rate of grade retention.
In a 1989 survey of forty urban school districts, Joseph Gastright (1989) found that:
1. Before assessment procedures and exercises are developed, educational standards should specify what students should know and be able to do.
For assessment information to be valid and useful, assessment must be based on a consensus definition of what students are expected to learn and perform at various developmental stages. Such standards should address important abilities, such as problem solving, rather than discrete pieces of information or isolated skills. Standards should be determined through open discussion among experts, educators, parents, policymakers, and others, including those concerned with the relationship between school learning and life outside of school.2. The primary purpose of assessment should be to assist both educators and policymakers to improve instruction and advance student learning. Students educators, parents, policymakers, and others have different needs for assessment and different uses for assessment information. For example, teachers, students, and parents may want information on individual achievement, while policymakers and the public may want information for accountability purposes. In all cases, the system should provide not just numbers or ratings, but useful information on particular abilities. All purposes and uses of assessment should be beneficial to students; assessment that cannot be shown to be beneficial should not be used at all.
3. Assessment standards, tasks, procedures, and uses should be fair to all students. Because individual assessment results often affect both students' present situation and future opportunities, the assessment system, the standards on which it is based, and all of its parts must treat students equitably. Assessment tasks and procedures must be sensitive to cultural, racial, class, and gender differences, or disabilities, and must not penalize any groups. To ensure fairness, students should have multiple opportunities to meet standards in different ways. No student's fate should depend upon a single test score. Assessment information should be used fairly.
It should be accompanied by information about access to the curriculum and about opportunities to meet the standards. Students should not be held responsible for inequities in the system.
4. The assessment exercises or tasks should be valid and appropriate representations of the standards students are expected to achieve.
A sound assessment system provides information about the full range of knowledge and abilities considered valuable and important for students to learn. Multiple-choice tests, the most common type of assessment, are inadequate to measure many of the most important educational outcomes and do not allow for diversity in learning styles or culture. More appropriate tools include student portfolios, open-ended questions, and extended reading and writing experiences which include rough drafts and revisions, individual and group projects, and exhibitions.5. Assessment results should be reported in the context of other relevant information. Information about student performance should be one part of a system of multiple indicators of the quality of education. Multiple indicators permit educators and policymakers to examine the relationship among context factors (such as the type of community, socioeconomic status of students, and school climate), resources (such as expenditures per student, physical plant, staffing, and money for materials and equipment), programs and processes (such as curriculum, instructional methods, class size, and grouping), and outcomes (such as student performance, dropout rates, employment, and further education).
6. Teachers should be involved in designing and using the assessment system. For an assessment system to help improve learning outcomes, teachers must fully understand its purposes and procedures and must be committed to, and use, the standards on which it is based. Teachers should participate in the design, administration, scoring, and use of assessment tasks and exercises.
7. Assessment procedures and results should be understandable. Assessment information should be understandable to those who need it-students, teachers, parents, policymakers, and the general public. At present, test results are often reported in technical terms that are confusing and misleading. They should be reported, instead, in terms of educational standards.
8. The assessment system should be subject to continuous review and improvement. Large-scale, complex systems are rarely perfect. Even well-designed systems must be modified to adapt to changing conditions. Plans for the assessment system should provide for a continuing review process in which all concerned participate.
North Carolina Elementary Schools:
Grades 1 and 2
In the summer of 1988, the Mathematics Section and Communication Skills Section of the North Carolina Department of Public Instruction received the mandate to develop alternative assessments for grades 1 and 2 in lieu of the California Achievement Test.
Prior to passage of the assessment reform legislation, staff at the Communica-tion Skills Section were concerned that standardized testing was driving the curriculum. They feared that teachers, pressured to increase test scores, were teaching isolated test skills rather than developing integrated reading, writing, and oral language abilities. Over the preceding year, they had designed an alternative assessment for use with first and second graders to assess communication skills. Their draft assessment instrument had been sent to over 300 teachers for feedback and recommendations.
While staff from the Mathematics Section had not developed an assessment tool, they too were concerned that teachers were teaching mathematics as a set of separate discrete skills instead of a series of interrelated concepts. When they received the directive from the State Board of Education, they were in the midst of a mathematics leadership skills seminar attended by more than 300 teachers from across the state identified as leaders in mathematics instruction. Workshop sessions were modified to include extensive discussion on the design of the assessment instrument. Teachers' input was used to design alternative mathematics assessment techniques.
Developers of the alternative assessment program wanted to "affect instruction through assessment," according to Jeanne Joyner, Elementary Consul-tant with the Mathematics Section of the North Carolina Department of Public Instruction. Her counterpart at the Communication Skills Section, Cindi Heuts concurs: "We wanted to put the focus of assessment back where it should be-improving instruction. Educators must become keen observers of children by looking at their accomplishments and mistakes as part of the developmental process. It's not that children have failed but that they have not yet reached that level of understanding."
The assessment program is designed to gather information about individual children in order to plan future lessons, to document each child's progress toward specified goals during the year, and to evaluate each child's achievement of those goals. In both communication skills and mathematics, assessment evaluates development in several content areas. For communications skills, these include oral language, orientation to print, reading strategies, listening and silent reading comprehension, and unassisted writing. The seven content areas for mathematics, based on recommendations of the National Council of Teachers of Mathematics, include numeration, geometry, classification and patterning, measurement, problem-solving and mathematical thinking, understanding and using data, and computation.
Evaluation of a student's progress and understanding in any of the content areas is based on information from a number of sources gathered over a period of time. Responses to questions, demonstrations with materials, products and displays, samples of written work, and informal observations all provide teachers with information for completing student profiles.
Summaries made two or three times a year on the assessment profiles ar eto reflect multiple sources of information. The profiles are not designed to be used as check-off sheets but rather as a synthesis of anecdotal records kept by the teacher on each child. Because teachers verify children's progress through a number of methods, a high level of accountability is possible.
According to the developers, the assessment program is more appropriate for young children than standardized tests because it evaluates children's understanding through what they are able to do, as well as what they are able to explain and to write. Teachers use activities and manipulative materials as part of the evaluation. The assessment acknowledges children's early abilities to solve problems, identify patterns, and see relationships by validating other demonstrations of these capacities in addition to pencil and paper examples. The assessment assists teachers in planning appropriate experiences for individual children and identifies their progress toward curriculum goals, as well as their achievement of them.
The assessment program was piloted at eight elementary schools across the state during the 1988 to 1989 school year. In September, teachers at the pilot schools attended two days of intensive training on the assessment program.
In November, a full-day follow-up session of all pilot teachers took place to address teacher concerns, share ideas, and develop and revise strategies for implementation. Teachers met again for two days in January to continue
to share their ideas, experiences, and recommendations, but also to celebrate their successes. Throughout the year, state and regional staff maintained contact with teachers piloting the program in order to respond to concerns, give suggestions, and hear teachers' recommendations for revisions of the assessment tools. In the spring, videotapes were made of teachers and students engaged in assessment activities for use in the more extensive statewide training.
In the summer of 1989, the State Department of Instruction began statewide training for the assessment program. Because schools are not mandated to use the new assessment techniques-the legislation prohibited the use
of state money by local school districts to purchase standardized achievement tests for first and second graders but mandated only the development of alternative assessments, not their implementation-staff can only invite school systems to participate. They encourage local school systems to select one "lead" teacher for every twenty first- and second-grade teachers to
send to the statewide training-an intensive week-long summer institute conducted by a state team. The training team works with the lead teachers, who in turn conduct local training sessions using videos and other materials prepared by the state team.
Training for lead teachers focuses on the philosophy and purpose of the assessment program. Teachers examine the first- and second-grade curric-ulum, discuss issues and concerns related to mathematics and communication skills for young children, and work with manipulative materials and activity-oriented lessons.
The training also focuses on the specific staff development sessions lead teachers are responsible for conducting in their own school systems. Participating teachers receive classroom sets of profiles and guidebooks
for teachers that review the philosophy, discuss record-keeping, and detail strategies for using the assessment program. Brochures are also provided to distribute to parents early in the school year. In addition, each school system receives a videotape for training that is also suitable for parents' nights and community awareness programs.
During the year, regional consultants in each of North Carolina's eight education regions are available for staff development. They provide support
to teachers, locate appropriate materials, bring in speakers and observe in classrooms. In addition, local school systems are encouraged to sponsor
and facilitate sharing and discussion sessions for participating teachers every four to six weeks. Early informal surveys indicate that the most successful implementation of the assessment is occurring where administrative support is the strongest. This support includes purchase of appropriate materials, special training sessions for school staff, parent nights, and monthly teacher sharing sessions.
Carol Midgett, a second-grade teacher at Southport Elementary School,
has "tremendous support from the principal" for the assessment program. Located in rural working-class Southport, the school hosts over 800 children in kindergarten through fifth grade. Because the school was one of the eight that originally piloted the assessment program, Midgett attended the pilot trainings, the statewide week-long trainings, and has conducted staff development sessions for local teachers.
Assessment has become an integral part of Carol Midgett's teaching day.
As she puts it: "I seize every opportunity for teaching moments and assessment moments from the minute I walk into the classroom until I leave for the day." She finds she has become more attuned to what her children know, what they are learning, and what they need to know and has become a "better observer of kids, the activities they engage in, and the meaning of these activities."
Portfolios of her children's work and her observations provide accurate and useful records of when children demonstrate knowledge of a concept as well as the depth of their understanding. Through the math journal each child keeps, Midgett has often discovered that children understand concepts she did not realize they had grasped or they fail to grasp others she thought they understood.
A writing folder allows children to visualize their progress from the beginning of the year to the end. At the close of the year, she takes representative samples of work and binds them in a book for each child, to reaffirm their growth and achievement over the year. She also finds the portfolios "a tremendous value" in demonstrating children's progress to their parents. "They reaffirm parents' beliefs that their children are learning and picture it in a very convincing way."
To date, the State Department of Instruction has trained teachers to use the assessment program in 106 of the 134 school districts in the state. Close to 1,000 lead teachers have been trained in either the communication skills or mathematics assessments. However, the degree of implementation of the assessment program has varied from school district to school district, ranging from a few first- and second-grade teachers in scattered schools to all the first- and second-grade teachers in every elementary school in a system. Because school systems are not mandated to use the assessment instrument, it is up to individual districts to make decisions regarding use and implementation.
State staff have discovered that the largest obstacle teachers need to over-come is continuing to view assessment as something they do at the end of an activity rather than as part of ongoing observations and documentation.
As Cindi Heuts points out: "They need to fight the temptation to see assessment as something they do at the end of six weeks."
State staff have also learned that some administrators have concerns about accountability. They question how they can evaluate teachers who use the assessment program since it does not provide hard statistics as compared to standardized tests. State staff stress this assessment process has more accountability than most tests-teachers must keep observation records and samples of children's work to document their progress and achievement.
For further information, please contact: Ms. Jeanne Joyner or Ms. Cindi Heuts, Department of Public Instruction, 116 West Edenton Street, Raleigh, NC 27603.
Graham Parks School:
Cambridge, Massachusetts
At the Graham Parks School, kindergarten through eighth-grade classrooms are structured to celebrate and promote diversity. As Principal Len Solo states: "We believe that diversity is the best starting point for learning."
The 400 children enrolled at the school reflect a diverse student population. Fifty-three percent of the children are of color, 23 percent of whom are recent immigrants from Haiti. Sixty percent of the children come from working-class or poor families. Because the school has gained a reputation among parents for success with learning-disabled students, 35 percent of its student population has special learning needs.
All the classrooms at the Graham Parks School, with the exception of kindergarten, are comprised of children from more than one grade. The school has experimented with several different classroom structures over the years but has settled on one that combines two grades in each classroom with no overlap among classrooms. Children are placed in a first/second, third/fourth, fifth/sixth, or seventh/eighth classroom. They remain with the same teacher for two years.
In the past, classrooms were structured for overlap between grades; for example, one class was a second/third while another was a third/fourth, to help meet the diverse needs of students. While this structure offered optimum flexibility for placing students, it created difficulties for freeing teachers to meet for joint planning times, staff development, and support. In the end, the staff chose to give up some of the flexibility in return for consistency in staff support and development as a team.
Teachers form teams based on their classroom structure. For example, all the kindergarten and first/second classroom teachers form a team, all the third/fourth teachers form a team, and so forth. First/second, third/fourth, and fifth/sixth teams meet once a week after school. Seventh/eighth teachers meet three times a week during the school day. The school is presently examining ways to replicate the seventh/eighth team structure at the lower levels. During team meetings, teachers discuss instructional approaches, discuss and develop curriculum, share materials and ideas, and address any individual student concerns teachers may bring before the team.
A complex process that involves staff members as well as parents is used to place students in classrooms. The appropriate "sending" teacher team, the special needs and bilingual teachers, the staff developer, parent coordinator, and principal review all students moving into the next grade cluster. Using corkboards and visual aids (for example, different color index cards to signify various ethnic backgrounds) they create groups of students that are balanced by race, sex, academic achievement, socioeconomic status, and special education and bilingual status. According to Principal Solo, the goal is to "create groups that mirror each other as much as possible." The staff then pulls teachers names out of a hat to match each group with a particular teacher. The school also solicits feedback from parents on what class they would like their children placed in based on their learning needs and friendship groups.
After the initial grouping decisions are completed, the staff reviews them with the appropriate "receiving" teachers and two additional parents. At this point, parent requests and teacher concerns are taken into consideration. However, maintaining the balance among the classrooms is a top priority. Children are only moved if the change does not affect the overall balance. Letters are then sent home to parents sharing placement decisions.
A grievance process is available to parents who are not satisfied with their child's classroom placement. A grievance committee comprised of the principal, the staff developer, the parent coordinator, and two other parents hears parent concerns. According to Principal Solo, on the average, the committee hears ten grievances a year of which half are usually granted.
Judy Richards, a third/fourth-grade teacher, espouses an "approach to teaching and learning that embodies democratic values and seeks pluralism." According to Richards, teachers must address the status of each child in the classroom, including validating and respecting their culture. They must also recognize that children learn in a number of different ways.
Linda Fobes, a first/second-grade teacher, points out that all teachers have a group of students with a wide range of abilities, regardless of classroom structure. The teacher's role is to access each child's strengths and needs and develop and draw on strategies, resources, and materials to facilitate each child's progress.
In Richards's and Fobes's classes, children are active participants and take considerable responsibility for their own and their peers' learning. Fobes begins the year by exploring which themes or topic areas students most want to learn about. After they make decisions about topics, she develops charts with various subject areas they plan to cover over the year. Then they decide on activities and lessons for the various subject areas.
For example, because many of the children were interested in learning about dinosaurs, this was a topic studied for a couple of months. For mathematics, the children measured dinosaurs. They also made dinosaur cookies. As a class, they collaborated on building a dinosaur skeleton out of cardboard with each child responsible for different bones. In addition, they made dinosaur eggs to study dinosaurs' reproductive process. For language arts, they each produced dictionaries on dinosaurs. In another collaborative class project, the children made a Big Book (an oversized book children can read together) on dinosaurs entitled Dinomites. Each child had to make up or "discover" a dinosaur, give it a name, and provide two interesting facts. Once the book was completed, the class read it to fifth/sixth graders.
In Judy Richards's classroom, the curriculum is also theme-driven. For example, a unit on "Changes" covered base changes in number systems for mathematics and verb changes in both English and Haitian Creole for language arts. In science, changes studied included chemical changes, the growth cycle, and the birth and death cycle. The children also read folk-tales and studied historical changes and character and plot changes.
The children in Richards's classroom often teach each other. As Richards puts it: "There are twenty-six teachers and twenty-six learners in my classroom." She has developed a mechanism for her students to let her know if they are grasping a concept; if children understand a concept, they signal with a "thumbs up" sign; if they don't, the signal is "thumbs down." Judy believes teachers too often repeat the same auditory approach-"saying it once, saying it again, and saying it yet again louder." Instead, Richards has the "thumbs up" students "repackage" or use a different approach to explain the concept to the "thumbs down" students. In this way, students have the responsibility for teaching their peers, who are more likely to grasp the concept because it is presented in a number of different ways. The students become involved in their own education and get excited about their capacity to learn.
Children teaching children and working together cooperatively in a variety
of small groups is a thread that runs through all classrooms at Graham Parks. Children often tutor peers who are having difficulty and collaborate on whole-class or small-group projects. Because the curriculum focuses on themes and content rather than skills alone, all the students work together regardless of individual skill level. For example, in Linda Fobes's class, groups of children worked together to construct Native American villages. Each group decided what tribe to study and developed and assigned the various tasks they had to complete to successfully construct the village.
Teachers strive to create an atmosphere where children recognize that each of their peers has important and valuable knowledge to offer the group. Differences are recognized but never seen as deficits. Diversity and different cultural contributions are recognized and celebrated throughout the curric-ulum. As Judy Richards states: "It is essential to make all children's words, all children's heritages part of the curriculum."
A unit on "Islands" in Linda Fobes's class allows children to spend considerable time studying the history and culture of Haiti. When covering the topic of "Space," as with all topics, they study the contributions to space explorations by scientists and others from various cultures. Judy Richards uses the folklore of her students to present mathematics problems.
While meeting the diverse academic needs of students in multi-graded classes requires considerable skill, both Judy Richards and Linda Fobes agree that this is not the biggest challenge they face. Rather, it is the differences in social/emotional development among the children. Peer tutoring and cooperative group activities allow children to take on leadership roles and help their less mature peers. In addition, both teachers recruit parents to assist in their classrooms, providing an extra pair of hands in the classroom and allowing parents to take on an important role in their children's education. Discrepancies between home and school culture are reduced, parents and teachers acquire a better understanding of each other, and children see that their parents value their education.
The ability to develop strong, rich relationships with parents is one benefit of having children stay with the same teacher for two years. Parents and teachers are more invested in knowing each other because they share their children's lives for an extended period. Having children for two years also means that teachers can better understand and plan for the needs of half of their students before the beginning of the next school year. Moreover, because the older children have already developed a trusting relationship with the teacher and understand the routines and structures of the class-
room, they become leaders-or as Judy Richards says "disseminators of the classroom culture"-and can help younger children to adjust successfully to their new classroom.
Linda Fobes has developed activities to promote the leadership role of her second-year students. At the end of each year, her upcoming first graders visit her classroom and meet her present first graders. Children choose buddies and before the year is over, the older children write welcoming letters to their buddies. Linda sends the children's letters in August with her cover letter telling the children to look for something special in their cubby on the first day of school. The children arrive at their first day in their new classroom knowing they have a special surprise in their cubby along with an older buddy to show them around. For the first few weeks of school, the older children are responsible for helping their buddies during lunch, recess, at the end of the day, and at any time during the day if they are having difficulty finding materials or with the general routines of the classroom. Buddies also help each other with lessons and activities.
With up to six or seven grade levels in some multi-graded classrooms, academic ability range can be very wide. Linda Fobes's first/second classroom has children working on pre-reading skills while others read at a sixth-grade level. Because children remain with the same teacher for two years, grade retention is an infrequent practice at the Graham Parks School. According
to Principal Len Solo, only two or three children are retained each year, and retention is very rare beyond first grade. A full review of the child by a team of staff members with parent input is completed before any decision is made to retain a child.
Successfully teaching multigraded classes requires careful planning and organization. Among other things, the curriculum must be planned for two-year cycles because children stay with the same teacher for two years. Adding to this difficulty is the fact that the majority of commercial curricula still focus on specific subject areas and skill development. At the Graham Parks, teachers are not boxed in by a standard curriculum they must follow and complete. They are empowered to develop their own curriculum within school and city guidelines.
The team structure becomes critical in facilitating this process. Teachers
share resources, materials, and ideas. They develop curriculum together and problem-solve together to address any difficulties a team member may be experiencing in his or her classroom. Ongoing communication, support, and mutual respect help teachers to build confidence and encourage them to become risk-takers in their classrooms. As Linda Fobes concludes: "The most important thing is the support we get from each other. Here one always feels that there is more to do, that whatever you are doing you can do it better.
We are able to communicate with each other and have a lot of respect for each other. People go into each other's rooms. There is an openness and willingness to share that you do not find in many schools. All too often in many schools teachers go in their classrooms and close the door."
For further information, please contact: Mr. Len Solo, Principal, Graham Parks School, Upton Street, Cambridge, MA 02139.
Atlantic Center for Research in Education:
Campaign to Eliminate Testing
The campaign to end the use of norm-referenced standardized testing in the first and second grades of North Carolina's public schools began when the state's General Assembly voted in 1983 to use the California Achievement Test in early grades.
The Atlantic Center for Research in Education (ACRE) opposed the use of this test because of its low reliability and validity for young children, and its failure to provide useful instructional information to teachers. Additionally, the tests resulted in narrowed, test-driven curricula and pressured school staff to spend a disproportionate amount of instructional time testing rather than teaching their students.
According to ACRE staff, repealing the testing mandate was "as much a state of mind as any set of techniques." Working to eliminate testing required a strong conviction that these tests were ineffective and destructive to students, particularly young children. Because testing enjoyed strong popular support, including the governor and the State Superintendent of Public Instruction, ACRE knew it was in for a long battle.
Monitoring the Testing Environment
ACRE began by gathering research on testing and collecting information and support from testing experts. In addition, ACRE monitored meetings of the State Board of Education and the North Carolina Testing Commission-the organization responsible for the state's standardized testing programs.
The decision to monitor these meetings helped to shape ACRE's strategy in a number of ways. When discussion of the California Achievement Test appeared on the commission's agenda, ACRE was ready to provide an alternative viewpoint. Discussions and presentations in these meetings enabled ACRE to understand political and administrative attitudes toward testing, allowing them to locate the most vulnerable point of attack. This specialized knowledge led ACRE to focus on first and second grades, since people seemed reluctant to label children at an early age.
While the state's General Assembly was in session, ACRE monitored relevant committee meetings to better understand the workings of the assembly
and the positions of individual legislators. From its past legislative activities, ACRE was able to gain support from a small group-spanning the political spectrum- within the assembly. However, development of a strong legislative movement was frustrated by the governor's strong support for standardized testing.
Finding a Strong Ally
ACRE found a firm ally in the North Carolina Association for the Education of Young Children (NCAEYC), composed of professors, teachers, and day care providers concerned with the quality of services for children from infancy to eight years of age. The ACRE staff had worked previously with NCAEYC and knew them to be a highly effective policy and lobbying organization. When ACRE staff formally presented their ideas to NCAEYC, both organizations discovered shared views about the use of standardized testing in early grades.
The two groups complemented each other. ACRE had experience with lobbying and cutting through the bureaucratic maze of the State Department of Public Instruction. NCAEYC had 1,700 dues-paying members with vast knowledge about testing, facilitating letter-writing campaigns, and telephone trees. In addition, ACRE held workshops, published articles in its newsletter, and created the Parent and Citizen Test Review Commission to champion the cause of opposing standardized testing.
A Move to Legislative Action
In 1987, NCAEYC persuaded a state legislator to introduce a bill to stop standardized testing of first and second graders. By this time, the governor was less interested in testing, and the legislature was beginning to hear complaints about the amount of classroom time devoted to testing and the stress it was causing to young children.
ACRE's organizing efforts produced a small group of dedicated teachers willing to lobby the state assembly in conjunction with the testimony of NCAEYC teachers presented to legislative committees. To quote an ACRE staff member, "Few things are more persuasive than having someone who looks like your first-grade teacher tell you what to do." The telephone tree and letter-writing campaigns, along with other organizing and lobbying efforts, resulted in unanimous passage of the anti-testing bill by the state senate.
Members of ACRE and NCAEYC attended all meetings of the House Education Committee, finding the bill stalled when opposition surfaced from the state superintendent and the former governor. Political pressure had caused the committee chair to dramatically slow down the bill's movement through the committee process.
ACRE and NCAEYC reported the stalled bill to three supportive state senators from the Appropriations Committee. While it was impossible for these legislators to get the bill out of the House Education Committee, they were able to attach an amendment to the overall education budget that removed $150,000 in funding for testing and transferred it to the Division of Instructional Services with the expressed condition it be used to produce developmentally appropriate, individualized assessment instruments for first and second graders. By continuing their telephone and letter-writing campaigns, ACRE and NCAEYC helped to enact a further provision in the spring of 1988 to eliminate the testing requirement.
What Made It Work
The greatest strength of this campaign was the combined effort of two organizations with a shared vision of eliminating standardized tests from the lives of young students. One group offered a sizable membership with education knowledge and experience. The other offered a politically experienced staff.
Six years is a long time to sustain a volunteer group as it faces off against the vast capacities of a state bureaucracy and national testing companies.
This campaign, like many grassroots efforts, suffered from a lack of time and money. Due to lack of resources, the campaign had limited success with generating parent support. But it was able to maintain an astonishingly persistent group of volunteers who became increasingly skilled in lobbying and communicating.
For further information, please contact: Atlantic Center for Research in Education, P.O. Box 1068, Durham, NC 27702. Telephone (919) 688-6464.
RESEARCH APPENDIX
Current research relating to topics discussed in Chapter 4
I.STUDENT ASSESSMENT
Increased Use of Standardized Tests
Public schools in the United States administer an estimated 100 million standardized tests to their more than 40 million students annually-an average of more than two and one-half tests per student (Medina & Neill, 1990). The volume of testing has been increasing by 10-20 percent annually over the past 40 years (Haney & Madaus, 1989).
Between 1985 and 1987: the number of states requiring students to pass a standardized test to graduate from high school increased from 15 to 24; the number of states using standardized tests to determine grade promotion increased from 8 to 12; and the number of states using standardized tests as part of a state student assessment program increased from 37 to 42 (Medina & Neill, 1990).
In many public schools, standardized tests are the primary or sole criteria for making a number of decisions affecting students, teachers, and schools. Standardized tests determine: student assignment to special education or remedial programs; admission to gifted and talented or accelerated programs; grade promotions; and high school graduation (FairTest, 1990; Medina & Neill, 1990).
Problems with Standardized Tests
Compelling evidence undermines claims by test developers that standardized tests are scientifically developed instruments which simply, objectively, and reliably measure student achievement, abilities, or skills.
When test publishers claim that standardized tests are "reliable,"
Dennie Palmer Wolf (1989) contends that current
assessments not only fail to promote but actually prohibit students from becoming thoughtful critics of their own work. The message of tests is that what matters is a "slice
of skills" addressed by the test; that on-the-spot work is sufÞcient; and that achievement takes priority over development. Along with other critics of standardized tests, Wolf advocates adoption of assessments based on students' work and growth over a period of time. A portfolio, or "longitudinal collection," of students' work would serve not only to give a full and accurate representation of their abilities and development, but also to expand the students' own understanding of the learning process.
Just as curriculum has been narrowed, so too have textbooks. A recent report
by the Council for Basic Education concludes that "instead of designing
a book from the standpoint of its subject or its capacity to capture the
children's imagination, editors are increasingly organizing elementary reading
series around the content and time of standardized tests...As a result,
much of what is in the textbooks is incomprehensible" (Goodman, et
al., 1988; Tyson-Bernstein, 1988).
| |