The Impact of Score Differences on the Admission of Minority Students: An Illustration
National Board on Educational Testing and Public Policy
Carolyn A. and Peter S. Lynch School of Education
Volume 1, Number 5 June 2000
In this paper, we discuss one of the arguments that has been advanced against the use of standardized college admissions tests: the notion that their use leads admissions officers to reject non-Asian minority students on the basis of small and insignificant differences in scores. We do not discuss in detail any of the numerous other arguments about college admissions testing, although we do briefly comment on two of them: the possibility of test bias against minorities, and the relative size of the gap between minority and non-minority students on admissions tests compared to the gaps shown on other achievement tests.
The average differences in performance between non-Asian minority students and majority students are very large and have a major effect on the selection of students and the composition of the selected student population. This effect becomes progressively larger as schools become more selective, not because of any idiosyncrasy of the tests used for selection, but simply because of the distribution of student performance. Some students (in any racial or ethnic group) will always fall on the margin of acceptance, of course, and for those individual students, a small change in test scores might tip the balance toward acceptance or rejection. This fact, however, should not obscure the magnitude of the average differences between the groups. In the aggregate, the disadvantage minority students face as a result of their test scores is not a matter of small differences at the margin. Efforts to improve access for minority students must address that fact.
Are Admissions Tests Biased or Atypical?
A full discussion of possible bias in college admissions testing is beyond the scope of this paper, as is an evaluation of the agreement between admissions tests and other measures of student achievement. The large differences in scores discussed here, however, cannot be interpreted without some information on these two questions, and a brief synopsis is provided here.
A simple mean difference in test scores between groups that is, a finding that the average score of one group is markedly lower than that of another does not in itself indicate test bias. Bias arises when the scores for one or more groups are misleading for example, if they are low because of unfair questions.
Copious research has not shown admissions tests to be biased against minorities. Admissions tests are used to predict performance in college, and they are most often validated by assessing how well they predict the early performance of students accepted to a given college specifically, their freshman-year grades or grade-point averages (GPAs). If the tests were biased against minority students, one would expect to find that minority students perform better in college, on average, than their scores predict. But that is not the case. In the case of male students, research finds the opposite: on average, black and Hispanic students achieve somewhat lower freshman grades than their scores predict. Black and Hispanic women achieve on average about the GPAs their test scores predict.
The average differences between minority and non-minority students on admissions tests are not atypically large compared to the differences typically found on tests of educational achievement. In 1999, the mean differences between black and white students on the SAT-I were 0.89 standard deviation on the verbal scale and 1.0 on the mathematics scale.(note 1) (When differences are expressed in standard deviations, they can be compared across tests.) A recent review of large-scale studies of secondary-school students showed that black-white differences in composite scores (that is, scores summing across subjects) ranged from .82 to 1.18 standard deviations.(note 2) The differences in individual subjects varied somewhat more. While the gap between blacks and whites on achievement tests has narrowed in recent decades, it remains very large.(note 3) For example, in 1994 the gap in the National Assessment of Educational Progress (NAEP) trend assessment still ranged from 0.66 standard deviation in reading to 1.08 in science.(note 4)
The similarity in the results of admissions tests and other large-scale achievement tests also argues against bias. Considerable work has gone into limiting bias in these many tests. Moreover, examination of the questions of which the tests consist shows large performance differences between blacks and whites on items that are clearly not biased, such as simple mathematics problems.
The lack of bias and rather typical mean differences shown by current college admissions tests do not mean that all possible admissions tests would yield the same results. Indeed, substituting different tests for the SAT-I or ACT would improve the prospects for some students while lessening them for others. It does suggest, however, that the current test results are not misleading in the aggregate and that only substantial changes in admissions testing, such as the measurement of important skills or content not currently assessed, could greatly change group differences.
For purposes of this paper, then, we accept that college admissions tests show fairly typical group differences and that these differences are not biased against minorities as predicators of college grades. These differences are large relative to the distribution of achievement within each group. They could stem from a variety of factors, which we will not examine here. We will merely explore the size and effects of these differences in practical terms.
Before analyzing the effects of test scores on admissions, we will translate the differences in standard deviations into another metric that is easier to understand. A mean difference of 0.60 standard deviations, which is smaller than any of the black-white differences noted above, would place the average black student at the 27th percentile among white students (Table 1). That is, only 27 percent of white students would score as low as or lower than the average black student. A more typical difference of 0.80 standard deviation would place the average black student at the 21st percentile among whites. A gap of a full standard deviation the size of the gap on the SAT-I mathematics scale places the average black student at the 16th percentile among whites.
Black-White Mean Score Differences
|Differences in standard deviations||White percentile of average black student|
How We Carried Out Our Analysis
We wanted our analysis to reflect the general pattern of group differences in performance rather than any idiosyncrasies of any particular test or test-taking group. Accordingly, rather than using data from the SAT or ACT testing programs, we simulated data that mirror the typical differences found in large-scale testing programs.
We created databases that had a mean difference of 0.80 standard deviation between blacks and whites a difference modestly smaller than those found on the SAT but larger than some of the most recent differences found in NAEP. We made the scores of simulated black students a little less variable than those of whites.(note 5) This is consistent with the pattern shown in numerous studies.(note 6) For simplicity, we set the mean of the scores to zero and the standard deviation to 1. Thus, a score of zero in our data corresponds roughly to an SAT-I score of 500.
We then examined the effects of several simple admissions rules that depended solely on scores. We set a number of cut scores on the test, and all students scoring above the cut were "accepted," while all those below it were "rejected." We considered no other characteristics of students. These are overly simple selection rules that no colleges follow, and indeed using them would be inconsistent with accepted professional standards. These unrealistically simple rules, however, isolate the effect of test scores.
The Effect of Test Scores: Three Scenarios
We present three scenarios. For simplicity, all consider only black and white applicants. The first sets the cut score at the overall mean and uses equal numbers of black and white applicants. The second retains the equal numbers of applicants but imposes a higher cut score, set arbitrarily at one standard deviation above the mean, roughly the 84th percentile. These two show the pure effect of test-based selection, independent of the smaller number of black applicants at most colleges. Comparison of these two scenarios shows the effect of greater selectivity. The final scenario maintains the cut score at one standard deviation above the mean but reduces the number of black applicants to a more realistic 15 percent of the total.(note 7)
Scenario 1: 50 percent black applicants, cut score at the overall mean
The distributions of scores in our first simulated case, like those in many actual test databases, roughly follow the normal curve (Figure 1). The mean score for all students is set to zero, and the standard deviation is one. Thus, a value of +1.0 represents a score one standard deviation above the overall mean, or roughly the 84th percentile in the entire population, while a value of 1.0 represents a score one standard deviation below the overall mean, or roughly the 16th percentile in the entire population.
Cut Score at the Overall Mean, Equal Numbers of Black and White Applicants
Because black applicants have an average score .8 standard deviation below that of white applicants, black students are clustered around an average that is well below the overall mean (roughly .68 standard deviation below the mean; see Figure 1). White students are clustered around their mean, which is modestly (.12 standard deviation) above the overall mean. Because the scores of black students vary somewhat less than those of white students, black applicants are bunched a little more tightly around their average scores than are white students. The dashed vertical line in Figure 1 represents the cut score, which is set at the overall mean score of 0. Everyone with scores above this line was "accepted," while all students below the line were "rejected."
Even with the relatively low cut score of 0 (the overall mean score), a much smaller percentage of black than of white students is accepted: About 20 percent, compared with 55 percent of white applicants (Figure 2). The percentage accepted, shown as a bar chart in Figure 2, is equivalent to the portions of the distributions above the cut score in Figure 1.
Percentages Accepted by Race, Equal Numbers of Applicants and Cut Score at Mean
Use of this low cut score causes blacks to be underrepre-sented in the admitted student body by roughly a factor of two, relative to their representation in the pool of applicants. Although they constitute half of the applicants, they constitute only 27 percent of the selected students (Figure 3).
Composition of Applicant Pool and Admitted group, Equal numbers of applicants and cut Score at Mean
Scenario 2: 50 percent black applicants, cut score at 1 standard deviation above the overall mean
The second scenario retained the same applicant pool and distribution of scores but was more selective, setting the cut score at one standard deviation above the mean (Figure 4). This would equal 616 on the SAT I-Verbal and 625 on the SAT I-Mathematics appreciably above the 25th percentile of scores of freshmen at the University of Pennsylvania on the SAT I-Verbal (560) but below the 25th percentile of those students on the SAT I-Mathematics (650).(note 8)
Raising the cut score from the mean to one standard deviation above sharply reduces the percentage of both white and black applicants accepted (Figure 4). This reduction is particularly severe, however, for black applicants. Roughly 17 percent of white students are accepted (Figure 5), compared with 55 percent when the cut score is at the mean. Only about 1 percent of black applicants are accepted (Figure 5), compared with about 20 percent when the cut score is at the mean.
Cut Score at +1 Standard Deviation, Equal Numbers of Black and White Applicants
Percents Accepted by Race, Equal Numbers of Applicants and Cut Score at 1 Standard Deviation Above Mean
Raising the cut score also has a dramatic effect on the racial composition of the accepted student population. While the applicant pool is constructed to be half black and half white, black students constitute barely 6 percent of the accepted students (Figure 6). With the cut score at the mean, blacks constituted about 27 percent of the students. In other words, with a cut score at the mean blacks are underrepresented in the student population by a factor of about two; with a cut score at one standard deviation above the mean, blacks are underrepresented by roughly a factor of eight.
Composition of Applicant Pool and Admitted Group, Equal Numbers of Applicants and Cut Score at 1 Standard Deviation Above Mean
Scenario 3: 15 percent black applicants, cut score at 1 standard deviation above the overall mean
The final scenario again uses a cut score of one standard deviation above the mean but reduces the black applicant pool to a more plausible 15 percent of the total. The result is shown in Figure 7. This is identical to Figure 4 except for the smaller number of black applications.
Cut Score at +1 Standard Deviation, 15% Black Applicants
Because the cut score and the average score for each group remain unchanged, the percentage of black and white students accepted remains the same: about 17 percent of white applicants but only 1 percent of black applicants. The smaller pool of black applicants increases the homogeneity of the accepted student population. While the applicant pool is 15 percent black, the accepted student body is roughly 99 percent white (Figure 8).
Composition of Applicant Pool and Admitted Group, 15% Black Applicants and Cut Score at 1 Standard Deviation Above Mean
These simulations illustrate that when test scores count heavily in admissions, the large differences in scores between black and white students have a major impact both on the probability that black students will be admitted and on the composition of the accepted student population. These effects become progressively more severe as the selectivity of admissions increases. For example, with a cut score at the overall mean, black students would be underrepresented by a factor of two in the student body; with a cut score at one standard deviation above the mean, they would be underrepresented by a factor of roughly eight. The relatively small number of black applicants to college, which stems in part from their relatively small numbers in the cohort of college-age students, changes neither the probability that black students will be accepted nor the proportional underrepresentation of black students in the student body. It does, however, further increase the homogeneity of the student body.
Of course, few if any schools select students solely on the basis of a cut score on an admissions test, and more common selection processes that give weight to other factors will often place black students at less of a disadvantage. Nonetheless, unless test scores are given very little weight or are offset by other factors on which minority students have an advantage relative to whites, the average test-score disparity will generally have a severe impact on admission to selective colleges.
The values used in the simulation were chosen to be representative of a broad range of tests rather than any single college admissions test. Because the mean difference between blacks and whites is somewhat larger on the SAT-I than in these simulated databases, using values from the SAT-I would exacerbate the results presented here, albeit not greatly.
A parallel simulation for Hispanic students nationwide would also show a severe impact, but smaller than that for blacks. Data from large-scale assessments typically show Hispanic students scoring somewhat higher, on average, than blacks.
It is important to emphasize that the progressively more severe impact that accompanies greater selectivity does not stem from any peculiarity of college admissions tests. It stems primarily from the roughly normal distribution of scores that is, from the fact that most students have scores quite close to the average for their group, while few have scores much higher or lower than the average. This pattern is slightly exacerbated by the fact that black students show modestly less variable test scores than do white students. That is, the percentage of black students scoring either much higher or much lower than the black average is smaller than the corresponding percentage for white students.
These results illustrate the difficulty inherent in reconciling
academic selectivity with increased equity of access to post-secondary education
for non-Asian minority groups, particularly at selective colleges and universities.
Most colleges consider a variety of other factors in addition to test scores
in making admissions decisions, and to the extent that those factors are not
strongly correlated with test scores, the problems illustrated here will be
ameliorated somewhat. The differences between minority and majority students
in academic performance as measured by diverse standardized tests are so large,
however, and their effects are so substantial at academically selective colleges,
that it will be difficult to offset their impact without confronting them directly.
1 The College Entrance Examination Board (1999). 1999 College-Bound Seniors, National Report. New York: author.
2 Hedges, L. V., & Nowell, A. (1998). Black-white test score convergence since 1965. In C. Jencks and A. Phillips (Eds.), The Black White Test Score Gap, (pp. 149-181). Washington, D. C.: Brookings.
3 Cambell, J. R., Voelkl, K.E., & Donahue, P.L. (1997). Report in Brief: NAEP 1996 Trends in Academic Performance, Washington, D.C.: National Center for Education Statistics (NCES97986); Hedges and Nowell, op. cit.,; Koretz, D. (1986), Trends in Educational Achievement, Washington, D.C.: Congressional Budget Office.
4 Hedges and Nowell, op. cit.
5 Specifically, we set the black/white variance ratio to .81.
6 See Hedges and Nowell, op cit.
7 Note that the distributions of scores used in all three scenarios are based on a population that is 15 percent black and 85 percent white. This results in a white average that is slightly above the overall average and a black average that is much lower than the overall average. If we had regenerated data for the scenarios with equal numbers of applicants based on a population that is half black and half white, the mean difference between blacks and whites would have remained the same, but both group means would have increased relative to the overall mean. (They would have been equidistant from the overall mean). This would have been unrealistic and would have confounded the comparisons among the scenarios.
8 The Princeton Review (1998), The Best 311 Colleges, 1999 Edition. New York: Random House.
About the Author
testing in the news