Alpha Beta Testing
Instructor: Dr. Kecia Scott
August 6, 2012
Psychological Testing: Alpha Beta Tests
The history of psychological testing in the armed services is the history of psychological testing itself. World War I served as a laboratory for the study of modern psychology and provided clients for the emerging field of psychological examination. The military used aptitude tests to classify recruits as eligible for officer training or to select appropriate positions within the military (Markowitz, 1998). Aptitude tests are designed to measure both learning and inborn potential for the purpose of making predictions about the test taker??™s future performance (Cohen and Swerdlik, 2010). Aptitude tests have been used by the military since World War I to screen prospective inductees for military service (ASVAB, 2012). Aptitude tests were administered in group format to provide information to military commanders on the abilities of military personnel (ASVAB, 2012).
The Alpha Beta tests were the first formal tests administered in group format to individuals in military service. They were developed by psychologists for the US Army in 1917 and 1918 (Yerkes, 1921). The Army Alpha test measured an individual??™s numerical ability, verbal ability, ability to follow directions and their knowledge of basic information (ASVAB, 2012). Army Beta tests measured the same constructs, but were the non-verbal counterparts to the Alpha tests. The Beta test was used to evaluate the aptitudes of illiterate, unschooled or non-English speaking draftees and volunteers (ASVAB, 2012).
Over 1.5 million recruits were given the tests to identify specific classifications for individuals to serve. It was used to identify those who showed promise for leadership roles in the military (ASVAB, 2012). Both the Alpha and Beta test were based on the postulated premise that intelligence was an inherited trait and the tests measured native, inborn intelligence (Markowitz, 1998).
The Alpha test battery included tests of knowledge and cognitive skills (Markowitz, 1998). In initial results, poorly educated officers performed well on the Alpha test as a support to the claim that the test measured inborn, native intelligence. Since this measurement was taken among officers only, the findings suggested that literacy practices among officers may have been higher leading to improved performance on the Alpha test (Yerkes, 1921). In addition, whites exceeded blacks at all levels of education. Scores for Northern Blacks exceeded those of Southern Blacks (Yerkes, 1921). It is suggested that these trends that began with Alpha Beta tests continue with intelligence testing today.
The selection of the test (Alpha or Beta) was based on the number of years of education reported by the recruit (Yerkes, 1921). Yerkes (1921) stated that those recruits who reported less than six years of education were sent to Beta testing. Additionally, men who were non-English speakers or spoke English poorly were also sent to Beta testing. Men who took the Alpha test and were subsequently judged to be poor readers were also referred to Beta testing. Yerkes (1921) noted a lack of consistency in the procedures that determined which test a recruit took across testing locations.
The Impact of Alpha and Beta Tests
Data from the article Negro Intelligence and Achievement Norms (Development, 1963), concluded that before interpretations could be made as to the validity of the results of the Alpha Beta tests, what constituted native or inborn intelligence must be identified. This was another in a long line of studies that questioned the composition of what constituted native or inborn intelligence.
Results from alpha beta tests were used not only in the armed services, but data collected from these tests were used in other studies. In one study, mental measurement figures derived from the Alpha Beta tests were used to show a correlation between intelligence and suicide rates (Voracek, 2007). Much of this was based on the intelligence measurement provided by the results of military testing.
When the results of army tests were published, it was stated that the average mental age of the recruits was 13 suggesting a mental age at the level of moron; the scores of African Americans were considerably lower, 10.4 (Yerkes, 1921). The revelation of these results began the continuing debate over whether or not tests measured what they intended to measure.
According to Millsap (1995), psychometric literature has viewed testing bias as being consistent. Millsap (1995) posited:
???What we need to be sure of, of course, is that the test measures the same construct equally well in the various populations in which it is intended for use and does not also measure some other characteristic on which the groups differ but which is uncorrelated with the construct purportedly measured by the test???. (p. 578)
In other words, is the test differentiating not intelligence constructs, but cultural, environmental and experiential constructs Although Yerkes (1921) insisted that the test was measuring native, inborn intelligence, many of the questions did not demonstrate this. For example, one question on the Alpha test under the category of judgment asked; If a man made a million dollars, he ought t; (1) pay off the national debt; (2) contribute to various worthy charities or; (3) give it all to some poor man. Depending on the individual??™s cultural and/or environmental orientation, one, some or all of these responses might be appropriate. The psychological developers, however only accepted one response as the correct answer (Yerkes, 1921).
The debate concerning cultural and environmental considerations in test questions continues today. The findings of the original studies of the Alpha and Beta tests demonstrated disparities between ethnicities. Psychological testing today continues to study the disparities that are often evidenced in testing results. When Millsap (1995) stated that we need to ensure tests are not measuring characteristics that are not correlated with the construct that the test measures, consideration has to be given to experience, exposure and cultural influences in the construction of test questions.
The use of multiple assessments in diagnosis today is a result of this debate. Reliance on one measure given the potential bias inherent in the examination increases the likelihood of misinterpretation and potential misrepresentation. A portfolio of assessments using tests that have been checked for validity and reliability in conjunction with other measures (case studies, observations, etc.) is a more accurate means of directing and developing appropriate interventions for all clients.
ASVAB. (2012, August 4). ASVAB: Official Site of the ASVAB. Retrieved from Official Site of the ASVAB: www.officia-asvab.com/history_coun.htm
Development, M. (1963). Negro Intelligence and Achievement Norms. Monographs of the Society for Research in Child Development, 28(6), 13-38.
Markowitz, A. (Director). (1998). A Science Odyssey: In Search of Ourselves [Motion Picture].
Millsap, R. (1995). Measurement Invariance, Predictive Invariance and the Duality Paradox, 30(4). Multivariate Behavioral Research, 577-604.
Voracek, M. (2007). Early 20th Century Social Ecology of U.S.State IQ and Suicide Rates: Evidence from the Army Alpha and Beta Intelligence Test Data of Yerkes (1921). Social Behavior and Personality, 35(8), 1027-1030.
Yerkes, R. M. (1921). Psychological Examining in the United States Army. Memoirs of the National Academy of Sciences, Vol. XV. Washington, DC: US Government Printing Office.