Flynn Effect Shows Real IQ Gains
Summaries Written by FARAgent (AI) on February 16, 2026 · Pending Verification
For much of the late 20th century, the Flynn effect looked like a plain fact with an obvious meaning: IQ scores kept rising, sometimes by roughly 3 points a decade, so people must be getting smarter. That was not a foolish inference. The gains appeared across countries and across decades, and they arrived alongside real improvements in health, schooling, nutrition, family size, and the cognitive demands of modern life. If raw scores climbed by 30 points or more from the 1920s onward, a reasonable observer could conclude that modern societies were producing a genuine increase in intellectual capacity, not just better test-taking.
The trouble began when researchers looked more closely at what was rising. The biggest gains often showed up on abstract, culture-loaded, or test-specific items, not in the broad, stable general intelligence, g, that many psychologists treated as the thing IQ tests were supposed to capture. Arthur Jensen and others warned early that large score gains over time sat awkwardly beside the stubborn rank order of individuals and groups, and beside the fact that nobody believed the average person in 1990 was a genius by 1930 standards. Later work found mixed patterns: gains that varied by subtest, country, and period, reversals in some places, and evidence that environmental improvements can raise scores without clearly raising the underlying trait in the old, simple sense.
So the old slogan, rising IQ means rising intelligence, no longer carries the easy confidence it once did. A substantial body of experts now rejects the strong version of that claim, arguing that the Flynn effect shows changes in test performance more than a clean, across-the-board increase in general cognitive ability. Others still hold that at least part of the gain is real, pointing to better nutrition, reduced disease burden, schooling, and even secular increases in brain size as signs that something substantive improved. The debate now is less about whether scores rose, they did, than about what exactly rose with them.
- James Flynn was a political scientist at the University of Otago who in 1984 published the paper that gave the phenomenon its name. He combed through old test manuals and standardization samples from the United States and found that Americans born later scored about 13.8 points higher on earlier versions of the same tests, roughly three points per decade. Flynn presented the data plainly and insisted the gains were real, though he grew increasingly skeptical they reflected genuine rises in intelligence itself. His later writings and 2013 TED Talk framed the effect as evidence of shifting habits of mind rather than raw cognitive capacity. The term Flynn effect entered textbooks and policy debates largely because of his persistence. [2][5][11]
- Arthur Jensen was a psychologist at the University of California who spent decades defending the concept of g, the general factor of intelligence. He viewed the reported gains with deep suspicion and laid out strict criteria that any legitimate secular trend must satisfy: comprehensive samples, unaltered tests, mature participants, and culture-reduced instruments. Jensen repeatedly warned that the apparent rises looked more like test sophistication than increases in g. His cautions were cited by Flynn himself as the proper framework for evaluation, yet they were often sidelined in the rush to celebrate rising scores. [3][6]
- Richard Herrnstein and Charles Murray, authors of the 1994 book The Bell Curve, acknowledged the Flynn effect while advancing their argument that high heritability of IQ would sort society into cognitive classes. They coined the very term Flynn effect in their pages and treated the score increases as a rise in test performance rather than a direct contradiction of their meritocracy thesis. The book brought the phenomenon to a wide audience of policymakers and journalists who had never read the original papers. [5][11]
The American Psychological Association published Flynn’s seminal 1984 article in its flagship journal American Psychologist and thereby lent institutional weight to the idea that mean IQ scores were climbing. The APA’s platform turned an obscure finding into the standard citation for anyone arguing that environment could dramatically reshape cognitive scores. Later APA task forces on intelligence cited the gains as important context while stopping short of declaring them proof of rising intelligence. [2]
The Dutch military authorities tested nearly every 18-year-old male conscript with the same unaltered 40-item Raven’s Progressive Matrices from 1945 onward. Their records supplied some of the cleanest data on generational change, showing roughly 20 raw-score points gained in thirty years. Similar comprehensive testing programs in Belgium and Norway fed the international literature and made the Flynn effect appear robust across borders. [6]
The American Association on Intellectual and Developmental Disabilities maintained an IQ cutoff of 70 for the legal definition of intellectual disability. In practice many clinicians and courts applied old norms without adjustment, producing abrupt swings in eligibility when new tests appeared. The organization’s guidelines mentioned clinical judgment but rarely emphasized the need to correct for secular gains. [7]
The strongest case for the assumption rested on decades of consistent data. Successive versions of the same IQ tests, normed on fresh representative samples, showed later cohorts scoring markedly higher on earlier editions. Meta-analyses of hundreds of studies placed the gain at about three points per decade in developed countries, larger on fluid reasoning than on crystallized knowledge. [7][11] Reasonable observers noted that nutrition had improved, schooling had lengthened, and societies had grown more complex; these changes offered plausible mechanisms. Twin and adoption studies demonstrated that within a generation heritability was high, yet the between-generation shifts looked environmental. A thoughtful scholar in the late twentieth century could therefore conclude that average intelligence really was rising, even if the gains were uneven and their ultimate meaning unclear. [2][13]
James Flynn’s 1984 analysis of U.S. standardization samples from 1932 to 1978 documented a 13.8-point gain on older tests. The pattern replicated across fourteen nations and on culture-reduced instruments such as Raven’s matrices. These numbers seemed to prove that something potent in the environment was lifting cognitive performance. [4][6] Yet the same data contained warnings. Gains were largest at the lower end of the distribution and did not produce a noticeable renaissance in scientific or cultural achievement. Flynn himself soon argued that the tests measured a correlate of intelligence rather than intelligence proper. [5][11]
Polygenic scores for educational attainment began to show declines across twentieth-century birth cohorts in European Americans, Britons, and Icelanders. Reaction times, a correlate of g, lengthened rather than shortened in controlled modern samples. Brain-volume and brain-weight studies that once appeared to support physical enlargement of intelligence turned out to rest on biased samples. These contrary indicators accumulated gradually and raised growing questions about whether the score gains reflected genuine increases in cognitive capacity. [1]
Academic journals and textbooks spread the assumption rapidly after Flynn’s 1984 paper. The term Flynn effect, introduced in The Bell Curve, entered psychology curricula and became the most cited environmentalist counterargument in debates over racial IQ differences. Richard Nisbett’s 2009 book relied heavily on the gains to argue for a purely environmental explanation of group differences. [3][5] Test publishers issued new norms without always highlighting how much higher modern samples scored on obsolete versions, quietly embedding the phenomenon in clinical practice. [2]
Military testing programs across Europe supplied raw data on unaltered instruments and lent the effect an aura of official objectivity. Scholars exchanged letters and questionnaires with test directors in thirty-five countries, each confirming local gains and reinforcing the sense of a global phenomenon. Flynn’s 2013 TED Talk brought the idea to a popular audience by noting that people from a century ago would score about 70 on today’s norms. [6][11] Yet the effect was rarely emphasized in clinical training programs, so many practicing psychologists continued to treat IQ scores as stable across generations. [7]
Atkins v. Virginia in 2002 barred execution of the intellectually disabled, defined in part by an IQ of 70 or below. Courts then faced the question of whether old test scores should be adjusted downward for the Flynn effect. Walker v. True in 2005 established precedent for such correction in capital cases, affecting hundreds of death-row inmates. [7][13] States continued to rely on whichever norms were available at the time of testing, producing inconsistent outcomes that depended on when a defendant had been assessed. [11]
Special-education eligibility in American schools hinged on IQ cutoffs for intellectual disability and on the discrepancy between IQ and achievement for learning disabilities. Each renorming of a major test produced sudden drops in average scores, shrinking or expanding the pool of eligible children and straining district budgets. Clinicians warned of misdiagnosis patterns, but the adjustments were applied unevenly. [4][7] Educational policymakers cited longer schooling as one driver of the gains, justifying expansions of compulsory education on the assumption that more years in class produced real cognitive growth. [5]
The assumption shaped life-and-death decisions in the criminal justice system. More than eighty death sentences had been converted to life terms by 2008 because of Flynn adjustments; hundreds more cases remained active. Convicts who scored above 70 on outdated norms sometimes fell below the cutoff once scores were corrected, yet others were executed before the adjustment became routine. [7][13] In special education, renorming caused abrupt swings in diagnoses. A 5.6-point drop when switching tests could push borderline children into or out of eligibility, disrupting services and confusing parents. [4][7]
Policy debates about group differences absorbed the idea that environmental trends could close gaps, fostering unrealistic expectations. The same data were used to argue that genetic explanations for racial disparities were untenable, even as the gains failed to eliminate those disparities. Research agendas were distorted by premature causal theories that treated every point of increase as proof of effective interventions. [2][3][9]
Mounting evidence began to challenge the interpretation that the gains represented real increases in intelligence. Polygenic scores for educational attainment declined across cohorts, reaction times lengthened, and brain-size studies were exposed as biased by sampling. Flynn himself concluded that the tests measured abstract problem-solving with little practical payoff and called the gains “ersatz.” [1][5] Meta-analyses documented that the effect was slowing or reversing in developed countries after the 1990s, with different trajectories for different abilities. [5][11]
Austrian data from 2005 to 2018 showed score gains alongside a weakening positive manifold, the very pattern of correlations that defines g. Measurement invariance tests confirmed the changes were real but indicated they reflected ability differentiation rather than uniform intelligence growth. [8] Sibling fixed-effects analyses of large U.S. samples eliminated the apparent Flynn effect once maternal age and fertility timing were properly controlled, turning the trend slightly negative. [12] Critics increasingly argued that IQ tests yield only ordinal scales unsuitable for secular comparisons and that the gains largely reflect test-specific skills or familiarity. A substantial body of experts now questions whether the twentieth-century rise in raw scores ever reflected a genuine increase in the capacity for intelligence. [3][6][9]
-
[1]
How real was the Flynn effect?reputable_journalism
- [2]
-
[3]
The theory of intelligence and its measurementpeer_reviewed
- [4]
-
[5]
Flynn effect - Wikipediaunverified
- [6]
-
[7]
The Flynn Effect: A Meta-analysispeer_reviewed
- [8]
- [9]
-
[11]
What Is The Flynn Effect In Psychology?reputable_journalism
- [12]
- [13]
-
[14]
Sadiq Khan: London needs more migrantsreputable_journalism
- [15]
- [16]
- [17]
- Affirmative Action Causes No Reverse DiscriminationAcademia Criminal Justice Economy Education Immigration Psychology Public Policy Race & Ethnicity UK Politics
- Diversity is Our StrengthAcademia Criminal Justice Economy Education History Immigration Public Policy Race & Ethnicity UK Politics
- Immigration Compensates for Low Birth RateAcademia Criminal Justice Economy Education History Immigration Public Policy Race & Ethnicity UK Politics
- Policing Disparities Prove DiscriminationAcademia Criminal Justice Economy Education Immigration Psychology Public Policy Race & Ethnicity UK Politics
- Race-IQ Inquiry Must Be SilencedAcademia Criminal Justice Economy Education History Immigration Psychology Public Policy Race & Ethnicity