- Review
- Open access
- Published:
The elusive associations of nucleotides with human success: evolutionary genetics in education and social policies
Evolution: Education and Outreach volume 18, Article number: 4 (2025)
Abstract
Introducing the fundamental principles of evolution and genetics in the pedagogy of biology and curricula should emphasize an understanding of the basic evolutionary genetic mechanisms. These mechanisms involve a number of intervening and highly variable biological and environmental parameters that affect the inheritance and development of complex traits. This implies that an individual’s DNA sequence alone is insufficient to precisely determine what traits they would or do possess at any given time in their life course. It is not just a matter of uncoupling the genetic and environmental components of a given phenotype, but of understanding the network of causal influences and its complex genetic architecture of phenotypic components. The primary aim of this paper is to provide a general understanding of the scientific background needed by teachers and curriculum designers about the complex and often unpredictable relationships between DNA sequences and the complex traits they influence. This idea holds special importance in classrooms, because failing to integrate this perspective in the context of human genetics and evolution education reinforces essentialism about human beings, that is, the view that a person's biological, physical and intellectual abilities are fixed. Educators and students alike must avoid taking this view. Here, we aim to caution readers to be aware of the limitations of claims made on behalf of DNA-based predictive indices when applied to evaluate children for their potential for educational, financial, and social success. These indices include the classic heritability concept and, more recently, 'polygenic risk score' (PRS). The latter is especially significant because it is often recommended in clinical diagnostics, and is inferred from 'big data' consisting of millions of DNA markers known as 'genome wide association studies' (GWAS), which gives it an air of credibility. GWAS has enabled mapping of specific regions of the human genome associated with many complex polygenic traits. These individual DNA markers indicate a suggestive or real causal association, but their magnitude of influence (effect size) on complex traits is usually small; to compensate, the individual effects of markers that show association are combined into a PRS, a predictive index, which is then applied in clinical diagnostics or to estimate the magnitude of the cumulative influence of DNA markers on a quantitative trait. Behavioral scientists have extended this rationale to predict future educational achievements and financial prosperity of school children. We question the rationale behind these applications by exploring evolutionary genetic principles underlying quantitative traits, particularly the traits used to predict future social and educational achievements in children. Further, because additive effects of alleles form the basis for inferring the properties of PRS and heritability indices, they are constrained by genetic, developmental, and environmental uncertainties, and the complex architectures of correlated phenotypic traits. We assert that PRS, like heritability, is neither a static nor a deterministic property of genes for individuals or populations—it is dynamic and contextual. It can be easily modulated through socio-cultural niche construction and epigenetic reprogramming. We conclude that the application of molecular indices to predict educational and financial success of children is untenable, and should be avoided.
The topic of “race” and IQ should be buried because there is in the foreseeable future no possibility of eliminating extreme naivete of genetical, environmental and statistical modeling. (Kempthorne 1977)
Introduction
Theodosius Dobzhansky, well known for his consideration of the broader relevance of inheritance and variation, famously stated "Nothing in Biology makes sense except in the light of evolution" (1973). The source of his claim here is of special note: The American Biology Teacher. Learning, educational outcomes, and financial success dependent on success in school and educational attainment may be considered composite human phenotypes, themselves composed of many complex behavioral and cognitive traits—and so may be considered in light of Dobzhansky's claim, as a part of the study of human biology. What light can evolutionary biology and genetics shed on these nagging questions, deeply entrenched in human societies?
Here is a provocative answer to this question, to which the arguments of this paper are meant to respond. If there were a strong genetic basis for these aspects of human life, if the relationship between genetics and environment might be teased apart in order to understand success in education and financial success, then those genetically predisposed for success might be provided with the right context, so that their natural-born talents might find full efflorescence. Taking this line of thought one step further, perhaps here is a new form of injustice: someone with the genetic propensity for doing so, having made a good faith effort, fails to attain success in education, for having not been exposed to the right environment. Furthermore, one might ask, how would this differ from, say, differences in educational outcomes across ethnic groups, some held back while others succeed, due to the organization of society, historical factors, or culture, all beyond their control? If there are social policies aimed at promoting success across all types of people, why not also account for differences in genetic endowment?
This provocative suggestion about the role of genetics and cultural evolution in social policy has recently been articulated by Katherine Harden (2021) in a popular and easily accessible book, The Genetic Lottery: Why DNA Matters for Social Equality. The book, which embodies the spirit of Asbury and Plomin's (2014) G is for Genes: The Impact of Genetics on Education and Achievement, is notable for the boldness with which genetics is proposed as a guide to education and for framing social policy, especially since it highlights the application of recent advances in human genetics such as Genome-Wide Association Studies (GWAS) and Polygenic Risk Scores (PRS). What is more notable is how neatly the recommendations in the book, and the use of genetics and evolutionary biology, fit in with already existing tendencies in thought—misconceptions—about genetic determinism, and more broadly, with tendencies in thought about the usefulness of genomic data about individuals for predicting behavioral or cognitive phenotypes. The primary aim of this paper is to show that, novel methods notwithstanding, there are serious flaws in any program of predicting or controlling outcomes in education based on genomic variations, such as data on single nucleotide polymorphisms (SNPs). Information derived from nucleotide sequences does not provide an adequate basis for identifying meaningful associations, let alone causal links, for making predictions between individuals' genetics and their educational and socio-economic destiny. Instead, it is the entire system of interactions, including biological, ecological, and social factors and contexts of which individuals or a battery of genes in those individuals are only a small part, which must be seriously considered.
We proceed as follows. In Sect. "Context: genetic determinism, genetics curriculum, and eugenics", we place our argument in the context of the recent body of work on genetic determinism in the education literature, and identify the extent to which Harden's proposals align with traditional eugenic and the more recent "gene-centric" views of human health and cognition which have emerged largely due to newer genotyping technologies and analytical approaches. Subsequent sections of the paper address genetic architectures of complex traits and their measurement errors. Sect. "Genome-wide association studies and their application" concerns methodological aspects of Genome-Wide Association Studies, based on large numbers of individuals, that diminish their usefulness for drawing meaningful conclusions on the behavior of complex polygenic traits about specific individuals based on their genomic variation. These include complexities of human population structure, genomic architecture, and epigenic modulations in relation to both biological and socio-cultural environments.
Sect. "Heritability: an unreliable index" concerns the concept of heritability (h2), its genetic basis, and its applications. It is essential to address confounding factors and the highly variable environmental effects that influence the magnitude of its expression, as these pose serious challenges for making effective social policies. In Sect. "Epigenetics", we discuss the role of genotype-environment interactions, epigenetics, and developmental variation as yet another dimension of human biology which modulates causal links and the context-dependent flow of metabolites between genotype and phenotype via epigenetic networks. In Sect. "Niche construction", we argue that niche construction—the active shaping of an individual child's socio-cultural environment through personal choices, supported by caregivers and opportunity-providing institutions—plays an overwhelming role in the child's educational and financial success, surpassing the influence of genomic variation as reflected by predictive genetic indices.
Readers interested in an overview of the main claims and arguments of the paper are urged to consult the tables in the Appendix, which summarize the argument of each section of the paper.
Context: genetic determinism, genetics curriculum, and eugenics
Our arguments aim to advance understanding the role of genes as determinants of cognitive abilities and future financial success of children, an understanding which recent literature shows is urgently needed in educational programs (e,g., Harden 2021; Asbury and Plomin 2014). The problem is that misconceptions about the role of genes in explaining cognitive traits are reinforced by lacunae in biology curricula. These misconceptions center around genetic determinism and essentialism. A misconception commonly held among students is that genes play a primary causal role as determinants of an individual's excellence in traits such as sports, music and socio-economic success. To generalize, the notion of genetic determinism has become a folkloric expression. This belief is unfortunately directed toward specific individuals or groups, such as members of an ethnic community in hierarchically stratified societies (Mayo and Nanjundaiah 2024), who are trapped in frozen social systems, and often have derogatory statements hurled at them, attributing their traits to genetic determinism—'Oh, it's in their genes!’ Stereotypes and misconceptions about genetics reinforce each other. At present, genetics curricula in schools do little to address these concerns, which often intersect dangerously with outdated eugenic points of view. Our arguments in this paper are meant to provide a comprehensive account of the ways in which single nucleotide substitutions are to a large extent causally disconnected from complex traits, or minimally influence them, with the aim of informing teachers, researchers, and curriculum designers about how to address these misconceptions, especially in light of advances in genetics, which are intended to establish true causal links between genes and phenotypes.
In this section, we review the broader context of genetic determinism in the recent history of evolutionary biology and in the place of current thinking about science teaching, placing Harden's focus on modern methods in genetics into this context as a form of genetic determinism (Sect. "The "Genetic Lottery" in the context of genetic determinism in evolutionary biology and science teaching"). The relationship between discredited eugenic schools of thought and Harden's understanding of what these methods tell us about social policy is then explained (Sect. "What about eugenics?"). This provides context for our arguments about the role of genes as determinants of human success: educational, economic and social.
The "Genetic Lottery" in the context of genetic determinism in evolutionary biology and science teaching
Harden's The Genetic Lottery: Why DNA Matters for Social Equality (2021) is a recent contribution aimed at articulating and defending genetic determinism in human social life. Harden's book, to a large extent, extends the view convincingly articulated in Edward Wilson’s (1975) book, Sociobiology: The New Synthesis, which generated worldwide interest, marking a signal moment in evolutionary biology, especially in the United States (Goldstein 2015). Wilson's first chapter, "The Morality of the Gene," conveys a veiled message that genes have a purposeful moral obligation to the individual who carries them, suggesting that "the organism is only DNA’s way of making more DNA. More to the point, the hypothalamus and the limbic system are engineered to perpetuate DNA" (Wilson 1975, 3). Wilson aims to extrapolate the "self-serving view" from the gene to the entire community, quoting two passages from the Gita, which extol the importance of preserving family units. Additionally, Wilson's views are rooted in the concept of kin selection (Hamilton 1964), which states that natural selection can promote a tendency for self-sacrifice to save individuals closely genetically related, so that the group can reproduce and perpetuate.
Wilson's foundational work is one of a long succession of attempts to apply discoveries in evolution and genetics (evolutionary genetics) to social science. This tradition begins with Galton (1883), who was also responsible for the origin of eugenics, "good in birth," defined as "the science which deals with all influences that improve the inborn qualities of a race; also with those that develop them to the utmost advantage" (Galton 1904). Herbert Spencer is another notable figure in this tradition. Spencer, a late 19th-century philosopher and sociologist, coined the term, "survival of the fittest" (which Darwin borrowed and later adopted)—and extended Darwinian principles to explain intellectual and economic disparities among peoples of the Victorian Era. Although the popularity of Wilson's work waned in his lifetime, both geneticists and social scientists of the twentieth century have continued to extend new discoveries in genetics and evolutionary biology to understand the causes and consequences of genetic variation on various dimensions of human health and behavior. These include certain Mendelian disorders, complex mental disorders, intellectual attainment, affluence, political power, risk-taking, entrepreneurship, deceits and deceptions, sexual orientation, legitimizing social hierarchies, empire building, and myriads of human affairs. Wilson's work has directly or indirectly influenced an avalanche of related works in the subsequent years, with exemplary works including Dawkins (1976, 1982), Herrnstein and Murray (1994), and Plomin (2018). To highlight the points of view articulated in these works about the importance of genes and genetics in social, political, and economic decision making, we will call it the Wilson-Dawkins-Plomin ("WDP") school of thought. This accords as well with views articulated by Plomin in collaboration with Asbury (Asbury and Plomin 2014), in which it is urged that services and pedagogy made available to students in school should be apportioned according to their genomic variation, including an evaluation at a young age, based on their genetics, of whether they will attend a university (Asbury and Plomin 2014, 174).
Equipped with novel conceptual and technological advances in genetics and genomics, adherents of the WDP school continue to promote DNA as the deterministic force of many aspects of human affairs, in particular human cognitive abilities, school attendance among children, and socio-economic affairs (Mills and Tropf 2020). The Genetic Lottery (Harden 2021) is an eminently accessible work, aimed at the general reader. This appears to be Harden's intent, articulated in an opinion article in the New York Times (2018), in which she appeals to progressives to use genomic information about individuals, specifically their polygenic risk scores, to predict educational achievements, as a tool for advancing the goals of an egalitarian society. This would clearly represent an expansion of the WDP view, usually opposed by more established evolutionary geneticists such as the late Richard Lewontin, Marcus Feldman, Molly Przeworski, and others.
Notably, polygenic risk scores continue to be of interest, for instance, the New York Times reporting on risks of heart disease, dementia, asthma, and kidney disease (Kolata 2023; Bakalar 2019; Lennon et al. 2024). Furthermore, to complicate the picture, the very same methods and conclusions of recent genetic analyses involving GWAS are used by Harden's ideological adversaries, that is, those that oppose progressive politics. They advocate the general appeal of polygenic risk scores determined from GWA studies to advance their own agenda for charting social policies. In doing so, proponents of sociogenomics perpetuate outdated and scientifically discredited conceptions of race (Smart 2022; Lala and Feldman 2024), and represent kaleidoscopic visions of the WDP school.
To be clear, Harden does not share this vision in its entirety; the key point here is that her conclusions draw on much of the same body of scientific literature that promote gene-centric and deterministic view of human development, which makes it all the more urgent to assess it from a scientific point of view. The accessibility of her work, and her focus on genetics and genomics, including the most recent computational methods, make Harden's recent work an ideal starting point for addressing our goals in this essay: to highlight methodological blind spots of newer and other historically important methods in genetics, and to warn of potential errors that may follow from realizing the WDP school's conclusions in social, economic, and political spheres in light of novel approaches in genetics.
Genetic determinism, essentialism, and science curricula in schools
Recent literature in science education points to the importance of our aim here. Donovan et al. (2020, Table 1) makes the helpful distinction among types of genomics education: Basic, Standard, and Humane. Basic focuses purely on Mendelian and molecular genetics concepts, while Standard also incorporates multifactorial-polygenic inheritance (quantitative genetics) and population-genetic thinking. Humane genomics education extends the concepts used in Standard curricula by highlighting ways in which quantitative genetics and population thinking are not compatible with essentialism, and also ways in which essentialism has pernicious social consequences. Donovan et al. (2020, 1502) show that students exposed only to Basic genomics education are more likely to affirm essentialism rooted in genetics than those exposed to Standard or Humane genomics education. This approach aligns with Stern's and Kampourakis's (2017, 194, on "SSI's"), who emphasize the importance of designing curriculum from the perspective of the use of scientific knowledge in society.
A brief survey reveals that some influential science curricula in the USA primarily reflect the Basic level, with tentative extension, at best, to Standard. The curriculum adopted by the Portland (Oregon) Public Schools for its middle schools, The Science Education for Public Understanding Project (SEPUP 2020), explains genetics in terms of someone considering whether to be tested for a genetic disease, at a single Mendelian locus, and does not advance beyond explanations of simple one-gene Mendelian inheritance. Consideration of the larger social context is limited to the use of genetic information by insurance companies. SEPUP is authored by the Lawrence Hall of Science, and purports to be research-based. The NGSS High School Life Sciences Standards (Achieve 2013, "Inheritance and Variation of Traits"), widely adopted in the United States as something like national standard science curriculum standards, clearly encompass Basic but do not clearly incorporate Standard genomics, and clearly do not address concerns required for the Humane level. Finally, AP Biology curriculum standards (College Board 2020, units 5 and 6) do incorporate elements of the Standard picture. The AP standards present topics such as gene regulation or differential expression across contexts as an extension of ideas about transcription and translation, and they focus solely on molecular contexts. Stern and Kampourakis (2017) confirm that these are not isolated cases. They highlight the simplified view of genetics found in textbooks (Stern and Kampourakis 2017, 199ff), and also the views articulated by teachers themselves, who they observed to conflate "trait" and "gene" when addressing students (Stern and Kampourakis 2017, 205ff). The result is that "secondary students think of genes as the ultimate determinants of [diseases, i.e., complex traits], overlooking the complex mechanisms that underlie the development of phenotypes" (Stern and Kampourakis 2017, 201).
The vision of Humane Genomics education—that all causal processes responsible for an individual's traits be accounted for, not just genes, their alleles, or base pair substitutions—remains to be fulfilled. This has serious consequences for how students understand human possibilities for personal growth and success. Genetic determinism, "attributing to genes the formation of human traits at an individual level, perceiving them as having more causal power than what scientific consensus suggests" (Gericke and McEwen 2023), forms the basis for misconceptions with socially pernicious consequences. Donovan's work, taken together with the work of others, represents a consistent, unified body of research on the consequences of misconceptions about determinism (Jamieson and Radick 2017; Gericke and McEwen 2023; Donovan et al. 2021 and 2020; Gericke et al. 2017; Stern and Kampourakis 2017; Kampourakis 2017).
In short, the situation is as follows. In a naive biological worldview, which incorporates misconceptions about the causal power of genes, genes alone play the central causal role in explaining an individual's traits, including both cognitive and personality traits. This in turn supports an essentialist view of human beings, which is that human populations and ethnic groups are fixed in their characteristics due to their shared inheritance. Interestingly, this view was also advanced by a prominent geneticist (Darlington 1953). Together, these views support the belief that inequalities in society are as they must be—if individuals' traits are fixed by genes, and if genes are distributed into socially distinct subgroups of people, it must be that those groups of people have reached their natural level of social attainment. This is generally understood to be a form of the Naturalistic Fallacy, according to which actual states of affairs are understood to reflect the way things ought to be.
What about eugenics?
Harden (2021) proposes to distinguish the intended use of GWAS and other predictive genetic indices from eugenics, which may be interpreted as rank-ordering of human genotypes in a "morally arbitrary" manner (p. 13ff, especially the section on the "Davenport-Laughlin model"). Just as a reminder, the work conducted by Davenport and his colleagues from 1910 to 1939 at the Eugenics Record Office in Long Island (now Cold Spring Harbor Laboratory) directly led to the forced sterilization of thousands of American citizens, the passage of anti-immigration laws, and a worldwide interest in eugenic experiments, particularly in Germany (Rutherford 2022). While the intention of avoiding socially pernicious ideologies is laudable, the substance of Harden's proposal is not clearly distinct from either Davenport's model of eugenics or from the kind of genetic determinism embodied in both Darlington’s evolutionary thinking and the WDP view. To see how this is so, first consider the relationship between social, educational, and political outcomes as Harden sees them.
Harden points towards a correlation, if not a causal relationship, between a predictive index of human genetic variation and educational, financial, and social success. While these claims are attenuated in that they allow a multitude of causes of educational, social, and financial success, and that favorable genetics do not guarantee success, they clearly attribute causal importance to one's genome. The primary evidence base for these claims is the "educational polygenic index," a distribution of genotypes in which the top quarter of the "genetic distribution were nearly four times more likely to graduate from college than those in the bottom quarter" (Harden 2021, 9). The educational polygenic index is an extension of the polygenic risk score, a statistic derived from Genome-Wide Association Studies (Wray et al. 2007). The educational polygenic index is intended to support the claim that genes "cause important life outcomes, including educational attainment" (Harden 2021, 25), and the claim that "it would be a mistake to dismiss the relationship between genes and education as trivial or unimportant…one’s genetics might not determine your life outcomes, but they are still associated among other things with being hundreds of thousands of dollars wealthier at the end of one’s working life" (Harden 2021, 46).
Other similar claims by Harden (2021) include those about the impact of single nucleotide substitutions in education outcomes (71); their impact on years of schooling, intelligence test scores and academic test scores (67–70); general statements to the effect that genes are on par with social class as a force in society determining personal success (8–10) and in creating inequality (149); and general claims that genetics must be viewed as an "ally" in "remaking and reimagining society" (20), "improving human lives," "understanding society to improve it," (186) and in social science (187).
Harden aims to convey that GWAS-based predictive indices can inform social policies that are different from traditional eugenics by pointing to the of role chance as a way individuals can benefit from genetics. Segregation and independent assortment in sexual reproduction introduce an element of chance into which genes one inherits; presumably Harden would allow that accidents of mate choice in human pairings also introduce an element of chance, even though social factors are usually considered by her in light of their tendency to counteract beneficial genes. Harden's view is that, on the one hand, the "genetic lottery" can result in a person having a genotype that (according to the educational polygenic index) promotes that person's success, but that, on the other hand, because of unfavorable social conditions, the person does not succeed. The idea is that this is a form of injustice. Possession of a certain panel of nucleotides ought to provide someone with cognitive capacities which would promote their bearer's success; nonetheless, the genes are prevented from making a difference by factors extrinsic to the person's use of those capacities in good faith efforts at school, work, or in social life. If conditions had been favorable for these individuals, they very well might have succeeded!
This informs what Harden sees as an alternative to traditional eugenics in the following way, articulated broadly in the first section of the book. On the one hand, we now clearly recognize traditional eugenics as a pernicious form of selective breeding of human beings, and as a form of making direct changes to the human gene pool to alter the human population in ways believed by a self-identified cultural elite to be favorable. As we would also now recognize, both selective breeding in humans and manipulating the gene pool, and a cultural elite's vision of what society should look like, are not compatible with a just society. These are reasons why most people are not comfortable with taking genetics into account in educational and social outcomes. On the other hand, Harden's vision is that social structure itself, in the form of social policies and institutions, might be changed via genetics, so that everyone with a favorable array of genetic variants can succeed. This vision is bolstered, according to Harden, because it dovetails with Rawls's vision of a fair society as an unbiased lottery system. According to Rawls' logic, factors like race, location, and other related aspects of one's life should not determine whether one benefits from society (Rawls 1971). In a similar vein, Harden argues that success due to 'genetic lottery' should not be diminished by the conditions identified by Rawls.
There are two important reasons why it is unclear that Harden's vision should be fully affirmed, even by those who would promote social equality and educational success for all who deserve it. First, it is not clear how distinct it is from the traditional eugenic view. The proposed educational polygenic index, as discussed by Harden, provides an estimate of which genotypes may have a greater propensity for success in society. To state that these individual genotypes are favorable and that individuals having them should be favored by social policies or institutions begs the question of what society should look like and who should succeed. Valuing these seems so much more arbitrary, since they are just those occurring "in the wild and in large populations," as it were, against the background of how these institutions have developed based on policies in place historically. Perhaps there are good reasons to value them, but claiming that certain or specific genotypes tending to succeed in these contexts should be identified and the efforts of those that have them promoted in contrast to others would seem to accord with Davenport's (1911) vision of establishing a rank-ordering of genotypes. The rank-ordering concept of genotypes suggests that genetic combinations (genotypes) are arranged from lowest- to the highest-fitness (Weinreich 2005).
Second, and more immediately from a scientific point of view, there are good reasons to question whether there really exist causal links—or even meaningful associations—that explain the educational polygenic index and similar relationships between genes and social, economic, and educational outcomes. Confidence should not be placed in the view that some one aspect of an individual's genetic variation such as a few single nucleotide polymorphisms can be identified in a sea of over 85 million SNPs reported in the human genome (Anonymous 2015) that causes or is associated with a composite or complex trait such as a cognitive capacity, behavior, or life outcome of economic or social attainment. For instance, consider this statement, "even one SNP associated with staying school for an extra two days may not be so important, but under a polygenic index scoring it could be important" (Harden 2021, 71). There are several reasons for questioning such claims. The aim of the remainder of this paper is to explain those reasons.
Genome-wide association studies and their application
Genome-wide association studies and their most important extension, polygenic risk scores, require a narrow set of conditions for their applicability. Outside of these contexts, which include specific population structures and mating patterns, the information they provide about composite traits (cumulative influence of many complex traits on one major complex trait) such as lifetime earnings or performance in school is limited at best.
A background on genetic aspects of GWAS
The central assumption of models claimed by Harden to support her view has its roots primarily in Fisher's work. In the classical Fisherian sense (Fisher 1918, 1930), an ideal breeding population consists of an infinite number of individuals that mate at random (panmixia) and remain in equilibrium, experiencing no selection, migration, or mutation. They have non-overlapping generations, and no differential viability among genotypes. Furthermore, under the infinitesimal model, a complex trait is influenced by an infinite number of alleles, each with a small additive effect. This fundamental model and its variations have guided much of research in quantitative genetics in the last century and continue to do so (Kempthorne 1957; Falconer and Mackay 1996; Walsh and Lynch 1998; Alvarez-Castro 2024). The same Fisherian approaches serve as the subterranean root system that connects the most widely accepted principle in quantitative genetic studies—additive gene action or the cumulative effects of many genes (additivity) that influence a complex trait. As will now be shown, human populations do not meet basic conditions for the applicability of models of the genotype–phenotype relationship Harden would like to use for forging social policies. In this context, there is no single critical failure; rather, careful consideration of the structure of human populations reveals multiple incongruities with extending the Fisherian model to sociological problems.
Humans are an outbreeding species; therefore, individual variation is a rule rather than an exception, as envisioned by Garrod (1902), who notes that "the individuals of a species do not conform to an absolutely rigid standard of metabolism but, differ slightly in their chemistry as they do in their structure." Further, genetically variable individuals are nested in age and stage structured populations due to past and on-going demographic factors. More recent studies (Grotzinger and Keller 2022) on human behavior suggest that an individual's tendency to choose mates with a unique set of traits, termed "cross-trait assortative mating," could distort the GWAS results; these studies caution that "complex trait genetics ignores these problems at its peril." GWAS are routinely conducted on hundreds of thousands (even millions) of individuals and subsequently compared with an individual’s genomic variation or PRS (Harden 2021, 65).
Moreover, humans are distributed in spatially structured overlapping populations of populations (i.e., metapopulations). They often originate from smaller ones, or are further divided into smaller ones due to drift (Tournebize et al. 2022), and frequently experience migration (demographic expansions); they also form different types of "clines" (gradation of allele frequencies between populations). Individuals within each of these populations often belong to different stages of growth and age groups and have overlapping generations. Sometimes consanguinity (hence increased homozygosity) could be common among related members of families within some of these populations. For instance, while the level of consanguinity is generally low in European countries, it could be as high as 50% or more in many Middle Eastern countries (Chekroun et al. 2025; Hamamy et al. 2011). Furthermore, the most useful measure of a population (effective population size) in humans hovers around 10,000; this is approximately one-third of the census size (Robertson 1960; Palstra and Fraser 2012).
Realistically then, any local (ecological) population composed of about 20,000–25,000 is perhaps a more useful way to define a "genetic population." Each of the subpopulations would have different allele frequencies, at least at some loci, as well as "private and rare alleles" specific to individual populations. Further, from a population-genetic perspective, large, artificially constructed individual populations (cohorts), commonly used in GWA studies, will have subpopulations with different linkage phases, and therefore may show different degrees of association between population specific variants with phenotypic traits. Consequently, mean and variances for traits among these populations would also vary. For instance, populations across Europe exhibit substantial variation in their phenotypic traits (Robinson et al. 2015). Then, drawing conclusions on the behavior of populations must be restricted to small populations within a region or even a locality (for instance, Hutterites or Mennonites), rather than a seemingly genetically homogeneous population on a continental scale (e.g., White European population). In reality, even populations with Northern European White ancestry would have or be derived from many overlapping populations with different allele frequencies. Therefore, a life-course perspective of specific populations would be more useful both for making broad generalizations in public health disparities and social inequalities (Wagner et al. 2024).
Overall, the extent of differentiation between pairs of human populations measured in terms of Fst ranges from about 2.0 percent to 18.0 percent (Elhaik 2012). The latter is closer to Lewontin’s (1972) estimates on the distribution of human genetic diversity. Progeny resulting from assortative mating among members of close families or individuals between different ethnic groups within a locality could have a differing phenotypic expression of individual traits. Clearly "a collection of individuals made by pooling two populations ought always to be more diverse than the average of their separate diversities unless the two populations are identical in composition" (Lewontin 1972). Inherent cryptic population structure and admixture could be major confounding factors which generate false positives in GWAS studies despite statistical adjustments (Sul et al. 2018).
We often see these unpredictable population dynamics in large, diverse cities which offer opportunities for people from different genetic backgrounds to meet and mix. For instance, over 700 languages and dialects are spoken in New York City (Anonymous 2021), and many of these languages are often represented by a few founding families. Belbin et al. (2011), conducted a fine-scale population structure in the BioMe, a diverse multi-ethnic biobank from New York City. They found 17 communities that shared recent genetic ancestry, and 1177 health outcomes were associated with specific ancestral groups. Drawing on this result, they stress the need for fine-scale monitoring of health outcomes. In short, large populations do not always offer correct answers to population-specific and individual-specific genetic questions.
Genome architecture and effect sizes
Can a deterministic view be made to work for simply inherited phenotypic traits or Mendelian traits in which the relationship to genes is well-understood? One could make a direct link between major genes that influence disorders such as achondroplasia, progeria, sickle cell anemia and others. Nonetheless, the "single genes" in these disorders implicate an interdependent system of traits due to direct and indirect pleiotropic effects against a backdrop of fluctuating environmental conditions. Take for example, achondroplasia or phenylketonuria caused by single mutations. These mutations sequentially affect many organs during the growth and development of individuals depending on the environmental (nutritional) conditions. Similarly, in any reasonably complex organism, every physiological or morphological trait is potentially correlated with other traits. This means that the organism functions as a correlated system of traits governed by allometric and pleiotropic relationships; the magnitude of these relationships can shift quickly in relation to environmental variation. This has significant consequences for GWAS, because the conditions under which a single nucleotide's influence can be isolated from its genetic context are limited, and even when it is reasonable to do so, the effects of a given nucleotide on any specific phenotype are so small as to be swamped by others or barely detectable within reasonable margins of error in measurement.
To see how this works, consider metabolism, a classic example. Eating unhealthy food can lead to obesity, which can bring about different magnitudes of both short and long-term changes in cognitive, physiological, anatomical, and morphological traits. This is because the flow and distribution of metabolites among traits vary in relation to development and demographic variables and the environment, as originally investigated by Sewall Wright (1921, 1934). In fact, Wright pioneered the method of "path analysis," a graphical representation to measure the degree of influence and direction of the flow of metabolites among individuals within populations in studies that are not well-designed and replicated akin to natural populations (Wright 1921). In plants, for example, the magnitude of path coefficients among traits changes in relation to the stage, age, growth medium, and density of individuals derived from open pollinated families. This pattern is also largely true among all organisms. More recently, Wright's path-analytical approach has inspired "Mendelian randomization" studies in human genetics and public health (Smith and Ebrahim 2003).
Harden’s "genetic lottery" metaphor reflects Mendelian randomization as well. This genetic lottery principle "has a causal effect on how far one goes in school" (Harden 2021, 129). She extends the genetic lottery idea to explain associations between genetic variants and behavioral traits among individuals in populations without explicitly mentioning the concept of Mendelian randomization. The degree of influence among traits as measured by path coefficients varies in relation to the available environmental resources. The magnitude of effects that cannot be measured is often designated as unknown U among a system of paths. In other words, even causal analytical approaches cannot fully account for all the causal factors that influence a complex trait.
In the simplest view, the genetic architecture of a given trait (for example, any trait that shows simple Mendelian inheritance) represents only a few genetic factors (in the Mendelian sense; now, genes) influencing simple morphological, biochemical, or anatomical traits. This number could be easily inferred for such traits by simply counting the proportion of individuals belonging to different classes (e.g. blue vs. brown iris eye color) in a segregating population of a cross. Environmental effects on the expression of such traits are minimal (say, Achondroplasia, a genetic disorder governed by dominance inheritance) relative to complex (polygenic) traits, such as human height or cognitive traits. For complex traits, however, Fisher (1918) correctly assumed an "indefinitely large number of segregating Mendelian loci with the possibility also of an arbitrary number of alleles at each locus" (Kempthorne 1977). But how many loci (genes) influence a given trait? Castle originally addressed this question to infer an approximate number of "effective factors" (note that Mendel used the term factors, in place of genes) through the Castle-Wright formula (Castle 1921), which was later improved by Lande (1981), Otto and Jones (2000), and others.
Advances in molecular biology in the 1980’s allowed cloning genes that showed a functional relationship with Mendelian disorders such as cystic fibrosis, Huntington Disease, and so on. Such studies might inspire optimism about narrowing down the very large number of genes influencing complex traits to just a few. These discoveries have also led to the characterization of genes that showed plausible associations with complex traits such as height or heart disease, termed "candidate genes." While some of these genes were in fact associated with the components of heart disease and other traits, their effects on individual phenotype were too weak and often showed contextual effects. Similarly, the effect of individual alleles in genes in terms of gene products (enzyme flux; Kacser and Burns 1981) was too small to be of any use in the over-hyped gene-therapy programs of the 1990s. In other words, an extension of ideas on single genes and polymorphism (or any one enzyme) in the complex system has only a minuscule influence on the performance of the overall polygenic system of a given complex phenotype as expected from the classical polygenic model as used by quantitative geneticists.
If a single gene or a small number of genes responsible for a given trait cannot be identified, the next question is, how to discover the number of most if not all genes in the entire genome influencing any complex traits, without resorting to statistical approaches (e.g., Otto and Jones 2000)? With the discovery of DNA markers (single nucleotide polymorphisms, i.e., SNPs) in the late 1990's, and extending classical concepts such as linkage and linkage disequilibrium, along with the expansion of data-crunching capabilities, an approximate number of genes that influence Mendelian and complex traits and diseases have been discovered for complex traits in various organisms through GWAS (e.g. Mills and Rahal 2019). Such studies require a large number of individuals in order to have sufficient power to establish a statistically significant association between candidate SNPs and individual traits (Risch and Merikangas 1996; Slatkin 2008). In brief, these studies involve genotyping tens of thousands to millions of individuals of cases (e.g., those with a disease) and controls (healthy individuals), and subsequently testing the allele frequencies between the two groups for statistical significance. These studies are attractive for several reasons. First, there is the relative ease of genotyping millions of SNPs for each individual; second, the availability of computational facilities; and third, because they provide a general idea about the approximate number of common genes, their location in the genome, and their alleles (see below) that influence a trait.
Because such large sample sizes are required to detect most (not all) genes influencing a given trait, several cohorts from different populations are merged into a single cohort to perform GWAS. Recall the discussion above (Sect. "A background on genetic aspects of GWAS") concerning population structure in space and time as required for applying classical models of additive gene action. Pooling many small populations can result in a reduction of heterozygotes called the Wahlund effect. Further, the discovered genes may not necessarily show a consistent relationship with the trait across populations of ethnic groups. In some cases, though, although the strong association between a trait and genetic markers is found, their individual influence or "effect sizes" on the trait under consideration are low, which would agree with the polygenic nature of complex traits as assumed by Fisher. Another consequence of requiring large sample sizes to detect genetic associations is that GWAS are largely restricted to detecting genetic variants that are commonly found in a population. Rare and private variants are not easily detected, and even greater sample sizes are often required for their detection.
Finally, note that, in addition to these problems associated with sample size, GWA studies, despite their popularity, are plagued with still other familiar problems, which may not be unrelated to population sizes. These include difficulty pinpointing specific rare alleles and their effect size, so that heritability of the trait is not fully explained, and difficulty replicating results for other populations and ethnic groups (Tam et al. 2019).
Accounting for the variation in population structure and genome architecture seen in such large populations has risks for the reliability of GWAS studies in behavioral studies. On the one hand, GWAS captures a largely additive portion of the variance. On the other hand, such studies miss the influence of dominance (cis) and epistatic (trans) interactions on the trait under consideration. This is because GWAS generally assume a direct influence of a given SNP located at a specific region of the genome on the studied trait(s). By definition, allele frequencies in populations within cases and control groups are not necessarily identical, because each of these populations also contain novel mutations (private or rare alleles), and the magnitude of their effect is often contextual, which includes both genetic and non-genetic factors. To "correct" for population admixture or stratification, a method called "Principal Component Analysis" (PCA) is often used. PCA was developed over a century ago by Pearson (1901) to reduce the complex dimensionality of correlated variables into linearly uncorrelated variables. This technique has been widely applied to correct population structure before performing GWAS.
Application of PCA in GWAS is also fraught with problems. For instance, even in well-documented and widely used samples for genetic studies such as the UK Biobank, the latent structure is a common problem despite adjustments using principal component analysis which have serious implications on the accuracy of PRS (Haworth et al. 2019). More recent studies on the applicability of PCA in GWA studies have been questioned: "PCA results are artifacts of the data and … can be easily manipulated to generate desired outcomes … [results] may not be reliable, robust, or replicable as the field assumes" (Elhaik 2022). Ding and colleagues (2023) reach a similar conclusion concerning the application of risk scores to individuals: such scores are highly dependent on the reference point selected at the basis for PCA in a training data set, and vary widely depending on the composition of the training data.
Polygenic risk scores and simple Mendelian genetic architectures
What about Polygenic Risk Scores, which are derived from GWAS? Note that GWAS does indeed provide insights into an approximate number of genes and their variants influencing a given trait, identified as single nucleotide polymorphisms or SNPs with an "rs" (reference SNP) number which may be associated with the traits under study. These studies also provide an idea of the magnitude of the influence (effect size) of each of the SNPs that show a fixed P-value threshold of 5 × 10−8 or more on the trait employed in the association study (e.g., SNPs vs. human height). Due to the sheer number of SNPs and size of the population, GWAS in principle appears to fulfill important statistical requirements: sample size, corrections for admixture, and distribution of effect among SNPs, to name just three. Although PRS is a novel method in human genetic research, and allows extracting some useful information from GWAS, the basic idea of PRS goes back to Mendel’s original formulation of genetic principles, dominance and recessivity, as well as Fisher’s idea of average effect and average excess of an allele (Fisher 1941).
A central principle of Mendelian genetics is that dominant alleles influence the expression of a given trait differently than do recessive alleles (Fig. 1). For instance, individuals with dominant, heterozygous, and recessive genotypes carry two, one, or zero copies of a given allele (e.g., TT, Tt, and tt for tall and dwarf phenotypes, respectively), resulting in dose-dependent effect on height. Fisher (1941) presented Mendel’s results differently, however. Suppose that a locus starts out with no polymorphism; what happens to progeny when a new mutation occurs at that locus, or an old allele (wild type) is "substituted" for a new one among the gametes of parents distributed in a population? Fisher (1941) proposed that in a random mating population, the average effect of gene (allele) substitution is equal to taking the difference between the weighted mean of individuals carrying an allele A versus those who have the alternate allele a. This is the magnitude of the influence of the A allele on a given trait. In other words, the average effect of the allele is a measure of the phenotypic effects of gametes carrying A or a (Falconer 1985; Templeton 1987). The same idea was discussed in the 1980s in terms of the "measured genotype approach" (Boerwinkle et al. 1986). The idea of PRS is only an extension of the above rationale.
In accordance with Fisher's (1918, 1930, 1941) understanding of the inheritance of numerous alleles and their additive properties, Wray et al. (2007) suggested that the cumulative influence of individual SNPs on a specific polygenic trait, i.e., a trait dependent on many genes, that reach statistical significance (not just P-values of 5 × 10−8) could be weighted in accordance with their magnitude of influence on a given trait, as indicated by the strength of association (effect size) between the SNP and the trait, and pooled in relation to their effect sizes and represented by a number (recall Fig. 1). That pooled statistic may be used as a predictive index, the polygenic risk score (PRS), in genetic studies of humans or any other organism, just as it has been applied in quantitative genetics. The expected effects of these genotypes (alleles) on the phenotype are assumed to be linear (Walsh and Lynch 1998), analogous to narrow-sense heritability. Furthermore, this linearity breaks down under different environmental conditions as demonstrated in many norms of reaction studies (see Sect. "Niche construction" below).
This sounds all very tidy, and it would be a boon if PRS could indeed perform as expected. The problem is that PRS reflects additive effects only, and does not include dominance and epistatic effects on the phenotype, which are non-linear, concatenated, and unpredictable. The genotype-to-phenotype expression is contingent upon the environment (Fig. 2). Although the use of PRS in human genetics (especially clinical studies) has been advocated by many investigators (Adeyemo et al. 2012), its indiscriminate use in the social sciences (Janssens 2019) and in human disease prediction have been questioned (Wald and Old 2019; Herzig et al. 2022; Kumuthini et al. 2022). For instance, Wald and Old (2019) claim that "to our knowledge, no genome-wide polygenic score meets this requirement (i.e., even a moderate, risk ratio of 3–6) and none is likely to do so with polygenic scores that emerge in the future. It is important that the potential applications of genomic medicine are not compromised by raising unrealistic expectations in medical screening." As Turnbull et al. (2024) have cautioned, genomic screening, using indices such as PRS, like any public health program, must be grounded in rigorous testing, regardless of the field of application. Recall Kempthorne’s (1977) views as a quantitative geneticist published over four decades ago: topics such as the genetics of race or IQ should be avoided "because there is in the foreseeable future no possibility of eliminating extreme naivete of genetical, environmental and statistical modeling." Although Wald and Old’s comment is a bit harsh, especially in light of Kempthorne's remark, they do raise a legitimate concern about the over-emphasized use of PRS in clinical studies, and in particular, human behavioral genetic studies.
A general depiction of genotype-by-environment interactions and reaction norms. Each genotype exhibits a different response to varying durations of light exposure. Some genotypes, such as Aa, display average performance across different environmental conditions, while the AA and aa genotypes may differ. Additionally, when the lines of reaction norm are not parallel, it indicates the presence of genotype-by-environment interaction (Redrawn with permission from Alvarez-Castro 2024). A very similar family of curves are also found in Gupta and Lewontin (1982) and Lewontin (2006). For simplicity, curvilinear effects are not shown
Heritability: an unreliable index
Although thousands of GWAS have been successfully conducted spanning numerous traits and global populations, they fall short of ideal conditions for applying this model for behavioral traits, which are highly influenced by environmental factors. What about that classical measure of additive gene action, often measured in terms of heritability? Harden is optimistic. "Despite pleas to abandon the concept I anticipate that heritability research will persist for good reason, because it is answering a question about whether people’s genes, an accident of birth over which they have no control…[cause] differences in education and income and well-being and health—in the societies in which we actually live" (Harden 2021, 123). This optimism is not warranted, unfortunately, because of critical imprecision in measuring heritability, rooted in the complexity of genotype–environmental (GxE) interactions (Alvarez-Castro 2024). What follows is that it is unclear what conclusions, if any, can be drawn about the impact of a single nucleotide substitution on composite, complex traits such as educational outcomes, plausibly influenced by thousands of variants (e.g., > 12,000 variants influence human height; Yengo et al. 2022) all of which are modulated by many developmental and environmental factors. What works reasonably well under carefully controlled environmental conditions, with optimal number of replications such as "greenhouse experiments" in which almost all sources of variation can be reasonably identified and partitioned, does not work for widely spread outbreeding human populations. Interestingly, a recent study by Kweon et al. (2025) on 668,288 individuals of European descent identified 162 genomic loci associated with income. However, these loci accounted for only 1–5% of income variation, with most effects being indirect and mediated through education, mental health, and social factors. What happened to the rest of 95% variation? This underscores that income is primarily shaped by environmental and societal influences rather than genetic factors. These findings align with previous research showing weak relationships between genetics and educational achievement.
There are two main issues. First, causal influences on the many aspects of the biological and ecological context cannot be uncoupled in a way that makes it possible to identify the role of the specific genetic variant within a given gene, especially for complex traits: there are too many confounders. Second, heritability estimates in general in human populations explained by SNPs are generally low and as a rule vary among traits (Mayhew and Mayre 2017). Therefore, these are not useful for decision-making about individuals or society. Consider each of these points in turn.
Confounders of heritability
Briefly, heritability is a ratio of genetic (more precisely additive genetic) variance to phenotypic variance and is applicable only to a given population and environment, as most quantitative geneticists have pointed out for decades. This ratio is constrained by traits, and demographic variables such as age, stage, and gender. Although this key idea in genetics was introduced by both Fisher and Wright in the early 1900s (Visscher et al. 2008), its use as a tool to develop efficient breeding strategies was promoted by Lush (1937) and continues to be used by countless plant and animal breeders, as well as evolutionary biologists around the world.
Narrow-sense heritability h2 is defined as the ratio of additive portion of variance Va to phenotypic variance Vp or Va/Vp (Falconer and Mackay 1996). This definition refers to the degree of direct and cumulative influence of parental genes on the progeny, estimated in terms of parent – offspring regressions. The concept of heritability is one of the most widely used and abused measures, particularly in human behavioral research and more specifically in relation to human cognitive behavior (Feldman and Lewontin 1975; Kevles 1985; Block 1995; Kempthorne 1977, 1978, 1997). To see how this works, let us take an easily measurable trait (phenotype) such as human height. Fisher (1918) recognized the phenotype P as the product of genetic and environmental factors G and E, and expressed their variation in terms of their variances in relation to (for simplicity) a single segregating locus. Later investigators (Cockerham 1954; Kempthorne 1957) further partitioned genetic variance G into a set of variances as follows: additive, a; dominant, d; intra-allelic, epistatic, inter-allelic interaction, i; and environmental, e. This is formally expressed as Vp = Va + Vd + Vi + Ve. These could also show additional interactions a × a; a × d; a × i; a × e; d × d, d × i, …. These relationships show both non-linear and unpredictable relationships with the environment (Fivet et al. 2018). This means that heritability is not especially useful for predicting the performance of individual crosses (or a series of them) in outbreeding organisms, generally estimated in terms of specific combining ability such as in the Toystory model, discussed by Harden (2021, 32). For instance, a person could mate with individuals from distinctly different genetic and geographic backgrounds, and have many children. Accordingly, siblings within families with varying degrees of relationships might show differing degrees of dominance and epistasis (intra- and inter-allelic interactions) for phenotypic traits. Individuals within such families might show divergent developmental patterns despite sharing a common environment.
Heritability also fails to account for the influence of dominance, for instance, if the Aa genotype shows intra-allelic interaction, pleiotropy, and epistatic interactions with other genes in a polygenic system. These interactions would show both superior and inferior effects of dominance and epistasis on expressed phenotypes ranging from hybrid vigor to hybrid inferiority (Govindaraju 2019). Intra- and inter-allelic interaction effects are predominantly expressed in individual-specific matings, particularly in outbred species of organisms, such as humans. In humans (and many other complex organisms such as mammals), gene effects that are largely or entirely nonadditive (over- and under-dominance) might show temporal or developmental trends, even among individuals sharing the same environment (Rice 2008). The same principle applies to gene-gene (epistatic) interactions and gene-environment interactions evaluated with stringent experimental designs such as those used by breeders.
What about variation in the environment? The classical approach to partitioning phenotypic variance is to estimate the amount of environmental variance, as well as the amount of heritability. As discussed earlier, this has worked reasonably well in statistically well-designed studies, with adequate randomization, replications, and local controls. Much of plant and animal breeding and agricultural research still relies on efficient experimental designs in order to reduce random and systematic errors, the techniques primarily developed by Fisher (1925). This cannot be extended to human populations, which are not amenable to such stringently controlled randomized studies. A special difficulty for human populations is that strictly controlling for environmental variations, which have a large number of components, is also impossible. Every individual in a sibship experiences the common environmental variation, measured in terms of common environmental variance, Vc, but in different degrees, so that every mating pair and progeny of a given mating pair experience widely different micro-environments. Quantitative geneticists who deal with natural and domesticated populations of various species suggest that Vc could include maternal effects, maternal genetic effects (mtDNA); brood/clutch effect, nested horizontal and hierarchical effects, confounding effects, and more (Kruuk and Hadfield 2007): "in many cases—for example, maternal effects due to innate differences between mothers—shared environment effects cannot be entirely eliminated by experimental design. Similarly, marker-based methods of estimating heritability will also be affected by the existence of shared environment effects."
In conclusion, heritability, perhaps useful for making approximate predictions on crop yields or in livestock breeding, under replicated and controlled experimental conditions, has limited purchase (if any) on the effects of a single allele substitution on an individual which would be of any use for making predictions about that individual's likelihood of success in school or financial success. There are simply too many non-linear, non-additive, confounding sources of variation in the environment and at the genetic and epigenetic and developmental levels.
High variability, small effect size
Simply quantifiable traits such as height and weight are defined as complex traits, and each one of them is governed by a system of thousands (polygenic) of mutually interacting genomic variants. For instance, human height, which can be clearly measured, appears to be influenced by over 12,000 variants, which were determined using over 5.4 million individuals (Yengo et al. 2022). Behavioral traits such as educational and financial success, on the other hand, are influenced by a combination of many complex traits, which include even biological traits such as height and weight; hence, they are composite traits. In other words, a number of complex traits influence composite traits. Data on many of these composite behavioral traits are derived from observational studies, which are often presented as dichotomous and categorical variables, and some of these data are just approximations, and hence are internally inconsistent.
For instance, consider variables such as "average number of school days missed per student," or "frequency of school absences;" "are parents rich or poor?"; or "income group." Such variables are correlated with other variables (which are also confounded) and some of which are subjective and internally inconsistent because they represent small segments of time periods of observations in the data set. Therefore, such data are often "adjusted" for confounders to improve the validity of the outcomes and to fit statistical models. Unfortunately, "residual confounding" still remains in the data even after adjusting for primary confounders, which would invariably lead to the cryptic and latent distortion of data structure due to measurement unreliability and model misspecification. "Error rates are highest—in some cases approaching 100%—when sample sizes are large and reliability is moderate," which often leads to drawing spurious conclusions (Rice 2008; Westfall and Yarkoni 2016). In general, confounding is frequently overlooked or downplayed in contemporary reports about genetic causes of human behavior and socioeconomic outcomes, which "fuels heriditarian fallacies" (Benning et al. 2024).
In the production sciences, high levels of heritability would serve as an index attributable to genetic factors, which in turn would be used to predict the extent that improvement could be made on a given trait or a set of traits through mass (truncational) selection (Robertson 1966). Traits that show low levels of heritability in the range of 10–15 percent are simply discarded, because low levels of heritability are generally attributed to environmental variation. Such low levels of heritability are ephemeral because of the fluctuating nature of environments. Alternatively, under such circumstances, it might be that, depending upon the traits examined, there could be greater genetic similarities among populations than differences—or perhaps not. For instance, "In his 1919 paper, Fisher commented that even a high heritability of human height of 0.95 (his estimate for height) is consistent with large effects of environmental factors" (Visscher and Goddard 2019). In short, even high heritability could be unreliable.
Lewontin (1974) has addressed chasing these mirages concerning traits that show low levels of heritability (e.g., Herrnstein and Murray 1994) quite well. "The fallacy is that a knowledge of the heritability of some trait in a population provides an index of the efficacy of the environmental or clinical intervention in altering the trait either in individuals or in the population as a whole." Lewontin continues: "misunderstanding about the relationship between heritability and phenotypic plasticity is not simply the result of an ignorance of genetics on the part of psychologists and electronic engineers. It arises from the entire system of analysis of causes through linear models, embodied in the analysis of variance and covariance and in path analysis."
Epigenetics
The Fisherian school of quantitative genetics is built on the premise that the phenotype is the product of genotype and the environment (P = G + E). A better description of this relationship would be that information from the genotype space passes through developmental space and expresses as phenotype under the influence of the environment (Lewontin 1974). Both Sewall Wright (1934) and Waddington (1942) used the term "epigenetics" to represent the developmental space (Lewontin 1974). These concatenated aspects of biological processes are linked together in networks (Fig. 3).
A schematic view of genotype-epigenetic-phenotype (G-E-P) map of an individual at a given time. Information from genotype space to phenotype space passes through epigenetic space. Genes (G1 … G8) show cis and trans (gene-gene) interactions. Genes influence phenotypes through biochemical (metabolic) pathways embedded in epigenetic space. Some genes could have major effect on only one phenotype (Mendelian traits, Tr5). On the other hand, many genes with pleiotropic effects influence complex traits (Tr4) and composite traits (Tr1, Tr2, Tr3) which are a constellation of complex traits. Also, some traits (T3, T4) could overlap with other traits, and appear to be independent at the phenotype level may not be independent at all; but could be connected in the epigenetic and genotype space due to pleiotropy. Note that all traits are differentially influenced by a network of genes that show direct, indirect, mediated, conditional, reverse, truncated and merged (H = hub) paths of distribution and dissipation of gene specific enzymes (metabolic flow) as well as post-translational modifications of histones into various components of the phenome (extra-genome). The phenotype is also influenced by age, stage gender, natural and constructed environments (Lewontin 1974; 2000; Wagner 1996; Houle et al. 2010)
Epigenetics has been defined as both heritable and non-heritable phenotypic changes without involving changes in the DNA. Expression of both over- and under- dominance (heterosis) are also aspects of epigenetic phenomena. More recent studies suggest that epigenetic processes involve direct, indirect, mediated, conditional, reverse, truncated, and merged paths of distribution and dissipation of gene-specific enzymes as well as post-translational modifications of histones into various components of the phenotype (Ramazi et al. 2020; Handy et al. 2011). All of these phenomena defeat attempts to accurately identify the paths of influence of a specific gene on any complex or composite trait in an individual or a population. The problem is that epigenetic and developmental effects introduce an extraordinarily wide range of variation in the phenotypes of individuals with the same genes in relation to environments.
Wright (1920) suggested that the epigenetic (extra-genomic) variation includes "irregularities in development" component D. He reported that in outbred guinea pigs, on the one hand, variance due to environment e2 was only 0.003%, and that due to heredity h2 was 42%, while on the other hand, variance due to development d2 was 58%. In contrast, in the inbred line, variance was 3% due to heredity, 5% due to environment, and 92% due to development. In other words, developmental (epigenetic) variations could overwhelm both hereditary and environmental effects. The general importance and implications of Wright's result has been confirmed by the vast body of accumulating evidence on epigenetic variation, which suggests that the environment exerts a profound influence on the developing organism in numerous ways throughout its lifetime (Feil and Fraga 2012). These are further supported by concepts such as phenotypic plasticity, and epigenetic, developmental, and metabolic reprogramming (Feinberg 2007; Godfrey et al. 2016; Xavier et al. 2019).
For instance, the environment together with epigenetics plays a significant role in the developmental origins of health and disease during the human life course (Gluckman and Hanson 2006; Wagner et al. 2024) which influences cognitive abilities as well. In a recent study, similar to Wright’s statistical analysis, Odintsova et al. (2021) made a comparative analysis of variances explained by allelic variation via GWAS and epigenetic markers using Epigenome-Wide Association Study (EWAS) on anthropometric traits (birth weight, adult BMI) and behavioral traits (current, former smoker). They calculated both PRS (DNA-based) and methylation risk score (MRS; epigenetics-based). Subsequently, they compared the variance explained by PRS and by MRS on smoking (a behavioral trait) between current smokers and former smokers. Up to 57.5% of the variance in current smoking status and 16.3% of the variance in former smoking status were explained by MRS. Interestingly, the variance explained by MRS was twenty times more than the variance explained by PRS. Clearly, exposure to tobacco, a potent environmental factor, could overwhelm the influence of genetic factors.
Since obesity is becoming a global health phenomenon, Bann et al. (2022) sought to determine PRS for body weight among children in relation to social background. While genetic risk only predicted 10% and social background 4% of these differences, ultimately, neither index was reliable in relation to social classes among children who developed obesity. These investigators concluded that PRS is a poor predictor of body weight, and that neither genetics nor social disadvantage have an important influence on our body weight. Note that body weight is also a complex trait.
Failure to account for epigenetics represents a significant flaw in the WDP perspective in Harden’s treatment. Although she has attempted to incorporate some thinking on differential gene expression (Harden 2021, 119), she makes no explicit reference to epigenetic mechanisms anywhere in her narrative. Some proponents of the WDP school may yet step in to explain how, in light of the importance of epigenetics, DNA sequence differences between individuals play a causal role in social or educational outcomes. Nonetheless, it is difficult to avoid the conclusion that Harden's avoidance of the issue reflects insuperable methodological barriers to accounting for it.
Niche construction
Harden mentions that "being moved out of institutional care causes an increase in IQ, but how? No one really knows" (Harden 2021, 104). Similarly, "adopting a child into foster care causes higher IQ" (2021, 107), and "a positive environmental difference disproportionately improved the outcomes of people who were at highest genetic risk for poor outcomes" (Harden 2021, 166). These statements suggest that genetically "determined" measures indeed show plasticity in cognitive traits to changing environments. They seem out of place, coming from a proponent of the WDP school, with its emphasis on genetics. Indeed, in human society, in which there is room for individual choice in the context of options provided by school, home life, and social relationships more generally, extension of an ecological concept—niche construction—makes more sense.
Note that the three important measures of quantitative genetic variation considered so far—heritability, GWAS, and PRS—are fed by the same concealed genetic conduit—additive gene action. As a trained plant breeder focused on out—breeding species, I (DRG) was weaned on the merits of additive gene action coming from the Edinburgh and N. C. State Schools of quantitative genetics. True, additive gene action has undoubtedly provided a strong theoretical foundation for mass/truncation selection programs and more (Robertson 1966; Crow and Kimura 1979), and have paved the way for spectacular advances in plant and animal breeding (Hill 2008). Nonetheless, this classic idea may not be applicable to the level of the individual (individual selection), where population based mass selection cannot be practiced (e.g. humans). Individual selection (merit) is a common feature of outbreeding organisms, because in outbreeding species individual selection (merit) is the predominant mode of selection. As discussed earlier, the additive model suggests that phenotype is the direct and linear product of all the genes and their variants on the phenotype with little or no dominance and epistatic effects. As we noted, indices of additive genetic variation, particularly (even high) heritability, are affected by the environment (Lewontin 2006).
Going further, Harden mentions Lewontin’s thought experiment examining the uncertainties associated with the phenotypic expression of genotypes in relation to environments, namely norms of reaction, or simply genotype x environment (GxE) interactions (Fig. 2; Alvarez-Castro 2024). Norms of reaction and plasticity are represented by a set of genotype-specific response curves that are often nonlinear with respect to environmental variation (Lewontin 2006; Gupta and Lewontin 1982). In fact, Hogben, writing in 1933 (Mayo and Nanjundaiah 2024), highlighted nonlinear genotype-environment interactions as a factor that could challenge the validity of Fisher’s assumptions regarding additive gene activity and the linear response of phenotypes to environmental variation. Such non-linear interactions are well known both in evolutionary biology and in production agriculture. For example, the GxE principle is central to identifying varieties and breeds that show a broad or narrow range of adaptability using multilocation trials. Heterozygous individuals tend to show greater resilience and buffering to environmental fluctuations (Govindaraju 2019). Further, these are a few of the foundational concepts in ecology and evolutionary biology of plants and animals (Clausen et al. 1940; Schmalhausen 1949).
So, how can low levels of additive genetic variance and weakly predictive PRS scores based on idealizations be trusted as predictive indices for composite traits such as childhood delinquencies and skipping school? Indeed, they should not be, especially because the role of personal and individual choice among a range of options provided by society plays such a significant role in an individual's educational outcomes. Axes in reaction norm graphs depict positive, negative, or neutral influences on developing phenotypes. As Lewontin (2006) puts it, while the linear model "appears to isolate distinct causes of variation into separate elements, it does not do so because the amount of environmental variance that appears depends upon the genotypic distribution, while the amount of genetic variance depends upon the environmental distribution. Thus, the appearance of the separation of causes is a pure illusion."
Lewontin (1983, 2000) proposed that most organisms, including humans, actively and consciously "construct" their own extensive and elaborate environments for survival and reproduction. From an ecological perspective, the niche of any organism represents an n-dimensional hypervolume (Hutchinson 1957; Rosa and Tudge 2013). Each point within this hypervolume corresponds to a state of the environment that permits populations of a species to exist. Nonetheless, in many organisms, particularly outbreeding ones such as humans, individual phenotypes interact and develop differently in relation to the abiotic and biotic environmental conditions that influence viability and reproduction (fitness). Therefore, many have argued for extending the Hutchinsonian concept to the individual level or individual niche: an n-dimensional space that accounts for the social roles, interactions, and choices each individual makes in relation to the social and cultural conditions they need to thrive (Ihara and Feldman 2004; Bergmüller and Taborsky 2010; Kaiser et al. 2024).
In fact, Bronfenbrenner, the renowned child development specialist cited by Harden (2021, 106), proposes ecological systems theory, which suggests different environmental influences that can affect a child's development, including factors such as parents, friends, school, work, and the larger cultural context (Rosa and Tudge 2013; Bronfenbrenner 1981). These ideas complement the idea of a Hutchinsonian niche and Lewontin’s idea of niche construction. In the case of humans, niche dimensions would include food, clothing, shelter, generational wealth, social interaction, medical interventions, transportation, and landscapes, to name a few. These factors, individually or collectively, influence individuals within families and families within ethnic groups, shaping their success both within and across generations. All of these may exert both advantageous and adverse effects on cognitive and physical health as well as longevity at all levels of biological organization during the course of their development. These combined processes have been termed "niche construction" (Odling-Smee et al. 2013; Lewontin 2000), which has been suggested to influence evolutionary processes in a wide range of organisms.
Interestingly, economists Chetty and Hendren (2022), using data derived from millions of families as part of "The Equality of Opportunity Project," have shown that the area in which children grow up has significant causal effects on their prospects for upward mobility. For instance, their models suggest that a child who moves at age two from the lower-income Van Dyke public housing to the nearby higher-income Nehemiah Spring Creek affordable housing in Brooklyn will earn roughly $25,000 a year as adults, compared with $17,000 a year, on average, had they remained. This gain decreases with each year the child remains, particularly after 18 (see Fig. 4).
Children who moved to more upwardly mobile neighborhood tended to have better income as adults illustrated by children who moved within Brooklyn, New York (Chetty and Hendren 2022; Reproduced with permission)
Novel and unique environments derived from niche construction, which include seemingly simple factors such as changing school districts or moving to a wealthy neighborhood, might affect the phenotypic expression of quantitative traits through various components of genotype and epigenetic spaces (Furrow and Feldman 2014). As a result, both positive and negative influences of niche construction on health and longevity, cognition, and even the aptitude for learning may be expected at all levels of biological hierarchies—from individuals to populations. From this perspective, Fisher's linear model used in quantitative genetics can be extended to include niche construction and also to represent the components of phenotypic variation (e.g., Falconer and Mackay 1996). This can be represented as P = G + Er + Enc + G × Er + G × Enc, where P = phenotype; G = genotype; Er = regular environment; Enc = environment (created by) niche construction; G × Er = interaction of a genotype in a regular environment; G × Enc = genotype response to a constructed environment. The influence of the constructed environment on a given genotype may be represented as G = G × Enc = G* (Govindaraju et al. 2015). It is assumed that Enc (just as Er) would also influence the additive (A), dominant (D), and epistatic (E) components of the genotype (where G→ G*; genotype influenced by the constructed environment) and their interactions thereof (Falconer and Mackay 1996). Additionally, environmental factors themselves show different forms of interactions (often factorial), as seen in agronomic experiments (Fisher 1925). Niche construction could be extended to include culture, nutrition, social and economic status, etc., among individuals nested in families and families in ethnic groups spanning generations (Odling-Smee 2024). As a complement to these results, Falconer (1952; Falconer and Mackay 1996) suggested in quantitative genetics that a quantitative trait (such as height or cognitive traits), when measured in two different environments, should be treated as two distinct but genetically correlated traits.
Harden suggests we use genetics as a guide to choose the environment, so that those with favorable array of genetic variants (rank ordered) according to the polygenic educational index will flourish; genetic isopleths could be drawn on country's rural and urban landscapes to guide regional and national educational programs. "Other environmental inequalities could be similarly diagnosed using genetic data … Which schools have the lowest rates of disciplinary problems among youth who are currently at most genetic risk for aggression, delinquency, or substance use problems; which areas of the county are 'opportunity zones' is defined not solely in terms of... children from low-income families ..., but also in terms of how children who are genetically at risk for school problems [fare]" (Harden 2021, 242). It is not at all clear that this will lead to a favorable result. "The fallacy is that a knowledge of the heritability [additive genetic variance/PRS] of some trait in a population provides an index of the efficacy of environmental or clinical intervention in altering the trait either in individuals or in the population as a whole" (Lewontin 2006).
Niche construction and emergence: transcending genetics
As seen above, the niche construction concept suggests that individuals not only could construct their own niche but also could influence their development and growth through sib-effects, interactions between genetically unrelated individuals, indirect genetic effects, and other types of interactions (Bijma and Wade 2008, Fogarty and Wade 2022). Most importantly, complex and composite traits are terminal and telescopic features of organisms that show ontogenetic relationships such as functional integration, interaction, and modification with other preceding traits (from the cell through organs and organ systems) during the course of development and growth throughout their life span. Like all complex systems in nature, these systems are characterized by: (i) complexity, (ii) cooperation (iii) hierarchy, (iv) self-governance, (v) emergent behavior, (vi) real-time decision making, and (vii) redundancy, robustness, resilience, and plasticity (Mitchell 2009).
Emergence, in particular, is a prime property of all biological systems in which the behavior of networks of complex traits often (positively) deviates from their component parts (Kauffman 1993; Nijhout, et al. 2017; Alcala-Corona 2021), which provides redundancy and flexibility to developing organisms. The Bijma-Fogarty-Wade model suggests that components of niche construction, particularly, social environment and indirect genetic effects within an individual, could swamp the influence of additive genetic variance on a trait and in turn drive the traditional heritability bounds (of even 1.0) "out of the ballpark."
Furthermore, these emergent developmental processes could potentially boost the plasticity of individuals through epigenetic and metabolic reprogramming both within and even across generations (Fitz-James and Cavalli 2022). From a quantitative genetic perspective, the combined additive and non-additive genetic variance and covariances, as well as epigenetic properties, may be reflected in the observed, latent and emergent properties and ultimate expression of genetic, epigenetic, and morphological traits. These could help push the phenotypic performance of a given trait beyond the upper ceiling of heritability. Clearly, although Harden gives a lukewarm treatment to Urie Bronfenbrenner’s bioecological model of development, the Fogarty-Wade niche-construction model and emergent properties of biological systems appear to explain why "adopting a child into a foster care causes higher IQ" ( Harden 2021, 107) and why one cannot rely on a static DNA statistic in order to determine a child’s future. Clearly, "DNA is not their destiny, it is only a part of what they are" (King 2024). Epigenetic, somatic, cultural, and ecological inheritances and niche construction can affect a wide variety of human traits, including cognitive abilities of children (Lala and Feldman 2024). Human societies offer extraordinary opportunities to bring about both positive and negative changes in almost every aspect of human biology, including a capacity to learn, earn and prosper!
Our shared responsibility
A half-century later, Lewontin's (1974, 318) claim should be affirmed: "The fitness of a single locus ripped from its interactive context is about as relevant to real problems of evolutionary (and human) genetics as the study of the psychology of individuals isolated from their social context is to an understanding of man’s sociopolitical evolution. In both cases context and interaction are not simply second-order effects to be superimposed on a primary monadic analysis. Context and interaction are of the essence."
In this ever-expanding enterprise of genetics research and debates, all of us—evolutionary biologists, geneticists, molecular biologists, physicians, editors, publishers, journalists, entrepreneurs, and others have been creating episodic "irrational exuberance" for over a century. Many influential social scientists, along with some reputable geneticists such as C. D. Darlington, have followed Herbert Spencer's lead by directly applying core ideas from genetics and evolution to human beings. Harden's work, aimed at a wide audience, reflects ongoing specialist interest among geneticists and the new discipline of data science, which is to a large extent as old as the science of genetics itself! This is seen, for instance, in a recent study in Nature Human Behavior linking specific patterns of responses to surveys with a supposed genetic basis using PRS (Mignona et al. 2023). Due no doubt to the prestige of Nature, the results of this paper are reported in a news source under the title "Using genetics to gain better insights into human behavior" ("Using Genetics" 2023). There is no mention of methodological questions about PRS, let alone questions about identifying a specific genetic basis for specific behavior. As discussed, genes alone do not determine a child's success in life—factors such as access to education, neighborhood, wealth, and social networks play a far greater role. The use of genetic indices such as PRS poses the risk of justifying inequality, especially in traditionally socially structured hierarchical societies (Mayo and Nanjundaiah 2024). More tellingly, "One’s genetic makeup or the family and societal environment into which one is born does not dictate one’s intrinsic value. The genetic variants that matter for income, and their effects, depend on the environment, that is, on what skills are valued by the labor market and by society. As the labor market changes or as government policies change, so can the variants and their effects" (Kweon et al. 2025).
We suggest that this on-going promotion of a gene-centric point of view is something like the "winner’s curse" concept, according to which the winning bid on an item exceeds its value. Either way, society is caught in between and bears the brunt. Like many predictive genetic indices, such as heritability estimates, the application of PRS to the educability of a specific child or even a group of children is untestable and irreplicable in many environments (laid out in well-designed experiments with adequate replications, controlling both genetic and environmental variation, as done in most agricultural multilocation trials) over the stage, age, gender, birth order, and geographical origins and movements of individual children. As described throughout this paper, advocating genetic certainty in the face of extreme biological, social, environmental, and other capricious and concentric circles of uncertainties involved in the future educational and financial prosperity of individual children based on the low and fluctuating levels of PRS scores is frightening, to say the least. It is analogous to a popular Sanskrit dictum, Lalata likhitam ("fate written on the forehead")—no one can change it!
Listen to the Red Queen’s gentle whisper to Alice, "Now, here, you see, it takes all the running you can do, to keep in the same place." But now, such musings may be replaced by numerical hieroglyphs derived from counting the number of nucleotides in every child that aspires to get an education—old adages march draped in genetic clothing. This brings Albert Camus’ (on a relevant note, Camus was born extremely poor and won the Nobel Prize for literature at the age of 44) portrayal of Sisyphus to mind. Sisyphus rolls the boulder of his fate to the mountain top, but it rolls down to the foot of the mountain. Sisyphus descends with it, sighs momentarily at the static boulder, and rolls it back again to the summit. He is stuck in a perpetual ordeal. Perhaps not—even fate can be propelled to ascend or descend in relation to gravity across newer valleys and scaling greater summits. To quote George Santayana, "Those who cannot remember (understand) the past are condemned to repeat it"—the choice is ours!
Availability of data and materials
No datasets were used or generated or analyzed for writing this article.
References
Acheive. NGSS High School Life Sciences. 2013. https://www.nextgenscience.org/sites/default/files/HS%20LS%20topics%20combined%206.13.13.pdf. Accessed 17 Aug 2024.
Adeyemo A, Balaconis MK, Darnes DR, Fatumo S, et al. Responsible use of polygenic risk scores in the clinic: potential benefits, risks and gaps. Nat Med. 2012;27:1876–84.
Alcala-Corana SA, Sandovi-Motta S, Espinal-Enriquez J, Hernandez-Lemus E. Modularity in Biological networks. Front Genet. 2021;12: 701331.
Alvarez-Castro J. Genes, environments and interactions: evolutionary and quantitative genetics brought up-to-date. Berlin: Springer; 2024.
Anonymous. 1000 Genomes Project Consortium. A global reference for human genetic variation. Nature. 2015;526:68–74.
Anonymous. NYC language map 2021. https://languagemap.nyc/.
Asbury K, Plomin R. G is for genes. Chichester: Wiley; 2014.
Bakalar N. Healthy lifestyle may reduce dementia risk. The New York Times. 2019.
Bann D, Wright L, Hardy R, Williams DM, Davies NM. Polygenic and socioeconomic risk for high body mass index: 69 years of follow-up across life. PLoS Genet 2022;18(7): e1010233. https://doiorg.publicaciones.saludcastillayleon.es/10.1371/journal.pgen.1010233.
Belbin GM, Cullina S, Wenric S, Gillian M, Soper ER, Glicksberg BS, Torre D, et al. Toward a fine-scale population health monitoring system. Cell. 2011;84:2068–83.
Benning JW, Carlson J, Smith OS, Shaw RG, Harpak A. Confounding fuels hereditarian fallacies. bioRxiv. 2024. https://doiorg.publicaciones.saludcastillayleon.es/10.1101/2023.11.01.565061.
Bergmüller R, Taborsky M. Animal personality due to social niche specialization. Trends Ecol Evol. 2010;25:504–11.
Bijma P, Wade MJ. The joint effects of kin, multilevel selection and indirect genetic effects on response to genetic selection. J Evol Biol. 2008;21:1175–88.
Block N. How heritability misleads about race. Cognition. 1995;56:99–128.
Boerwinkle E, Chakraborty R, Sing CF. The use of measured genotype information in the analysis of quantitative phenotypes in man. Ann Hum Genet. 1986;50:181–94.
Bronfenbrenner U. The ecology of human development: experiments by nature and design. Cambridge: Harvard University Press; 1981.
Castle WE. An improved method of estimating the number of genetic factors concerned in cases of blending inheritance. PNAS USA. 1921;81:6904–7.
Chekroun I, Rabea F, Jain R. Premarital genomic screening in Arab populations of the Middle East. Nat Med. 2025.
Chetty R, Hendren N. Ensuring the American Dream. Financ Dev. 2022;38–41.
Clausen JD, Keck D, Hiesey WM. 1940. Experimental Studies on the Nature of Species. I. Effects of Varied Environments on Western North American Plants. Washington D.C. Carnegie Institute of Washington. Publ. no. 520:1–452
Cockerham CC. An extension of the concept of partitioning hereditary variance for analysis of covariances among relatives when epistasis is present. Genetics. 1954;39:859–82.
College Board. AP biology course and exam description. 2020. https://apcentral.collegeboard.org/media/pdf/ap-biology-course-and-exam-description.pdf. Accessed 17 Aug 2024.
Crow JF, Kimura M. Efficiency of truncation selection. PNAS USA. 1979;(76):396–9.
Darlington CD. The control of evolution in man. Eugenics Rev. 1958;50(3):169–78.
Davenport CB. Euthenics and eugenics. Popular Sci Monthly. 1911;78:16–20.
Dawkins R. The selfish gene. Oxford: Oxford University Press; 1976.
Dawkins R. The Extended phenotype: the long reach of the gene. Oxford: Oxford University Press; 1982.
Ding Y, et al. Polygenic scoring accuracy varies across the genetic ancestry continuum. Nature. 2023;618:774–81.
Dobzhansky T. Nothing in biology makes sense except in the light of evolution. Am Biol Teach. 1973;35:125–9.
Donovan BM, Weindling M, Lee DM. From basic to humane genomics literacy: how different types of genetics curricula could influence anti-essentialist understandings of race. Sci Educ. 2020;29:1479–511. https://doiorg.publicaciones.saludcastillayleon.es/10.1007/s11191-020-00171-1.
Donovan BM, Weindling M, Salazar B, Duncan A, Stuhlsatz M, Keck P. Genomics literacy matters: supporting the development of genomics literacy through genetics education could reduce the prevalence of genetic essentialism. J Res Sci Teach. 2021;58:520–50. https://doiorg.publicaciones.saludcastillayleon.es/10.1002/tea.21670.
Elhaik E. Empirical distributions of FST from large-scale human polymorphism data. PLoS ONE. 2012. https://doiorg.publicaciones.saludcastillayleon.es/10.1371/journal.pone.0049837.
Elhaik E. Principal component analysis (PCA)—based findings in population genetic studies are highly biased and must be reevaluated. Sci Rep. 2022;12:14683.
Falconer DS. The problem of environment and selection. American Naturalist. 1952;86:293–98.
Falconer DS. A note on Fisher’s ‘average effect’ and ‘average excess.’ Genet Res. 1985;46:337–47.
Falconer DS, Mackay TFC. Introduction to quantitative genetics. Harlow: Addison Wesley Longman; 1996.
Feil R, Fraga MF. Epigenetics and the environment: emerging patterns and implications. Nat Rev Genet. 2012;13:97–109.
Feinberg AP. Phenotypic plasticity and the epigenetics of human disease. Nature. 2007;447:433–40.
Feldman MW, Lewontin RC. The heritability hang-up. Science. 1975;190:1163–8.
Fisher RA. the correlation between relatives on the supposition of Mendelian inheritance. Trans Roy Soc Edinburgh. 1918;52:399–433.
Fisher RA. Statistical methods for research workers. Edinburgh: Oliver and Boyd; 1925.
Fisher RA. The genetical theory of natural selection. Oxford: The Clarendon Press; 1930.
Fisher RA. Average excess and average effect of a gene substitution. Ann Eugen. 1941;11:53–63.
Fitz-James MH, Cavalli G. Molecular mechanisms of transgenerational epigenetic inheritance. Nat Rev Genet. 2022;23(6):325–41. https://doiorg.publicaciones.saludcastillayleon.es/10.1038/s41576-021-00438-5.
Fiévet JB, Nidelet T, Dillmann C, de Vienne D. Heterosis is a systemic property emerging from non-linear genotype-phenotype relationships: evidence from in vitro genetics and computer simulations. Front Genet. 2018;9:159.
Fogarty L, Wade MJ. Niche construction in quantitative traits: heritability and response to selection. Proc Roy Soc B Lond. 2022. https://doiorg.publicaciones.saludcastillayleon.es/10.1098/rspb.2022.0401.
Furrow RE, Feldman MW. Genetic variation and the evolution of epigenetic regulation. Evolution. 2014;68:673–83.
Galton F. Inquiries into human faculty and its development. London: Macmillan; 1883.
Galton F. Eugenics: its definition, scope and aims. Am J Sociol. 1904;10:1–25.
Garrod A. The incidence of alkaptonuria: a study in chemical individuality. Lancet. 1902;130:1616–20.
Gericke N, Carver R, Castéra J, Evangelista NAM, Marre CC, El-Hani CN. Exploring relationships among belief in genetic determinism, genetics knowledge, and social factors. Sci Educ. 2017;26:1223–59. https://doiorg.publicaciones.saludcastillayleon.es/10.1007/s11191-017-9950-y.
Gericke N, McEwen B. Defining epigenetic literacy: how to integrate epigenetics into the biology curriculum. J Res Sci Teach. 2023. https://doiorg.publicaciones.saludcastillayleon.es/10.1002/tea.21856.
Gluckman P, Hanson M. Developmental origins of health and disease. Oxford: Oxford University Press; 2006.
Godfrey KM, Costello PM, Karen AL. Development, epigenetics and metabolic programming. Nestle Nutr Inst Workshop Ser. 2016;85:71–80.
Goldstein AG. Darwinism. In a companion to the history of American Science. Ed. Georgina Montgomery and mark Largent. New York: Wiley Blackwell; 2015.
Govindaraju DR. An elucidation of over a century old enigma in genetics-Heterosis. PLoS Biol. 2019. https://doiorg.publicaciones.saludcastillayleon.es/10.1371/journal.pbio.3000215.
Govindaraju DR, Atzmon G, Barzilai N. Genetics, longevity, and lifespan: Lessons from centenarians. Appl Transl Genomics. 2015;4:23–32.
Grotzinger AD, Keller MC. Potential bias in genetic correlations: mating patterns across two traits can inflate estimates of genetic overlap. Science. 2022;378:709–10.
Gupta AP, Lewontin RC. Study of reaction norms in natural populations of Drosophila pseudoobscura. Evolution. 1982;36:934–48.
Hamamy, et al. Consanguineous marriages, pearls and perils: Geneva international consanguinity workshop report. Genet Med. 2011;13:841–7.
Hamilton WD. The Genetical evolution of social behaviour. II. J Theor Biol. 1964;7:17–52.
Handy DE, Castro R, Loscalzo J. Epigenetic modifications: basic mechanisms and role in cardiovascular disease. Circulation. 2011;123:2145–56. https://doiorg.publicaciones.saludcastillayleon.es/10.1161/CIRCULATIONAHA.110.956839.
Harden KP. Why progressives should embrace the genetics of education. New York: The New York Times; 2018.
Harden KP. The genetic lottery why DNA matters for social equality. Princeton: Princeton University Press; 2021.
Haworth S, Mitchell R, Corbin L. Apparent latent structure within the UK Biobank sample has implications for epidemiological analysis. Nat Comm. 2019;10:1–9.
Herrnstein RJ, Murray C. The bell curve. New York: Free Press; 1994.
Herzig AF, Clerget-Darpoux F, Génin E. The false dawn of polygenic risk scores for human disease prediction. J Pers Med. 2022;12:1266–77.
Hill WG, Goddard ME, Visscher PM. Data and theory point to mainly additive genetic variance for complex traits. PLoS Genet. 2008;4: e1000008.
Houle D, Govindaraju DR, Omholt T. Phenomics: the next challenge. Nat Rev Genet. 2010;11:855–855.
Hutchinson GE. Concluding Remarks. Cold Spring Harb Symp Quant Biol. 1957;22:415–27.
Ihara Y, Feldman MW. Cultural niche construction and the evolution of small family size. Theor Popul Biol. 2004;65:105–11.
Jamieson A, Radick G. Genetic determinism in the genetics curriculum: an exploratory study of the effects of Mendelian and Weldonian emphases. Sci Educ. 2017;26:1261–90. https://doiorg.publicaciones.saludcastillayleon.es/10.1007/s11191-017-9900-8.
Janssens ACJW. Validity of polygenic risk scores: are we measuring what we think we are? Hum Mol Genet. 2019;28:R143–50.
Kacser H, Burns JA. The molecular basis of dominance. Genetics. 1981;97:639–66.
Kaiser MI, Gadau J, Kaiser S, Müller RSH. Individualized social niches in animals: theoretical clarifications and processes of niche change. Bioscience. 2024;74:146–58.
Kampourakis K. Making sense of genes. Cambridge: Cambridge University Pres; 2017.
Kauffman SA. The origins of order. Self-organization and selection in evolution. Oxford: Oxford University Press; 1993.
Kempthorne O. An introduction to genetic statistics. Oxford: Wiley; 1957.
Kempthorne O. A Biometrics invited paper: logical, epistemological and statistical aspects of nature-nurture data interpretation. Biometrics. 1978;34:1–23.
Kempthorne O. Status of quantitative genetic theory. In: Pollack E, Kempthorne O, Bailey TB. Ames IA (Eds). Proc of the international conference on quantitative genetic. 1977;719–60.
Kevles DJ. In the name of eugenics: genetics and the uses of human heredity. New York: Knopf; 1985.
King T. Someone’s DNA is not their destiny; it is only a part of who they are. Irish Examiner. 2024;14.
Kolata G. To prevent heart attacks, doctors try a new genetic test. New York: The New York Times; 2023.
KruukBakal LEB, Hadfield JD. How to separate genetic and environmental causes of similarity between relatives. J Evol Biol. 2007;20:1890–903.
Kumuthini J, Zick B, Balasopoulou A, et al. The clinical utility of polygenic risk scores in genomic medicine practices: a systematic review. Hum Genet. 2022;141(11):1697–704.
Kweon H, Burik CAP, Ning Y, et al. Associations between common genetic variants and income provide insights about the socio-economic health gradient. Nat Hum Behav. 2025. https://doiorg.publicaciones.saludcastillayleon.es/10.1038/s41562-024-02080-7.
Lala KN, Feldman MW. Genes, culture, and scientific racism. PNAS USA. 2024;121:1–10.
Lande R. The minimum number of genes contributing to quantitative variation between and within populations. Genetics. 1981;99:541–53.
Lennon NJ, et al. Selection, optimization, and validation of ten chronic disease polygenic risk scores for clinical implementation in diverse US populations. Nat Med. 2024;30:480–7. https://doiorg.publicaciones.saludcastillayleon.es/10.1038/s41591-024-02796-z.
Lewontin RC. The genetic basis of evolutionary change. New York: Columbia University Press; 1974.
Lewontin RC. The triple helix: gene, organism and environment. Harvard: Harvard University Press; 2000.
Lewontin RC. The analysis of variance and the analysis of causes. Int J Epidemiol. 2006;35:520–5.
Lewontin RC. Gene, organism, and environment. In: Bendall DS, editor. Evolution from molecules to men. Cambridge: Cambridge University Press; 1983.
Lewontin RC. The apportionment of human diversity. In: Dobzhansky T, Hecht MK, Steere WC, editors. Evolutionary biology. New York: Springer; 1972. p. 381–98.
Lush JL. Animal breeding plans. Ames: Collegiate Press; 1937.
Mayhew AJ, Meyre D. Assessing the heritability of complex traits in humans: methodological challenges and opportunities. Curr Genomics. 2017;18:332–40.
Mayo O, Nanjundaiah V. Reflections on assortative mating, social stratification, and genetics. J Genet. 2024;103:15.
Mignona G, et al. Patterns of item nonresponse behaviour to survey questionnaires are systematic and associated with genetic loci. Nat Human Behav. 2023. https://doiorg.publicaciones.saludcastillayleon.es/10.1038/s41562-023-01632-7.
Mills MC, Rahal C. A scientometric review of genome-wide association studies. Commun Biol. 2019. https://doiorg.publicaciones.saludcastillayleon.es/10.1038/s42003-018-0261-x.
Mills MC, Tropf FC. sociology, genetics, and the coming of age of sociogenomics. Ann Rev Sociol. 2020;46:553–81.
Mitchell M. Complexity: a guided tour. Oxford: Oxford University Press; 2009.
Nijhout HF, Sadre-Marandi F, Best J, Reed MC. Systems biology of phenotypic robustness and plasticity. Integr Comp Biol. 2017;57:171–84.
Odintsova V, Rebattu V, Hagenbeek FA, Pool R, Beck JJ, Ehli EA, et al. Predicting complex traits and exposures from polygenic scores and blood and buccal DNA methylation profiles. Front Psychiatry. 2021;12:688464.
Odling-Smee J. Niche construction: how life contributes to its own evolution. Cambridge: MIT Press; 2024.
Odling-Smee J, Douglas HE, Palkovacs EP, Feldman MW, Laland KN. Niche construction theory: a practical guide for ecologists. Q Rev Biol. 2013;88:4–28.
Otto SP, Jones CD. Detecting the undetected: estimating the total number of loci underlying a quantitative trait. Genetics. 2000;156:2093–107.
Palstra FP, Fraser DJ. Effective/census population size ratio estimation: a compendium and appraisal. Ecol Evol. 2012;2:2357–65.
Pearson K. On lines and planes of closest fit to systems of points in space. Philos Mag. 1901;559–72.
Plomin R. Blueprint: how DNA makes us who we are. Cambridge: MIT Press; 2018.
Ramazi S, Allahverdi A, Zahiri J. Evaluation of post-translational modifications in histone proteins: a review on histone modification defects in developmental and neurological disorders. J Biosci. 2020;45:135.
Rawls J. A Theory of Justice. Cambridge: Harvard University Press; 1971.
Rice TK. Familial resemblance and heritability. Adv Genet. 2008;60:35–49.
Risch N, Merikangas K. The future of genetic studies of complex human diseases. Science. 1996;273:1516–7.
Robertson A. A theory of limits in artificial selection. Proc Royal Soc London Series B Biol Sci. 1960;153:234–49.
Robertson A. A mathematical model of the culling process in dairy cattle. Anim Prod. 1966;8:95–108.
Robinson M, Hemani G, Medina-Gomez C. et al. Population genetic differentiation of height and body mass index across Europe. Nat Genet. 2015;1357–62.
Rosa EM, Tudge J. Urie Bronfenbrenner’s theory of human development: its evolution from ecology to bioecology. J Family Theory Rev. 2013;5:243–58.
Rutherford A. Control: The Dark History and Troubling Present of Eugenics. W. W. Norton and Co. 2022.
SEPUP. Issues and science. 3d ed., revised. Berkeley: Lab-Aids Inc. 2020. https://sepup.lawrencehallofscience.org/curricula/middle/.
Schmalhausen II. The Factors of evolution. Blakiston, Philadelphia. 1949.
Slatkin M. Linkage disequilibrium—understanding the evolutionary past and mapping the medical future. Nat Rev Genet. 2008;9:477–85. https://doiorg.publicaciones.saludcastillayleon.es/10.1038/nrg2361.
Smart A. A field at the crossroads: Genetics and racial mythmaking. In The Persistence of Race Science. Ed. Zeller T. Cambridge MA: 2022. https://race.undark.org/articles/a-field-at-a-crossroads-genetics-and-racial-mythmaking.
Smith GD, Ebrahim S. Mendelian randomization: can genetic epidemiology contribute to understanding environmental determinants of disease? Int J Epidemiol. 2003;32:1–22.
Stern F, Kampourakis K. Teaching for genetics literacy in the post-genomic era. Stud Sci Educ. 2017;53(2):193–225. https://doiorg.publicaciones.saludcastillayleon.es/10.1080/03057267.2017.1392731.
Sul JH, Martin LS, Eskin E. Population structure in genetic studies: confounding factors and mixed models. PLoS Genet. 2018;14: e1007309.
Tam V, Patel N, Turcotte M, Bossé Y, Paré G, Meyre D. Benefits and limitations of genome-wide association studies. Nat Rev Genet. 2019;20:467–84.
Templeton A. The general relationship between average effect and average excess. Genet Res. 1987;49:69–70.
Tournebize R, Chu G, Moorjani P. Reconstructing the history of founder events using genome-wide patterns of allele sharing across individuals. PLoS Genet. 2022;23: e1010243.
Turnbull C, Firth HV, Wilkie AOM, Newman W. Population screening requires robust evidence—genomics is no exception. The Lancet. 2024;583–6.
Using genetics to gain better insights into human behavior. News: Medical Life Sciences. 2023. 30 June. https://www.news-medical.net/news/20230630/Using-genetics-to-gain-better-insights-into-human-behavior.aspx.
Visscher PM, Hill WG, Wray NR. Heritability in the genomics era—concepts and misconceptions. Nat Rev Genet. 2008;9:255–66.
Visscher PM, Goddard ME. From R.A. Fisher's 1918 paper to GWAS a century later. Genetics. 2019;211:1125–30.
Waddington H. The epigenotype. Endeavour. 1942;1:18–20.
Wagner GP. Homologues, natural kinds and the evolution of modularity. Amer Zool. 1996;36:36–43.
Wagner C, et al. Life course epidemiology and public health. Lancet Public Health. 2024;9:e261–9.
Wald NJ, Old R. The illusion of polygenic disease risk prediction. Genet Med. 2019;21:1705–7.
Walsh B, Lynch M. genetics and analysis of quantitative traits. Sunderland: Sinauer Associates; 1998.
Weinreich D. The rank ordering of genotypic fitness values predicts genetic constraint on natural selection on landscapes lacking sign epistasis. Genetics. 2005;171:1397–405.
Westfall J, Yarkoni T. Statistically controlling for confounding constructs is harder than you think. PLoS ONE. 2016;11: e0152719.
Wilson EO. Sociobiology: the new synthesis. Cambridge: Harvard University Press; 1975.
Wray NR, Goddard ME, Visscher PM. Prediction of individual genetic risk to disease from genome wide association studies. Genome Res. 2007;17:1520–8.
Wright S. The relative importance of heredity and environment in determining the piebald pattern of Guinea-Pigs. PNAS. 1920;6:320–32.
Wright S. Correlation and causation. Part 1. Method of path coefficients. J Agric Res. 1921;20:557–85.
Wright S. Physiological and evolutionary theories of dominance. Am Nat. 1934;68:24–53.
Xavier MJ, Roman SD, Aitken RJ, Nixon B. Transgenerational inheritance: how impacts to the epigenetic and genetic information of parents affect offspring health. Hum Reprod Update. 2019;25:518–40.
Yengo L, Vedantam S, Marouli E, et al. A saturated map of common genetic variants associated with human height. Nature. 2022;610:704–12.
Acknowledgements
This paper is dedicated to the memory of the late Charles Goodnight and Richard Lewontin, for their seminal contributions to evolutionary genetics, and their clear thinking on the origins and distribution of genetic variation in relation to both natural and constructed environments. I (DRG) treasure my interactions and discussion with both of them on diverse topics and express my gratitude to Drs. Laurel Fogarty, Robert Perlman, Sri Raj, Vidyanand Nanjundiah, Rama Singh, Michael Wade, for suggestions, as well as Dr. David Haig for much needed support.
Funding
The authors received no financial support from any source for this work.
Author information
Authors and Affiliations
Contributions
Both authors contributed equally to the manuscript. Further, we extend our sincere apologies to any authors who may perceive our views as overly critical or dismissive. Our intention is not to undermine differing perspectives but to contribute to a constructive dialogue. We maintain deep admiration for the contributions of all scholars in this field. The ideas expressed in this paper by us do not represent the ideals of our host institutions.
Corresponding authors
Ethics declarations
Competing interests
The authors declare no competing interests.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Appendix
Appendix
The appendix provides tables summarizing main claims of each section.
Table 1 Claims that single nucleotide substitutions should inform social and education policy reinforce essentialism and genetic determinism, and can also constitute a form of eugenics.
Table 2 Genome-wide association studies are not useful for predicting the behavior of complex traits, because Fisherian idealizations are not satisfied by them, especially in light of small effect sizes of single nucleotide substitutions.
Table 3 Heritability, a traditional measure of the influence of a gene or genes on a phenotype, cannot be used in cases of individuals, because there are too many confounding effects, it has high variability, and its effect sizes are small.
Table 4 Epigenetics (developmental processes affecting phenotypes) and niche construction (exercise of choice in social contexts especially) affect the influence of gene action.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.
About this article
Cite this article
Govindaraju, D.R., Goldstein, A.M. The elusive associations of nucleotides with human success: evolutionary genetics in education and social policies. Evo Edu Outreach 18, 4 (2025). https://doiorg.publicaciones.saludcastillayleon.es/10.1186/s12052-025-00218-3
Received:
Accepted:
Published:
DOI: https://doiorg.publicaciones.saludcastillayleon.es/10.1186/s12052-025-00218-3