Week 2 We’ve Been Wrong Before

Behavior genetics is perhaps best known for our spectacular failures, both in terms of venturing into public policy (eugenics) and producing a popular treasure trove of false-positive results (classic candidate gene studies) that are, unfortunately, the only BG research most people have ever heard of. So this is the “history of the field” week.


  • Become familiar with the history of eugenic programs in the US and worldwide
  • Understand the history, methods, and limitations of candidate gene studies

Lecture Notes

A Very Brief History of Eugenics

In 1883, Frances Galton, who is considered the father of modern behavior genetics and happened to be Charles Darwin’s cousin, coined the term “eugenics,” meaning, from the Greek terms eu- and genos, “good genes.” The early proposals for advancing eugenics was through government incentives (essentially, paying) for “eminent” people to marry and have children. Galton defined eminence as being rich (coming from rich families), educated (certainly not universally available), and having an eminent profession (like a doctor, judge, scientist - again, not options that were equally available to everyone).

The past and present of human behavior genetics is inextricably linked with the history of eugenics, not only because behavior genetics has been used to justify eugenic policies, which have overwhelmingly been abusive and violent, and targeted primarily at racial/ethnic/religious minorities, women, and people with disabilities. The reality is that the men and women who advocated for eugenic policies developed the methods that we still use today, often with the express purpose of producing data and statistics to support their advocacy. We cannot simply stop using their methods – not just within behavior genetics, but across the sciences overall: Karl Pearson, who developed the correlation (Pearson correlation), was Galton’s protege (Wikipedia’s word) and developed the correlation statistic to formally quantify Galton’s observation that parents and children tended to be similar on “eminence,” which he took as necessarily indicative of biological transmission. We cannot simply erase these men from our history or our science, so we need to acknowledge the bad with the good, and use knowledge of the past to consider how we can do better now and in the future.

When most people think of eugenics, they think of the Nazis. In the lead up to and during World War II, the Nazis enacted policies ranging from propaganda encouraging folks to be aware of family history of mental or physical illness when deciding who to marry and have children with, to enacting laws forbidding whites from marrying non-whites, making abortion illegal for women deemed to be ‘fit’ (of good, presumably genetic, quality), and registration of disabilities and illnesses leading to the forced surgical sterilization of over 400,000 people. And this is all in addition to the Holocaust, in which 6 million Jews and millions of other people from marginalized groups, including the disabled, the Romani, and people suspected of being gay.

What is less often discussed is that many of these policies, including forced sterilization, were in fact based on existing US policies. And the Nazi versions of the US policies were not necessarily more severe or violent. For example, the US laws forbidding interracial marriage defined “white” much more strictly. Racial and ethnic identification in the US has typically followed “the one-drop rule”: one-drop of non-white blood (or, any evidence of non-white heritage) excluded a person from the “white” demographic category.

The US was the first place in which eugenic sterilization policies were put into place. Around the turn of the 1900s, several states had legislation move increasingly further until in 1907 the first policies were enacted in Indiana. In 1927, the US Supreme Court upheld (8-1) Virginia’s eugenic sterilization policy in the case of Buck v Bell. Immediately, similar policies spread quickly across the US.

Non-eugenic sterilization exists, as well. Therapeutic sterilization typically refers to cases where women with extremely painful periods or uncomfortable or potentially cancerous tumors choose to have those organs removed or cases where people choose surgical interventions (tubal ligation or vasectomies) to voluntarily, with full informed consent, serve as a form of permanent birth control. Punitive sterilization is typically non-surgical (e.g. “chemical castration”) and used as a condition of release for repeat sex offenders. Eugenic sterilization, however, is specifically targeted a controlling the reproduction of others, regardless of their consent, with the goal of reducing the rates of mental and physical disabilities in the population. In the US, over the course of the six or so decades that eugenic sterilization policies were in place in the US, over 60,000 people were forcibly sterilized. Overwhelmingly, the people targeted by the state-directed eugenics boards that made such decisions were women or girls, especially those who were poor or Black.

After World War II and the widespread revelation of the Nazi’s atrocities, public opinion of eugenic sterilization declined. But these policies didn’t simply disappear – the last eugenic sterilization in the US occurred in Oregon in 1981. 1981. That is not a typo. 1981.

This is not ancient history. Many of the victims of these policies are still alive. In the Prep Work for this week, you’ll hear members of these state eugenics boards discussing how they made these decisions. You’ll meet a woman who was sterilized by a eugenics board as a child after she became pregnant from a rape. And you’ll meet her son, who was born by c-section at the same time that she was surgically sterilized. The state that did this to her, North Carolina, has committed $10 million in compensation for victims of its eugenics program. Even if all 7,600 people sterilized by that state were alive and able to claim their share, that comes out to $1,315 per person.

Candidate Genes

A cautionary tale of doing what we can with the technology available at the time

In more recent decades, the (intentional, widespread) involvement of behavior genetics researchers in public policy has been substantially reduced. Nonetheless, research from within the field tends to still gain widespread attention - the search for biological bases of behavior, genetic or otherwise, continues to hold our fascination worldwide. In the late 20th century, the development of novel technologies allowed us to partially read small parts of the genome (to describe this process broadly, we tend to use the word “genotype” as a verb, as shorthand for “to read the genotype”).

Example image of PCR gel electrophoresis showing several columns with variable distributions of horizontal black bands

In the olden days, we weren’t able to obtain full genome reads, however. Genotyping required targeting specific sequences within the DNA, cutting them away from the rest of the surrounding DNA, and investigating them intentionally. Similar techniques can be used for a variety of applications in genetic research and application (including cutting DNA into small bits, running them across an electrophoresis gel, and comparing the results across two individuals, say a suspect and DNA found at a crime scene - a process you’ve likely seen depicted in countless crime dramas). In behavior genetics, our application of this technology took the form of candidate gene research - if we knew in advance what gene (or, even more specifically, which variant within a gene) we wanted to study, we could cut out just that part, amplify it, and determine the genotype at that targeted location.

The classic candidate genes are the variants that you’re most likely to have heard are associated with human behavior. 5-HTT-LPR, MAOA-uVNTR, COMT-Val158Met, DRD4-7R all emerged as major candidate genes for human behavior in the 1990s and 2000s. The term candidate gene, however, is a pretty serious misnomer. Genes are huge, complicated things, with many sections that can be read in many variable ways (this is how your same DNA in every cell of your body can create so many different tissues and organs - the genotype is unchanged, but how it is read differs across tissue types). And even their “thing”-ness is questionable; debates around how to define “a gene” are still very much around, as we develop our understanding of the variable reads (including alternative start-stop locations, meaning that genes are not only next to each other but can also overlap), effects of regulatory elements upstream and sometimes downstream of the genes, and the role of intronic elements (the non-coding parts of the genes, in contrast with the coding exonic regions). Candidate genes would be more accurately called candidate VARIANTS. The term “candidate gene” seldom refers to an entire gene; rather, it typically refers to the targeted investigation of a single variant (from potentially hundreds) within a gene. (The specification or label of exactly which variant within the whole gene is what is labeled after the “-” in my gene list at the start of this paragraph.)

So, how did we choose which genes and which variants to go after? The genes were relatively easy to compile a shortlist of. We looked to non-human animal research, which had been investigating the consequences of removing an entire gene from model organisms, like mice, fruit flies, and worms. Luckily, much of the genome is conserved across species during evolution. That is, most of our DNA isn’t working at making us specifically human or specifically ourselves, it’s just trying to keep the basic functions of life going (breathing, eating, making sure our cells stick together). As a result, we share many genes (and biological systems) with other species, so when we find a gene that, say, substantially alters how dopamine is processed, and we know from dopamine administration trials in humans and non-humans that dopamine changes behavior (including things like risk-taking) we might be optimistic that the particular gene identified in the model organisms would be an interesting target within humans. Picking the specific variant within a given gene was more tricky, especially in the late 20th century when model organism research (and technology) was largely limited to targeting the removal of whole genes (or even multi-gene segments). In genotyping, a whole gene was too much, especially if we wanted to characterize naturally occurring variation among individuals (not just gene presence versus gene absence). You might think that the variants within genes (your -LPRs, -uVNTRs, -Val158Mets, -7Rs, etc.) were selected because they were known to have the greatest impact on gene function (or even organism behavior). The reality is much more mundane and practical. Specific variants were chosen within these candidate genes for one primary reason - they were relatively easy to genotype.

DNA has a physical structure, with turns and folds and repetitions that make certain parts harder to physically access than others. For the most part, what we refer to as candidate genes were selected because they were one of the hundreds of potential variants within large genes of overall consequence (that is, you don’t want to be missing it entirely) that could be pretty easily genotyped in a lab by someone with relatively little training. In many cases, it was because the variant regions were large - most of the variants targeted by candidate gene studies are what’s known as a variable number of tandem repeat regions (VNTRs), which means a short sequence of As, Cs, Ts, and/or Gs repeats a variable number of times (as is the case for the 5-HTT-LPR, where LPR stands for Limited Polymorphism Repeat (polymorphism meaning “many forms”; for MAOA-uVNTR, where that lower case u is how we lazily type lower case greek letter “mu”, identifying the specific VNTR we’re talking about; and DRD4-7R, referring to the 7 Repeat version of a VNTR that can commonly have both more or less than 7 repeats, but the 7R form is what’s been proposed for an increased likelihood of ADHD-like traits).

So, big stuff is easy to see/genotype, and it was all we could do at the time, better than nothing, right?

This week you’ll read in the Slate Star Codex (2019) blog post and the Keller & Duncan (2011) paper about the history of 5-HTT-LPR research and how we wasted immense money, time, and public attention chasing and defending false positives. I also want to give you some specific background on MAOA-uVNTR (“the warrior gene”), whose story runs very much parallel to 5-HTT-LPR, but happens to be the first gene that I ever personally published on, and so is, as a cautionary tale, quite near and dear to my heart. And for that reason, and because I’ve gone on quite long enough here, dear reader, I’ll be covering the story of my own now-presumed-false-positive MAOA finding in the Class Chat this week.

Prep Work

Below is a list of materials to review early in the week. Although these activities do not earn points, they will prepare you to undertake the Participation Activities and Course Project assignments.

  • Read: Loehlin, J. C. (2009). History of Behavior Genetics. In Y.-K. Kim (ed.), Handbook of Behavior Genetics, pp. 3-11. PDF download: https://link.springer.com/content/pdf/10.1007/978-0-387-76727-7_1.pdf
    • Fun fact 1 (from the article): the first organizational meeting that would become the Behavior Genetics Association took place at UIUC in 1970.
    • Fun(?) fact 2: Dr. Derringer is the incoming Treasurer of the Behavior Genetics Association.
  • Watch an interview with a survivor of the US eugenic sterilization program.
  • Skim the Wikipedia entry on the history of eugenics: https://en.wikipedia.org/wiki/History_of_eugenics
    • It’s long and detailed and includes examples from around the world.
  • Watch How to access scholarly papers (Video)

Participation Activities

You can earn up to 4 points for participation activities each week by selecting and completing tasks from the “menu” listed below. You may complete more than four tasks if you’d like, but the maximum number of points awarded will be 4 per week. Each activity is worth 1 point.

  • Journal: Eugenics
    • Write a brief response to the material you reviewed about Eugenics (including the Lecture Notes for Week 2, the video Eugenicist Movement in America: Victims Coming Forward, and the Wikipedia entry on the History of Eugenics). Answer both of the prompts below:
      • What were the TWO most surprising things you learned?
      • What is the ONE most important piece of information or lesson you take away from these materials?
  • Read & Discuss via Perusall: Slate Star Codex 2019 5-HTTLPR A Pointed Review. https://slatestarcodex.com/2019/05/07/5-httlpr-a-pointed-review/
    • This blog post provides an excellent overview of the empirical research and research culture that lead to the rise and fall of 5-HTTLPR as a candidate gene “for” depression. There are a LOT more comments on the actual blog post, but I’ve only included here the response from the senior author on the replication project being discussed, Dr. Matt Keller
  • Read & Discuss via Perusall: Duncan & Keller 2011 A critical review of the first 10 years of candidate gene-by-environment interaction research in psychiatry. American Journal of Psychiatry, 168(10), 1041-1049. https://doi.org/10.1176/appi.ajp.2011.11020191
    • This review article gives an introduction to one challenge in statistical model testing: power and sample size.
  • Read & Discuss via Perusall: Chabris et al 2015 The Fourth Law of Behavior Genetics. Current directions in psychological science, 24(4), 304-312. https://doi.org/10.1177/0963721415580430
    • This review article describes the empirical evidence that the heritability of common human behaviors must be polygenic, or involve many genes each with a tiny individual effect.
  • Find & Share a recent HUMAN CANDIDATE GENE study to the News, Memes, and Everything In Between Discussion Forum.
    • Search Google Scholar for a HUMAN study published since 2017 that SPECIFICALLY examined one or more of the candidate genes listed below *
    • Post to the News, Memes, and Everything In Between Discussion Forum, using the following format:
      • Post subject = Title of the paper (N = the sample size of or number of participants in the study)
      • Body of the post:
        • The APA-formatted citation of the paper
        • A brief (no more than 1-2 sentences) description of their finding
        • A brief (no more than 1-2 sentences) description of whether the study is a replication of another study OR if it itself has been replicated (either successfully or unsuccessfully), including either:
          • That the paper was a replication or has been replicated (and listing the APA-formatted citation of the paper it replicated or was replicated by) and a 1-sentence description of the other paper’s result; OR
          • That the paper does not appear to be a replication or have a published replication attempt, and a brief (1 sentence) description of how you arrived at that conclusion.
    • List of candidate genes to look for: 5-HT2A; 5-HTT; AVPR1A; BDNF; CHRM2; COMT; DAT1; DISC1; DRD2; DRD3; DRD4; GABRA2; MAOA; OXTR
    • Hint 1: Restrict search results to articles published in/since 2017. Once you’ve started your search from the Google Scholar homepage, there will be a link in the top left of the results page that says “Since 2017”. Click it.
    • Hint 2: Look for an empirical article; that is, one that does an analysis of data (rather than a review article, which only summarizes previous studies). An Empirical article will have a Methods section giving the number of participants, what measures were used (for both phenotypes and genotypes), and a description of how the data were analyzed.
    • Hint 3: To get the APA-formatted citation for a paper, click on the quotation mark icon below its listing in the search results.
    • Hint 4: To view papers that have cited your target paper (as replications usually should), click on the “Cited by ##” link below the paper’s search result listing.
    • Hint 5: To access a copy of the paper, click on the title, or any link in the right-side column of the search result listing, or the link below the search result listing that says “All ## versions” (which will bring up all links that google scholar has found associated with that same article). Make good choices about clicking on links. You shouldn’t have to pay to access any of these articles. If you hit a paywall, try either (1) being on campus (university subscriptions are often triggered automatically by your IP address), or (2) pretend you’re on campus by signing into VPN, or (3) search for the article through the campus library system. I definitely wouldn’t recommend that you familiarize yourself with the Sci-Hub system (wiki article here), which breaks paywalls to make pretty much every scholarly article available to everyone for free. It’s illegal enough that it has to move server hosts frequently. To make sure you don’t end up there accidentally, definitely don’t search “where is sci-hub now”. Totally unrelated, if you want an article that you can’t get free access to, just let me know and I’ll find you a copy. Article charges go to the publishers, who don’t pay authors or reviewers. If you ever in the future want an article that you can’t get access to, a great option is to email the author to ask for a copy - we love to share because it means someone’s interested in our work, and it doesn’t cost us anything to email a pdf.
    • Caveat: A point will only be awarded to the first person to post any given paper, so check what’s already been posted before starting on your own.
  • Class Chat on Thursday, 11:00 am - 12:20 pm (CT)