- Home
- Amram Shapiro
Book of Odds
Book of Odds Read online
DEDICATION
To Those Who Count
To those who sample us randomly
and collect their numbers humbly
and confess how wrong they may be honestly
and report what they see whether it is
what they wished for or not
and share what they find with all of us
so we may learn something
we did not know of ourselves
CONTENTS
Dedication
Introduction
Method
Chapter 1 Sex
Chapter 2 Singles and Dating
Chapter 3 Love, Marriage, and Divorce
Chapter 4 Pregnancy and Birth
Chapter 5 Infancy and Childhood
Chapter 6 High School and College
Chapter 7 Health and Illness
Chapter 8 Looking Good and Feeling Fine
Chapter 9 Mind, Psyche, and Addiction
Chapter 10 Beliefs and Fears
Chapter 11 Accidents and Death
Acknowledgments
About Book of Odds
About the Authors
Credits
Copyright
About the Publisher
INTRODUCTION
This book is a numerical snapshot of the United States. Like a photograph its subject is stopped for a moment and set in context. It can be looked at so closely you can count the people in the bleachers and the buttons on their shirts. The photograph is not an Instagram but a panorama or 360-degree image. It addresses the destinies of people, from health and happiness to accidents and loss. It covers the cycle of life from conception to birth to childhood to schooling to adult life to aging and decline. It covers the everyday and holidays, the serious concerns of life and its comic turns.
“What are the odds of that?” We ask this when something strikes us as unlikely. We don’t expect a reply since the question is rhetorical, an exclamation of surprise.
This book answers those questions in subject areas that tickle our curiosity or touch our anxieties and fears. Sometimes the odds surprise us. Sometimes they appall. Sometimes they amuse.
One thing the odds have in common in commonality. They are clear and simple. Every odds statement was created the same way and followed the same rules and conventions, the way entries in a dictionary do.
We look for the most fundamental units of activities or events to count, things just as we see them instead of more sophisticated, explanatory, but invisible measures. The likelihood of a batter hitting a single in a plate appearance is counted instead of his OPS (On-base Plus Slugging), or the odds a person owns blue jeans, rather than his propensity to spend.
Why? By concentrating on the experiences of normal existence, we have been able to develop a way of expressing the likelihood of these experiences. And we are able to compare likelihoods across a wide section of American life—something we call calibration. All of us already have an ability to calibrate, whether we recognize it or not. Think about how you automatically compare prices from one store to the next—you not only have a grasp of what things should cost; you also have a sense of the “reasonableness” of a price. Or how about the morning weather forecast? Without thinking you know if a projected temperature suggests the need for a coat. You not only understand the number in context; you understand the implications it has for your daily life.
The odds in this book can help us calibrate all kinds of possibilities in the same kind of way. We can judge risk or likelihood in a way we have never been able to do before. For example, the odds are 1 in 8.0 a woman will receive a diagnosis of breast cancer in her lifetime, about the odds a person lives in California, the most populous state.1 For men the odds are 1 in 769,2 about the same odds a Major League Baseball game will be a no-hitter (1 in 725).3 And speaking of baseball, one story became iconic during the three years we were developing the Book of Odds database. In those days our researchers met weekly to review their work with one another. We often had visitors. On this day our visitor was a college student, the daughter of a close friend. The presentations she saw were really varied: one researcher had just completed work on the odds associated with contraception; another one had compiled the odds of baseball.
The odds of a woman becoming pregnant after relying on one form or other of contraception were displayed. Starting with the population of women in 2002 who were of child-bearing years (15–44), the presentation identified the odds that one of these women was sexually active, the odds that she relied on condoms for contraception, and the odds that she would stop relying on condoms because she was pregnant. Each step of this “thread” of probabilities (as we term such chains) had independent odds. When put together, the odds that a woman in that original group would end up having given up condoms because there was no longer any point—she had become pregnant despite the contraceptive measure—were 1 in 142.4
Next came the baseball presentation, and as it happened, our visitor was a baseball fan. She seemed captivated as the Book of Odds’ Major League Baseball statistics were summarized. They were different from those she was used to on the sports pages. What will happen next on average, independent of who’s pitching and who’s batting? Viewed this way, the odds that the next batter will hit a triple are 1 in 144.5
Later that day I received a call from my friend, the college student’s mother. Her daughter had returned home on a mission. She had immediately called her boyfriend, and her mother overheard her daughter’s part of the conversation. “She asked her boyfriend if he knew the odds of a couple conceiving a child if they were relying solely on condoms,” my friend said. “She informed him that the odds were 1 in 142.”
Then she asked if he knew the odds of the next batter in a Major League Baseball game hitting a triple. Again, he didn’t know.
“It’s 1 in 144,” she told him. And then she added, jabbing her finger for emphasis, “And I’ve seen triples.”
We could go on and on: the odds a death will include HIV on the death certificate are becoming rarer and are 1 in 21,7746—this says an HIV death is less likely than that a visit to the ER is due to an accident involving a golf cart in a year: 1 in 22,325.7 Multiply by 10 and you have the approximate odds a person visiting the Grand Canyon will die during the trip: 1 in 232,100.8 Multiply by 10 again and you have the odds a person will die from chronic constipation: 1 in 2,215,900.9 For those working on murder mysteries: the odds of being murdered during a trip to the Grand Canyon: 1 in 8,156,000. Of dying in a Grand Canyon flash flood? 1 in 14,270,000.10
As I said, we could go on and on, which is why we wrote a book. Enjoy!
METHOD
This book is constructed on a considerable foundation. We began with a rigorous methodology, creating conventions and holding ourselves to the same standards as any reference source. Only then did we begin to assemble what has become a formidable database.
The very first database had only 450 odds but already vividly demonstrated how comparing disparate subjects with similar odds could both shock and inform. Take this example: “The odds a female who is raped is under 12 are 1 in 3.4.”1 That is shocking in and of itself, but it is made more vividly awful when one looks for other odds in the same range. “The odds a person 99–100 will die in a year are 1 in 3.3.”2 The odds a female rape victim is under 12 are about the same as a 99-year-old man dying in the next 12 months.
From there we went to work on growing the database and making it accessible for Internet use. We needed a way to classify the subjects we would cover and created a taxonomy that aided us later in employing semantic tools. More than fifty person-years went into creating more than 400,000 odds. Each one can be compared to any other, and thus each part enriches the whole.
But what do we mean when we talk about o
dds? When we say, “My doctor says the odds are one in ten that the test will be positive,” we’re expressing probability. In mathematical terms, statements like these put fractions into words. When we say, “the odds are one in ten,” think of a fraction, with the first, lower number as the numerator, or top number in the fraction, and the second, larger number as the denominator, or bottom number. So, “one in ten” literally means one-tenth, or a 10 percent chance. Each odds in The Book of Odds expresses the probability that a specific occurrence will take place, given the number of situations in which that occurrence might take place. Since it is past experience that provides a basis for expecting what will take place, odds are based entirely on past counts or on rare occasions actuarial forecasts.
Each statement in The Book of Odds contains certain required components. Consider the example, “The odds a person will be struck by lightning in a year are 1 in 1,101,000 (US).”3 First, we have to know what will happen, in this case, a lightning strike. Second, we have to know to whom it will happen—a person, any person. As we narrow that definition (a farmer, a golfer) the odds will change. Next, the statement tells us the parameters, or limitations, of the calculation. In this case, there are parameters of time (a single year), data span (annual data from 2008–2012), and of place (US). In this book all odds are US odds, so we have left the geography off and the data spans are usually evident in the sources cited. Any change to these parameters, as well as the time frame used to collect data, may change the odds. Some odds, such as those about the ideal fair coin toss coming up heads or tails, have no such parameters, and are considered true everywhere and any time because they are defined that way.
Odds, Probability, and Chances
At Book of Odds we treat these terms as synonymous. Odds are statements of probability. So, “The odds of . . .” should be interpreted mathematically as “The chances of . . . ,” or “The probability of . . . ,” or, the ratio of favorable outcomes to total outcomes. This is a subtle but important convention to be aware of when using the odds in this book. Its purpose is to be simple, accessible, and consistent with conversational English.
Traditionally, the term “odds” refers to the ratio of favorable to nonfavorable outcomes. So, a gambler might say, “A horse that is expected to win 25 percent of the races it enters has 3 to 1 (3:1) odds against or 1 to 3 (1:3) odds to win.” This is a great tool for a bettor who is attempting to calculate the expected value of a gamble. However, this form can be troublesome for ordinary people trying to understand complex statistics. “1 in 4” is easier to grasp in your mind’s eye than “3 to 1 against.” You can picture it, can’t you? This is also the way we humans commonly think and speak when discussing uncertainty.
That brings us to the question, why do we include what we do? We purposely focus on the events of everyday life, things that all or most of us will have experienced firsthand. This is vital for the exercise of calibration—understanding odds in a larger context. We also include those things we may not have experienced but whose likelihood we may worry about: misfortune, illness, death . . . We have broken the odds of human experience into three large sets: destiny, actions, and the cycle of life. Destiny is what happens to us. Actions are what we do. And cycle of life is a way of looking at the odds associated with the stages of our existence: conception, birth, childhood, schooling, adult life, work, retirement, aging, and death.
All the odds won’t be relevant or of interest to everyone, but each will be relevant or interesting to someone. We aim to present data and information objectively and without bias, but we readily acknowledge that decisions about what to include inevitably involve some subjective judgment and are subject to certain parameters: for example, we must work with the terms the data collectors have chosen to use. Our principles of selection, however, are not knowingly biased to support one position or another. And when we address controversial subjects, we seek to maintain a neutral perspective, shedding light, but not heat, on politically charged issues.
In every case we have searched for the most authoritative and reliable source for our data, but we are transparent about the fact that quality varies. For all sources we ask the same questions: who collected information from whom, in what manner, and for what purpose. Some are straightforward, actual counts like the US Census. For survey data and experimental trials, we evaluate the underlying hypothesis or research questions, study design, sample frame, and size, and make a judgment about whether it accurately reflects the population under study, as well as assess the methodology of analysis, fairness of presentation of the data, explanations of variables and limitations, reproducibility of results, and quality of peer review. Further, we examine the sponsoring body and those executing the study, looking to see if they have a vision or mission or mandate that might have had even a subtle influence on the findings. We don’t dismiss any source with an expected point of view out of hand, but we make every attempt to be mindful. There is a wealth of wonderful sources, but there are also many of limited or no value and applicability. These are either left out or, if used at all, presented with appropriate caveats attached.
Timeliness also matters, and within the time boundaries publishing affords we have updated most odds statements. Even so, some measurements are irregularly collected, and even those with regular measurements, such as economic data and annual crime and cancer statistics, have their quirks, since they rely on human input. One year New York City failed to provide crime data to the FBI, for example. And some subjects are studied sporadically. Sex, for example, is one of these, with a Kinsey Report or equivalent sometimes released only once a decade.
In addition to our internal controls, we seek independent external reviews of our sources. We consult book reviews and commentary and reviews in academic journals. We also contact relevant and appropriate specialists, including authors of related academic work, industry or research specialists, editors of and contributors to relevant journals, and any and all credible experts uncovered in our own investigations.
Tense Conventions
At its heart, the invention of a reference work is really the invention of a set of conventions followed by their application with relentless consistency. This is the work that Dr. Samuel Johnson, defining “lexicographer” in his own dictionary, called that of “a harmless drudge.”
The most subtle and important of our conventions relate to tenses. Odds naming past dates or historical events such as wars are in the past tense. Odds describing an outer or inner state of being or using the predicate nominative use the present tense. Most odds use the future tense, however, despite being based on past counts. This practice has the advantage of placing our readers and users into the condition we experience at all times, that of being about to learn what the future holds. Our internal methods document explains it this way:
We assume in virtually all of our odds that we are viewing the events and actions to be described from the time before their count began. From this perspective what is in the sentence is what a perfectly prescient forecast would have yielded. This we term the “future implicative.” From this perspective, the sentence becomes lively. It invites the reader to imagine standing poised at the beginning of the reference period, wondering perhaps what will happen next.
Caveats
Odds are based on recorded past occurrences among a large group of people. They do not pretend to describe the specific risk to a particular individual, and as such cannot be used to make personal predictions. For example, if a person learns that there is a quantifiable probability of a cure for a specific disease, those statistics cannot take into account this person’s personal genetic disposition or medical history, unique environmental factors, the experience of the treating physician, the accuracy of tests performed, the development of new treatments, and so on.
The past is the perch on which we must stand to look toward the future. Still, the view can be clouded, and the past does not always provide reliable guidance about the future. There is always the possibility
“a black swan” will appear—an unexpected event with an outsize impact. Complexity theory, which is the latest way of attacking modeling and large data sets, has a great deal to say about the impact of the increasing number of “agents” in our world systems, and what this means about predictability and new sources of risk.
Statistics is divided into two camps, the frequentist camp and the Bayesians. The former puts much reliance on past distributions, the latter on learning from new information. We are both. We like counts as something factual to start with, but we accept the Bayesian view that new insight may trump old data. All our odds may be thought of as potential “priors.”
If our work helps people gain a feel for probability because the presentation is fun, easy to understand, and touches on subjects of real interest, we will be very pleased with our efforts.
CHAPTER 1
SEX
Liar, Liar
The odds a man has lied about the number of sex partners he’s had in order to protect his ego: 1 in 7.1
SOURCE: AskMen.com, “Part I: Dating & Sex,” The Great Male Survey, 2011 Edition, http://www.askmen.com/specials/2011_great_male_survey.
SEX PARTNERS:
How High Can You Count?
When it comes to sex, most people think experience is a good thing—but they also think there can be too much of a good thing.
The largest group of women, 1 in 3.2, feels comfortable with a man who has had no more than 5 previous sexual partners, but the more the number exceeds what can be counted on one hand, the less comfortable many women feel. Up to two hands? 1 in 3.6 women chooses 10 as the maximum number of former partners she’s okay with, but just 1 in 5 women feels relaxed about a tally that is no more than 20. A man who fesses up to a maximum of 50 previous partners really limits his options: only 1 in 12.5 women feels comfortable with that number, and if his count is up to twice that, only 1 in 25 women is willing to join his parade.