Using the friendship paradox to sample a social network
DOI: 10.1063/1.3518199
Knowledge of whether a disease epidemic is unfolding in a community is crucial for public health officials and policymakers. Models suggest that vaccinating even a third of the population against influenza in a metropolis like New York City or Chicago can save lives and shorten the epidemic’s course—but only if implemented early enough. Unfortunately, current methods to monitor flu epidemics rely on contemporaneous data from random people seeking outpatient care, information that lags behind the course of the epidemic by a week or more.
Social networks, such as the network of real-life contacts in a city, are heterogeneous: Some members are more connected, or central, than others. Theorists know that the centralmost hubs, by virtue of their greater exposure, are most likely to catch and spread a contagion (see the article by Mark Newman in Physics Today, November 2008, page 33)
A social-network phenomenon known as the friendship paradox—“your friends have more friends than you do”—offers a strategic path around the problem. To appreciate the phenomenon, described by sociologist Scott Feld in 1991, consider a group of randomly chosen people, each asked to name a friend. More extroverts are likely to be named than loners, and the nominated friends will have, on average, more social ties than the nominators. 1 How many more depends on the variance in distribution of ties. Mathematical modeling by theorist Reuven Cohen of Israel’s Bar-Ilan University and his colleagues in 2003 bore out the potential utility of the idea. According to their simulations of computer and population networks, given a million randomly chosen nodes (computers or people), only a small fraction of random “acquaintances” of those nodes actually require immunization to arrest an unfolding computer virus or disease epidemic, compared to a large fraction needed in completely random immunization. 2
Last year, as the 2009 flu season approached, physician and sociologist Nicholas Christakis of Harvard University and political scientist James Fowler of the University of California, San Diego, realized they could experimentally test the paradox as a basis for early detection. 3 In October 2009, after the H1N1 epidemic had emerged but before vaccines were available, they contacted 744 Harvard undergraduates: 319 randomly chosen students and 425 friends they nominated. By monitoring the two groups and consulting records from the campus health center dating back to 1 September, Christakis and Fowler found that the friend group tended to get the flu well before the random group, a trend first discernable about 46 days prior to the epidemic’s peak (see the figure).

Differences in flu contagion between two groups, one composed of randomly selected Harvard University undergraduates and another composed of friends they nominated. Only 8% of the 744 students contracted the flu in 2009. Medical records show that members of the friend group tended to catch it earlier than the random group, as expected based on their greater average number of social ties in the population at large. A divergence of the curves is first detected at day 16.
(Adapted from ref. 3.)

The power of the approach, Christakis points out, is that in the event of an outbreak, the behavior of a small subgroup of the population presages the evolution in the much larger, and essentially unobservable, global network-without requiring details of anyone’s actual social ties. “We’re not claiming that people necessarily got the flu from the friends they nominated, nor that the local network maps the path along which the pathogen flows,” he says. “A ‘friend’ is just a proxy for an actual location in the network.” Armed with an advance warning from the health data about friends-or even friends of friends for a yet more central set of nodes-health officials could then decide how best to forestall an epidemic.
Ideally, Christakis and Fowler note, groups should be selected before data are collected to avoid any hidden bias creeping into the data. Thanks to the proliferation of internet- and GPS-equipped mobile phones and social networking sites like Facebook, researchers can, at least in principle, quickly access staggering amounts of information in real time: where people are; with whom they interact; and what they like, buy, and blog about.
As people increasingly reveal themselves online and computational power grows to better handle the volume, researchers may gain insight into spreading processes in networks and how to anticipate their effects. In 2009 the H1N1 virus infected over 50 million Americans. There are now more than 150 million active Facebook users in the US.
References
1. S. Feld, Am. J. Sociol. 96, 1464 (1991). https://doi.org/10.1086/229693
2. R. Cohen, S. Havlin, D. ben-Avraham, Phys. Rev. Lett. 91, 247901 (2003). https://doi.org/10.1103/PhysRevLett.91.247901
3. N. A. Christakis, J. H. Fowler, PLoS One 5, e12948 (2010). https://doi.org/10.1371/journal.pone.0012948