The genetic data posted online seemed perfectly anonymous — strings of billions of DNA letters from more than 1,000 people. But all it took was some clever sleuthing on the Web for a genetics researcher to identify five people he randomly selected from the study group. Not only that, he found their entire families, even though the relatives had no part in the study — identifying nearly 50 people.
The data are from an international study, the 1000 Genomes Project, that is collecting genetic information from people around the world and posting it online so researchers can use it freely. It also includes the ages of participants and the regions where they live. That information, a genealogy Web site and Google searches were sufficient to find complete family trees. While the methods for extracting relevant genetic data from the raw genetic sequence files were specialized enough to be beyond the scope of most laypeople, no one expected it to be so easy to zoom in on individuals.