My sister's keeper?: genomic research and the identifiability of siblings
2008

Genomic Research and Sibling Identifiability

Sample size: 452684 publication 15 minutes Evidence: high

Author Information

Author(s): Cassa Christopher A, Schmidt Brian, Kohane Isaac S, Mandl Kenneth D

Primary Institution: Children's Hospital Informatics Program at the Harvard-MIT Division of Health Sciences and Technology

Hypothesis

How much familial information can be inferred from genomic data, particularly regarding siblings?

Conclusion

Substantial discrimination and privacy risks arise from the use of inferred familial genomic data.

Supporting Evidence

  • Sibling SNP genotypes can be inferred with substantial accuracy.
  • A very low number of matches at commonly varying SNPs is sufficient to confirm sib-ship.
  • Using HapMap trio data, we achieved 91.9% inference accuracy for sibling genotypes.

Takeaway

This study shows that we can guess a sibling's genetic information from another sibling's DNA, which can lead to privacy issues.

Methodology

The study used a framework to measure the risk of SNP genotype disclosure to siblings and demonstrated inference techniques using HapMap data.

Potential Biases

The approach does not account for potential genotypic errors and assumes independence of loci.

Limitations

The study relies on population-based estimates for minor allele frequency from the HapMap population, which is small.

Participant Demographics

The study used data from the HapMap CEPH population, which includes 90 participants of northern and western European ancestry.

Statistical Information

P-Value

0.0001

Statistical Significance

p<0.05

Digital Object Identifier (DOI)

10.1186/1755-8794-1-32

Want to read the original?

Access the complete publication on the publisher's website

View Original Publication