Regarding the question what's IBD and IBS, here is my understanding:
A SNP (or a segment of DNA) is identical by state (IBS) if they have the same alleles in two of more individuals. For example, if person A has genotype A/C at a SNP position while person B has A/T at the position, then that SNP position has a IBS=1 status. IBD is a IBS that their shared alleles are from a common ancestor. That is, in my above example, if A and B’s parents have genotype A/C(mom) and A/T (dad), then we don’t know if the shared “A” allele in A and B came from mom or dad. In that case, it’s just an IBS, not an IBD. But if their parents’ genotypes are A/G (mom) and C/T (dad), then we know for largely sure the “A” is from mom, not from dad. So, it’s both an IBD and IBS. In practice, for a pair of individuals, without knowing their parents, we could estimate their IBD likelihood for each SNP using their IBS information. I think the math behind is Bayesian theorem.
PLINK use IBS-based clustering to remove sample outliers (which have too-far IBS distance from the rest), see here: http://pngu.mgh.harvard.edu/~purcell/plink/strat.shtml
PLINK use IBD to detect samples too-close to each other (e.g. relatives, contamination). See here: http://pngu.mgh.harvard.edu/~purcell/plink/ibdibs.shtml
Below is a nice piece for how to use IBD to detect relative. From Anderson, C. et al. Data quality control in genetic case-control association studies. Nat Protoc. 5, 1564–1573 (2010).
The expectation is that IBD = 1 for duplicates or monozygotic twins, IBD = 0.5 for first-degree relatives, IBD = 0.25 for second-degree relatives and IBD = 0.125 for third- degree relatives. Owing to genotyping error, LD and population structure, there is often some variation around these theoretical values and it is typical to remove one individual from each pair with an IBD value of > 0.1875, which is halfway between the expected IBD for third- and second-degree relatives. For these reasons an IBD value of > 0.98 identifies duplicates.
Another review on this is also worthy reading: Joseph E. Powell, Peter M. Visscher & Michael E. Goddard. Reconciling the analysis of IBD and IBS in complex trait studies. Nature Reviews Genetics 11, 800-805 (2010)
The following slide (from http://ibgwww.colorado.edu/workshop2005/cdrom/ScriptsA/evans/IBDestimation/IBD--2005.pdf) could be easier to understand IBD vs. IBS:
No comments:
Post a Comment