Cnvhap: an integrative population and haplotype–based multiplatform model of snps and cnvs

feature-image

Play all audios:

Loading...

ABSTRACT Although genome-wide association studies have uncovered single-nucleotide polymorphisms (SNPs) associated with complex disease, these variants account for a small portion of


heritability. Some contribution to this 'missing heritability' may come from copy-number variants (CNVs), in particular rare CNVs; but assessment of this contribution remains


challenging because of the difficulty in accurately genotyping CNVs, particularly small variants. We report a population-based approach for the identification of CNVs that integrates data


from multiple samples and platforms. Our algorithm, cnvHap, jointly learns a chromosome-wide haplotype model of CNVs and cluster-based models of allele intensity at each probe. Using data


for 50 French individuals assayed on four separate platforms, we found that cnvHap correctly detected at least 14% more deleted and 50% more amplified genotypes than PennCNV or QuantiSNP,


with an 82% and 115% improvement for aberrations containing <10 probes. Combining data from multiple platforms additionally improved sensitivity. Access through your institution Buy or


subscribe This is a preview of subscription content, access via your institution ACCESS OPTIONS Access through your institution Subscribe to this journal Receive 12 print issues and online


access $259.00 per year only $21.58 per issue Learn more Buy this article * Purchase on SpringerLink * Instant access to full article PDF Buy now Prices may be subject to local taxes which


are calculated during checkout ADDITIONAL ACCESS OPTIONS: * Log in * Learn about institutional subscriptions * Read our FAQs * Contact customer support SIMILAR CONTENT BEING VIEWED BY OTHERS


RECONSTRUCTING SNP ALLELE AND GENOTYPE FREQUENCIES FROM GWAS SUMMARY STATISTICS Article Open access 17 May 2022 PARSECNV2: EFFICIENT SEQUENCING TOOL FOR COPY NUMBER VARIATION GENOME-WIDE


ASSOCIATION STUDIES Article 01 November 2022 EFFICIENT PHASING AND IMPUTATION OF LOW-COVERAGE SEQUENCING DATA USING LARGE REFERENCE PANELS Article 07 January 2021 REFERENCES * Meyre, D. et


al. Genome-wide association study for early-onset and morbid adult obesity identifies three new risk loci in European populations. _Nat. Genet._ 41, 157–159 (2009). Article  CAS  Google


Scholar  * Sladek, R. et al. A genome-wide association study identifies novel risk loci for type 2 diabetes. _Nature_ 445, 881–885 (2007). Article  CAS  Google Scholar  * Zeggini, E. et al.


Meta-analysis of genome-wide association data and large-scale replication identifies additional susceptibility loci for type 2 diabetes. _Nat. Genet._ 40, 638–645 (2008). Article  CAS 


Google Scholar  * Cook, E.H. & Scherer, S.W. Copy-number variations associated with neuropsychiatric conditions. _Nature_ 455, 919–923 (2008). Article  CAS  Google Scholar  * Walters,


R.G. et al. A new highly penetrant form of obesity due to deletions on chromosome 16p11.2. _Nature_ 463, 671–675 (2010). Article  CAS  Google Scholar  * Aitman, T.J. et al. Copy number


polymorphism in Fcgr3 predisposes to glomerulonephritis in rats and humans. _Nature_ 439, 851–855 (2006). Article  CAS  Google Scholar  * Diskin, S.J. et al. Copy number variation at 1q21.1


associated with neuroblastoma. _Nature_ 459, 987–991 (2009). Article  CAS  Google Scholar  * McCarroll, S.A. et al. Deletion polymorphism upstream of IRGM associated with altered IRGM


expression and Crohn's disease. _Nat. Genet._ 40, 1107–1112 (2008). Article  CAS  Google Scholar  * Willer, C.J. et al. Six new loci associated with body mass index highlight a neuronal


influence on body weight regulation. _Nat. Genet._ 41, 25–34 (2009). Article  CAS  Google Scholar  * Kleinjan, D.A. & van Heyningen, V. Long-range control of gene expression: emerging


mechanisms and disruption in disease. _Am. J. Hum. Genet._ 76, 8–32 (2005). Article  CAS  Google Scholar  * Stranger, B.E. et al. Relative impact of nucleotide and copy number variation on


gene expression phenotypes. _Science_ 315, 848–853 (2007). Article  CAS  Google Scholar  * Bentley, D.R. et al. Accurate whole human genome sequencing using reversible terminator chemistry.


_Nature_ 456, 53–59 (2008). Article  CAS  Google Scholar  * Wellcome Trust Case Control Consortium. Genome-wide association study of CNVs in 16,000 cases of eight common diseases and 3,000


shared controls. _Nature_ 464, 713–720 (2010). * Conrad, D.F. et al. Origins and functional impact of copy number variation in the human genome. _Nature_ 464, 704–712 (2010). Article  CAS 


Google Scholar  * Lipson, D., Aumann, Y., Ben-Dor, A., Linial, N. & Yakhini, Z. Efficient calculation of interval scores for DNA copy number data analysis. _J. Comput. Biol._ 13, 215–228


(2006). Article  CAS  Google Scholar  * Wang, K. et al. PennCNV: an integrated hidden Markov model designed for high-resolution copy number variation detection in whole-genome SNP


genotyping data. _Genome Res._ 17, 1665–1674 (2007). Article  CAS  Google Scholar  * Colella, S. et al. QuantiSNP: an objective Bayes hidden-Markov model to detect and accurately map copy


number variation using SNP genotyping data. _Nucleic Acids Res._ 35, 2013–2025 (2007). Article  CAS  Google Scholar  * Franke, L. et al. Detection, imputation, and association analysis of


small deletions and null alleles on oligonucleotide arrays. _Am. J. Hum. Genet._ 82, 1316–1333 (2008). Article  CAS  Google Scholar  * Mefford, H.C. et al. A method for rapid, targeted CNV


genotyping identifies rare variants associated with neurocognitive disease. _Genome Res._ 19, 1579–1585 (2009). Article  CAS  Google Scholar  * Cooper, G.M., Zerr, T., Kidd, J.M., Eichler,


E.E. & Nickerson, D.A. Systematic assessment of copy-number-variant detection via genome-wide SNP genotyping. _Nat. Genet._ 40, 1199–1203 (2008). Article  CAS  Google Scholar  * Korn,


J.M. et al. Integrated genotype calling and association analysis of SNPs, common copy number polymorphisms and rare CNVs. _Nat. Genet._ 40, 1253–1260 (2008). Article  CAS  Google Scholar  *


Coin, L. & Durbin, R. Improved techniques for the identification of pseudogenes. _Bioinformatics_ 20 (Suppl. 1), i94–i100 (2004). Article  CAS  Google Scholar  * Hoerl, A.E. Application


of ridge analysis to regression problems. _Chem. Eng. Prog._ 58, 54–59 (1962). Google Scholar  * de Smith, A.J. et al. Small deletion variants have stable breakpoints commonly associated


with alu elements. _PLoS One_ 3, e3104 (2008). Article  Google Scholar  * Scheet, P. & Stephens, M. A fast and flexible statistical model for large-scale population genotype data:


applications to inferring missing genotypes and haplotypic phase. _Am. J. Hum. Genet._ 78, 629–644 (2006). Article  CAS  Google Scholar  * Su, S.-Y., Balding, D.J. & Coin, L.J.M.


Inference of haplotypic phase and missing genotypes in polyploid organisms and variable copy number genomic regions. _BMC Bioinformatics_ 9, 513 (2008). Article  Google Scholar  * de Smith,


A.J. et al. Array CGH analysis of copy number variation identifies 1284 new genes variant in healthy white males: implications for association studies of complex diseases. _Hum. Mol. Genet._


16, 2783–2794 (2007). Article  CAS  Google Scholar  * Peiffer, D.A. et al. High-resolution genomic profiling of chromosomal aberrations using Infinium whole-genome genotyping. _Genome Res._


16, 1136–1148 (2006). Article  CAS  Google Scholar  * Kidd, J.M. et al. Mapping and sequencing of structural variation from eight human genomes. _Nature_ 453, 56–64 (2008). Article  CAS 


Google Scholar  * Su, S.-Y., Balding, D.J. & Coin, L.J.M. Disease association tests by inferring ancestral haplotypes using a hidden Markov model. _Bioinformatics_ 24, 972–978 (2008).


Article  CAS  Google Scholar  * Marioni, J.C. et al. Breaking the waves: improved detection of copy number variation from microarray-based comparative genomic hybridization. _Genome Biol._


8, R228 (2007). Article  Google Scholar  Download references ACKNOWLEDGEMENTS We thank D. Serre, A. Montpetit and D. Vincent for advice concerning Illumina arrays and D. Peiffer (Illumina)


for providing genotype data on HapMap samples. Genome Canada and Genome Quebec funded genotyping on the Illumina Human1M platform. L.J.M.C. is funded by a Research Council UK fellowship.


J.E.A. is supported by the Medical Research Council. R.G.W. is supported by Johnson & Johnson and the South East England Development Agency. J.S.E.-S.M. is supported by an Imperial


College Division of Medicine PhD studentship. AUTHOR INFORMATION AUTHORS AND AFFILIATIONS * Department of Epidemiology and Biostatistics, School of Public Health, Imperial College London,


St. Mary's Hospital, London, UK Lachlan J M Coin * Department of Genomics of Common Disease, School of Public Health, Imperial College London, Hammersmith Hospital, London, UK Julian E


Asher, Robin G Walters, Julia S El-Sayed Moustafa, Adam J de Smith, Philippe Froguel & Alexandra I F Blakemore * Departments of Medicine and Human Genetics, McGill University and Genome


Quebec Innovation Centre, Montreal, Quebec, Canada Rob Sladek * Institute of Genetics, University College London, London, UK David J Balding * Centre National de la Recherche Scientifique


8090, Institute of Biology, Pasteur Institute, Lille, France Philippe Froguel Authors * Lachlan J M Coin View author publications You can also search for this author inPubMed Google Scholar


* Julian E Asher View author publications You can also search for this author inPubMed Google Scholar * Robin G Walters View author publications You can also search for this author inPubMed 


Google Scholar * Julia S El-Sayed Moustafa View author publications You can also search for this author inPubMed Google Scholar * Adam J de Smith View author publications You can also search


for this author inPubMed Google Scholar * Rob Sladek View author publications You can also search for this author inPubMed Google Scholar * David J Balding View author publications You can


also search for this author inPubMed Google Scholar * Philippe Froguel View author publications You can also search for this author inPubMed Google Scholar * Alexandra I F Blakemore View


author publications You can also search for this author inPubMed Google Scholar CONTRIBUTIONS L.J.M.C. designed the project with A.I.F.B., developed the cnvHap algorithm and software,


analyzed data and wrote the paper. J.E.A. ran cnvPartition, PennCNV and QuantiSNP on the data and helped write the paper. R.G.W. and J.S.E.-S.M. provided critical comments and helped to


write the paper. D.J.B. provided statistical advice. R.S. provided SNP genotype data, advised on its interpretation and edited the paper. A.J.d.S. provided aCGH data and advised on its


interpretation. P.F. provided the DNA samples and coordinated the SNP genotyping. A.I.F.B. designed the project with L.J.M.C., coordinated the aCGH analysis, contributed to writing the paper


and oversaw the project. CORRESPONDING AUTHOR Correspondence to Lachlan J M Coin. ETHICS DECLARATIONS COMPETING INTERESTS The authors declare no competing financial interests. SUPPLEMENTARY


INFORMATION SUPPLEMENTARY TEXT AND FIGURES Supplementary Figures 1–9, Supplementary Tables 1–3 and Supplementary Note 1 (PDF 1513 kb) SUPPLEMENTARY SOFTWARE Software, documentation and an


example. (ZIP 9805 kb) RIGHTS AND PERMISSIONS Reprints and permissions ABOUT THIS ARTICLE CITE THIS ARTICLE Coin, L., Asher, J., Walters, R. _et al._ cnvHap: an integrative population and


haplotype–based multiplatform model of SNPs and CNVs. _Nat Methods_ 7, 541–546 (2010). https://doi.org/10.1038/nmeth.1466 Download citation * Received: 01 March 2010 * Accepted: 05 May 2010


* Published: 30 May 2010 * Issue Date: July 2010 * DOI: https://doi.org/10.1038/nmeth.1466 SHARE THIS ARTICLE Anyone you share the following link with will be able to read this content: Get


shareable link Sorry, a shareable link is not currently available for this article. Copy to clipboard Provided by the Springer Nature SharedIt content-sharing initiative