Imputation of parent-offspring trios and their effect on accuracy of genomic prediction using Bayesian method

Document Type : Original Research Article (Regular Paper)

Authors

1 Department of Animal Science, Science and Research Branch, Islamic Azad University, Tehran, Iran.

2 Department of Animal Science, Shahr-e-Qods Branch, Islamic Azad University, Tehran, Iran.

3 Department of Animal Science, College of Abouraihan, University of Tehran, Tehran, Iran.

Abstract

The objective of this study was to evaluate the imputation accuracy of parent-offspring trios under different scenarios. By using simulated datasets, the performance Bayesian LASSO in genomic prediction was also examined. The genome consisted of 5 chromosomes and each chromosome was set as 1 Morgan length. The number of SNPs per chromosome was 10000. One hundred QTLs were randomly distributed across chromosomes. Three low density SNP panels, containing 0.5k, 1k and 5k SNPs, were generated from the 10k panel. Six scenarios were evaluated, each containing two trios (dam, sire and offspring) and sire of each dam for parent-offspring pair data. These scenarios were compared from completely genotyped offspring to low-density genotyped and dams that were completely genotyped, low density genotyped and non-genotyped. It was assumed that the genotypes of the offspring’s sires were available. The Beagle 3.3.2 program was used for imputation of parent-offspring trios. The Bayesian LASSO were used to estimate the marker effects using the R package of “BLR”. The results showed that accuracy of both imputation and genomic evaluation was influenced by imputation errors. Imputation accuracy ranged from 0.67 to 0.96 for genotyped individuals. Genotype imputation accuracy increased with increasing marker density of low-density genotyping platform and with dams having high-density genotypes. Results showed that imputation accuracies decreased significantly (P < 0.05) when dam was non-genotyped and both of offspring were low-density genotyped. In case of factors affecting imputation accuracy, the imputation accuracy of SNPs with low MAF increased considerably when a dam was completely genotyped. Imputation of non-genotyped individuals can help to include valuable phenotypes for genome-wide association studies or for genomic prediction, especially when the non-genotyped individuals have genotyped offspring.

Keywords

Main Subjects


  • Boichard, D., Chung, H., Dassonneville, R., David, X., Eggen, A., Fritz, S., 2012. Design of a bovine low-density SNP array optimized for imputation. PLoS ONE 7.e34130 doi: 10.1371/journal.pone.0034130.
  • Bouwman, A.C., Hickey, J.M., Calus, M.P., Veerkamp, R.F., 2014. Imputation of non-genotyped individuals based on genotyped relatives: assessing the imputation accuracy of a real case scenario in dairy cattle. Genetics Selection Evolution 46,6-10 doi: 10.1186/1297-9686-46-6.
  • Brøndum, R.F., Ma, P., Lund, M.S., Su, G., 2012. Short communication: genotypeimputation within and across Nordic cattle breeds. Journal of Dairy Science 95, 6795–6800.
  • Browning, B.L., Browning, S.R., 2009. A unified approach to genotype imputation and haplotype phase inference for large data sets of trios and unrelated individuals. The American Journal of  Human Genetics 84,210-223 doi: 10.1016/j.ajhg.2009.01.005.
  • Calus, M.P.L., Bouwman, A.C., Hickey, J.M., Veerkamp, R.F., Mulder, H.A., 2014. Evaluation of measures of correctness of genotype imputation in the context of genomic prediction: a review of livestock applications, Animal 8, 1743-1753 doi:10.1017/S1751731114001803
  • Chen, L., Li, C., Sargolzaei, M., Schenkel, F., 2014. Impact of genotype imputation on the performance of GBLUP and Bayesian methods for genomic prediction. Plos ONE 9, e101544. 
  • Chen, M.H., Huang, J., Chen, W.M., Larson, M.G., Fox, C.S., Vasan, R.S., Seshadri, S., O’Donnell, C.J., Yang, Q., 2012. Using family-based imputation in genome-wide association studies with large complex pedigrees: the Framingham heart study. PLoS ONE 7:e51589.
  • De los Campos, G., Naya, H., Gianola, D., Crossa, J., Legarra, A., Manfredi, E., Weigel, K., Cotes, J.M., 2009 Predicting Quantitative Traits With Regression models for Dense Molecular Markers and pedigree. Genetics 182, 375-385.
  • De los Campos, G., Pérez, P., 2010. BLR: Bayesian Linear Regression. R package version 1.1.
  • Erbe, M., Hayes, B.J., Matukumalli, L.K., Goswami, S., Bowman, P.J., Reich, C.M., Mason, B.A., Goddard,  M.E., 2012. Improving accuracy of genomic predictions within and between dairy cattle breeds with imputed high-density single nucleotide polymorphism panels. Journal of Dairy Science 95, 4114–4129 doi: 10.3168/jds.2011-5019.
  • Hickey, J.M., Crossa, J., Babu, R., de losCampos, G., 2012. Factors affecting the accuracy of genotype imputation in populations from several maize breeding programs. Crop Science 52, 654–663 doi: 10.2135/cropsci2011.07.0358.

  • Johnston, J., Kistemaker, G., Sullivan, P.G., 2011. Comparison of different imputation methods. Interbull Bulletin 44, 25–33.
  • Lu, A.T., Cantor, R.M., 2014. Identifying rare-variant associations in parent-child trios using a Gaussian support vector machine. BMC Proceedings 8, S98 doi: 10.1186/1753-6561-8-S1-S98.
  • Meuwissen, T.H.E., Goddard, M.E., 2010. The use of family relationships and linkage disequilibrium to impute phase and missing genotypes in up to whole genome sequence density genotypic data. Genetics 185, 1441–1449 doi: 10.1534/genetics.110.113936.
  • Meuwissen, T.H.E., Hayes, B.J., Goddard, M.E., 2001. Prediction of total genetic value using genome-wide dense marker maps. Genetics 157, 1819–1829.
  • Molaei Moghbeli, S., Barazandeh, S., Vatankhah, M., Mohammadabadi, M., 2013. Genetics and non-genetics parameters of body weight for post-weaning traits in Raini Cashmere goats. Tropical Animal Health and Production 45, 1519-1524 doi: 10.1007/s11250-013-0393-4
  • Mulder, H.A., Calus, M.P.L., Druet, T., Schrooten, C., 2012. Imputation of genotypes with low-density chips and its effect on reliability of direct genomic values in Dutch Holstein cattle. Journal of Dairy Science 95,876-889 doi: 10.3168/jds.2011-4490.
  • Ober, U., Ayroles, J.F., Stone, E.A., Richards, S., Zhu, D., Gibbs, R.A., Stricker, C., Gianola, D., Schlather, M., Mackay, T.F.C., Simianer, H., 2012. Using whole-genome sequence data to predict quantitative trait phenotypes in Drosophila melanogaster. PloS Genetics 8:e1002685 doi: 10.137/journal.pgen.1002685.
  • Pimentel, E.C.G., Erbe, M., König, S., Simianer, H., 2011. Genome partitioning of genetic variation for milk production and composition traits in Holstein cattle. Frontires in Genetics 2, 19-25.
  • Pimentel, E.C.G., Wensch-Dorendorf, M., Konig, S., Swalve, H.H., 2013. Enlarging a training set for genomic selection by imputation of un-genotyped animals in populations of varying genetic architecture. Genetics Selection Evolution. 45, 45-12 doi: 10.1186/1297-9686-45-12.
  • R Development Core Team. R: a language and environment for statistical computing, Vienna. 2014. Available at: http://www.r-project.org/.
  • Sargolzaei, M., Jansen, G.B., Schenkel, F.S., 2014. A new approach for efficient genotype imputation using information from relatives. BMC Genomics 15,478 doi: 10.1186/1471-2164-15-478.
  • Sargolzaei, M., Schenkel, F.S., Jansen, G.B., Schaeffer, L.R., 2008. Extent of linkage disequilibrium in Holstein cattle in North America. Journal of Dairy Science 91, 2106–2117 doi: 10.3168/jds.2007-0553.
  • Scheet, P., Stephens, M., 2006. A fast and flexible statistical model for large-scale population genotype data: applications to inferring missing genotypes and haplotypic phase. The American Journal of Human Genetics 78, 629–44.
  • Silva, F.F., Rose, G., Guimaräes, S., Lopes, P.S., Campos, G., 2011. Tree-step Bayesian factor analysis applied to QTL detection in crosses between outbred pig populations. Livestock Science 142, 210-215.
  • Technow, A.F., 2015. Hypred, simulation of genomic data in applied genetics. R package version 0.5. Available at: http://cran.rproject.org//web/packages/hypred
  • Villumsen, T.M., Janss, L., Lund, M.S., 2009. The importance of haplotype length and heritability using genomic selection in dairy cattle. Journal of Animal Breeding and Genetics  126, 3-13.
  • Weigel, K.A., Van Tassell, C.P., O’Connell, J.R., VanRaden, P.M., Wiggans, G.R., 2010. Prediction of unobserved single nucleotide poly-morphism genotypes of Jersey cattle using reference panels  and population based imputation algorithms. Journal of Dairy Science 93, 2229-2238 doi: 10.3168/jds.2009-2849.
  • Wellmann, R., Preuß, S., Tholen, E., Heinkel, J., Wimmers, K., Bennewitz, J., 2013. Genomic selection using low density marker panels with application to a sire line in pigs. Genetics Selection Evolution 45, 28 doi: 10.1186/1297-9686-45-28.
  • Willer, C.J., Sanna, S., Jackson, A.U., Scuteri, A., Bonnycastle, L.L., 2008. Newly identified loci that influence lipid concentrations and risk of coronary artery disease. Nature Genetics 40, 161–9.
  • Williams, A.L., Patterson, N., Glessner, J., Hakonarson, H., Reich, D., 2012. Phasing of many thousands of genotyped samples. The American Journal of Human Genetics 91, 238–251 doi: 10.1016/j.ajhg.2012.06.013.