Estimation of genomic breeding values using the Horseshoe prior

Ricardo Pong-Wong

Research output: Contribution to journalJournal articleResearchpeer-review

Abstract

BACKGROUND: A method for estimating genomic breeding values (GEBV) based on the Horseshoe prior was introduced and used on the analysis of the 16(th) QTLMAS workshop dataset, which resembles three milk production traits. The method was compared with five commonly used methods: Bayes A, Bayes B, Bayes C, Bayesian Lasso and GLUP.

METHODS: The main difference between the methods is the prior distribution assumed during the estimation of the SNP effects. The distribution of the Bayesian Lasso is a Laplace distribution; for Bayes A is a Student-t; for Bayes B and Bayes C is a spike and slab prior combining a proportion of SNP without effect and a proportion with effect distributed as a Student-t or Gaussian for Bayes B and C, respectively; for GBLUP is similar to a ridge regression. The distribution for the Horseshoe prior behaves like log(1+1/β(2)) (up to a constant). It has an infinite spike at zero and heavy tail that decay by β(-2) (slower than the Laplace or the Student-t). The implementation of all methods (except GBLUP) was done using a MCMC approach, where the relevant parameters defining the prior distributions were jointly estimated from the data. The GBLUP was done using ASREML.

RESULTS: The accuracy for all methods ranged from 0.74 to 0.83, representing an improvement of 44% to 78% over the traditional BLUP evaluation. GEBV with the highest accuracy were obtained with Bayes A, Bayes B and the Horseshoe prior. The Horseshoe tended to select smaller number of SNP and assigning them larger effects, while strongly shrinking the remaining SNP to have an effect closer to zero.

CONCLUSIONS: The Horseshoe prior showed a different shrinkage pattern than the other methods. While for this specific dataset, this has little impact on the accuracy of the GEBV, it may prove a good property to discriminate true effect from noise, and thereby, improve overall prediction under different scenarios.

Original languageEnglish
JournalB M C Proceedings
Volume8
Issue numberSuppl 5
Pages (from-to)S6
ISSN1753-6561
DOIs
Publication statusPublished - 2014
Externally publishedYes

Fingerprint

Students
Single Nucleotide Polymorphism
Noise
alachlor
Education
Milk
Datasets

Cite this

Pong-Wong, Ricardo. / Estimation of genomic breeding values using the Horseshoe prior. In: B M C Proceedings. 2014 ; Vol. 8, No. Suppl 5. pp. S6.
@article{8d0b91963b7c4f889fe6af970153d327,
title = "Estimation of genomic breeding values using the Horseshoe prior",
abstract = "BACKGROUND: A method for estimating genomic breeding values (GEBV) based on the Horseshoe prior was introduced and used on the analysis of the 16(th) QTLMAS workshop dataset, which resembles three milk production traits. The method was compared with five commonly used methods: Bayes A, Bayes B, Bayes C, Bayesian Lasso and GLUP.METHODS: The main difference between the methods is the prior distribution assumed during the estimation of the SNP effects. The distribution of the Bayesian Lasso is a Laplace distribution; for Bayes A is a Student-t; for Bayes B and Bayes C is a spike and slab prior combining a proportion of SNP without effect and a proportion with effect distributed as a Student-t or Gaussian for Bayes B and C, respectively; for GBLUP is similar to a ridge regression. The distribution for the Horseshoe prior behaves like log(1+1/β(2)) (up to a constant). It has an infinite spike at zero and heavy tail that decay by β(-2) (slower than the Laplace or the Student-t). The implementation of all methods (except GBLUP) was done using a MCMC approach, where the relevant parameters defining the prior distributions were jointly estimated from the data. The GBLUP was done using ASREML.RESULTS: The accuracy for all methods ranged from 0.74 to 0.83, representing an improvement of 44{\%} to 78{\%} over the traditional BLUP evaluation. GEBV with the highest accuracy were obtained with Bayes A, Bayes B and the Horseshoe prior. The Horseshoe tended to select smaller number of SNP and assigning them larger effects, while strongly shrinking the remaining SNP to have an effect closer to zero.CONCLUSIONS: The Horseshoe prior showed a different shrinkage pattern than the other methods. While for this specific dataset, this has little impact on the accuracy of the GEBV, it may prove a good property to discriminate true effect from noise, and thereby, improve overall prediction under different scenarios.",
author = "Ricardo Pong-Wong",
year = "2014",
doi = "10.1186/1753-6561-8-S5-S6",
language = "English",
volume = "8",
pages = "S6",
journal = "B M C Proceedings",
issn = "1753-6561",
publisher = "BioMed Central",
number = "Suppl 5",

}

Estimation of genomic breeding values using the Horseshoe prior. / Pong-Wong, Ricardo.

In: B M C Proceedings, Vol. 8, No. Suppl 5, 2014, p. S6.

Research output: Contribution to journalJournal articleResearchpeer-review

TY - JOUR

T1 - Estimation of genomic breeding values using the Horseshoe prior

AU - Pong-Wong, Ricardo

PY - 2014

Y1 - 2014

N2 - BACKGROUND: A method for estimating genomic breeding values (GEBV) based on the Horseshoe prior was introduced and used on the analysis of the 16(th) QTLMAS workshop dataset, which resembles three milk production traits. The method was compared with five commonly used methods: Bayes A, Bayes B, Bayes C, Bayesian Lasso and GLUP.METHODS: The main difference between the methods is the prior distribution assumed during the estimation of the SNP effects. The distribution of the Bayesian Lasso is a Laplace distribution; for Bayes A is a Student-t; for Bayes B and Bayes C is a spike and slab prior combining a proportion of SNP without effect and a proportion with effect distributed as a Student-t or Gaussian for Bayes B and C, respectively; for GBLUP is similar to a ridge regression. The distribution for the Horseshoe prior behaves like log(1+1/β(2)) (up to a constant). It has an infinite spike at zero and heavy tail that decay by β(-2) (slower than the Laplace or the Student-t). The implementation of all methods (except GBLUP) was done using a MCMC approach, where the relevant parameters defining the prior distributions were jointly estimated from the data. The GBLUP was done using ASREML.RESULTS: The accuracy for all methods ranged from 0.74 to 0.83, representing an improvement of 44% to 78% over the traditional BLUP evaluation. GEBV with the highest accuracy were obtained with Bayes A, Bayes B and the Horseshoe prior. The Horseshoe tended to select smaller number of SNP and assigning them larger effects, while strongly shrinking the remaining SNP to have an effect closer to zero.CONCLUSIONS: The Horseshoe prior showed a different shrinkage pattern than the other methods. While for this specific dataset, this has little impact on the accuracy of the GEBV, it may prove a good property to discriminate true effect from noise, and thereby, improve overall prediction under different scenarios.

AB - BACKGROUND: A method for estimating genomic breeding values (GEBV) based on the Horseshoe prior was introduced and used on the analysis of the 16(th) QTLMAS workshop dataset, which resembles three milk production traits. The method was compared with five commonly used methods: Bayes A, Bayes B, Bayes C, Bayesian Lasso and GLUP.METHODS: The main difference between the methods is the prior distribution assumed during the estimation of the SNP effects. The distribution of the Bayesian Lasso is a Laplace distribution; for Bayes A is a Student-t; for Bayes B and Bayes C is a spike and slab prior combining a proportion of SNP without effect and a proportion with effect distributed as a Student-t or Gaussian for Bayes B and C, respectively; for GBLUP is similar to a ridge regression. The distribution for the Horseshoe prior behaves like log(1+1/β(2)) (up to a constant). It has an infinite spike at zero and heavy tail that decay by β(-2) (slower than the Laplace or the Student-t). The implementation of all methods (except GBLUP) was done using a MCMC approach, where the relevant parameters defining the prior distributions were jointly estimated from the data. The GBLUP was done using ASREML.RESULTS: The accuracy for all methods ranged from 0.74 to 0.83, representing an improvement of 44% to 78% over the traditional BLUP evaluation. GEBV with the highest accuracy were obtained with Bayes A, Bayes B and the Horseshoe prior. The Horseshoe tended to select smaller number of SNP and assigning them larger effects, while strongly shrinking the remaining SNP to have an effect closer to zero.CONCLUSIONS: The Horseshoe prior showed a different shrinkage pattern than the other methods. While for this specific dataset, this has little impact on the accuracy of the GEBV, it may prove a good property to discriminate true effect from noise, and thereby, improve overall prediction under different scenarios.

U2 - 10.1186/1753-6561-8-S5-S6

DO - 10.1186/1753-6561-8-S5-S6

M3 - Journal article

C2 - 25519520

VL - 8

SP - S6

JO - B M C Proceedings

JF - B M C Proceedings

SN - 1753-6561

IS - Suppl 5

ER -