Estimation of genomic breeding values using the Horseshoe prior

Ricardo Pong-Wong

Publikation: Bidrag til tidsskriftTidsskriftartikelForskningpeer review

Resumé

BACKGROUND: A method for estimating genomic breeding values (GEBV) based on the Horseshoe prior was introduced and used on the analysis of the 16(th) QTLMAS workshop dataset, which resembles three milk production traits. The method was compared with five commonly used methods: Bayes A, Bayes B, Bayes C, Bayesian Lasso and GLUP.

METHODS: The main difference between the methods is the prior distribution assumed during the estimation of the SNP effects. The distribution of the Bayesian Lasso is a Laplace distribution; for Bayes A is a Student-t; for Bayes B and Bayes C is a spike and slab prior combining a proportion of SNP without effect and a proportion with effect distributed as a Student-t or Gaussian for Bayes B and C, respectively; for GBLUP is similar to a ridge regression. The distribution for the Horseshoe prior behaves like log(1+1/β(2)) (up to a constant). It has an infinite spike at zero and heavy tail that decay by β(-2) (slower than the Laplace or the Student-t). The implementation of all methods (except GBLUP) was done using a MCMC approach, where the relevant parameters defining the prior distributions were jointly estimated from the data. The GBLUP was done using ASREML.

RESULTS: The accuracy for all methods ranged from 0.74 to 0.83, representing an improvement of 44% to 78% over the traditional BLUP evaluation. GEBV with the highest accuracy were obtained with Bayes A, Bayes B and the Horseshoe prior. The Horseshoe tended to select smaller number of SNP and assigning them larger effects, while strongly shrinking the remaining SNP to have an effect closer to zero.

CONCLUSIONS: The Horseshoe prior showed a different shrinkage pattern than the other methods. While for this specific dataset, this has little impact on the accuracy of the GEBV, it may prove a good property to discriminate true effect from noise, and thereby, improve overall prediction under different scenarios.

OriginalsprogEngelsk
TidsskriftB M C Proceedings
Vol/bind8
Udgave nummerSuppl 5
Sider (fra-til)S6
ISSN1753-6561
DOI
StatusUdgivet - 2014
Udgivet eksterntJa

Fingeraftryk

Students
Single Nucleotide Polymorphism
Noise
alachlor
Education
Milk
Datasets

Citer dette

Pong-Wong, Ricardo. / Estimation of genomic breeding values using the Horseshoe prior. I: B M C Proceedings. 2014 ; Bind 8, Nr. Suppl 5. s. S6.
@article{8d0b91963b7c4f889fe6af970153d327,
title = "Estimation of genomic breeding values using the Horseshoe prior",
abstract = "BACKGROUND: A method for estimating genomic breeding values (GEBV) based on the Horseshoe prior was introduced and used on the analysis of the 16(th) QTLMAS workshop dataset, which resembles three milk production traits. The method was compared with five commonly used methods: Bayes A, Bayes B, Bayes C, Bayesian Lasso and GLUP.METHODS: The main difference between the methods is the prior distribution assumed during the estimation of the SNP effects. The distribution of the Bayesian Lasso is a Laplace distribution; for Bayes A is a Student-t; for Bayes B and Bayes C is a spike and slab prior combining a proportion of SNP without effect and a proportion with effect distributed as a Student-t or Gaussian for Bayes B and C, respectively; for GBLUP is similar to a ridge regression. The distribution for the Horseshoe prior behaves like log(1+1/β(2)) (up to a constant). It has an infinite spike at zero and heavy tail that decay by β(-2) (slower than the Laplace or the Student-t). The implementation of all methods (except GBLUP) was done using a MCMC approach, where the relevant parameters defining the prior distributions were jointly estimated from the data. The GBLUP was done using ASREML.RESULTS: The accuracy for all methods ranged from 0.74 to 0.83, representing an improvement of 44{\%} to 78{\%} over the traditional BLUP evaluation. GEBV with the highest accuracy were obtained with Bayes A, Bayes B and the Horseshoe prior. The Horseshoe tended to select smaller number of SNP and assigning them larger effects, while strongly shrinking the remaining SNP to have an effect closer to zero.CONCLUSIONS: The Horseshoe prior showed a different shrinkage pattern than the other methods. While for this specific dataset, this has little impact on the accuracy of the GEBV, it may prove a good property to discriminate true effect from noise, and thereby, improve overall prediction under different scenarios.",
author = "Ricardo Pong-Wong",
year = "2014",
doi = "10.1186/1753-6561-8-S5-S6",
language = "English",
volume = "8",
pages = "S6",
journal = "B M C Proceedings",
issn = "1753-6561",
publisher = "BioMed Central",
number = "Suppl 5",

}

Estimation of genomic breeding values using the Horseshoe prior. / Pong-Wong, Ricardo.

I: B M C Proceedings, Bind 8, Nr. Suppl 5, 2014, s. S6.

Publikation: Bidrag til tidsskriftTidsskriftartikelForskningpeer review

TY - JOUR

T1 - Estimation of genomic breeding values using the Horseshoe prior

AU - Pong-Wong, Ricardo

PY - 2014

Y1 - 2014

N2 - BACKGROUND: A method for estimating genomic breeding values (GEBV) based on the Horseshoe prior was introduced and used on the analysis of the 16(th) QTLMAS workshop dataset, which resembles three milk production traits. The method was compared with five commonly used methods: Bayes A, Bayes B, Bayes C, Bayesian Lasso and GLUP.METHODS: The main difference between the methods is the prior distribution assumed during the estimation of the SNP effects. The distribution of the Bayesian Lasso is a Laplace distribution; for Bayes A is a Student-t; for Bayes B and Bayes C is a spike and slab prior combining a proportion of SNP without effect and a proportion with effect distributed as a Student-t or Gaussian for Bayes B and C, respectively; for GBLUP is similar to a ridge regression. The distribution for the Horseshoe prior behaves like log(1+1/β(2)) (up to a constant). It has an infinite spike at zero and heavy tail that decay by β(-2) (slower than the Laplace or the Student-t). The implementation of all methods (except GBLUP) was done using a MCMC approach, where the relevant parameters defining the prior distributions were jointly estimated from the data. The GBLUP was done using ASREML.RESULTS: The accuracy for all methods ranged from 0.74 to 0.83, representing an improvement of 44% to 78% over the traditional BLUP evaluation. GEBV with the highest accuracy were obtained with Bayes A, Bayes B and the Horseshoe prior. The Horseshoe tended to select smaller number of SNP and assigning them larger effects, while strongly shrinking the remaining SNP to have an effect closer to zero.CONCLUSIONS: The Horseshoe prior showed a different shrinkage pattern than the other methods. While for this specific dataset, this has little impact on the accuracy of the GEBV, it may prove a good property to discriminate true effect from noise, and thereby, improve overall prediction under different scenarios.

AB - BACKGROUND: A method for estimating genomic breeding values (GEBV) based on the Horseshoe prior was introduced and used on the analysis of the 16(th) QTLMAS workshop dataset, which resembles three milk production traits. The method was compared with five commonly used methods: Bayes A, Bayes B, Bayes C, Bayesian Lasso and GLUP.METHODS: The main difference between the methods is the prior distribution assumed during the estimation of the SNP effects. The distribution of the Bayesian Lasso is a Laplace distribution; for Bayes A is a Student-t; for Bayes B and Bayes C is a spike and slab prior combining a proportion of SNP without effect and a proportion with effect distributed as a Student-t or Gaussian for Bayes B and C, respectively; for GBLUP is similar to a ridge regression. The distribution for the Horseshoe prior behaves like log(1+1/β(2)) (up to a constant). It has an infinite spike at zero and heavy tail that decay by β(-2) (slower than the Laplace or the Student-t). The implementation of all methods (except GBLUP) was done using a MCMC approach, where the relevant parameters defining the prior distributions were jointly estimated from the data. The GBLUP was done using ASREML.RESULTS: The accuracy for all methods ranged from 0.74 to 0.83, representing an improvement of 44% to 78% over the traditional BLUP evaluation. GEBV with the highest accuracy were obtained with Bayes A, Bayes B and the Horseshoe prior. The Horseshoe tended to select smaller number of SNP and assigning them larger effects, while strongly shrinking the remaining SNP to have an effect closer to zero.CONCLUSIONS: The Horseshoe prior showed a different shrinkage pattern than the other methods. While for this specific dataset, this has little impact on the accuracy of the GEBV, it may prove a good property to discriminate true effect from noise, and thereby, improve overall prediction under different scenarios.

U2 - 10.1186/1753-6561-8-S5-S6

DO - 10.1186/1753-6561-8-S5-S6

M3 - Journal article

C2 - 25519520

VL - 8

SP - S6

JO - B M C Proceedings

JF - B M C Proceedings

SN - 1753-6561

IS - Suppl 5

ER -