TY - JOUR
T1 - Comparative evaluation of the heterozygous variant standard deviation as a quality measure for next-generation sequencing
AU - Høy Hansen, Marcus
AU - Steensboe Lang, Cecilie
AU - Abildgaard, Niels
AU - Nyvold, Charlotte Guldborg
N1 - Publisher Copyright:
© 2022 The Author(s)
PY - 2022/11
Y1 - 2022/11
N2 - Next-generation sequencing holds unprecedented throughput in terms of informational content to cost. The technology has entered the scene in laboratory diagnostics and offers flexible workflows in biomedical research. However, the rapid acquisition of genomic data also gives rise to a substantial fraction of sequencing artifacts, causing the detection of false-positive germline variants or erroneous somatic mutations. Consequently, there is a pressing need for efficient and practical quality assessment in sequencing projects. In this study, we investigate using heterozygous variant allele frequency (VAF) standard deviation (σ) for supplementary quality control. Whereas several proposed quality metrics are based on empirical assessments, the dispersion of the allele frequencies reflects a direct approximation of the inherent and discrete features of a diploid genome. Consequently, homologous chromosomes display heterozygous VAF of approximately 1/2. Based on the meta-analysis of 152 whole-exome sequencing data sets, we found that σ reflects both sequencing coverage and noise and can be effectively modeled. It is concluded that the relative comparison of heterozygous VAF σ provides a practical handle for quality assessment, even for samples afflicted with copy-number alterations. The approach can be implemented when performing whole-exome, whole-genome, or targeted panel sequencing and helps identify problematic samples, such as those retrieved from archived formalin-fixed paraffin-embedded tissue.
AB - Next-generation sequencing holds unprecedented throughput in terms of informational content to cost. The technology has entered the scene in laboratory diagnostics and offers flexible workflows in biomedical research. However, the rapid acquisition of genomic data also gives rise to a substantial fraction of sequencing artifacts, causing the detection of false-positive germline variants or erroneous somatic mutations. Consequently, there is a pressing need for efficient and practical quality assessment in sequencing projects. In this study, we investigate using heterozygous variant allele frequency (VAF) standard deviation (σ) for supplementary quality control. Whereas several proposed quality metrics are based on empirical assessments, the dispersion of the allele frequencies reflects a direct approximation of the inherent and discrete features of a diploid genome. Consequently, homologous chromosomes display heterozygous VAF of approximately 1/2. Based on the meta-analysis of 152 whole-exome sequencing data sets, we found that σ reflects both sequencing coverage and noise and can be effectively modeled. It is concluded that the relative comparison of heterozygous VAF σ provides a practical handle for quality assessment, even for samples afflicted with copy-number alterations. The approach can be implemented when performing whole-exome, whole-genome, or targeted panel sequencing and helps identify problematic samples, such as those retrieved from archived formalin-fixed paraffin-embedded tissue.
U2 - 10.1016/j.jbi.2022.104234
DO - 10.1016/j.jbi.2022.104234
M3 - Journal article
C2 - 36283582
AN - SCOPUS:85140907003
SN - 1532-0464
VL - 135
JO - Journal of Biomedical Informatics
JF - Journal of Biomedical Informatics
M1 - 104234
ER -