Despite a strong genetic background in cognitive function only a limited number of single nucleotide polymorphisms (SNPs) have been found in genome-wide association studies (GWASs). We hypothesize that this is partially due to mis-specified modeling concerning phenotype distribution as well as the relationship between SNP dosage and the level of the phenotype. To overcome these issues, we introduced an assumption-free method based on generalized correlation coefficient (GCC) in a GWAS of cognitive function in Danish and Chinese twins to compare its performance with traditional linear models. The GCC-based GWAS identified two significant SNPs in Danish samples (rs71419535, p = 1.47e-08; rs905838, p = 1.69e-08) and two significant SNPs in Chinese samples (rs2292999, p = 9.27e-10; rs17019635, p = 2.50e-09). In contrast, linear models failed to detect any genome-wide significant SNPs. The number of top significant genes overlapping between the two samples in the GCC-based GWAS was higher than when applying linear models. The GCC model identified significant genetic variants missed by conventional linear models, with more replicated genes and biological pathways related to cognitive function. Moreover, the GCC-based GWAS was robust in handling correlated samples like twin pairs. GCC is a useful statistical method for GWAS that complements traditional linear models for capturing genetic effects beyond the additive assumption.
- generalized correlation coefficient