Evaluating correlation coefficients for clustering gene expression profiles of cancer

Pablo A. Jaskowiak*, Ricardo J.G.B. Campello, Ivan G. Costa

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingArticle in proceedingsResearchpeer-review

Abstract

Cluster analysis is usually the first step adopted to unveil information from gene expression data. One of its common applications is the clustering of cancer samples, associated with the detection of previously unknown cancer subtypes. Although guidelines have been established concerning the choice of appropriate clustering algorithms, little attention has been given to the subject of proximity measures. Whereas the Pearson correlation coefficient appears as the de facto proximity measure in this scenario, no comprehensive study analyzing other correlation coefficients as alternatives to it has been conducted. Considering such facts, we evaluated five correlation coefficients (along with Euclidean distance) regarding the clustering of cancer samples. Our evaluation was conducted on 35 publicly available datasets covering both (i) intrinsic separation ability and (ii) clustering predictive ability of the correlation coefficients. Our results support that correlation coefficients rarely considered in the gene expression literature may provide competitive results to more generally employed ones. Finally, we show that a recently introduced measure arises as a promising alternative to the commonly employed Pearson, providing competitive and even superior results to it.

Original languageEnglish
Title of host publicationAdvances in Bioinformatics and Computational Biology - 7th Brazilian Symposium on Bioinformatics, BSB 2012, Proceedings
PublisherSpringer
Publication date2012
Pages120-131
ISBN (Print)9783642319266
DOIs
Publication statusPublished - 2012
Externally publishedYes
Event7th Brazilian Symposium on Bioinformatics, BSB 2012 - Campo Grande, Brazil
Duration: 15. Aug 201217. Aug 2012

Conference

Conference7th Brazilian Symposium on Bioinformatics, BSB 2012
Country/TerritoryBrazil
CityCampo Grande
Period15/08/201217/08/2012
SponsorConselho Nacional Desenvolvimento Cientifico Tecnologico (CNPq), Coordenacao de Aperfeicoamento Pessoal de Nivel Superior (CAPES), Fund. Apoio Desenvolv. Ensino, Cienc. Tecnol.Estado, Mato Grosso do Sul (Fundect), Fundacao de Apoio a Pesquisa ao Ensino e a Cultura (FAPEC)
SeriesLecture Notes in Computer Science
Volume7409 LNBI
ISSN0302-9743

Keywords

  • clustering
  • correlation
  • gene expression
  • proximity measure

Fingerprint

Dive into the research topics of 'Evaluating correlation coefficients for clustering gene expression profiles of cancer'. Together they form a unique fingerprint.

Cite this