Exact and heuristic algorithms for weighted cluster editing

Sven Rahmann, Tobias Wittkop, Jan Baumbach, Marcel Martin, Anke Truss, Sebastian Böcker

Publikation: Bidrag til tidsskriftTidsskriftartikelForskningpeer review

Resumé

Clustering objects according to given similarity or distance values is a ubiquitous problem in computational biology with diverse applications, e.g., in defining families of orthologous genes, or in the analysis of microarray experiments. While there exists a plenitude of methods, many of them produce clusterings that can be further improved. "Cleaning up" initial clusterings can be formalized as projecting a graph on the space of transitive graphs; it is also known as the cluster editing or cluster partitioning problem in the literature. In contrast to previous work on cluster editing, we allow arbitrary weights on the similarity graph. To solve the so-defined weighted transitive graph projection problem, we present (1) the first exact fixed-parameter algorithm, (2) a polynomial-time greedy algorithm that returns the optimal result on a well-defined subset of "close-to-transitive" graphs and works heuristically on other graphs, and (3) a fast heuristic that uses ideas similar to those from the Fruchterman-Reingold graph layout algorithm. We compare quality and running times of these algorithms on both artificial graphs and protein similarity graphs derived from the 66 organisms of the COG dataset.
OriginalsprogEngelsk
TidsskriftC S B. Conference Proceedings
Vol/bind6
Sider (fra-til)391-401
Antal sider11
ISSN1752-7791
StatusUdgivet - 2007
Udgivet eksterntJa

Fingeraftryk

Cluster Analysis
Computational Biology
Weights and Measures
Heuristics
Proteins
Datasets

Citer dette

Rahmann, S., Wittkop, T., Baumbach, J., Martin, M., Truss, A., & Böcker, S. (2007). Exact and heuristic algorithms for weighted cluster editing. C S B. Conference Proceedings, 6, 391-401.
Rahmann, Sven ; Wittkop, Tobias ; Baumbach, Jan ; Martin, Marcel ; Truss, Anke ; Böcker, Sebastian. / Exact and heuristic algorithms for weighted cluster editing. I: C S B. Conference Proceedings. 2007 ; Bind 6. s. 391-401.
@article{afdc0f38255d43a9b9057fdd80ab59d7,
title = "Exact and heuristic algorithms for weighted cluster editing",
abstract = "Clustering objects according to given similarity or distance values is a ubiquitous problem in computational biology with diverse applications, e.g., in defining families of orthologous genes, or in the analysis of microarray experiments. While there exists a plenitude of methods, many of them produce clusterings that can be further improved. {"}Cleaning up{"} initial clusterings can be formalized as projecting a graph on the space of transitive graphs; it is also known as the cluster editing or cluster partitioning problem in the literature. In contrast to previous work on cluster editing, we allow arbitrary weights on the similarity graph. To solve the so-defined weighted transitive graph projection problem, we present (1) the first exact fixed-parameter algorithm, (2) a polynomial-time greedy algorithm that returns the optimal result on a well-defined subset of {"}close-to-transitive{"} graphs and works heuristically on other graphs, and (3) a fast heuristic that uses ideas similar to those from the Fruchterman-Reingold graph layout algorithm. We compare quality and running times of these algorithms on both artificial graphs and protein similarity graphs derived from the 66 organisms of the COG dataset.",
author = "Sven Rahmann and Tobias Wittkop and Jan Baumbach and Marcel Martin and Anke Truss and Sebastian B{\"o}cker",
year = "2007",
language = "English",
volume = "6",
pages = "391--401",
journal = "C S B. Conference Proceedings",
issn = "1752-7791",
publisher = "Imperial College Press",

}

Rahmann, S, Wittkop, T, Baumbach, J, Martin, M, Truss, A & Böcker, S 2007, 'Exact and heuristic algorithms for weighted cluster editing', C S B. Conference Proceedings, bind 6, s. 391-401.

Exact and heuristic algorithms for weighted cluster editing. / Rahmann, Sven; Wittkop, Tobias; Baumbach, Jan; Martin, Marcel; Truss, Anke; Böcker, Sebastian.

I: C S B. Conference Proceedings, Bind 6, 2007, s. 391-401.

Publikation: Bidrag til tidsskriftTidsskriftartikelForskningpeer review

TY - JOUR

T1 - Exact and heuristic algorithms for weighted cluster editing

AU - Rahmann, Sven

AU - Wittkop, Tobias

AU - Baumbach, Jan

AU - Martin, Marcel

AU - Truss, Anke

AU - Böcker, Sebastian

PY - 2007

Y1 - 2007

N2 - Clustering objects according to given similarity or distance values is a ubiquitous problem in computational biology with diverse applications, e.g., in defining families of orthologous genes, or in the analysis of microarray experiments. While there exists a plenitude of methods, many of them produce clusterings that can be further improved. "Cleaning up" initial clusterings can be formalized as projecting a graph on the space of transitive graphs; it is also known as the cluster editing or cluster partitioning problem in the literature. In contrast to previous work on cluster editing, we allow arbitrary weights on the similarity graph. To solve the so-defined weighted transitive graph projection problem, we present (1) the first exact fixed-parameter algorithm, (2) a polynomial-time greedy algorithm that returns the optimal result on a well-defined subset of "close-to-transitive" graphs and works heuristically on other graphs, and (3) a fast heuristic that uses ideas similar to those from the Fruchterman-Reingold graph layout algorithm. We compare quality and running times of these algorithms on both artificial graphs and protein similarity graphs derived from the 66 organisms of the COG dataset.

AB - Clustering objects according to given similarity or distance values is a ubiquitous problem in computational biology with diverse applications, e.g., in defining families of orthologous genes, or in the analysis of microarray experiments. While there exists a plenitude of methods, many of them produce clusterings that can be further improved. "Cleaning up" initial clusterings can be formalized as projecting a graph on the space of transitive graphs; it is also known as the cluster editing or cluster partitioning problem in the literature. In contrast to previous work on cluster editing, we allow arbitrary weights on the similarity graph. To solve the so-defined weighted transitive graph projection problem, we present (1) the first exact fixed-parameter algorithm, (2) a polynomial-time greedy algorithm that returns the optimal result on a well-defined subset of "close-to-transitive" graphs and works heuristically on other graphs, and (3) a fast heuristic that uses ideas similar to those from the Fruchterman-Reingold graph layout algorithm. We compare quality and running times of these algorithms on both artificial graphs and protein similarity graphs derived from the 66 organisms of the COG dataset.

M3 - Journal article

C2 - 17951842

VL - 6

SP - 391

EP - 401

JO - C S B. Conference Proceedings

JF - C S B. Conference Proceedings

SN - 1752-7791

ER -

Rahmann S, Wittkop T, Baumbach J, Martin M, Truss A, Böcker S. Exact and heuristic algorithms for weighted cluster editing. C S B. Conference Proceedings. 2007;6:391-401.