SynTemp: Efficient Extraction of Graph-Based Reaction Rules from Large-Scale Reaction Databases

Tieu Long Phan*, Klaus Weinbauer, Marcos E.González Laffitte, Yingjie Pan, Daniel Merkle, Jakob L. Andersen, Rolf Fagerberg, Christoph Flamm, Peter F. Stadler

*Corresponding author for this work

Research output: Contribution to journalJournal articleResearchpeer-review

Abstract

Reaction templates are graphs that represent the reaction center as well as the surrounding context in order to specify salient features of chemical reactions. They are subgraphs of imaginary transition states, which are equivalent to double pushout graph rewriting rules and thus can be applied directly to predict reaction outcomes at the structural formula level. We introduce here SynTemp, a framework designed to extract and hierarchically cluster reaction templates from large-scale reaction data repositories. Rule inference is implemented as a robust graph-theoretic approach, which first computes an atom-atom mapping (AAM) as a consensus over partial predictions from multiple state-of-the-art tools and then augments the raw AAM by mechanistically relevant hydrogen atoms and extracts the reactions center extended by relevant context. SynTemp achieves an exceptional accuracy of 99.5% and a success rate of 71.23% in obtaining AAMs on the chemical reaction dataset. Hierarchical clustering of the extended reaction centers based on topological features results in a library of 311 transformation rules explaining 86% of the reaction dataset.

Original languageEnglish
JournalJournal of Chemical Information and Modeling
Volume65
Issue number6
Pages (from-to)2882–2896
ISSN1549-9596
DOIs
Publication statusPublished - 28. Feb 2025

Fingerprint

Dive into the research topics of 'SynTemp: Efficient Extraction of Graph-Based Reaction Rules from Large-Scale Reaction Databases'. Together they form a unique fingerprint.

Cite this