Explaining semi-supervised text alignment through visualization

Christofer Meinecke, David Wrisley, Stefan Janicke*

*Corresponding author for this work

Research output: Contribution to journalJournal articleResearchpeer-review

Abstract

The analysis of variance in complex text traditions is an arduous task when carried out manually. Text alignment algorithms provide domain experts with a robust alternative to such repetitive tasks. Existing white-box approaches allow the digital humanities to establish syntax-based metrics taking into account the spelling, morphology and order of words. However, they produce limited results, as semantic meanings are typically not taken into account. Our interdisciplinary collaboration between visualization and digital humanities combined a semi-supervised text alignment approach based on word embeddings that take not only syntactic but also semantic text features into account, thereby improving the overall quality of the alignment. In our collaboration, we developed different visual interfaces that communicate the word distribution in high-dimensional vector space generated by the underlying neural network for increased transparency, assessment of the tools reliability and overall improved hypothesis generation. We further offer visual means to enable the expert reader to feed domain knowledge into the system at multiple levels with the aim of improving both the product and the process of text alignment. This ultimately illustrates how visualization can engage with and augment complex modes of reading in the humanities.

Original languageEnglish
JournalIEEE Transactions on Visualization and Computer Graphics
Volume28
Issue number12
Pages (from-to)4797-4809
ISSN1077-2626
DOIs
Publication statusPublished - 1. Dec 2022

Keywords

  • Collaboration
  • Data visualization
  • Human-in-the-loop
  • Pipelines
  • Professional Reading
  • Semantics
  • Task analysis
  • Text Alignment
  • Visual analytics
  • Visualization
  • Visualization in the Humanities
  • Word Embeddings
  • human-in-the-loop
  • word embeddings
  • professional reading
  • visualization in the humanities
  • Text alignment

Fingerprint

Dive into the research topics of 'Explaining semi-supervised text alignment through visualization'. Together they form a unique fingerprint.

Cite this