The Henrik Pontoppidan Text Corpora and Database

Project: Private Foundations

Project Details

Description

Texts (in both printed and digital form), library catalogues and well-designed bibliographical databases are the ground stones of research in many corners of the Humanities (Kjældgaard & Bjerring-Hansen, 2015). Especially in literary studies, they are a seminal part of the research infrastructure. The proposed project, a collaboration between the Pontoppidan Centre (SDU) and the University Library of Southern Denmark, aims to create a full-text database containing all works by and about Nobel Laurate Henrik Pontoppidan. Such a database is first and foremost prerequisite for the current research project “Re-Investigating Henrik Pontoppidan” (PI K. Jensen Husen) which examines earlier research on the works of Henrik Pontoppidan, the connections between the different critical works in form of network analysis and tracks references to specific Pontoppidan texts in different parts of the reception through citation analysis. In order to make this project feasible, a database containing all critical literature on Henrik Pontoppidan in digital form must be created and we apply the Carlsberg Foundation for the funds to do so. The records and texts are scattered in different collections such as the Bibliography of Danish Literary History (hosted by the Royal Library, but currently unavailable), the bibliography Henrik Pontoppidans forfatterskab (Skjerbæk, Skjerbæk & Herring, 2006), and on www.henrikpontoppidan.dk, but it is difficult to access it all at once and work with it computationally.

As stated above, bibliographic databases are an import part of the Humanities research infrastructure, yet the creation and maintenance of such databases are often neglected. Many bibliographies are so-called enumerative bibliographies which contain all works concerning a specific topic (works of an author, a country etc. (Kondrup, 2011)) – one example is Henrik Pontoppidans forfatterskab – but what we want to create is a relational database allowing not only cross-referencing with the use of metadata but also examination of patterns in works cited (both primary and secondary literature) and citation network analysis: Are specific works of Pontoppidan more often cited within a specific research area, e.g. philosophy of cognition or Marxist literary analysis? and, if so, is it possible to extract those patterns from citations? In order to answer these questions, we need all secondary works collected in one single digital bibliography.

As a bonus the database will also contain all of the works of Henrik Pontoppidan in plain text format with metadata on publication and editions, which will correspond with the entries in the main database. This part of the database will have two key functions: one will be to support “Re-Investigating Henrik Pontoppidan” and other research on Pontoppidan’s works; the other will be to provide a curated dataset for researchers and students working within Digital Humanities, following in the footsteps of the MeMo project at University of Copenhagen. The collection and cleaning of data for training purposes take up a lot of time and a curated and cleaned dataset for working with Danish literature from the time around the Modern Breakthrough could be a great help in training both computer models and students in dealing computationally with a minor language like Danish. In this way the project also contributes to the national and international communities of Digital Literary Studies and the development of language technology and tools. At the same time the development of the database could be a starting point for the creation of a national standard for bibliographies concerning primary and secondary works on specific authors, e.g. H.C. Andersen and other canonical authors. The addition of more authors would also make it possible to detect patterns across publications on different authors, thus adding new perspectives to the Danish literary history.



Projects supported and collaborative perspectives

This project is the basis for carrying out the project of mapping the reception of Henrik Pontoppidan’s works, but a thorough registration and collection of all literature surrounding Pontoppidan will also support other research activities inside and outside the Pontoppidan Centre. The project of building the database and text corpus in collaboration between the University Library and The Pontoppidan Centre will therefore draw on the strengths of both researchers and library specialists.

The second part of the database containing all the works of Pontoppidan is also a requisite in the “Pontoppidan’s Pastors” project (PI Gunder Hansen) submitted this fall to the Danish Research Council. Among other things, this project examines the appearances of pastors in Pontoppidan’s works using digital methods such as text mining and topic modelling.



Timeline:

The project will have the following work packages:

WP1: Registration and curation of the physical archive. Registration will take place in the Endnote reference tool (license paid by SDU) (supervisor: Kamilla J. Husen, work by student assistants).

WP2: Digitisation and supplementing the Endnote database. The Skjerbæk Archive records will be supplemented from bibliotek.dk, The Bibliography of Danish Literary History (data provided by The Royal Library) and the Henrik Pontoppidan Website, including digitisation and uploading of full-text versions, if possible. (Assessment of missing records: Nils Gunder Hansen; supervising student assistants: Kamilla J. Husen, library database expertise provided by the University Library).

WP3: Creation of web interface and mark-up of specific elements. The development of the web interface will be curated by an internal web developer (from SDU IT-section) in order to secure the best usability in terms of external access. The last task to be carried out by a student assistant in close collaboration with Kamilla J. Husen is the mark-up and extraction of the first set of specific data, namely the references of each secondary text in order to start the first analysis of citation networks and references to specific works of Pontoppidan.



Technical considerations

The use of Endnote as registration tool will make the data accessible and reusable for other researchers, as the data records can be exported in .ris format and afterwards transformed to a format of choice. Likewise the web interface will make the .txt versions available in a format which can easily be downloaded by external parties (taking any copyright restrictions into account) and thus making the data accessible for anyone. The web access will be split into “access for all” (non-copyrighted material) and “restricted access” (for copyrighted material).





References

Kjældgaard, L. H. & Bjerring-Hansen, J. (2015). Infrastruktur i humaniora. I D. B. Pedersen, F. Stjernfelt & S. Køppe (red.), Kampen om disciplinerne - Viden og videnskabelighed i humanistisk forsknin. Kbh.: Hans Reitzels Forlag.

Kondrup, J. (2011). Editionsfilologi. Kbh.: Museum Tusculanum.

Skjerbæk, E., Skjerbæk, T. & Herring, R. (2006). Henrik Pontoppidans forfatterskab: en bibliografi. Kbh.: Det Danske Sprog- og Litteraturselskab.
Short titleThe Henrik Pontoppidan Text Corpora and Database
AcronymHPTCD
StatusActive
Effective start/end date01/02/202231/12/2022

Fingerprint

Explore the research topics touched on by this project. These labels are generated based on the underlying awards/grants. Together they form a unique fingerprint.