About MiMoText

Mining and Modeling Text

Interdisciplinary applications, informational development, legal perspectives (MiMo Text)

The acquisition of knowledge from large amounts of text and data which can no longer be handled by individuals is becoming increasingly important due to the possibilities of digitisation. For the humanities, this means in particular that digital full texts and rich metadata must not only be available, but must also be available in a form that promotes knowledge in the humanities.

The aim of the MiMoText project is therefore to establish an information network for the humanities fed from various sources, which, by making it available as Linked Open Data, is not only freely available and can be linked to other knowledge resources of the Semantic Web, but also offers innovative and efficient access possibilities to scientific information.

MiMoTextBase was built as part of the project “Mining and Modeling Text” (2019-2023). It is implemented using a Wikibase infrastructure and integrates various data from heterogeneous sources. Note that the project is ongoing and the contents and structure of the MiMoTextBase will continually be further developed. The focus is on sources on the history of the French novel from 1751 to 1800, drawing on some existing full-text digital copies (e.g. from Gallica).

Tutorial and further information about the MiMoTextBase: https://docs.mimotext.uni-trier.de
SPARQL endpoint: https://query.mimotext.uni-trier.de
MiMoTextBase: https://data.mimotext.uni-trier.de

Bibliographic directories, specialist literature and primary texts serve as sources of information. From these, metadata, concrete text properties and descriptive or evaluative statements about relevant entities are extracted for example authors and works. For this purpose, quantitative methods for automatic text analysis as well as for the extraction and modelling of data from extensive text collections must be further and partly newly developed. After that the information is converted into a Linked Open Data format and can be linked to each other and to the outside world. From the start of the project, the legal framework will also be analysed in order to ensure that the knowledge network is set up and made available in accordance with copyright and data protection laws.

Recent slides about the project: https://dhtrier.quarto.pub/tbilisi-mmt/

Hinzmann, Maria, Matthias Bremm, Tinghui Duan, Anne Klee, Johanna Konstanciak, Julia Röttgermann, Christof Schöch, and Joëlle Weis. 2024. “Patterns in Modeling and Querying a Knowledge Graph for Literary History [Preprint].” Zenodo. https://doi.org/10.5281/zenodo.12080340.

Schöch, Christof, Maria Hinzmann, Röttgermann Julia, Anne Klee, und Katharina Dietz. „Smart Modelling for Digital Literary History“.IJHAC: International Journal of Humanities and Arts Computing [Special issue on Linked Open Data] 16, Nr. 1 (2022): 78–93. https://doi.org/10.3366/ijhac.2022.0278.

Project spokesperson: Prof Dr Christof Schöch (schoech@uni-trier.de)

Deputy speaker: Prof Dr Claudine Moulin (moulin@uni-trier.de)