Prototype of our MiMoText graph

14. March 2022

The aim of the project Mining and Modeling Text is to establish an information network for the humanities built from various sources: bibliographic metadata, primary sources and scholarly publications.


In our first prototype, we gather data on the French Enlightement Novel and store it in RDF (Resource Description Framework) triples in the free and open software Wikibase. These triples include statements on authors,  titles, narrative forms, narrative locations, characters, themes, style, as well as metadata on place of publication, first publication dates or pages per novel.


Here are some screenshots of the Wikibase instance, which will be published in mid-2022.


Screenshot Wikibase

Figure I: MiMoText Wikibase instance


The SPARQL-endpoint enables us to query our knowledge graph and to  visualise the results in a multitude of ways (table, map, bar chart, scatter chart, bubble chart, timeline, tree map etc.).



Figure II: Aboutness per novel, scatter plot



Figure III: Aboutness per novel, tree map


In querying the graph, we can combine properties, for example a certain narrative location ("imaginary place") with another property, for example themes per novel.


sparql endpoint

Figure IV: MiMoText SPARQL-Endpoint



Figure V: Narrative location "imaginary place" and thematic concepts per novel


RDF statements on publication dates enables us to query for developments in time, for example the occurence of the thematic concept "travel" in novels 1751-1800.


SELECT ?date ?countTravel ?countAll (?countTravel  / ?countAll AS ?rel) WHERE {  {      SELECT ?date (count(*) AS ?countTravel )    WHERE {      ?item wdt:P7 ?date;            wdt:P25 ?topic .      ?topic rdfs:label ?topicLabel .      filter(lang(?topicLabel) = "en")      filter(lcase(?topicLabel) = "travel"@en)    }    GROUP BY ?date  }  {    SELECT ?date (count(*) AS ?countAll)    WHERE {      ?item wdt:P7 ?date;          wdt:P25 ?topic .      ?topic rdfs:label ?topicLabel .      filter(lang(?topicLabel) = "en")    }    GROUP BY ?date  } } ORDER BY desc(?rel)

Figure VI: thematic concept "travel" per year