CITADEL

Our graph database is part of CITADEL, the PhD project of Aaron Pattee. CITADEL stands for "Computational Investigation of the Topographical and Architectural Designs in an Evolving Landscape". Besides the geo-spatial analysis of historical maps and architectural analysis using 3D models of the remaining ruins of interest, Aaron wanted to use a network analysis of a graph database. The data comes from charters from 882-1589 A.D. and mention either one of the case studies or a member of the families that have maintained the buildings of the case study. The goal is to obtain additional information on the sites and the families who maintained them to get a clearer picture on their power during the building phases of the respective sites (they have been worked on many times after their initial construction). Hopefully this allows to support or challenge information that was gained through the other analyses. However, looking at the current version of the database we can already see connections that we did not see by just reading the charters as the sheer amount of data makes it virtually impossible to have an oversight without the help of a graphical network.

Example of a charter

The following charter is part of the Regesta Imperii where most of the charters we used are from.

Ludwig II. - RIplus Regg. Pfalzgrafen 1 n. 905
1273 Oktober 11, Heidelberg
Reinhard v. Hoheneck beurkundet, dass er für von dem berzog ihm zugesagte 1000 mark Silber die ihm von weiland könig Richard anvertrauten reichskleinodien dem könig auf dessen ansuchen sofort auszuliefern versprochen habe. Im falle derselbe vorher sterbe, wird er den etwa vom herzog schon empfangenen theil der summe demselben zurückgeben und die ihm gestellten bürgen von ihrer bürgschaft entbinden. Will er aber das geld behalten, so bleibt ihm der herzog für den rest verpflichtet, wofür er dann demselben die reichskleinodien und alles was zu seiner Pflegschaft (procurationem) gehört, namentlich burg und stadt Lautern und die burg Trifels überweisen wird. Riezler, Forschungen 20,237 und W. Acta 1,592 ex or.

(Accessed on 09.07.2018)

Neo4j Neo4j

Neo4j is a graph database built in java. Not only are graph databases useful for us because of their higher performance on highly connected data (compared to relational databases) but they also make it very easy to explore the data visually. Neo4j itself offers an interface that gives developers and users an easy-to-use platform to alter and look at the database.
Visual exploration through this interface (rather than writing very specific queries) also allowed us to come upon connections we didn't expect and wouldn't have written queries for. However, it is important to note that this is a very selective view at the collected data.

Further advantages of Neo4j (and graph databases in general) for our purpose are its focus on relations and its flexibility (database model can be edited easily). It also has a query language called Cypher that was heavily inspired by SQL.

Cypher

SQL

MATCH (n:NODE)
WHERE
AND
RETURN
SELECT col
FROM table
WHERE
AND

Database Model

schema

Person

Persons found in the charters. They can be related to other persons, participate in events and have a few attributes such as firstMention (the year of the earliest charter we found them in)

Charters

Charters are the documents we extracted our data from. We also split them up into events (see below) but they remain in the database to have a better illustration of how the events connect, as a quick way to refer to the source and because not all charters have been seperated into events (see non-focusgroup below).

Appearance

We added Appearances as a meta node because this allows us to store data of a person specific to a certain event. Most importantly, we can differentiate their Ranks and AdminPosition over time.

Event

Charters of special interest (see FocusGroup below) are split up into events. We do this because charters often contain several actions. For example, there might be a trade and a confirmation within one charter.

Topic

Topic tells us what is being discussed. While events have a type (e.g. transaction) and sites refer to specific sites, topic allows for better categorization. For example, two seperate mentions of "fields around Lampertsmühle" in different charters aren't necessarily the same. So instead of having a site of "fields around Lampertsmühle", the site will be "Lampertsmühle" and the topic "fields". This also allows better sorting.

Site

Sites are points of interest in the charters. A site could be a forest close to an abbey. In many cases, such sites are part of a trade and they can give us an overview of how much land certain groups or family owned in the region in a specific time period.

Rank / AdminPosition

At the time, ranks and administrative positions were very important. Aaron has designed a sort of hierarchy to have a better understanding of the position in society everyone had. Both ranks and administrative positions can be secular or ecclesiastic. For example, emperor/empress and pope are the highest ranks whereas the highest administrative positions (amongst others) are judges, marshalls and notaries on a state/imperial level.

Location

any city, castle or other place of issue (of a charter)

Building

Our six case studies and seven other buildings closeby. They can interact in events the way persons do through appearances. That's because many charters have interactions with buildings rather than persons (e.g. when selling land).

FocusGroup and Non-FocusGroup

The charters Aaron selected contain over 1400 persons both within our area of interest but also as far away as Turin, Italy. However, our focus lays upon a handful of families that lived in the case studies and the case studies themselves. Collecting the data to create Appearances in Events - including their specific role in it - for all of those persons and possible Events in Charters would cause a work overhead that is out of scope of this project and ultimately doesn't affect our core interests much. That is why we created the "FocusGroup". The FocusGroup is our persons of interest and the "Building" nodes. Whenever someone from our FocusGroup is in a charter, we split it up into events and define their roles. Non-FocusGroup persons on the other side only appear in events if FocusGroup persons are in there as well. Otherwise they are directly linked to charters without any definition of their role.

Advantages of a graph database

The goal of the project is not just the systematic digitalisation but to use computer science as a tool to better understand the available data (here: the charters). That's why graph databases are a huge advantage for us: We can go into the database, look at specific data and easily (and quickly) see its relations and even go and explore. In a relational database, we would have to write new queries to reach additional data. This is especially handy for the user who will most likely not have a background in computer science.

Below you can see an example of the visualisation within Neo4j:

Here you can find the slides of my presentation on the project.