Between May 27 and August 9, 2019, I worked as a full time intern at TELOTA, BBAW.
The conference "Digital Humanities im deutschsprachigen Raum" (DHd 2019, see also associated conference report) resulted in an internship, which fitted great in my half-year break between finishing my bachelor's degree (B.A. Library Management) at the Potsdam University of Applied Sciences and starting my master's degree (M.A. Information Sciences) at Humboldt University in Berlin. Here I would like to discuss my different tasks and learning experiences in this internship.
The Berlin-Brandenburg Academy of Sciences and Humanities (BBAW)
"The Berlin-Brandenburg Academy of Sciences and Humanities is a learned society with a three-hundred-year-old tradition of uniting outstanding scholars and scientists across national and disciplinary boundaries. [...] As the largest non-university research institute for the humanities in the Berlin-Brandenburg region, it preserves and reveals the region’s cultural inheritance [...]."
TELOTA (The Electronic Life Of The Academy)
"is the digitalization initiative of the Berlin-Brandenburg Academy of Sciences and Humanities. Its main task is the development of tools that enable a digital creation, documentation and presentation of research findings by the academy." (own translation)
During my internship, I supported the long-term project on the Prussian Monarchy ↓ and the project correspSearch ↓ . However, in the first days of the internship, I was given the opportunity to extend my knowledge towards the tasks ahead and also acquire completely new knowledge. Taking advantage of the material and tips I was provided with, I updated my XML/XSLT skills. In addition, I learned some programming basics with an online course on programming with Python.
The long-term project “Anpassungsstrategien der späten mitteleuropäischen Monarchie am preußischen Beispiel 1786 bis 1918” ("Adaptation strategies of the late Central European monarchy using the Prussian example 1786 to 1918", own translation) examines
"the development of the monarchy using the Prussian example in the 19th century in an European and global perspective […]" (own translation)
One of my tasks was to convert existing organizational charts (which should serve as a description and visualization of Prussian court structures) from a static format to an XML/TEI format so that they can be integrated into the project's website and dynamically searched, expanded or integrated elsewhere. I also worked on an XSLT file, which automatically performs some of the transformation steps required for this.
“The web service correspSearch is based on digital letter indexes that are available online and written in the Correspondence Metadata Interchange (CMI) Format.”
In a second internship project I dealt with the automated extraction of letter metadata from a printed edition. I began to learn more about correspSearch as a project, but also about the underlying data format Correspondence Metadata Interchange Format (CMIF), and I developed an understanding of the essential steps in the process of automatically generating metadata records. After digitizing (scanning & OCRing) the table of contents of some of the volumes on Georg Forster's works (= a large edition of letters between Georg Forster and his family or other influential researchers/explorers at that time), I created Python scripts for transforming the raw texts into corresponding CSV files as well as for conversing the dates of the letters listed into a standardized date format (ISO). Although the latter has already been implemented comprehensively (see web service “Dates” of the Person Data Repository), this approach allowed me to develop an understanding of the complexity of such applications and to write a similar small program in its basic functionality myself. In addition, I used another Python script for the transformation of CSV files into the CMI format and Open Refine to check the OCR results for any mistakes. Here again, I recognized the relevance of structuring and simplifying your workflow. ↑
Learning success and evaluation
My overall learning success was very high. I was able to extend my theoretical knowledge in the area of XML significantly through these more practical projects. In addition, I gained a lot of new knowledge, for instance some programming basics, but also concerning more practical issues, such as digital workflows, tools and methods. (Co-)Working on real projects was one of the aspects I liked most about this internship. Thereby, you do not only get concrete results, but also a realistic work situation including typical errors. By developing certain strategies for problem solving, the personal learning effect was particularly high.
During my internship I always felt very welcome and (despite some initial uncertainties regarding my own programming skills) I was able to settle in quickly. It was very positive that I was asked about my interests even before the internship started and that these were considered in the subsequent planning of the internship as well. There was always time for discussing questions and problems. Different colleagues provided me with helpful tips as well as software and tool suggestions. I was able to participate in various meetings and events and felt very well integrated into the team. An internship plan gave me a rough structure and yet I was able to work flexibly and independently on the tasks I was given.
I can only recommend this internship at TELOTA, especially for students with a disciplinary background in the Digital Humanities, Computer and Information Sciences. If you are enthusiastic about topics related to the digital publication of humanities content and/or the development of research software, you have come to the right place. If this report raised your interest in such an internship, you are very welcome to contact me and I will gladly pass this on.