In Search of Europe’s Scientific Identity

Home >

In Search of Europe’s Scientific Identity

In Search of Europe’s Scientific Identity

Modern algorithms open up new possibilities for historians

In the past, scholars used to pore over dusty tomes. Today Dr. Matteo Valleriani, group leader at the Max Planck Institute for the History of Science as well as honorary professor at TU Berlin and fellow at the Berlin Institute for the Foundations of Learning and Data (BIFOLD), uses algorithms to group and analyze digitized data from historical works. The term used to describe this process is computational history. One of the goals of Valleriani’s research is to unlock the mechanisms involved in the homogenization of cosmological knowledge in the context of studies in the history of science.

Logo from the Sphaera Project „Prudentia“. Fresco. Casa Minerbi e del Sale, Ferrara (Italy), 13th cent., detail. From Ragghianti, Carlo R. 1970. Gli affreschi di casa Minerbi a Ferrara. S. Miniato: Cassa di Risparmio di S. Miniato.

The project is co-financed by BIFOLD and researches the evolutionary path of the European scientific system as well as the establishment of a common scientific identity in Europe between the 13th and 17th centuries. Dr. Valleriani is working with fellow researchers from the Max Planck Institute for the Physics of Complex Systems to develop and implement empirical, multilayer networks to enable the analysis of huge quantities of data.

In Paris in the first half of the 13th century, Johannes de Sacrobosco compiled an elementary text on geocentric cosmology entitled Tractatus de sphaera. This manuscript is a simple, late medieval description of the geocentric cosmos based on a synthesis of Aristotelian and Ptolemaic worldviews.

Common illustration to explain how lunar eclipses take place. (Piccolimini 1568, 122)


“This compilation of the knowledge of its time is the result of an emerging intellectual interest in Europe. In the 13th century, a need arose for a knowledge of astronomy and cosmology on a qualitative and descriptive basis – parallel to and driven by the emergence of a network of new universities,” explains Valleriani. Over the following decades, the Tractatus de sphaera was commented on, extended, and revised many times, but continued to be a mandatory text at all European universities until the 17th century. Digitized copies of 359 printed textbooks featuring modified forms of the Tractatus de sphaera from the period 1472 until 1650 are now available to researchers. During this period of about 180 years, some 30 new universities were founded in Europe.

The universal language of scholars at that time was Latin, which contributed significantly to the high mobility of knowledge even in this period. “An introductory course in astronomy was mandatory for students in Europe at that time,” explains Valleriani. “As a committed European, I am mainly interested in how this led to the emergence of a shared scientific knowledge in Europe.”

Taken together, these 359 books contain some 74,000 pages – a quantity of text and images that it is not possible for any individual person to examine and analyze. Working with machine learning experts from BIFOLD, the research team first had to clean, sort, and standardize this colossal data corpus drawn from a wide range of digital sources to make it accessible for algorithms. The first step was to sort the data into texts, images, and tables. The texts were then broken down into recurring textual parts and organized according to a specific semantic taxonomy reflecting early modern modes of production of scientific knowledge.


Each of the more than 20,000 scientific illustrations had to be linked to the extensive metadata of the editions and their textual parts. In addition, more than 11,000 tables were identified in the Sphaera corpus. “To analyze the tables, we developed an algorithm to divide them into several groups with similar characteristics. This allows us to now use further analyses to compare these groups with each other,” explains Valleriani. This process may sound simple, but in fact involves countless technical difficulties: “Developing suitable algorithms is made more difficult by four error sources. The books from this period contain many printer errors. This and the fact that the conditions of the books vary greatly makes them at times hard to digitize. Then there is the problem of the differing quality of the electronic copies. We also have to remember that at that time every printer used their own typeface, meaning that our algorithms have to be effectively trained for each printer to be able to even recognize the data.” In order to track the transformation process of the original text in the 359 books dating from this 180-year period and formalize this process of knowledge, the researchers need to understand precisely how knowledge changed, ultimately becoming more and more homogenous.

Instrument for calculating the length of the solar day depending on the latitude. Sacrobosco and Melanchthon 1543, B5-7; Bayerische Staatsbibliothek

“To achieve an understanding based upon data requires an intelligent synthesis of machine learning and the working practices of historians. The algorithms which we will now publish are the first capable of analyzing such data. We are also looking forward to develop further algorithms as part of our continuing cooperation with BIFOLD,” Valleriani explains.

Building and Interpreting Deep Similarity Models

Authors:
Oliver Eberle, Jochen Büttner, Florian Kräutli, Klaus-Robert Müller, Matteo Valleriani, Grégoire Montavon

Abstract:
Many learning algorithms such as kernel machines, nearest neighbors, clustering, or anomaly detection, are based ondistances or similarities. Before similarities are used for training an actual machine learning model, we would like to verify that they arebound to meaningful patterns in the data. In this paper, we propose to make similarities interpretable by augmenting them with anexplanation. We develop BiLRP, a scalable and theoretically founded method to systematically decompose the output of an alreadytrained deep similarity model on pairs of input features. Our method can be expressed as a composition of LRP explanations, whichwere shown in previous works to scale to highly nonlinear models. Through an extensive set of experiments, we demonstrate thatBiLRP robustly explains complex similarity models, e.g. built on VGG-16 deep neural network features. Additionally, we apply ourmethod to an open problem in digital humanities: detailed assessment of similarity between historical documents such as astronomicaltables. Here again, BiLRP provides insight and brings verifiability into a highly engineered and problem-specific similarity model.

Publication:
IEEE Transactions on Pattern Analysis and Machine Intelligence

Evolution and transformation of early modern cosmological knowledge: a network study

Authors:
Maryam Zamani, Alejandro Tejedor, Malte Vogl, Florian Kräutli, Matteo Valleriani, Holger Kantz

Abstract:
We investigated the evolution and transformation of scientific knowledge in the early modern period, analyzing more than 350 different editions of textbooks used for teaching astronomy in European universities from the late fifteenth century to mid-seventeenth century. These historical sources constitute the Sphaera Corpus. By examining different semantic relations among individual parts of each edition on record, we built a multiplex network consisting of six layers, as well as the aggregated network built from the superposition of all the layers. The network analysis reveals the emergence of five different communities. The contribution of each layer in shaping the communities and the properties of each community are studied. The most influential books in the corpus are found by calculating the average age of all the out-going and in-coming links for each book. A small group of editions is identified as a transmitter of knowledge as they bridge past knowledge to the future through a long temporal interval. Our analysis, moreover, identifies the most impactful editions. These books introduce new knowledge that is then adopted by almost all the books published afterwards until the end of the whole period of study. The historical research on the content of the identified books, as an empirical test, finally corroborates the results of all our analyses.

Publication:
Scientific Reports – Nature

In the media:


Further information is available from:

Dr. Matteo Valleriani
TU Berlin
Honorary professor at the Institute of History and Philosophy of Science, Technology, and Literature
Tel.: 0049 (0)30 22 667 128
Email: valleriani@mpiwg-berlin.mpg.de

BIFOLD Fellow Dr. Wojciech Samek heads newly established AI research department at Fraunhofer HHI

Home >

BIFOLD Fellow Dr. Wojciech Samek heads newly established AI research department at Fraunhofer HHI

BIFOLD Fellow Dr. Wojciech Samek heads newly established AI research department at Fraunhofer HHI

Dr. Samek (l.) and Prof. Müller in front of an XAI demonstrator at Fraunhofer HHI. (Copyright: TU Berlin/Christian Kielmann)

The Fraunhofer Heinrich Hertz Institute (HHI) has established a new research department dedicated to “Artificial Intelligence”. The AI expert and BIFOLD Fellow Dr. Wojciech Samek, previously leading the research group “Machine Learning” at Fraunhofer HHI, will head the new department. With this move Fraunhofer HHI aims at expanding the transfer of its AI research on topics such as Explainable AI and neural network compression to the industry.

Dr. Wojciech Samek: “The mission of our newly founded department is to make today’s AI truly trustable and in all aspects practicable. To achieve this, we will very closely collaborate with BIFOLD in order to overcome the limitations of current deep learning models regarding explainability, reliability and efficiency.“

“Congratulations, I look forward to a continued successful teamwork with BIFOLD fellow Wojciech Samek, who is a true AI hot shot.”

BIFOLD Director Prof. Dr. Klaus-Robert Müller

The new department further strengthens the already existing close connection between basic AI research at BIFOLD and applied research at Fraunhofer HHI and is a valuable addition to the dynamic AI ecosystem in Berlin.

“The large Berlin innovation network centered around BIFOLD is unique in Germany. This ensures that the latest research results will find their way into business, science and society.”

BIFOLD Director Prof. Dr. Volker Markl