Dissertation / PhD Thesis FZJ-2026-02802

http://join2-wiki.gsi.de/foswiki/pub/Main/Artwork/join2_logo100x88.png
Enabling the digital transformation in materials science and engineering: leveraging ontologies for knowledge representation, provenance, and text mining

 ;  ;

2025
RWTH Aachen University

RWTH Aachen University pages 1 Online-Ressource : Illustrationen () [10.18154/RWTH-2025-07969] = Dissertation, RWTH Aachen, 2025

This record in other databases:  

Please use a persistent id in citations: doi:

Abstract: The digital transformation of Materials Science and Engineering (MSE) is essential for accelerating the development of novel materials and enhancing understanding of the materials life cycle, encompassing raw resources, functional materials, engineered components, and beyond. This transformation entails integrating computational methods, data science, and artificial intelligence (AI) to advance the field of MSE. However, the heterogeneity of data formats, unstructured information, and the reproducibility crisis in MSE pose challenges to the effective management, reuse, and analysis of data. This thesis addresses these challenges by leveraging ontologies as the foundation for semantic data enrichment, facilitating knowledge representation, provenance documentation, and text mining. The first contribution of this work is the development of the Dislocation Ontology (DISO), an ontology that represents the domain knowledge of linear defects in crystalline materials. The development of DISO was driven by the objective of facilitating data interoperability with other MSE-related data. DISO was aligned with the Elementary Multiperspective Material Ontology (EMMO) and Materials Design Ontology (MDO) to ensure interoperability. The ontology alignment efficiently represents the dislocation simulation data. Moreover, we present a real-world use case of representing discrete dislocation dynamics data as a knowledge graph (DisLocKG), which can depict the relationships between them. Additionally, DisLocKG is accessible, as we developed a SPARQL endpoint that offers considerable flexibility when querying DisLocKG. Another contribution of this work is the PRovenance Information for MAterials Science (PRIMA) ontology, which was designed to document provenance information in MSE research, promoting data reliability, trustworthiness, and reproducibility. PRIMA was aligned with the Provenance Ontology (PROV-O) and the Platform Material Digital core ontology (PMDco) and was evaluated through use cases involving metallic biomaterial fabrication and microscopy data. Furthermore, this thesis presents a framework integrating Large Language Models (LLMs) and Semantic Web technologies to extract structured data from unstructured materials synthesis text. Free-text data is transformed into machine-readable formats such as JSON and further enriched semantically using an ontology. The overarching objective of this work is to demonstrate an interdisciplinary approach that integrates MSE knowledge, Semantic Web technologies, and Natural Language Processing (NLP) to facilitate the digital transformation of MSE. The developed ontologies and knowledge graphs are pivotal in data enrichment and interoperability, ensuring that materials data adhere to the FAIR (Findable, Accessible, Interoperable, Reusable) data principles. The LLMs-based text mining framework provides a strategy for handling unstructured data in materials synthesis-related text, enabling the generation of linked data and knowledge extraction. Despite these advances, challenges persist in expanding DISO and DisLocKG to new use cases, refining data models, and improving long-context processing in LLMs. Future works will entail the development of Application Programming Interfaces (APIs) for DisLocKG, extending PRIMA with computational modules, and exploring more efficient LLMs architectures for a more extensive range of text-mining applications. We envision a future where Semantic Web technologies and AI converge to enable machines to extract, process, and understand scientific data, ultimately driving the digital transformation in MSE.

Keyword(s): Hochschulschrift ; digital transformation ; materials science and engineering ; ontology ; dislocation ; provenance ; text mining


Note: Dissertation, RWTH Aachen, 2025

Contributing Institute(s):
  1. Materials Data Science and Informatics (IAS-9)
Research Program(s):
  1. 5111 - Domain-Specific Simulation & Data Life Cycle Labs (SDLs) and Research Groups (POF4-511) (POF4-511)

Click to display QR Code for this record

The record appears in these collections:
Dokumenttypen > Hochschulschriften > Doktorarbeiten
Institutssammlungen > IAS > IAS-9
Workflowsammlungen > Öffentliche Einträge
Publikationsdatenbank

 Datensatz erzeugt am 2026-06-22, letzte Änderung am 2026-06-22



Dieses Dokument bewerten:

Rate this document:
1
2
3
 
(Bisher nicht rezensiert)