Research data management in institutional repositories: an architectural approach using data lakehouses

Guardat en:
Dades bibliogràfiques
Publicat a:Digital Library Perspectives vol. 41, no. 1 (2025), p. 145-178
Autor principal: He, Zilong
Altres autors: Fang, Wei
Publicat:
Emerald Group Publishing Limited
Matèries:
Accés en línia:Citation/Abstract
Full Text
Full Text - PDF
Etiquetes: Afegir etiqueta
Sense etiquetes, Sigues el primer a etiquetar aquest registre!
Descripció
Resum:PurposeThis paper aims to address the pressing challenges in research data management within institutional repositories, focusing on the escalating volume, heterogeneity and multi-source nature of research data. The aim is to enhance the data services provided by institutional repositories and modernise their role in the research ecosystem.Design/methodology/approachThe authors analyse the evolution of data management architectures through literature review, emphasising the advantages of data lakehouses. Using the design science research methodology, the authors develop an end-to-end data lakehouse architecture tailored to the needs of institutional repositories. This design is refined through interviews with data management professionals, institutional repository administrators and researchers.FindingsThe authors present a comprehensive framework for data lakehouse architecture, comprising five fundamental layers: data collection, data storage, data processing, data management and data services. Each layer articulates the implementation steps, delineates the dependencies between them and identifies potential obstacles with corresponding mitigation strategies.Practical implicationsThe proposed data lakehouse architecture provides a practical and scalable solution for institutional repositories to manage research data. It offers a range of benefits, including enhanced data management capabilities, expanded data services, improved researcher experience and a modernised institutional repository ecosystem. The paper also identifies and addresses potential implementation obstacles and provides valuable guidance for institutions embarking on the adoption of this architecture. The implementation in a university library showcases how the architecture enhances data sharing among researchers and empowers institutional repository administrators with comprehensive oversight and control of the university’s research data landscape.Originality/valueThis paper enriches the theoretical knowledge and provides a comprehensive research framework and paradigm for scholars in research data management. It details a pioneering application of the data lakehouse architecture in an academic setting, highlighting its practical benefits and adaptability to meet the specific needs of institutional repositories.
ISSN:2059-5816
2059-5824
8756-5196
1065-075X
2054-1694
DOI:10.1108/DLP-02-2024-0022
Font:Library Science Database