LISA Pipeline Runner Dashboard
Guardado en:
| Publicado en: | PQDT - Global (2025) |
|---|---|
| Autor principal: | |
| Publicado: |
ProQuest Dissertations & Theses
|
| Materias: | |
| Acceso en línea: | Citation/Abstract Full Text - PDF |
| Etiquetas: |
Sin Etiquetas, Sea el primero en etiquetar este registro!
|
| Resumen: | Effective resource management and workflow execution monitoring are essential for ensuring performance, stability, and minimal downtime in high-performance computing (HPC) environments. This dissertation presents the design and implementation of a modular, extensible monitoring dashboard tailored to the LISA Pipeline Runner infrastructure, which is responsible for executing complex scientific workflows within the context of the LISA mission.LISA (Laser Interferometer Space Antenna) is a space-based mission aimed at detecting gravitational waves generated by massive cosmic events such as merging black holes, supermassive black holes, and white dwarfs. To achieve this, a constellation of detectors will be deployed in space to capture these gravitational wave signals, which will then be transmitted back to Earth for analysis. This analysis will be carried out in dedicated Data Computing Centers (DCCs), each hosting a Pipeline Runner that orchestrates the data processing workflows. Given the computational demands and critical nature of the data, each Pipeline Runner must be supported by a dedicated monitoring dashboard to provide real-time visibility into system health and resource utilization during data analysis operations.A comparative analysis of existing open-source and research-developed technologies was conducted, focusing on their capabilities in collecting, processing, and visualizing telemetry data from multiple sources. Key metrics such as CPU, memory consumption and workflow execution data were identified as critical for effective monitoring in the context of the LISA mission.Based on this evaluation, a monitoring solution was implemented using Prometheus for metrics collection, Grafana for visualization, and Thanos for long-term storage of time-series data. Loki was also integrated to aggregate and query workflow logs. The system was deployed within a Kubernetes cluster, tightly coupled with the developed Pipeline Runner, with special attention to monitoring Argo Workflow executions and the overall health of the cluster. The resulting dashboard offers a centralized, intuitive interface for operators and developers, supporting observability, traceability, and future scalability within the LISA mission. |
|---|---|
| ISBN: | 9798265425577 |
| Fuente: | ProQuest Dissertations & Theses Global |