CosmoHub: Interactive exploration and distribution of astronomical data on Hadoop

Wedi'i Gadw mewn:
Manylion Llyfryddiaeth
Cyhoeddwyd yn:arXiv.org (Mar 10, 2020), p. n/a
Prif Awdur: Tallada, Pau
Awduron Eraill: Carretero, Jorge, Casals, Jordi, Acosta-Silva, Carles, Serrano, Santiago, Caubet, Marc, Castander, Francisco J, César, Eduardo, Crocce, Martín, Delfino, Manuel, Eriksen, Martin, Fosalba, Pablo, Gaztañaga, Enrique, Merino, Gonzalo, Neissner, Christian, Tonello, Nadia
Cyhoeddwyd:
Cornell University Library, arXiv.org
Pynciau:
Mynediad Ar-lein:Citation/Abstract
Full text outside of ProQuest
Tagiau: Ychwanegu Tag
Dim Tagiau, Byddwch y cyntaf i dagio'r cofnod hwn!

MARC

LEADER 00000nab a2200000uu 4500
001 2374911174
003 UK-CbPIL
022 |a 2331-8422 
035 |a 2374911174 
045 0 |b d20200310 
100 1 |a Tallada, Pau 
245 1 |a CosmoHub: Interactive exploration and distribution of astronomical data on Hadoop 
260 |b Cornell University Library, arXiv.org  |c Mar 10, 2020 
513 |a Working Paper 
520 3 |a We present CosmoHub (https://cosmohub.pic.es), a web application based on Hadoop to perform interactive exploration and distribution of massive cosmological datasets. Recent Cosmology seeks to unveil the nature of both dark matter and dark energy mapping the large-scale structure of the Universe, through the analysis of massive amounts of astronomical data, progressively increasing during the last (and future) decades with the digitization and automation of the experimental techniques. CosmoHub, hosted and developed at the Port d'Informació Científica (PIC), provides support to a worldwide community of scientists, without requiring the end user to know any Structured Query Language (SQL). It is serving data of several large international collaborations such as the Euclid space mission, the Dark Energy Survey (DES), the Physics of the Accelerating Universe Survey (PAUS) and the Marenostrum Institut de Ciències de l'Espai (MICE) numerical simulations. While originally developed as a PostgreSQL relational database web frontend, this work describes the current version of CosmoHub, built on top of Apache Hive, which facilitates scalable reading, writing and managing huge datasets. As CosmoHub's datasets are seldomly modified, Hive it is a better fit. Over 60 TiB of catalogued information and \(50 \times 10^9\) astronomical objects can be interactively explored using an integrated visualization tool which includes 1D histogram and 2D heatmap plots. In our current implementation, online exploration of datasets of \(10^9\) objects can be done in a timescale of tens of seconds. Users can also download customized subsets of data in standard formats generated in few minutes. 
653 |a Cosmology 
653 |a Large scale structure of the universe 
653 |a Datasets 
653 |a Space missions 
653 |a Applications programs 
653 |a Exploration 
653 |a Histograms 
653 |a Mapping 
653 |a Dark energy 
653 |a Dark matter 
653 |a Structured Query Language-SQL 
653 |a Relational data bases 
653 |a Celestial bodies 
653 |a Computer simulation 
653 |a Sky surveys (astronomy) 
653 |a Query languages 
700 1 |a Carretero, Jorge 
700 1 |a Casals, Jordi 
700 1 |a Acosta-Silva, Carles 
700 1 |a Serrano, Santiago 
700 1 |a Caubet, Marc 
700 1 |a Castander, Francisco J 
700 1 |a César, Eduardo 
700 1 |a Crocce, Martín 
700 1 |a Delfino, Manuel 
700 1 |a Eriksen, Martin 
700 1 |a Fosalba, Pablo 
700 1 |a Gaztañaga, Enrique 
700 1 |a Merino, Gonzalo 
700 1 |a Neissner, Christian 
700 1 |a Tonello, Nadia 
773 0 |t arXiv.org  |g (Mar 10, 2020), p. n/a 
786 0 |d ProQuest  |t Engineering Database 
856 4 1 |3 Citation/Abstract  |u https://www.proquest.com/docview/2374911174/abstract/embedded/ZKJTFFSVAI7CB62C?source=fedsrch 
856 4 0 |3 Full text outside of ProQuest  |u http://arxiv.org/abs/2003.03217