Computational storage: an efficient and scalable platform for big data and HPC applications

I tiakina i:

Ngā taipitopito rārangi puna kōrero
I whakaputaina i:	Journal of Big Data vol. 6, no. 1 (Nov 2019), p. 1
Kaituhi matua:	Torabzadehkashi, Mahdi
Ētahi atu kaituhi:	Rezaei, Siavash, HeydariGorji, Ali, Bobarshad, Hosein, Alves, Vladimir, Bagherzadeh, Nader
I whakaputaina:	Springer Nature B.V.
Ngā marau:	Data centers Data management Data processing Storage systems Application servers Computer storage devices Microprocessors Electronic devices Computational efficiency Algorithms Engines Neon Energy consumption Servers Benchmarks Distributed processing Big Data High performance computing Energy efficiency Application Prototypes Storage Motivation
Urunga tuihono:	Citation/Abstract Full Text - PDF
Ngā Tūtohu:	Tāpirihia he Tūtohu Kāore He Tūtohu, Me noho koe te mea tuatahi ki te tūtohu i tēnei pūkete!

MARC


LEADER	00000nab a2200000uu 4500
001	2315414097
003	UK-CbPIL
022			\|a 2196-1115
024	7		\|a 10.1186/s40537-019-0265-5 \|2 doi
035			\|a 2315414097
045	2		\|b d20191101 \|b d20191130
100	1		\|a Torabzadehkashi, Mahdi \|u University of California, Irvine (UCI), Irvine, USA; NGD Systems, Inc., Irvine, USA
245	1		\|a Computational storage: an efficient and scalable platform for big data and HPC applications
260			\|b Springer Nature B.V. \|c Nov 2019
513			\|a Journal Article
520	3		\|a In the era of big data applications, the demand for more sophisticated data centers and high-performance data processing mechanisms is increasing drastically. Data are originally stored in storage systems. To process data, application servers need to fetch them from storage devices, which imposes the cost of moving data to the system. This cost has a direct relation with the distance of processing engines from the data. This is the key motivation for the emergence of distributed processing platforms such as Hadoop, which move process closer to data. Computational storage devices (CSDs) push the “move process to data” paradigm to its ultimate boundaries by deploying embedded processing engines inside storage devices to process data. In this paper, we introduce Catalina, an efficient and flexible computational storage platform, that provides a seamless environment to process data in-place. Catalina is the first CSD equipped with a dedicated application processor running a full-fledged operating system that provides filesystem-level data access for the applications. Thus, a vast spectrum of applications can be ported for running on Catalina CSDs. Due to these unique features, to the best of our knowledge, Catalina CSD is the only in-storage processing platform that can be seamlessly deployed in clusters to run distributed applications such as Hadoop MapReduce and HPC applications in-place without any modifications on the underlying distributed processing framework. For the proof of concept, we build a fully functional Catalina prototype and a CSD-equipped platform using 16 Catalina CSDs to run Intel HiBench Hadoop and HPC benchmarks to investigate the benefits of deploying Catalina CSDs in the distributed processing environments. The experimental results show up to 2.2× improvement in performance and 4.3× reduction in energy consumption, respectively, for running Hadoop MapReduce benchmarks. Additionally, thanks to the Neon SIMD engines, the performance and energy efficiency of DFT algorithms are improved up to 5.4× and 8.9×, respectively.
653			\|a Data centers
653			\|a Data management
653			\|a Data processing
653			\|a Storage systems
653			\|a Application servers
653			\|a Computer storage devices
653			\|a Microprocessors
653			\|a Electronic devices
653			\|a Computational efficiency
653			\|a Algorithms
653			\|a Engines
653			\|a Neon
653			\|a Energy consumption
653			\|a Servers
653			\|a Benchmarks
653			\|a Distributed processing
653			\|a Big Data
653			\|a High performance computing
653			\|a Energy efficiency
653			\|a Application
653			\|a Prototypes
653			\|a Storage
653			\|a Motivation
700	1		\|a Rezaei, Siavash \|u University of California, Irvine (UCI), Irvine, USA; NGD Systems, Inc., Irvine, USA
700	1		\|a HeydariGorji, Ali \|u University of California, Irvine (UCI), Irvine, USA; NGD Systems, Inc., Irvine, USA
700	1		\|a Bobarshad, Hosein \|u NGD Systems, Inc., Irvine, USA
700	1		\|a Alves, Vladimir \|u NGD Systems, Inc., Irvine, USA
700	1		\|a Bagherzadeh, Nader \|u University of California, Irvine (UCI), Irvine, USA
773	0		\|t Journal of Big Data \|g vol. 6, no. 1 (Nov 2019), p. 1
786	0		\|d ProQuest \|t ABI/INFORM Global
856	4	1	\|3 Citation/Abstract \|u https://www.proquest.com/docview/2315414097/abstract/embedded/7BTGNMKEMPT1V9Z2?source=fedsrch
856	4	0	\|3 Full Text - PDF \|u https://www.proquest.com/docview/2315414097/fulltextPDF/embedded/7BTGNMKEMPT1V9Z2?source=fedsrch