Scaling out NUMA-Aware Applications with RDMA-Based Distributed Shared Memory
שמור ב:
| הוצא לאור ב: | Journal of Computer Science and Technology vol. 34, no. 1 (Jan 2019), p. 94 |
|---|---|
| מחבר ראשי: | |
| מחברים אחרים: | , , , , |
| יצא לאור: |
Springer Nature B.V.
|
| נושאים: | |
| גישה מקוונת: | Citation/Abstract Full Text - PDF |
| תגים: |
אין תגיות, היה/י הראשונ/ה לתייג את הרשומה!
|
MARC
| LEADER | 00000nab a2200000uu 4500 | ||
|---|---|---|---|
| 001 | 2171314603 | ||
| 003 | UK-CbPIL | ||
| 022 | |a 1000-9000 | ||
| 022 | |a 1860-4749 | ||
| 024 | 7 | |a 10.1007/s11390-019-1901-4 |2 doi | |
| 035 | |a 2171314603 | ||
| 045 | 2 | |b d20190101 |b d20190131 | |
| 084 | |a 137755 |2 nlm | ||
| 100 | 1 | |a Yang, Hong |u Shanghai Key Laboratory for Scalable Computing Systems, Shanghai Jiao Tong University, Shanghai, China | |
| 245 | 1 | |a Scaling out NUMA-Aware Applications with RDMA-Based Distributed Shared Memory | |
| 260 | |b Springer Nature B.V. |c Jan 2019 | ||
| 513 | |a Journal Article | ||
| 520 | 3 | |a The multicore evolution has stimulated renewed interests in scaling up applications on shared-memory multiprocessors, significantly improving the scalability of many applications. But the scalability is limited within a single node; therefore programmers still have to redesign applications to scale out over multiple nodes. This paper revisits the design and implementation of distributed shared memory (DSM) as a way to scale out applications optimized for non-uniform memory access (NUMA) architecture over a well-connected cluster. This paper presents MAGI, an efficient DSM system that provides a transparent shared address space with scalable performance on a cluster with fast network interfaces. MAGI is unique in that it presents a NUMA abstraction to fully harness the multicore resources in each node through hierarchical synchronization and memory management. MAGI also exploits the memory access patterns of big-data applications and leverages a set of optimizations for remote direct memory access (RDMA) to reduce the number of page faults and the cost of the coherence protocol. MAGI has been implemented as a user-space library with pthread-compatible interfaces and can run existing multithreaded applications with minimized modifications. We deployed MAGI over an 8-node RDMAenabled cluster. Experimental evaluation shows that MAGI achieves up to 9.25x speedup compared with an unoptimized implementation, leading to a scalable performance for large-scale data-intensive applications. | |
| 653 | |a Data management | ||
| 653 | |a Synchronism | ||
| 653 | |a Clusters | ||
| 653 | |a Resource management | ||
| 653 | |a Distributed shared memory | ||
| 653 | |a Servers | ||
| 653 | |a Distributed memory | ||
| 653 | |a Redesign | ||
| 653 | |a Memory management | ||
| 653 | |a Nodes | ||
| 700 | 1 | |a Yang, Zheng |u Shanghai Key Laboratory for Scalable Computing Systems, Shanghai Jiao Tong University, Shanghai, China | |
| 700 | 1 | |a Yang, Fan |u Shanghai Key Laboratory for Scalable Computing Systems, Shanghai Jiao Tong University, Shanghai, China | |
| 700 | 1 | |a Bin-Yu, Zang |u Shanghai Key Laboratory for Scalable Computing Systems, Shanghai Jiao Tong University, Shanghai, China | |
| 700 | 1 | |a Hai-Bing Guan |u Shanghai Key Laboratory for Scalable Computing Systems, Shanghai Jiao Tong University, Shanghai, China | |
| 700 | 1 | |a Hai-Bo Chen |u Shanghai Key Laboratory for Scalable Computing Systems, Shanghai Jiao Tong University, Shanghai, China | |
| 773 | 0 | |t Journal of Computer Science and Technology |g vol. 34, no. 1 (Jan 2019), p. 94 | |
| 786 | 0 | |d ProQuest |t ABI/INFORM Global | |
| 856 | 4 | 1 | |3 Citation/Abstract |u https://www.proquest.com/docview/2171314603/abstract/embedded/L8HZQI7Z43R0LA5T?source=fedsrch |
| 856 | 4 | 0 | |3 Full Text - PDF |u https://www.proquest.com/docview/2171314603/fulltextPDF/embedded/L8HZQI7Z43R0LA5T?source=fedsrch |