Balanced parallel triangle enumeration with an adaptive algorithm
Gorde:
| Argitaratua izan da: | Distributed and Parallel Databases vol. 42, no. 1 (Mar 2024), p. 103 |
|---|---|
| Egile nagusia: | |
| Beste egile batzuk: | , , , |
| Argitaratua: |
Springer Nature B.V.
|
| Gaiak: | |
| Sarrera elektronikoa: | Citation/Abstract Full Text Full Text - PDF |
| Etiketak: |
Etiketarik gabe, Izan zaitez lehena erregistro honi etiketa jartzen!
|
| Laburpena: | Triangle enumeration is a foundation brick for solving harder graph problems related to social networks, the Internet and transportation, to name a few applications. This problem is well studied in the theory literature, but remains an open problem with big data. In this paper, we defend the idea of solving triangle enumeration with SQL queries evaluating the steps of a new adaptive algorithm with linear speedup. Such SQL approach provides scalability beyond RAM limits, automatic parallel processing and more importantly: linear speedup as more machines are added. We present theory results and experimental validation showing our solution works well with large graphs analyzed on a parallel cluster with many machines, producing a balanced workload even with highly skewed degree vertices. We consider two types of distributed systems: (1) a parallel DBMS that evaluates SQL queries, and (2) a parallel HPC cluster calling the MPI library (called via Python). Extensive benchmark experiments with large graphs show our SQL solution offers many advantages over MPI and competing graph analytic systems. |
|---|---|
| ISSN: | 0926-8782 1573-7578 |
| DOI: | 10.1007/s10619-023-07437-x |
| Baliabidea: | Advanced Technologies & Aerospace Database |