Investigating skew effects in shared-nothing parallel database systems

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:ProQuest Dissertations and Theses (1993)
1. Verfasser: Hu, Ron-Chung
Veröffentlicht:
ProQuest Dissertations & Theses
Schlagworte:
Online-Zugang:Citation/Abstract
Full Text - PDF
Tags: Tag hinzufügen
Keine Tags, Fügen Sie das erste Tag hinzu!

MARC

LEADER 00000nab a2200000uu 4500
001 304049385
003 UK-CbPIL
020 |a 979-8-208-31641-2 
035 |a 304049385 
045 0 |b d19930101 
084 |a 66569  |2 nlm 
100 1 |a Hu, Ron-Chung 
245 1 |a Investigating skew effects in shared-nothing parallel database systems 
260 |b ProQuest Dissertations & Theses  |c 1993 
513 |a Dissertation/Thesis 
520 3 |a Larger databases and cheaper hardware have generated great interest in applying database applications to parallel architectures. A database system based on multiple processors that share nothing (i.e. share neither main memory nor disks) is one way to provide the functionality of a conventional DBMS. To exploit parallelism, the shared-nothing parallel system horizontally partitions a data relation across all the processors. Proponents of this loosely-coupled approach claim such a parallel architecture can achieve high scalability and provide good cost-performance. However, the effectiveness of parallel executions on a shared-nothing system depends on our ability to equally divide the load among the nodes while minimizing the coordination overhead. In this dissertation, we investigate the skew effects, which frequently cause load imbalance and impair system performance if improperly handled, in parallel database systems. We discuss the nature of skew effects and the reason why they cause performance problems. In order to take full advantage of parallel executions, we study three major performance-oriented topics: Query Optimization, Index Mechanism, and Parallel Join Operation. In each topic, we illustrate the flaws in existing methods which are often straightforward generalizations of conventional database techniques when applied to parallel database systems. In query optimization, we propose the two-level-query-optimization approach in which query optimization functions are split into system level and node level. We suggest to migrate all the decisions which need to consider individual node's data distribution to the node level. We show that this new approach is especially beneficial to large parallel systems which are vulnerable to the presence of various skew effects. In index mechanism, we present the unified index mechanism by concurrently incorporating both local and distributive mechanisms in a single index. We perform simulation experiments to validate the effectiveness of this new index mechanism. We devise the two-threshold-mechanism to efficiently maintain it. In parallel join operation, we introduce two modified parallel hash join algorithms using tuple duplication and partial duplication schemes respectively. We identify the domains in which our algorithms can provide good performance. As we extend our knowledge of effective parallel executions, our research contributes an essential step in achieving a high performance database system. 
653 |a Computer science 
773 0 |t ProQuest Dissertations and Theses  |g (1993) 
786 0 |d ProQuest  |t ProQuest Dissertations & Theses Global 
856 4 1 |3 Citation/Abstract  |u https://www.proquest.com/docview/304049385/abstract/embedded/L8HZQI7Z43R0LA5T?source=fedsrch 
856 4 0 |3 Full Text - PDF  |u https://www.proquest.com/docview/304049385/fulltextPDF/embedded/L8HZQI7Z43R0LA5T?source=fedsrch