Designing an In-Memory Metadata Cache to Accelerate Object Storage Operations
Guardado en:
| Publicado en: | ProQuest Dissertations and Theses (2025) |
|---|---|
| Autor principal: | |
| Publicado: |
ProQuest Dissertations & Theses
|
| Materias: | |
| Acceso en línea: | Citation/Abstract Full Text - PDF |
| Etiquetas: |
Sin Etiquetas, Sea el primero en etiquetar este registro!
|
| Resumen: | The Simple Storage Service (S3) protocol has become the de-facto standard for largescale data storage. With the widespread adoption of cloud services, the S3 protocol, which was initially developed by Amazon, has now been quickly adopted by all major vendors with a common set of base functionalities. S3 file operations are performed using a Representational State Transfer (REST) Application Programming Interface (API). This thesis presents the challenges associated with copying large amounts of data across S3 Clusters (both on-premise and in cloud) using native tools such as mc (MinIO Client), rclone, and s3cmd, and proposes the design of an in-memory metadata cache to accelerate S3 operations. The metadata cache first builds the current state of the bucket and persists the operations to the disk using PostgreSQL, and then uses the S3 bucket notification to build an incremental view of changes caused by file operations on the bucket. This solution eliminates the need to rescan the entire contents of the bucket to determine file changes in the source S3 bucket, which is the current standard in replication tools such as rclone. The cache has been developed in golang and tested on an 8-core Turing Pi System On Chip (SoC) module, and impact with performance has been measured. Performance evaluations demonstrate significant reductions in metadata retrieval time to a mere 6 minutes as compared to 4 hours using the standard method of listing objects, making this approach a practical enhancement for on-premise S3 storage solutions. |
|---|---|
| ISBN: | 9798265465979 |
| Fuente: | ProQuest Dissertations & Theses Global |