Scalable Methods for Monte Carlo Tree Search

Guardado en:

Detalles Bibliográficos
Publicado en:	ProQuest Dissertations and Theses (2025)
Autor principal:	Naderzadeh Ardebili, Yashar
Publicado:	ProQuest Dissertations & Theses
Materias:	Artificial intelligence Industrial engineering Computer science
Acceso en línea:	Citation/Abstract Full Text - PDF
Etiquetas:	Agregar Etiqueta Sin Etiquetas, Sea el primero en etiquetar este registro!

Descripción
Resumen:	Monte-Carlo Tree Search (MCTS) is a versatile and adaptive heuristic tree-search algorithm, designed to uncover near-optimal actions by iteratively exploring decision-making points. Through its unique balance of exploration and exploitation, MCTS progressively constructs a search tree by gathering samples during each iteration, ultimately guiding the search towards regions of the decision space that yield higher rewards. This adaptability and efficiency have positioned MCTS as a leading technique within the realm of game theory, where it has demonstrated exceptional success in solving complex decision-making problems such as those found in Go, Chess, and other strategic games. In addition to its notable application in gaming, MCTS has proven its capability to address broader, more complex problems. Specifically, MCTS has shown significant promise when applied to NP-hard combinatorial optimization problems, which are central to many industrial and research applications. These include problems such as the Job-Shop Scheduling Problem (JSSP) and the Weighted Set-Cover Problem (WSCP), both of which require intelligent exploration of vast solution spaces to identify near-optimal solutions within reasonable timeframes.Given the increasing demand for solving larger and more complex problems, MCTS has been extended for distributed-memory parallel platforms, a crucial step in enabling scalability across high-performance computing systems. However, the adoption of MCTS in distributed-memory environments introduces two primary challenges: (1) the considerable communication overhead that arises from coordinating parallel processes, and (2) the difficulty in maintaining an even computational load across all processes to avoid bottlenecks and inefficiencies. In this work, we introduce a novel distributed-memory parallel MCTS algorithm, termed Parallel Partial-Backpropagation MCTS (PPB-MCTS). The primary innovation of PPB-MCTS lies in its approach to minimizing communication overhead while enhancing performance in combinatorial optimization contexts. Our algorithm leverages a technique known as partial backpropagation, which reduces the frequency and size of data transmitted between processes by sending only essential backpropagation messages, rather than full state information. This minimizes the communication overhead without sacrificing the accuracy of the search.To address the load-balancing challenge, we introduce a shared transposition table, enabling parallel processes to share information regarding explored states. This strategy not only ensures that computational work is distributed more evenly across processes, but also reduces redundant computations, thereby improving the overall efficiency of the algorithm.Moreover, our approach addresses the issue of duplicate states in distributed-memory environments. As is common in sequential MCTS, duplicate states can cause the search tree to evolve into a Directed Acyclic Graph (DAG), complicating the search process. We adapt techniques from sequential MCTS to manage duplicate states effectively in a parallel context, thus preserving the integrity of the search structure and preventing unnecessary computational overhead.We evaluate the effectiveness of PPB-MCTS through an extensive experimental study, focusing on the Job-Shop Scheduling Problem (JSSP) and the Weighted Set-Cover Problem (WSCP). Both problems are well-known for their computational difficulty and are commonly used as benchmarks for evaluating combinatorial optimization algorithms. The experiments are conducted on a large cluster of computers, each equipped with multiple cores, allowing us to fully test the scalability and efficiency of our approach.The empirical results demonstrate that our proposed algorithm significantly outperforms existing distributed-memory parallel MCTS algorithms, particularly in terms of scalability and load balancing. As the number of processes increases, PPB-MCTS maintains high rollout efficiency and improves the distribution of computational load, leading to faster convergence to high-quality solutions. This performance improvement makes PPBMCTS a valuable tool for solving large-scale NP-hard problems in both research and industry.
ISBN:	9798280720824
Fuente:	ProQuest Dissertations & Theses Global