On sampling error in genetic programming

Na minha lista:
Detalhes bibliográficos
Publicado no:Natural Computing vol. 21, no. 2 (Jun 2022), p. 173
Autor principal: Schweim, Dirk
Outros Autores: Wittenberg, David, Rothlauf, Franz
Publicado em:
Springer Nature B.V.
Assuntos:
Acesso em linha:Citation/Abstract
Full Text - PDF
Tags: Adicionar Tag
Sem tags, seja o primeiro a adicionar uma tag!

MARC

LEADER 00000nab a2200000uu 4500
001 2673710960
003 UK-CbPIL
022 |a 1567-7818 
022 |a 1572-9796 
024 7 |a 10.1007/s11047-020-09828-w  |2 doi 
035 |a 2673710960 
045 2 |b d20220601  |b d20220630 
084 |a 109034  |2 nlm 
100 1 |a Schweim, Dirk  |u Johannes Gutenberg University, Mainz, Germany (GRID:grid.5802.f) (ISNI:0000 0001 1941 7111) 
245 1 |a On sampling error in genetic programming 
260 |b Springer Nature B.V.  |c Jun 2022 
513 |a Journal Article 
520 3 |a The initial population in genetic programming (GP) should form a representative sample of all possible solutions (the search space). While large populations accurately approximate the distribution of possible solutions, small populations tend to incorporate a sampling error. This paper analyzes how the size of a GP population affects the sampling error and contributes to answering the question of how to size initial GP populations. First, we present a probabilistic model of the expected number of subtrees for GP populations initialized with full, grow, or ramped half-and-half. Second, based on our frequency model, we present a model that estimates the sampling error for a given GP population size. We validate our models empirically and show that, compared to smaller population sizes, our recommended population sizes largely reduce the sampling error of measured fitness values. Increasing the population sizes even more, however, does not considerably reduce the sampling error of fitness values. Last, we recommend population sizes for some widely used benchmark problem instances that result in a low sampling error. A low sampling error at initialization is necessary (but not sufficient) for a reliable search since lowering the sampling error means that the overall random variations in a random sample are reduced. Our results indicate that sampling error is a severe problem for GP, making large initial population sizes necessary to obtain a low sampling error. Our model allows practitioners of GP to determine a minimum initial population size so that the sampling error is lower than a threshold, given a confidence level. 
653 |a Population 
653 |a Probabilistic models 
653 |a Error analysis 
653 |a Genetic algorithms 
653 |a Confidence intervals 
653 |a Populations 
653 |a Statistical analysis 
653 |a Fitness 
653 |a Sampling 
653 |a Sampling error 
700 1 |a Wittenberg, David  |u Johannes Gutenberg University, Mainz, Germany (GRID:grid.5802.f) (ISNI:0000 0001 1941 7111) 
700 1 |a Rothlauf, Franz  |u Johannes Gutenberg University, Mainz, Germany (GRID:grid.5802.f) (ISNI:0000 0001 1941 7111) 
773 0 |t Natural Computing  |g vol. 21, no. 2 (Jun 2022), p. 173 
786 0 |d ProQuest  |t Science Database 
856 4 1 |3 Citation/Abstract  |u https://www.proquest.com/docview/2673710960/abstract/embedded/J7RWLIQ9I3C9JK51?source=fedsrch 
856 4 0 |3 Full Text - PDF  |u https://www.proquest.com/docview/2673710960/fulltextPDF/embedded/J7RWLIQ9I3C9JK51?source=fedsrch