Merging Distinct Sources Databases to Improve Software Estimation Models

Sparad:
Bibliografiska uppgifter
I publikationen:Programming and Computer Software vol. 50, no. 8 (Dec 2024), p. 786
Utgiven:
Springer Nature B.V.
Ämnen:
Länkar:Citation/Abstract
Full Text
Full Text - PDF
Taggar: Lägg till en tagg
Inga taggar, Lägg till första taggen!

MARC

LEADER 00000nab a2200000uu 4500
001 3154524634
003 UK-CbPIL
022 |a 0361-7688 
022 |a 1608-3261 
024 7 |a 10.1134/S0361768824700762  |2 doi 
035 |a 3154524634 
045 2 |b d20241201  |b d20241231 
245 1 |a Merging Distinct Sources Databases to Improve Software Estimation Models 
260 |b Springer Nature B.V.  |c Dec 2024 
513 |a Journal Article 
520 3 |a Context. For more than six decades, software cost/effort estimation has been a relevant topic for research due to its impact on the industry. Although many estimation models exist, regression-based estimation approaches have been predominantly used in the literature. However, some problems have been observed both in industry and academia: the lack of datasets with a high or at least enough number of data points and the arbitrary combination of different source databases belonging to practitioners in order to create larger datasets.Objective. Propose the application of the Kruskal–Wallis test to validate the integration of distinct source databases (independent groups), thereby avoiding the mixing of unrelated data, increasing the number of data points, and improving the estimation models.Method.We conducted a case study using real data from an international company, specifically data from their Mexico office. This office provides software development services for a technological tower identified as “Microservices and APIs.” The data were collected in 2020.Results: The quality criteria in the final estimation model were improved. The MMRE was reduced by 25.4% (from 78.6 to 53.2%), the standard deviation was reduced by 97.2% (from 149.7 to 52.5%), and the Pred (25%) indicator increased by 3.2 percentage points. Additionally, the number of data points increased significantly, and linear regression constraints was accomplished. The application of the Kruskal–Wallis test to validate the integration of distinct source databases (independent groups) proved useful in improving the estimation models. 
653 |a Databases 
653 |a Software 
653 |a Datasets 
653 |a Random variables 
653 |a Hypotheses 
653 |a Integrated works software 
653 |a Software development 
653 |a Statistical methods 
653 |a Variance analysis 
653 |a Data points 
773 0 |t Programming and Computer Software  |g vol. 50, no. 8 (Dec 2024), p. 786 
786 0 |d ProQuest  |t Advanced Technologies & Aerospace Database 
856 4 1 |3 Citation/Abstract  |u https://www.proquest.com/docview/3154524634/abstract/embedded/6A8EOT78XXH2IG52?source=fedsrch 
856 4 0 |3 Full Text  |u https://www.proquest.com/docview/3154524634/fulltext/embedded/6A8EOT78XXH2IG52?source=fedsrch 
856 4 0 |3 Full Text - PDF  |u https://www.proquest.com/docview/3154524634/fulltextPDF/embedded/6A8EOT78XXH2IG52?source=fedsrch