PWFS: Probability-Weighted Feature Selection

محفوظ في:
التفاصيل البيبلوغرافية
الحاوية / القاعدة:Electronics vol. 14, no. 11 (2025), p. 2264
المؤلف الرئيسي: Ayanoglu, Mehmet B
مؤلفون آخرون: Uysal Ismail
منشور في:
MDPI AG
الموضوعات:
الوصول للمادة أونلاين:Citation/Abstract
Full Text + Graphics
Full Text - PDF
الوسوم: إضافة وسم
لا توجد وسوم, كن أول من يضع وسما على هذه التسجيلة!
الوصف
مستخلص:Feature selection has been a fundamental research area for both conventional and contemporary machine learning since the beginning of predictive analytics. From early statistical methods, such as principal component analysis, to more recent and data-driven approaches, such as deep unsupervised feature learning, selecting input features to achieve the best objective performance has been a critical component of any machine learning application. In this study, we propose a novel, easily replicable, and robust approach called probability-weighted feature selection (PWFS), which randomly selects a subset of features prior to each training–testing regimen and assigns probability weights to each feature based on an objective performance metric such as accuracy, mean-square error, or area under the curve for the receiver operating characteristic curve (AUC–ROC). Using the objective metric scores and weight assignment techniques based on the golden ratio led iteration method, the features that yield higher performance are incrementally more likely to be selected in subsequent train–test regimens, whereas the opposite is true for features that yield lower performance. This probability-based search method has demonstrated significantly faster convergence to a near-optimal set of features compared to a purely random search within the feature space. We compare our method with an extensive list of twelve popular feature selection algorithms and demonstrate equal or better performance on a range of benchmark datasets. The specific approach to assigning weights to the features also allows for expanded applications in which two correlated features can be included in separate clusters of near-optimal feature sets for ensemble learning scenarios.
تدمد:2079-9292
DOI:10.3390/electronics14112264
المصدر:Advanced Technologies & Aerospace Database