Adaptive Discretization Using Golden Section to Aid Outlier Detection for Software Development Effort Estimation

The software engineering researchers have worked on different dimensions to facilitate better software effort estimates, including those focusing on dataset quality improvement.In this research, we specially investigated the effectiveness of outlier removal to improve estimation performance of 5 machine learning (ML) methods (Support Vector Regression, Random Forest, Ridge Regression, K-Nearest Neighbor, and Gradient Boosting Machines) for software development effort estimation (SDEE).We propose Nylon/Synthetic Headstalls a novel discretization method based on Golden Section (dubbed as Golden Section based Adaptive Discretization, GSAD) to identify optimal number of outliers for SDEE dataset.

The results signify the importance of optimal number of outliers’ removal to improve estimations.Moreover, the results obtained after applying GSAD technique have been compared with IQR and Cooks’ distance based outlier identification methods over 4 datasets: ISBSG Release 2021, UCP, NASA93 and China.The empirical results confirm that the performance of ML based SDEE methods is generally improving by employing GSAD and the proposed GSAD method has the ability to compete with the other prevalent Wings outlier identification methods.

Leave a Reply

Your email address will not be published. Required fields are marked *