|
Dataset Quality Assessment: An extension for analogy based effort estimationKeywords: Analogy - Based Softwa re Effort Estimation , Quality of Data set , Attribute Subset Selection , Kendall Rowwise orrelation . Abstract: Estimation by Analogy(EBA)isanincreasingly active researchmethod in the area ofsoftwareengineering. Thefundamentalassumption of this method is thatthesimilarprojects in terms of attributevalueswillalsobesimilar in terms of effortvalues.It is well recognized thatthequality ofsoftwaredatasets hasaconsiderable impact on the reliability and accuracy of such method.Therefore,if thesoftwaredataset does notsatisfythe aforementionedassumptionthenitis notratherusefulfor EBAmethod.This paperpresentsa new methodbased on Kendall’s row-wise rank correlationthat enablesdataqualityevaluationandproviding a data pre-processing stagefor EBA.The proposedmethodprovidessound statistical basis and justification for the processofdataquality evaluation. Unlike Analogy-X,ourmethodhastheability to deal withcategorical attributesindividually withoutthe need for partitioningthedataset.Experimental results showed thatthe proposed method could formauseful extension forEBAas itenables: dataset quality evaluation, attribute selection and identifying abnormal observation
|