Selecting features in origin analysis

Pam Green, Peter C.R. Lane, Austen Rainer, Sven Bodo Scholz

Research output: Chapter in Book/Report/Conference proceedingConference contribution

2 Citations (Scopus)

Abstract

When applying a machine-learning approach to develop classifiers in a new domain, an important question is what measurements to take and how they will be used to construct informative features. This paper develops a novel set of machine-learning classifiers for the domain of classifying files taken from software projects; the target classifications are based on origin analysis. Our approach adapts the output of four copy-analysis tools, generating a number of different measurements. By combining the measures and the files on which they operate, a large set of features is generated in a semi-automatic manner. After which, standard attribute selection and classifier training techniques yield a pool of high quality classifiers (accuracy in the range of 90%), and information on the most relevant features.

Original languageEnglish
Title of host publicationRes. and Dev. in Intelligent Syst. XXVII
Subtitle of host publicationIncorporating Applications and Innovations in Intel. Sys. XVIII - AI 2010, 30th SGAI Int. Conf. on Innovative Techniques and Applications of Artificial Intel.
Pages379-392
Number of pages14
DOIs
Publication statusPublished - 01 Dec 2011
Externally publishedYes
Event30th SGAI International Conference on Innovative Techniques and Applications of Artificial Intelligence, AI 2010 - Cambridge, United Kingdom
Duration: 14 Dec 201016 Dec 2010

Conference

Conference30th SGAI International Conference on Innovative Techniques and Applications of Artificial Intelligence, AI 2010
CountryUnited Kingdom
CityCambridge
Period14/12/201016/12/2010

ASJC Scopus subject areas

  • Artificial Intelligence
  • Information Systems

Fingerprint Dive into the research topics of 'Selecting features in origin analysis'. Together they form a unique fingerprint.

  • Cite this

    Green, P., Lane, P. C. R., Rainer, A., & Scholz, S. B. (2011). Selecting features in origin analysis. In Res. and Dev. in Intelligent Syst. XXVII: Incorporating Applications and Innovations in Intel. Sys. XVIII - AI 2010, 30th SGAI Int. Conf. on Innovative Techniques and Applications of Artificial Intel. (pp. 379-392) https://doi.org/10.1007/978-0-85729-130-1-29