Sampling open source projects from portals: Some preliminary investigations

Austen Rainer*, Stephen Gale

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contribution

5 Citations (Scopus)

Abstract

In this paper, we provide a preliminary evaluation of the quality and quantity of data on 50 000 open source (OS) projects hosted at the SourceForge.net portal. Using several indicators of project activity, we identify one sample from the entire dataset: the 'most-broadly-active' OS projects. The number of projects that are active across all of our main indicators of activity account for less than 1% of the projects on the portal. 75% of the projects currently hosted on the SourceForge.net portal are not, and have never really been, active on the portal. Furthermore, whilst there has been a substantial increase in the number of projects being added to SourceForge.net over time, the number of projects being added that then go on to become most-broadly-active projects seems to be decreasing over time. Finally, we recognise that care needs to be taken in defining samples, such as the most-broadly-active projects, as these definitions raise implications for the conclusions that one makes and the generalisations that one should draw.

Original languageEnglish
Title of host publicationProceedings - 11th IEEE International Software Metrics Symposium, METRICS 2005
Pages241-250
Number of pages10
Volume2005
DOIs
Publication statusPublished - 01 Dec 2005
Event11th IEEE International Software Metrics Symposium, METRICS 2005 - Como, Italy
Duration: 19 Sep 200522 Sep 2005

Conference

Conference11th IEEE International Software Metrics Symposium, METRICS 2005
Country/TerritoryItaly
CityComo
Period19/09/200522/09/2005

ASJC Scopus subject areas

  • Engineering(all)

Fingerprint

Dive into the research topics of 'Sampling open source projects from portals: Some preliminary investigations'. Together they form a unique fingerprint.

Cite this