Citations, research topics and active countries in software engineering: A bibliometrics study

Vahid Garousi*, Mika V. Mäntylä

*Corresponding author for this work

Research output: Contribution to journalReview article

39 Citations (Scopus)
165 Downloads (Pure)


Context: An enormous number of papers (more than 70,000) have been published in the area of Software Engineering (SE) since its inception in 1968. To better characterize and understand this massive research literature, there is a need for comprehensive bibliometrics assessments in this vibrant field.Objective: The objective of this study is to utilize automated citation and topic analysis to characterize the software engineering research literature over the years. While a few bibliometrics studies have appeared in the field of SE, this article aims to be the most comprehensive bibliometrics assessments in this vibrant field.Method: To achieve the above objective, we report in this paper a bibliometrics study with data collected from Scopus database consisting of over 70,000 articles. For thematic analysis, we used topic modeling to automatically generate the most probable topic distributions given the data.Results: We found that number of papers published per year has grown tremendously and currently 6000-7000 papers are published every year. At the same time, nearly half of the papers are not cited at all. Using text mining of articles titles, we found that currently the hot research topics in software engineering are: (1) web services, (2) mobile and cloud computing, (3) industrial (case) studies, (4) source code and (5) test generation. Finally, we found that a small share of large countries produce the majority of the papers in SE while small European countries are proportionally the most active in the area of SE, based on the number of papers.Conclusion: Due to large volumes of research in SE, we suggest using the automated analysis of bibliometrics as we have done in this paper. By picking out the most cited papers, we can present the land marks of SE and, with thematic analysis, we can characterize the entire field. This can be useful for students and other new comers to SE and for presenting our achievements to other disciplines. In particular, we see and report the value of such an analysis in situations where performing a full scale SLR is not feasible due to restrictions on time or to lack of exact research questions.

Original languageEnglish
Pages (from-to)56-77
Number of pages22
JournalComputer Science Review
Publication statusPublished - 01 Feb 2016
Externally publishedYes


  • Bibliometrics
  • Citation analysis
  • Research literature
  • Software engineering
  • Thematic and topic analysis

ASJC Scopus subject areas

  • Theoretical Computer Science
  • Computer Science(all)

Fingerprint Dive into the research topics of 'Citations, research topics and active countries in software engineering: A bibliometrics study'. Together they form a unique fingerprint.

  • Cite this