Data

New use cases for Sosiefication

In the context of the DIVERSIFY project, we investigate the automatic generation of diverse program variants that are all functionally similar. This work is based on Tailored Source Code Transformations to Synthesize Computationally Diverse Program Variants.
In our most recent work on Automatic Software Diversity in the Light of Test Suites, we have performed experiments with source code and test suites of 6 popular Java libraries

  • Java
  • Location: Github
  • Content: 6 large Java programs with high quality JUnit test suites.

Software monoculture in WordPress and JavaScript

In the context of the DIVERSIFY project, we have collected a large quantity of data about WordPress plugins and JavaScript libraries and show that, despite the huge diversity of available software, websties currently use only a small set of plugins or library, creating a monoculture in web applications.
Check data

Java sources and test suites for Sosiefication

  • Java
  • Location: diversify-project.eu
  • Content: 9 large Java programs with high quality JUnit test suites.
  • Source: We have chosen projects that are widely used (such as JUnit) and which target good quality through extensive testing (such as apache.common library) .

Download data

We have used this data set to synthesize sosie programs. We have synthesized thousands of sosies and 100 sosies of JUnit are avaible here. The whole results and methodology are published in our paper Tailored Source Code Transformations to Synthesize Computationally Diverse Program Variants that has been presented at ISSTA’14

Java sources with high usage diversity

  • Java
  • Location: gforge.inria.fr
  • Content: 3 418 Jar files, which include 382 774 different types (classes or interfaces).
  • Source: We have collected all Jar files present on a machine used for performing software mining experiments for 7 years.

Download data

We have used this data set for our analysis of API usage diversity is available. The results are published in our paper “Empirical Evidence of Large-Scale Diversity in API Usage of Object-Oriented Software” (Diego Mendez, Benoit Baudry, Martin Monperrus) that has been presented at the SCAM’13 conference.

Metamodels and well-formedness rules

  • Ecore, OCL
  • Location: REMODD model repository
  • Content:
    • 14 metamodels. Five of these metamodels include between 3 and 13 packages, each of which can be considered as an independent metamodel.
    • 1262 well-formedness rules
  • Source: We have collected this data set from the OMG, our industrial partners and an open call to the community (through the planetmde mailing list). The original data was in various formats, but we have made homogeneous (only Ecore and OCL) in this data set.

Download data

We have used this data set to analyze the interactions between two formalisms (Ecore and OCL) for metamodeling. The results are available in a technical report: Ten years of Meta-Object Facility: an Analysis of Metamodeling Practices (Juan Cadavid, Benoit Combemale, Benoit Baudry).