License Usage and Changes: A Large-Scale Study on GitHub - EMSE Special Issue Online Appendix
This web page is a companion to our EMSE paper entitled "License Usage and Changes: A Large-Scale Study on GitHub". This work extends our prior ICPC'15 Paper.
1. Data
GitHub Projects
The list of Java projects and corresponding GitHub urls is available as a CSV file.
Commit and Issue Tracker Data
Sampled Commit Notes
The Java commit notes are contained in the following archive: Java Commits each file in the archive corresponds to a commit note.
The commits notes below are fomatted: [project],[commit_hash],[commit_note]
The Java commit notes corresponding to commits with atomic license changes are contained in the following csv: Java License Changes Commits
The C commit notes are in the following csv: C Commits
The C++ commit notes are in the following csv: C++ Commits
The C# commit notes are in the following csv: C# Commits
The Javascript commit notes are in the following csv: Javascript Commits
The Python commit notes are in the following csv: Python Commits
The Ruby commit notes are in the following csv: Ruby Commits
Sampled Issue Tracker Urls
The C commit notes are in the following csv: C Issues
The C++ commit notes are in the following csv: C++ Issues
The C# commit notes are in the following csv: C# Issues
The Java commit notes are in the following csv: Java Issues
The Javascript commit notes are in the following csv: Javascript Issues
The Python commit notes are in the following csv: Python Issues
The Ruby commit notes are in the following csv: Ruby Issues
2. Scripts
Analysis Scripts
RQ1: The following script and text file were used to generate our license usage data: License Usage Script and License List (this list was extracted from our dataset)
Usage: >./lic_usage.sh [ProjectsList] [LicenseList] [PathToProjectResultsFolder]
Note: The script requires bash 4 and because of this the script cannot be run using "sh"
RQ2: The following script was used to generate the atomic license changes: Generate Sequences
Usage: >python AtomicLicenseChanges.py [MARKOS_result_file] [output_file]
3. Results
MARKOS License Analyzer Results
The MARKOS License Analyzer raw results from our Java dataset can be found in the following archive: MARKOS Results(~4Gb).
*Authors
- Christopher Vendome
- The College of William and Mary, VA, USA.
E-mail: cvendome at cs dot wm dot edu - Gabriele Bavota -
Free University of Bolzano, Italy.
Email: gabriele.bavota at unibz dot it - Massimiliano Di Penta - University of Sannio, Benevento, Italy.
Email: dipenta at unisannio dot it - Mario Linares-Vásquez
- The College of William and Mary, VA, USA.
E-mail: mlinarev at cs dot wm dot edu - Daniel German
- University of Victoria.
E-mail: dmg at cs dot uvic dot ca - Denys Poshyvanyk
- The College of William and Mary.
E-mail: denys at cs dot wm dot edu