This is the second part of a mini-series about our recently published paper “The Role of Open-Source Software in the Energy Sector”. In the first part of this mini-series, we have discussed on the general implications and properties of open-source software. In this part, we recapitulate the findings of the analysis of open-source software in the energy sector.
What we have done, is to create a collection of all open-source software we could find that relates to this sector. A few smaller collections were found on this matter, but in this domain we provide the biggest collection of software with a total of 388 projects, ranging from small hobbyist ones to big foundation-based platforms. If you are interested, this catalog and the other data relevant for this analysis can be found at: https://git.rwth-aachen.de/acs/public/publications/2023-oss-in-energy-data
The first questions we wanted to answer is: “Is open-source really a commercial thing or just done by universities”, and “are there synergies between commercial actors and non-commercial ones”. To answer these questions, we needed to classify the projects into categories. The categories we chose “Academia” for universities and non-commercial research institutions. Profit oriented companies are categorized “Commercial”, non-profit organizations are “Non-Profit”. It was not possible to clearly distinguish private contributions, so these fall into the “Other” category. As we wanted to investigate the synergies, it would not be enough to classify software projects, but instead we manually categorized the single contributors for the project, identified by e-mail address. A total of 754 mail addresses and domain classifications were needed to classify a significant number of the contributors in almost all projects.
The following table presents the results of this classification:
|Category||Nr. Projects||Percentage||Mixed Projects|
|Unclassified & Private||40||–||–|
Our analysis showed that, indeed, the majority of the open-source software projects were driven by academia, which was somewhat expected, as open-sourcing code is often seen as part of the dissemination and publication of research work. However, approximately 18% of the projects were, to a significant degree, commercially driven which disproves the thesis that open source cannot be combined with commercial interests. With about 8% of the projects having contributors from different categories, synergies do apparently exist, but still the vast majority of projects keep development in their respective category. The share of mixed projects was significantly higher for non-profit and commercial projects than for academia.
As open-source software is largely defined by the license for the project, it is interesting to see which specific licenses are used. The following figure shows the situation for our catalog:
The permissive licenses dominate the field and it is notable that whilst many licenses exist, most projects utilize the recommended subset of licenses, making it easier to (re-) use the project code.
If you ask yourself, “Which language to learn if I want to contribute to the open-source landscape in the energy domain”, there is a clear answer: Python. The following graph depicts the programming-language distribution in our data-set, and you can clearly see the domination of this language.
Our analysis of 388 energy sector projects highlights the growing trend in open-source software. Most projects follow good license practices, with Python being the dominant programming language. However, more industry-academia collaboration is needed since academia mainly drives these projects. To bridge the gap, we recommend reinforcing open-source incentives and promoting industry engagement through success stories. Please refer to the paper, if you are interested in the details and further conclusions and explanations.