Client to generate GrimoireLab metrics for Project Health using the software analytics platform GrimoireLab.
Instances of GrimoireLab 2.x and OpenSearch must be running before launching this tool. Please check the GrimoireLab documentation in order to deploy the platform.
To get this tool running, we also recommend using poetry.
This package manager will install the tool and all its dependencies.
You can install poetry
by following this guide.
Once you have poetry
running, move to the next section.
-
Clone the repository:
git clone [email protected]:Bitergia/grimoirelab-metrics.git cd grimoirelab_metrics
-
Install dependencies and tool:
poetry update poetry install
This will install the tool inside of a virtual environment managed by poetry. To use the tool you will have to activate it first with the command
eval $(poetry env activate)
(poetry >= 2.x) orpoetry shell
(poetry < 2.x).For development mode, install the tool with:
poetry install --with dev
Given a SPDX SBOM file with git repositories as input, this tool will generate a set of Project Health metrics. These metrics are calculated using the data stored on GrimoireLab about those repositories. If any of the repositories is not available on GrimoireLab, the tool will add it to GrimoireLab to have it analyzed.
grimoirelab-metrics spdx.xml \
--grimoirelab-url http://localhost:8000 \
--grimoirelab-user user --grimoirelab-password password \
--opensearch-url https://127.0.0.1:9200 \
--opensearch-index events \
--opensearch-user 'admin' --opensearch-password 'admin' \
--verify-certs --opensearch-ca-certs /path/to/ca.pem \
--from-date 2024-01-01 --to-date 2025-01-01 \
--repository-timeout 3600 \
--code-file-pattern "\.py$|\.js$" \
--binary-file-pattern "\.exe$|\.tar$" \
--pony-threshold 0.5 \
--elephant-threshold 0.5 \
--dev-categories-thresholds 0.8 0.95 \
--output metrics.json
The parameters needed to run the tool are:
- A valid SPDX file
- GrimoireLab instance address
- OpenSearch instance address
- OpenSearch index name, where GrimoireLab events data are stored
- Output filename, where metrics will be written.
This is an example of a valid SPDX file:
<?xml version="1.0" encoding="utf-8"?>
<Document>
<SPDXID>SPDXRef-DOCUMENT</SPDXID>
<creationInfo>
<created>2025-02-07T00:00:01Z</created>
<creators>Organization: Bitergia</creators>
</creationInfo>
<dataLicense>CC0-1.0</dataLicense>
<name>GrimoireLab</name>
<spdxVersion>SPDX-2.3</spdxVersion>
<documentNamespace>mynamespace</documentNamespace>
<packages>
<SPDXID>SPDXRef-bootstrap-gnu-config.bst-0</SPDXID>
<comment>Product: gnu-config</comment>
<downloadLocation>https://github.com/chaoss/grimoirelab-perceval.git</downloadLocation>
<filesAnalyzed>false</filesAnalyzed>
<name>bootstrap/gnu-config.bst</name>
<sourceInfo>git</sourceInfo>
</packages>
<packages>
<SPDXID>SPDXRef-bootstrap-gnu-config.bst-0</SPDXID>
<comment>Product: gnu-config</comment>
<downloadLocation>https://github.com/chaoss/grimoirelab-core.git</downloadLocation>
<filesAnalyzed>false</filesAnalyzed>
<name>bootstrap/gnu-config.bst</name>
<sourceInfo>git</sourceInfo>
</packages>
</Document>
This is the list of the metrics generated by this tool:
The metric is defined as the number of individuals, who produce up to the first 50% of the total number of code contributions (in descending order) within a given time period.
A low Pony Factor implies a high dependency on these individuals, making the project vulnerable if they were to leave.
Also known as: Lottery Factor, Bus Factor, Contributor Absence Factor.
The metric is defined as the number of unique organizations producing up to the first 50% of the total number of code contributions (in descending order) within a given time period.
It was first defined by Bitergia, and it applies the concept of the Pony Factor metric and takes it to contributing Organizations.
Contributions are focused on Git commits, and the organization is determined by the email address of the commit author.
This metric quantifies the total number of distinct organizations whose members have made contributions to an open source project over a specified period.
Contributions are focused on Git commits, and the organization is determined by the email address of the commit author.
This metric quantifies the number of unique organizations whose members have actively made contributions to an open source project within the last 90 days.
Contributions are focused on Git commits, and the organization is determined by the email address of the commit author.
This metric quantifies the total count of unique individuals who contributed within the last 90 days.
This metric counts the total number of commits made to the project within the last 90 days.
This metric measures the growth rate of active contributors, defined as the number of people sending one or more code contributions (in this case, Git commits) in a given period.
To calculate it, the period is split into two halves, and the number of active contributors in each half is compared. The growth rate is the difference between the second half and the first half, divided by the number of active contributors in the first half.
This metric measures the growth of active contributors, defined as the number of people sending one or more code contributions (in this case, Git commits) in a given period.
To calculate it, the period is split into two halves, and the number of active contributors in each half is compared. Growth is the difference between the second half and the first half.
This metric refers to the count of branches within a project's version control repository (for this case, Git) that have seen recent development activity, usually indicated by new commits.
This metric shows the number of days since the last commit was submitted to the repository or the project.
This metric verifies the existence of a file containing the full text of the project's adopters.
Standard practice dictates that this file is named ADOPTERS
, ADOPTERS.md
,
or ADOPTERS.txt
, and is located in the root directory of the project's
source code repository.
This metric verifies the existence of a file containing the full text of the project's chosen open source license.
Standard practice dictates that this file is named LICENSE
, LICENSE.md
,
LICENSE.txt
, or COPYING
(a convention historically used by GNU projects)
and is located in the root directory of the project's source code repository.
This metric involves categorizing contributors based on the frequency, consistency, and intensity of their contributions over a defined period. It aims to distinguish between individuals who contribute episodically (infrequent contributors) and those who engage with the project consistently and often substantially (regular or core contributors).
The CHAOSS community, for instance, defines "Occasional Contributors" as "people who make contributions to a project on an irregular basis".
This metric, often referred to as "returning contributors" or as an indicator of "contributor retention," counts the number of unique individuals who were active contributors in the last 90 days and had also made contributions in one or more defined previous periods.
This metrics calculates the rate of commits between the last 90 days and the last year. The purpose of this metrics is having an estimation of how distributed contributions are, and the "momentum" of the project.
- Number of commits per repository
- Number of developers per repository
- File type metrics (code, binaries or other)
- Commit side metrics (added lines and removed lines)
- Message size metrics (total, mean and median)
- Frequency metrics for commits (week, month and year)
- Developer categories (core, regular and casual)