In need of a canonical way to add license texts for licenseRefs / Maybe bugged?

Trying to implement a custom "exporter" that converts a detailed, custom SBOM format to SPDX, I encountered some _really weird behaviour_ regarding the implementation of Chapter 10 "Other licensing information detected section", which I believe corresponds to the `ExtractedLicenseInfo` class in the library.
Since the base SPDX license list is insufficient for our use, we use both licenses from the list and licenseRefs to maintain correctness and detail in all our license expressions.

Using the library, I noticed that after converting and adding artifacts/software components as SPDX packages (maybe also files at some point), the converter programmatically finds and adds license texts for each custom licenseRef.
However, I couldn't find an example or documentation about how this is supposed to be done.

And that is the main question: _**What's the supported / recommended way to add custom license texts**_ to a document with the java library?

Trying to figure this out myself, I encountered some counterintuitive behaviour that may be caused by bugs in the implementation:

In our use-case, we want to write the resulting SPDX document to JSON using the `spdx-java-jackson-store` with `MultiFormatStore`.
I think it has to do with a combination of the store's behaviour, `ModelSet#add` and the use of `ExtractedLicenseInfo#id` to hold the licenseRef.
I think that when I try to add my `new ExtractedLicenseInfo` later, with the correct licenseRef in the id field, the store will reject this new, completed object, instead keeping a previously added `AnyLicenseInfo` unchanged, which didn't have the text yet.

As a user, these unique ids and the order in which I add license expressions and corresponding license texts are an implementation detail that I do not want to worry about.

---

Again, since I don't know the correct way, i experimented, but any interface I could find was unsatisfactory. To demonstrate my issues with current behaviour, I attached a class containing examples on some issues I am trying to demonstrate.
In the code, `// PROBLEM:` comments try to highlight what I feel the issue is.

Regrettably, the code ended up being messy. As a summary for each example:

#### Example 1

Simply using `document.addExtractedLicenseInfos(new ExtractedLicenseInfo(licenseRef, text))` doesn't make the correct `text` show up in the JSON.

- I think this is the way _it should work_, since info in this section would be somewhat independent of what licenseRefs have previously been added through other packages.

#### Example 2

Since [Example 1](#example-1) doesn't work, I tried to use an instance returned by `parseSPDXLicenseString`.

- this did not work, as the issue wasn't with the construction of `ExtractedLicenseInfo` but with the way objects are resolved.
- this makes for bad code, since we have to speculate on the real type of the given `AnyLicenseInfo` (and cast to `ExtractedLicenseInfo` to be able to call `SpdxDocument#addExtractedLicenseInfos`)

Through extensive browsing of source code, I later found out about `LicenseInfoFactory.parseSPDXLicenseString`'s overloading, leading to [Example 4](#example-4).

#### Example 3

This demonstrates a hack that immediately adds a license text just after first parsing the expression that contains it.

- demonstrates that an instance of `ExtractedLicenseInfo` is remembered in some obscure part of the store without explicitly adding it (unsure where from, probably somewhere under `SpdxDocument#createPackage`?)
  - later the instance is found present already by matching its `id`, breaking `hasExtractedLicensingInfos` when manually adding `ExtractedLicenseInfo`
- this solution is unintuitive and makes for hacky code
- unsuitable to my use case for technical reasons
  - I need to be able to add license texts in a _later processing stage_.

#### Example 4

This is similar to [Example 2](#example-2), abusing `LicenseInfoFactory#parseSPDXLicenseString`. It will return a writable instance to which I add the text inplace.

- same disadvantages as [Example 2](#example-2), except it produces the desired result (text shows up in `hasExtractedLicensingInfos`)
- bad api, this terrible hack leads to bad code and bad stability (since the interface is buried in implementation details).
- it's the way I bodged it... because I found no other way to make my program work
- demonstrates that `addExtractedLicenseInfos` **doesn't add the given object** to `hasExtractedLicensingInfos` **at all**!
  Instead, it later resolves something by "id" in some non-trivial way, depending on what licenseRefs are / were found in previously present license expressions.

I think a user of the library:

- shouldn't have to call `parseSPDXLicenseString` with its store, etc as parameters like i did here
- shouldn't ever have to rely on `parseSPDXLicenseString` giving the exact same instance between calls to achieve functionality
- should instead be able to add and remove information as objects freely
  - without them having unexpected side-effects (like an earlier package addition leading to broken license texts in `hasExtractedLicensingInfos`)

---

To conclude, I believe a change in the library's behaviour is required.
Example 1 could be a first step towards a better implementation.

So: If needed, I'll try to debug further for a PR once I know the way things _should_ work.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

In need of a canonical way to add license texts for licenseRefs / Maybe bugged? #215

Example 1

Example 2

Example 3

Example 4

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

In need of a canonical way to add license texts for licenseRefs / Maybe bugged? #215

Description

Example 1

Example 2

Example 3

Example 4

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions