-
Notifications
You must be signed in to change notification settings - Fork 41
Description
Trying to implement a custom "exporter" that converts a detailed, custom SBOM format to SPDX, I encountered some really weird behaviour regarding the implementation of Chapter 10 "Other licensing information detected section", which I believe corresponds to the ExtractedLicenseInfo class in the library.
Since the base SPDX license list is insufficient for our use, we use both licenses from the list and licenseRefs to maintain correctness and detail in all our license expressions.
Using the library, I noticed that after converting and adding artifacts/software components as SPDX packages (maybe also files at some point), the converter programmatically finds and adds license texts for each custom licenseRef.
However, I couldn't find an example or documentation about how this is supposed to be done.
And that is the main question: What's the supported / recommended way to add custom license texts to a document with the java library?
Trying to figure this out myself, I encountered some counterintuitive behaviour that may be caused by bugs in the implementation:
In our use-case, we want to write the resulting SPDX document to JSON using the spdx-java-jackson-store with MultiFormatStore.
I think it has to do with a combination of the store's behaviour, ModelSet#add and the use of ExtractedLicenseInfo#id to hold the licenseRef.
I think that when I try to add my new ExtractedLicenseInfo later, with the correct licenseRef in the id field, the store will reject this new, completed object, instead keeping a previously added AnyLicenseInfo unchanged, which didn't have the text yet.
As a user, these unique ids and the order in which I add license expressions and corresponding license texts are an implementation detail that I do not want to worry about.
Again, since I don't know the correct way, i experimented, but any interface I could find was unsatisfactory. To demonstrate my issues with current behaviour, I attached a class containing examples on some issues I am trying to demonstrate.
In the code, // PROBLEM: comments try to highlight what I feel the issue is.
Regrettably, the code ended up being messy. As a summary for each example:
Example 1
Simply using document.addExtractedLicenseInfos(new ExtractedLicenseInfo(licenseRef, text)) doesn't make the correct text show up in the JSON.
- I think this is the way it should work, since info in this section would be somewhat independent of what licenseRefs have previously been added through other packages.
Example 2
Since Example 1 doesn't work, I tried to use an instance returned by parseSPDXLicenseString.
- this did not work, as the issue wasn't with the construction of
ExtractedLicenseInfobut with the way objects are resolved. - this makes for bad code, since we have to speculate on the real type of the given
AnyLicenseInfo(and cast toExtractedLicenseInfoto be able to callSpdxDocument#addExtractedLicenseInfos)
Through extensive browsing of source code, I later found out about LicenseInfoFactory.parseSPDXLicenseString's overloading, leading to Example 4.
Example 3
This demonstrates a hack that immediately adds a license text just after first parsing the expression that contains it.
- demonstrates that an instance of
ExtractedLicenseInfois remembered in some obscure part of the store without explicitly adding it (unsure where from, probably somewhere underSpdxDocument#createPackage?)- later the instance is found present already by matching its
id, breakinghasExtractedLicensingInfoswhen manually addingExtractedLicenseInfo
- later the instance is found present already by matching its
- this solution is unintuitive and makes for hacky code
- unsuitable to my use case for technical reasons
- I need to be able to add license texts in a later processing stage.
Example 4
This is similar to Example 2, abusing LicenseInfoFactory#parseSPDXLicenseString. It will return a writable instance to which I add the text inplace.
- same disadvantages as Example 2, except it produces the desired result (text shows up in
hasExtractedLicensingInfos) - bad api, this terrible hack leads to bad code and bad stability (since the interface is buried in implementation details).
- it's the way I bodged it... because I found no other way to make my program work
- demonstrates that
addExtractedLicenseInfosdoesn't add the given object tohasExtractedLicensingInfosat all!
Instead, it later resolves something by "id" in some non-trivial way, depending on what licenseRefs are / were found in previously present license expressions.
I think a user of the library:
- shouldn't have to call
parseSPDXLicenseStringwith its store, etc as parameters like i did here - shouldn't ever have to rely on
parseSPDXLicenseStringgiving the exact same instance between calls to achieve functionality - should instead be able to add and remove information as objects freely
- without them having unexpected side-effects (like an earlier package addition leading to broken license texts in
hasExtractedLicensingInfos)
- without them having unexpected side-effects (like an earlier package addition leading to broken license texts in
To conclude, I believe a change in the library's behaviour is required.
Example 1 could be a first step towards a better implementation.
So: If needed, I'll try to debug further for a PR once I know the way things should work.