Rewrite JarResultBuildStep and enable parallel compression of jars #49585

gsmet · 2025-08-18T10:29:02Z

Note

This is part of my efforts on large monoliths but it should benefit all applications in the end.

This is a big patch, sorry, but I tried to have semantic commits, even if the first commit is quite large by itself.

My goal was to only implement parallel compression of jars using Commons Compress... but I ended up being unable to do it with JarResultBuildStep in its current state: the class was too big (1600+ lines) and it was too hard to actually comprehend the specificities of each format.

Thus why my first step was to rewrite this class with a proper hierarchy and each format splitted. FWIW, it's not the first time I struggled to adjust things there so my personal opinion is that this rewrite was long overdue.

Then I extracted the creation of the archive to an interface and finally implemented parallel compression using Commons Compress.

The last commit is just some final cleanup that was too entangled to actually be squashed in one of the existing commits.

For my test app with 37k Java source classes (+ all the ones we generate), it went from 6 seconds to 1.6 seconds, so a 3.75x speedup.

For now I'm still using the ZipFileSystem approach for Uberjars. Uberjars are specific as the Manifest is potentially updated at the end to include the multi release bits and it comes with some subtleties as the Manifest has to be added first to the jar, nothing insurmountable but this work already consumes a lot of bandwith. I could be convinced to put the extra work and get rid of it at some point. Not in this PR though.

gsmet · 2025-08-18T10:30:13Z

core/deployment/pom.xml

-                                        <!-- let's avoid having commons-io crawling into quarkus-core -->
-                                        <exclude>commons-io:commons-io</exclude>


I will ban it through ForbiddenAPIs instead but we have some unrelated code depending on it so I need to clean up this code before doing it.

Will open a follow-up PR once this one is in.

Actually, this is fine, we already have a local forbidden-apis rule for core/deployment banning usage of Commons IO.

We are using it in a lot of test utils everywhere so it's hard to fully get rid of it. We could get rid of most of them but the IOUtils dependencies to get the content of a URL are handy.

Sanne · 2025-08-18T10:49:21Z

...yment/src/main/java/io/quarkus/deployment/pkg/jar/ParallelCommonsCompressArchiveCreator.java

+        };
+    }
+
+    private static ExecutorService initExecutorService() {


It feels quite wasteful to create a whole new Executor for each single jar?
What about reusing the build executor, and wrapping it into an adaptor wich would ignore the shutdown request issues by commons-compress?

Oh, that you are plain right. I will create only one. BUT I prefer avoiding reusing the build executor as really I don't want to make assumptions as to what Commons Compress is doing with the executor.

I know it's a bit suboptimal but I think it's safer.

I had a look and I think I will keep it as it is even if suboptimal. The code in Commons Compress has several comments saying they absolutely want the executor to be shut down.

I know it's probably being a bit too safe but I really prefer not breaking their contract.

but that's a lot of threads? And they wouldn't coordinate well with other jobs running on the existing executors - it worries me that it's not "a little suboptimal", might lead to serious problems like hard to reproduce issues, unmanaged spikes of memory

Raaah, I hesitate...

has several comments saying they absolutely want the executor to be shut down.

I know it's probably being a bit too safe but I really prefer not breaking their contract.

Ok, I see - they probably do some dodgy things then - I guess you're right in being cautious. What about at least reusing the ParallelScatterZipCreator across your various needs, would that be an option?

So I ended up doing it... and the result is underwhelming. It's quite slower.

I thought it could be due to the fact that we end up starting too many threads so I tried your patch here: #49575 .

And basically:

your patch there makes the global build faster at 2 * cores

but the specific jar compression is faster at 4 * cores (but the global build is slower even with the compression being faster)

So I think we are looking at something that specifically requires its own thread pool with specific configuration.

I'm going to try with your patch + executor decoupling but only one Executor Service for all the jars.

Mkay, I can't actually reproduce my numbers...

I'll push the patch reusing the common thread pool...

Sanne · 2025-08-18T10:51:29Z

It's interesting that you opted to parallelize the writing of a single jar - wasn't there also some opportunities to write different jars in parallel? Would that be a different optimisation that we could do as well ?

But all such tweaks would benefit the most from actually sharing executors.

gsmet · 2025-08-18T10:59:16Z

It's interesting that you opted to parallelize the writing of a single jar - wasn't there also some opportunities to write different jars in parallel? Would that be a different optimisation that we could do as well ?

We could at some point, but I needed to optimize the writing of a single jar anyway as not all jars are created equals. Some of them are very large, some of them very small.

I'm not convinced adding parallelization of the whole jar operation after this work would bring some benefits.

Feel free to have fun with it though :).

Sanne · 2025-08-18T11:19:44Z

Feel free to have fun with it though :)

It doesn't feel like a priority now that you went with the nuclear approach :) But I do wonder how efficient "parallel compression" in commons-compress is, compared to a "simple" zipstream for each jar.

gsmet · 2025-08-18T15:34:21Z

FWIW, the failures are due to some Maven tests using very old versions of Commons IO as a dependency, and not relying on our BOM.
I'm fixing the tests... but each rerun of the Maven tests takes ages...

As for Gradle, the problem was that a test tests the exact list of files in build/ and we had a directory from the compression still lurking around.
I added a line to drop it at the end.

gsmet · 2025-08-18T15:37:44Z

It doesn't feel like a priority now that you went with the nuclear approach :) But I do wonder how efficient "parallel compression" in commons-compress is, compared to a "simple" zipstream for each jar.

In some cases, we build only one jar (native image and legacy thin jar but only the first one is relevant this day). So just parallelizing the build of each jar won't help.

In the case where we build the most jars, it's like 3. One of the three quarkus-run.jar is tiny. The two others (transformed and generated), it depends but in the large monolith case, one is double the size of the other.

So just parallelizing building each jar won't bring you much.

gsmet · 2025-08-18T15:59:33Z

OK, it should be fine now.

gsmet · 2025-08-19T08:29:00Z

@geoand this is another for you when you're back from PTO. Sorry :).

This class had become completely unmanageable due to its size. Given I'm willing to invest some time to see if we can improve our zip build time, this is a necessary step in preparation to the upcoming improvements.

Including the native image source jar. It is based on commons-compress. I haven't implemented parallel jar build for uberjars for now. This is a known limitation.

I would have squashed it but it's not easy to squash into initial commit due to conflicts.

quarkus-bot · 2025-08-24T13:14:39Z

Status for workflow `Quarkus CI`

This is the status report for running Quarkus CI on commit cfc8d90.

✅ The latest workflow run for the pull request has completed successfully.

It should be safe to merge provided you have a look at the other checks in the summary.

You can consult the Develocity build scans.

Flaky tests - Develocity

⚙️ Gradle Tests - JDK 17

📦 integration-tests/gradle

❌ io.quarkus.gradle.BuildConfigurationTest.buildNoOverride - History

Multiple Failures (1 failure) -- failure 1 -- [sub project 'without-configuration', package type 'fast-jar'] Expecting path: - org.assertj.core.error.AssertJMultipleFailuresError

org.assertj.core.error.AssertJMultipleFailuresError: 

Multiple Failures (1 failure)
-- failure 1 --
[sub project 'without-configuration', package type 'fast-jar'] 
Expecting path:
  /home/runner/_work/quarkus/quarkus/integration-tests/gradle/target/classes/build-configuration/without-configuration/build/quarkus-app/quarkus-run.jar
to exist (symbolic links were followed).

geoand

Very nice work!

Let's get it in so we can avoid conflicts

geoand · 2025-08-27T04:42:17Z

@gsmet FYI, this looks like it caused the following issue in Quarkus LangChain4j

2025-08-27T03:51:19.3505531Z [ERROR] Failed to execute goal org.apache.maven.plugins:maven-compiler-plugin:3.14.0:compile (default-compile) on project quarkus-langchain4j-jlama-deployment: Compilation failure
2025-08-27T03:51:19.3508229Z [ERROR] /home/runner/work/quarkus-langchain4j/quarkus-langchain4j/current-repo/model-providers/jlama/deployment/src/main/java/io/quarkiverse/langchain4j/jlama/deployment/JlamaProcessor.java:[269,32] cannot find symbol
2025-08-27T03:51:19.3510005Z [ERROR]   symbol:   variable QUARKUS_RUN_JAR
2025-08-27T03:51:19.3510703Z [ERROR]   location: class io.quarkus.deployment.pkg.steps.JarResultBuildStep
2025-08-27T03:51:19.3511384Z [ERROR] -> [Help 1]
2025-08-27T03:51:19.3513012Z org.apache.maven.lifecycle.LifecycleExecutionException: Failed to execute goal org.apache.maven.plugins:maven-compiler-plugin:3.14.0:compile (default-compile) on project quarkus-langchain4j-jlama-deployment: Compilation failure
2025-08-27T03:51:19.3526289Z /home/runner/work/quarkus-langchain4j/quarkus-langchain4j/current-repo/model-providers/jlama/deployment/src/main/java/io/quarkiverse/langchain4j/jlama/deployment/JlamaProcessor.java:[269,32] cannot find symbol
2025-08-27T03:51:19.3528038Z   symbol:   variable QUARKUS_RUN_JAR
2025-08-27T03:51:19.3528644Z   location: class io.quarkus.deployment.pkg.steps.JarResultBuildStep
2025-08-27T03:51:19.3529140Z 
2025-08-27T03:51:19.3529596Z     at org.apache.maven.lifecycle.internal.MojoExecutor.doExecute2 (MojoExecutor.java:333)
2025-08-27T03:51:19.3530640Z     at org.apache.maven.lifecycle.internal.MojoExecutor.doExecute (MojoExecutor.java:316)
2025-08-27T03:51:19.3531872Z     at org.apache.maven.lifecycle.internal.MojoExecutor.execute (MojoExecutor.java:212)
2025-08-27T03:51:19.3532878Z     at org.apache.maven.lifecycle.internal.MojoExecutor.execute (MojoExecutor.java:174)
2025-08-27T03:51:19.3533850Z     at org.apache.maven.lifecycle.internal.MojoExecutor.access$000 (MojoExecutor.java:75)
2025-08-27T03:51:19.3534806Z     at org.apache.maven.lifecycle.internal.MojoExecutor$1.run (MojoExecutor.java:162)
2025-08-27T03:51:19.3537251Z     at org.apache.maven.plugin.DefaultMojosExecutionStrategy.execute (DefaultMojosExecutionStrategy.java:39)
2025-08-27T03:51:19.3538633Z     at org.apache.maven.lifecycle.internal.MojoExecutor.execute (MojoExecutor.java:159)
2025-08-27T03:51:19.3539837Z     at org.apache.maven.lifecycle.internal.LifecycleModuleBuilder.buildProject (LifecycleModuleBuilder.java:105)
2025-08-27T03:51:19.3541222Z     at org.apache.maven.lifecycle.internal.LifecycleModuleBuilder.buildProject (LifecycleModuleBuilder.java:73)
2025-08-27T03:51:19.3542697Z     at org.apache.maven.lifecycle.internal.builder.singlethreaded.SingleThreadedBuilder.build (SingleThreadedBuilder.java:53)
2025-08-27T03:51:19.3543993Z     at org.apache.maven.lifecycle.internal.LifecycleStarter.execute (LifecycleStarter.java:118)
2025-08-27T03:51:19.3544890Z     at org.apache.maven.DefaultMaven.doExecute (DefaultMaven.java:261)
2025-08-27T03:51:19.3545652Z     at org.apache.maven.DefaultMaven.doExecute (DefaultMaven.java:173)
2025-08-27T03:51:19.3546594Z     at org.apache.maven.DefaultMaven.execute (DefaultMaven.java:101)
2025-08-27T03:51:19.3547315Z     at org.apache.maven.cli.MavenCli.execute (MavenCli.java:906)
2025-08-27T03:51:19.3548001Z     at org.apache.maven.cli.MavenCli.doMain (MavenCli.java:283)
2025-08-27T03:51:19.3548670Z     at org.apache.maven.cli.MavenCli.main (MavenCli.java:206)
2025-08-27T03:51:19.3549621Z     at jdk.internal.reflect.DirectMethodHandleAccessor.invoke (DirectMethodHandleAccessor.java:103)
2025-08-27T03:51:19.3550566Z     at java.lang.reflect.Method.invoke (Method.java:580)
2025-08-27T03:51:19.3551415Z     at org.codehaus.plexus.classworlds.launcher.Launcher.launchEnhanced (Launcher.java:255)
2025-08-27T03:51:19.3552433Z     at org.codehaus.plexus.classworlds.launcher.Launcher.launch (Launcher.java:201)
2025-08-27T03:51:19.3553689Z     at org.codehaus.plexus.classworlds.launcher.Launcher.mainWithExitCode (Launcher.java:361)
2025-08-27T03:51:19.3554823Z     at org.codehaus.plexus.classworlds.launcher.Launcher.main (Launcher.java:314)
2025-08-27T03:51:19.3555857Z Caused by: org.apache.maven.plugin.compiler.CompilationFailureException: Compilation failure
2025-08-27T03:51:19.3558019Z /home/runner/work/quarkus-langchain4j/quarkus-langchain4j/current-repo/model-providers/jlama/deployment/src/main/java/io/quarkiverse/langchain4j/jlama/deployment/JlamaProcessor.java:[269,32] cannot find symbol
2025-08-27T03:51:19.3559649Z   symbol:   variable QUARKUS_RUN_JAR
2025-08-27T03:51:19.3560251Z   location: class io.quarkus.deployment.pkg.steps.JarResultBuildStep

Not a problem, just wanted to raise awareness that there might be other extensions that could fail as well

quarkus-bot bot added area/core area/devtools Issues/PR related to maven, gradle, platform and cli tooling/plugins area/gradle Gradle area/kubernetes area/maven labels Aug 18, 2025

gsmet commented Aug 18, 2025

View reviewed changes

Sanne reviewed Aug 18, 2025

View reviewed changes

This comment has been minimized.

Sign in to view

gsmet force-pushed the jar-build-step branch from 0254f46 to 5653404 Compare August 18, 2025 15:59

This comment has been minimized.

Sign in to view

quarkus-bot bot added the triage/flaky-test label Aug 18, 2025

gsmet requested a review from geoand August 19, 2025 08:29

gsmet mentioned this pull request Aug 20, 2025

Reduce our dependency to exotic common-io versions in tests #49625

Merged

gsmet added 6 commits August 24, 2025 11:46

Reorganize JarResultBuildStep and split it

b026299

This class had become completely unmanageable due to its size. Given I'm willing to invest some time to see if we can improve our zip build time, this is a necessary step in preparation to the upcoming improvements.

Extract the archive creation to a specific interface

2d07617

Use a parallel zip builder for building fast and legacy jars

c798db7

Including the native image source jar. It is based on commons-compress. I haven't implemented parallel jar build for uberjars for now. This is a known limitation.

Some further cleanup and simplifications

210290f

Reuse the build executor and wrap it to avoid shut down

aa63ee2

Delete temp directory at the end of the operations

cfc8d90

I would have squashed it but it's not easy to squash into initial commit due to conflicts.

gsmet force-pushed the jar-build-step branch from 5653404 to cfc8d90 Compare August 24, 2025 09:46

geoand approved these changes Aug 25, 2025

View reviewed changes

gsmet merged commit 83438f8 into quarkusio:main Aug 25, 2025
57 checks passed

quarkus-bot bot added this to the 3.28 - main milestone Aug 25, 2025

		<!-- let's avoid having commons-io crawling into quarkus-core -->
		<exclude>commons-io:commons-io</exclude>

Rewrite JarResultBuildStep and enable parallel compression of jars #49585

Rewrite JarResultBuildStep and enable parallel compression of jars #49585

Uh oh!

Conversation

gsmet commented Aug 18, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Sanne Aug 18, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Sanne commented Aug 18, 2025

Uh oh!

gsmet commented Aug 18, 2025

Uh oh!

Sanne commented Aug 18, 2025

Uh oh!

This comment has been minimized.

gsmet commented Aug 18, 2025

Uh oh!

gsmet commented Aug 18, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

gsmet commented Aug 18, 2025

Uh oh!

This comment has been minimized.

gsmet commented Aug 19, 2025

Uh oh!

quarkus-bot bot commented Aug 24, 2025 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Status for workflow Quarkus CI

Flaky tests - Develocity

⚙️ Gradle Tests - JDK 17

📦 integration-tests/gradle

Uh oh!

geoand left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

geoand commented Aug 27, 2025

Uh oh!

Uh oh!

Sanne Aug 18, 2025 •

edited

Loading

gsmet commented Aug 18, 2025 •

edited

Loading

quarkus-bot bot commented Aug 24, 2025 •

edited by github-actions bot

Loading

Status for workflow `Quarkus CI`