Skip to content

Conversation

kocheick
Copy link

Overview

Implements automatic XML sitemap generation for Kobweb applications to improve SEO and search engine discoverability.

Closes #701

Implementation Details

New Features

  • Automatic sitemap generation: Creates sitemap.xml at site root (/sitemap.xml)
  • Smart route discovery: Uses existing @Page annotation detection (includes markdown-generated pages)
  • Flexible configuration: DSL block under kobweb.app.sitemap { ... }
  • Dynamic route filtering: Excludes parameterized routes like /users/{id} by default
  • Custom filtering: Support for excludeRoutes, extraRoutes, and custom routeFilter lambdas
  • Localhost testing: Supports http://localhost:8080 for development testing
  • Standards compliant: Generates proper XML sitemaps following sitemaps.org specification

Task Dependencies

  • Runs after markdown processing (kobwebxMarkdownProcess) when present
  • Integrated into resource processing pipeline
  • Outputs directly to src/jsMain/resources/public/sitemap.xml

Configuration Examples

Basic usage:

kobweb {
    app {
        sitemap {
            baseUrl.set("https://mysite.com")
        }
    }
}

Advanced configuration:

kobweb {
    app {
        sitemap {
            baseUrl.set("https://mysite.com")
            extraRoutes.addAll("/blog/post-1", "/products/special")
            excludeRoutes.addAll("/admin", "/internal")
            routeFilter.set { !it.contains("/temp/") }
        }
    }
}

Localhost testing:

kobweb {
    app {
        sitemap {
            baseUrl.set("http://localhost:8080")
        }
    }
}

Key Benefits

  • Zero configuration: Works out-of-the-box for most sites with just baseUrl
  • Markdown integration: Automatically includes markdown-generated pages
  • Development friendly: Easy to test locally with localhost URLs
  • Production ready: Handles large sites with size limit warnings
  • SEO optimized: Places sitemap at standard /sitemap.xml location

Files Added/Modified

  • New: KobwebGenerateSitemapTask.kt - Main sitemap generation task
  • Modified: AppBlock.kt - Added SitemapBlock configuration DSL
  • Modified: KobwebApplicationPlugin.kt - Task registration and dependency wiring

Testing

  • Supports http://localhost:8080 for local development testing
  • Generates sitemap at src/jsMain/resources/public/sitemap.xml
  • Accessible at /sitemap.xml when server is running
  • Works with both development (kobweb start) and production (kobweb export) workflows

Breaking Changes

None - this is purely additive functionality that only activates when baseUrl is configured.

@kocheick kocheick changed the base branch from main to dev August 22, 2025 02:08
Copy link
Collaborator

@DennisTsar DennisTsar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your work on this.

I took a look at the code relating to the task creation & configuration and left some comments/tips about how to make it more Gradle idiomatic. I haven't looked at the code the pertains to the actual sitemap generation.

The biggest thing to address is the following (copied from one my comments below):

It's worth considering how we actually want a user to enable/disable sitemap generation. While onlyIf { baseUrl.isPresent } is an option, it has a few drawbacks:

  • It's not obvious to a user that baseUrl is required for sitemap generation
  • Users that don't want sitemap generation still see the (skipped) task in their console

Instead, I would suggest having sitemap generation enabled with a function:

fun generateSitemap(baseUrl: String, /* other parameters with defaults */)
// or possibly
fun generateSitemap(baseUrl: String, config: SitemapConfig.() -> Unit = {})

The latter could be nicer to support Property values, which would allow creating a task to, for example, fetch entries from a database, and then use those values to populate extraRoutes for dynamic routes. We could also provide both. Calling the function would register the sitemap generation task (with some mechanism to disallow/warn about calling it multiple times) and configure it accordingly.

@DennisTsar
Copy link
Collaborator

I'm not sure if you are waiting for a review on this (feel free to request a re-review from me if so), but I'll note that in the current state kobweb run does not work in the playground project (I think due to a misplaced }).

Also, in case it got buried, I described here how to deal with @get:Internal // Avoid serialization issues with lambdas.

@kocheick
Copy link
Author

I'm not sure if you are waiting for a review on this (feel free to request a re-review from me if so), but I'll note that in the current state kobweb run does not work in the playground project (I think due to a misplaced }).

Also, in case it got buried, I described here how to deal with @get:Internal // Avoid serialization issues with lambdas.

there was indeed a misplaced }, in the KobwebApplicationPlugin I accidently removed a brace } , puting the jvm Block into the js Block, but now it's fixed.
This has been a great learning experience, because the filterBlock is finally working consistently whenever I change something in the block, however it lead to updating kobwebSiteRoutes
from:

val Project.kobwebSiteRoutes: Provider<List<String>>
    get() = tasks.named<KobwebCacheAppFrontendDataTask>("kobwebCacheAppFrontendData").map { task ->
        val pageEntries =
            Json.decodeFromString<AppFrontendData>(task.appDataFile.get().asFile.readText()).frontendData.pages
        pageEntries
            .asSequence()
            .map { it.route }
            .sorted()
            .toList()
    }

to:

val Project.kobwebSiteRoutes: Provider<List<String>>
    get() = tasks.named<KobwebCacheAppFrontendDataTask>("kobwebCacheAppFrontendData").flatMap { task ->
        task.appDataFile.map { file ->
            val pageEntries =
                Json.decodeFromString<AppFrontendData>(file.asFile.readText()).frontendData.pages
            pageEntries
                .asSequence()
                .map { it.route }
                .sorted()
                .toList()
        }
    }

@kocheick
Copy link
Author

regarding kobwebSiteRoutes

Fix: configuration cache compatibility for kobwebSiteRoutes

TL;DR

  • Stop reading task outputs during configuration.
  • Switch from map + get() to flatMap + map so file I/O happens at execution time.
  • Enables Gradle configuration cache and removes intermittent file-missing errors after clean.

Problem

We saw failures like:

.../build/kobweb/cache/kobwebCacheAppFrontendData/appData.json (No such file or directory)

This occurred while Gradle was serializing the configuration cache. Root cause: kobwebSiteRoutes read a task output
file during configuration.

Before (problematic)

val Project.kobwebSiteRoutes: Provider<List<String>>
    get() = tasks.named<KobwebCacheAppFrontendDataTask>("kobwebCacheAppFrontendData").map { task ->
        val pageEntries =
            Json.decodeFromString<AppFrontendData>(task.appDataFile.get().asFile.readText()).frontendData.pages
        pageEntries.asSequence().map { it.route }.sorted().toList()
    }
  • task.appDataFile.get() forces eager resolution at configuration time.
  • File often doesn’t exist yet; also breaks config cache constraints.

After (config-cache friendly)

val Project.kobwebSiteRoutes: Provider<List<String>>
    get() = tasks.named<KobwebCacheAppFrontendDataTask>("kobwebCacheAppFrontendData").flatMap { task ->
        task.appDataFile.map { file ->
            val pageEntries =
                Json.decodeFromString<AppFrontendData>(file.asFile.readText()).frontendData.pages
            pageEntries.asSequence().map { it.route }.sorted().toList()
        }
    }
  • flatMap defers resolving appDataFile until execution.
  • Inner map defers file I/O until the provider is realized by a task.
  • No .get() during configuration.

Why this change

  • Preserve API: still returns Provider<List<String>> with the same behavior.
  • Align with Gradle provider best practices (no eager .get() in configuration phase).
  • Unblock configuration cache for faster subsequent builds.
  • Remove flakiness after clean when the file hasn’t been produced yet.

Impact

  • Functional: no changes for consumers of kobwebSiteRoutes (sitemap, route tools, etc.).
  • Performance: configuration cache usable; faster repeat builds.
  • Stability: eliminates intermittent file-missing errors.

Verification

  • ./gradlew clean kobwebStart: passes.
  • Subsequent runs reuse configuration cache without errors.
  • Features depending on kobwebSiteRoutes (e.g., sitemap) behave as before.

Follow-ups / guidance

  • Avoid .get() on Property/Provider during configuration.
  • Prefer provider chains:
    • Avoid: provider.map { it.output.get().asFile.readText() }
    • Prefer: provider.flatMap { it.output.map { file -> file.asFile.readText() } }
  • For onlyIf guards, capture simple booleans at configuration time instead of touching task outputs or extensions at
    execution time.

@kocheick kocheick requested a review from DennisTsar September 11, 2025 06:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add Kobweb support for creating a sitemap

2 participants