Skip to content

Clarify handling of null bytes (U+0000) in UTF-8 text fields across BOLT specifications #1260

@erickcestari

Description

@erickcestari

The current BOLT specifications require UTF-8 encoding for text fields (e.g., BOLT11 description field, BOLT12 offer descriptions) but don't explicitly address the handling of null bytes (U+0000). While null bytes are technically valid UTF-8, they cause significant interoperability and might security issues across Lightning implementations.

Current Specification Language

  • BOLT11: "MUST set d to a valid UTF-8 string" (BOLT11 spec)
  • BOLT01: "A writer MUST ensure an array of these is a valid UTF-8 string, a reader MAY reject any messages containing an array of these which is not a valid UTF-8 string" (BOLT01 spec)

The Problem

Null bytes (U+0000) are valid Unicode code points and valid UTF-8, but they cause severe implementation issues:

  1. C/C++ implementations: Treat null bytes as string terminators, causing truncation
  2. Inconsistent behavior: Different implementations handle them differently (truncate, reject, or pass through)

For example, this offer and BOLT11 invoice cannot be decoded in CLN, but rust-lightning can handle them.

Test Vectors:

BOLT11:

lnbc100n1p70xwfzpp5qqqsyqcyq5rqwzqfqqqsyqcyq5rqwzqfqqqsyqcyq5rqwzqfqypqdrv2pkx2ctnv5sxxmmwwd5kgetjypeh2ursdae8g6twvus8g6rfwvs8qun0dfjkxaqqqpmkjargyph82mrvyp38jar9wvqx2mtzv4jxgetyqqqqnp4q0n326hr8v9zprg8gsvezcch06gfaqqhde2aj730yg0durunfhv66sp5qszsvpcgpyqsyps8pqysqqgzqvyqjqqpqgpsgpgqqypqxpq9qcrs9qrsgq2srkxv0a8uu02qvtcvlt5ex354axardkn8z0t59twhsk7qn660gqw0l8ygtfvpdnt8u892qhmp85eueccvnmxm7frkk9mzscfajvgfqq00jpr3

Description: Please consider supporting this project\x00\x00with null bytes\x00embedded\x00\x00

BOLT12 offer:

lno1pgx9getnwsq8vetrw3hhyucs5ypjgef743p5fzqq9nqxh0ah7y87rzv3ud0eleps9kl2d5348hq2k8qzqgpqyqszqgpqyqszqgpqyqszqgpqyqszqgpqyqszqgpqyqszqgpqyqszqgpqyqszqgpqyqszqgpqyqszqgpqyqszqgpqyqszqgpqyqszqgqpqqqqqqqqqqqqqqqqqqqqqqqqqqqzqgpqyqszqgpqyqszqgpqyqszqgpqyqszqgpqyqszqgpqyqszqgpqqzq3zyg3zyg3zygs

Description: Test\x00vectors

Proposed Solution

Explicitly Prohibit Null Bytes
Text fields MUST contain valid UTF-8 and MUST NOT contain null bytes (U+0000).
Implementations MUST reject messages containing null bytes in text fields.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions