Add parsing tests for string and bytes literals #489

hudlow · 2025-11-06T06:00:04Z

String and byte literals including escape sequences, triple-quoted multiline variations, and raw variants are currently largely untested in the conformance suite. This PR is intended to fully remediate this gap.

Notably, cel-go fails tests that have an unescaped carriage return in a triple-quoted string — it seemingly canonicalizes carriage returns to line feeds. I don't think there should be any doubt that this is, in fact, a bug in cel-go.

hudlow · 2025-11-06T16:56:09Z

@TristonianJones A quick review would be appreciated — I'd rather not merge my fixes to cel-es without tests.

TristonianJones · 2025-11-06T19:26:20Z

/gcbrun

TristonianJones · 2025-11-06T23:00:07Z

tests/simple/testdata/parse.textproto

+  }
+
+  test {
+    name: "single_quoted_escaped_carriage_return"


It looks like the CEL stacks (C++, Java, and Go) normalize carriage returns to from \r -> \n. I don't think this makes a huge difference, but it can simplify certain things. What are your feelings here?

In my view, there are many arguments to be made for many normalizations, and I don't think this one is so overwhelmingly compelling to be the one exception. Also, doing the normalization with byte literals seems categorically wrong to me.

I'm surprised the behavior is consistent — I figured it was a fluke in Go but didn't test the others. There might be some angle I'm not considering, but absent that, it would be my preference that we follow the spec and add the conformance tests as they are.

It's actually a decision that I can't quite pin down unless the same normalization is happening in Google SQL ... I'll have to check and get back to you. It shouldn't actually be a problem to remove though it may result in some change detector test failures

Yep, this came from GoogleSQL:

https://github.com/google/zetasql/blob/c3f174111ccfbd6b5b0b112925ef962a3a5c8d47/zetasql/public/strings.cc#L198

Looks like we preserved this behavior for full literal parity between CEL and SQL

If we remove support from CEL, Google SQL will probably still work just fine ... I'll try to test tomorrow to see what happens to figure out if we can relax the constraint

Thanks for the quick follow-up! I went ahead and added tests for \r\n sequences as well, since I hadn't thought about those being special-cased. Let me know what you figure out and I can adjust as needed.

It certainly isn't the end of the world if we need to codify the line-ending normalization into the CEL spec.

Let's update the tests for the moment to the current behavior to get you unblocked. My GoogleSQL translation checks seem to not result in any issues, but I'll need to do some broader tests to validate that I can drop the normalization in a timely fashion.

I opened #490 and commented out the offending tests so we don't create an explicit gap between the spec and the conformance suite.

TristonianJones

Some of these cases overlap with the basic.textproto test cases, but I don't mind that they do. I just had one question regarding carriage return handling.

hudlow · 2025-11-07T06:08:11Z

@TristonianJones Thanks for the quick review! I responded and also added one more (unrelated) commit for some tests I forgot to add (unassigned code points in string literals).

TristonianJones · 2025-11-08T02:48:11Z

/gcbrun

TristonianJones · 2025-11-08T20:45:31Z

/gcbrun

Add parsing tests for string and bytes literals

8d1d9c4

hudlow mentioned this pull request Nov 6, 2025

Fix for parser bug processing multiline strings bufbuild/cel-es#231

Merged

TristonianJones reviewed Nov 6, 2025

View reviewed changes

Add tests for unassigned code points in string literals

c425788

hudlow force-pushed the string-parsing-tests branch from 77d9616 to c425788 Compare November 7, 2025 06:00

Add tests for windows-line-end sequences in string literals

10a0643

hudlow mentioned this pull request Nov 8, 2025

Codify the lack of line-end normalization for triple-quoted strings #490

Open

defer line-ending normalization tests per google#490

1c6bb7f

TristonianJones approved these changes Nov 8, 2025

View reviewed changes

TristonianJones merged commit 7f3c4c5 into google:master Nov 8, 2025
2 checks passed

Add parsing tests for string and bytes literals #489

Add parsing tests for string and bytes literals #489

Uh oh!

Conversation

hudlow commented Nov 6, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

hudlow commented Nov 6, 2025

Uh oh!

TristonianJones commented Nov 6, 2025

Uh oh!

TristonianJones Nov 6, 2025

Choose a reason for hiding this comment

Uh oh!

hudlow Nov 7, 2025

Choose a reason for hiding this comment

Uh oh!

TristonianJones Nov 7, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

TristonianJones Nov 7, 2025

Choose a reason for hiding this comment

Uh oh!

TristonianJones Nov 7, 2025

Choose a reason for hiding this comment

Uh oh!

hudlow Nov 7, 2025

Choose a reason for hiding this comment

Uh oh!

TristonianJones Nov 8, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

hudlow Nov 8, 2025

Choose a reason for hiding this comment

Uh oh!

TristonianJones left a comment

Choose a reason for hiding this comment

Uh oh!

hudlow commented Nov 7, 2025

Uh oh!

TristonianJones commented Nov 8, 2025

Uh oh!

TristonianJones commented Nov 8, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

hudlow commented Nov 6, 2025 •

edited

Loading

TristonianJones Nov 7, 2025 •

edited

Loading

TristonianJones Nov 8, 2025 •

edited

Loading