Skip to content

Provides an abstraction for creating the JavaScriptEncoder used in SystemTextConfigurationEditorJsonSerializer #19849

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Conversation

AndyButland
Copy link
Contributor

Prerequisites

  • I have added steps to test this contribution in the description below

Provides a solution to #19812

Description

When serializing the data type configuration to the database we use SystemTextConfigurationEditorJsonSerializer which uses a default JavaScriptEncoder. The linked issue shows how a project may want to customize this, but controlling how particular UnicodeRanges are encoded.

It doesn't seem we can just change the default behaviour here - as we don't know what ranges are appropriate to consider for a given project. We could provide UnicodeRanges.All, but at minimum that would be a behaviourally breaking change and at worst could lead to errors where certain strings aren't encoded with escaping as we'd expect.

So I've added an interface with a default implementation of the current behaviour to allow this to be overridden.

Testing

Verify that for example a check box list with values containing non-Latin characters can be saved with configuration and used in content.
Verify that the non-Latin values continue to be escaped in the database, via:

select * from umbracoPropertyData where PropertyTypeId = {id}
select * from umbracoDataType where nodeid = {id}

Add a custom implementation such as the following and verify that the expect changes to the stored values in the database are made, that the data type configuration can be saved and that content using the data type can be updated.

using System.Text.Encodings.Web;
using System.Text.Unicode;
using Umbraco.Cms.Core.Composing;
using Umbraco.Cms.Core.Serialization;

namespace Umbraco.Cms.Web.UI.Custom.SystemTextConfigurationEditor;

public class SystemTextConfigurationEditorComposer : IComposer
{
    public void Compose(IUmbracoBuilder builder)
    {
        builder.Services.AddUnique<IConfigurationEditorJsonSerializerEncoderFactory, MyConfigurationEditorJsonSerializerEncoderFactory>();
    }
}

internal class MyConfigurationEditorJsonSerializerEncoderFactory : IConfigurationEditorJsonSerializerEncoderFactory
{
    public JavaScriptEncoder CreateEncoder() => JavaScriptEncoder.Create(UnicodeRanges.BasicLatin, UnicodeRanges.Cyrillic);
}

@Copilot Copilot AI review requested due to automatic review settings August 4, 2025 09:25
Copilot

This comment was marked as outdated.

@AndyButland AndyButland requested a review from Copilot August 4, 2025 09:47
Copy link
Contributor

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR introduces an abstraction for creating the JavaScriptEncoder used in SystemTextConfigurationEditorJsonSerializer to address requirements for customizing Unicode character encoding in data type configurations. The solution allows projects to control how specific UnicodeRanges are encoded while maintaining backward compatibility.

  • Adds an interface and factory pattern for encoder creation
  • Updates SystemTextConfigurationEditorJsonSerializer to accept the encoder factory via dependency injection
  • Updates all test files to use the new constructor with the default encoder factory

Reviewed Changes

Copilot reviewed 17 out of 17 changed files in this pull request and generated 1 comment.

Show a summary per file
File Description
src/Umbraco.Core/Serialization/IConfigurationEditorJsonSerializerEncoderFactory.cs Defines interface for encoder factory
src/Umbraco.Infrastructure/Serialization/DefaultConfigurationEditorJsonSerializerEncoderFactory.cs Provides default implementation using BasicLatin Unicode range
src/Umbraco.Infrastructure/Serialization/SystemTextConfigurationEditorJsonSerializer.cs Updates serializer to use encoder factory and adds obsolete constructor
src/Umbraco.Infrastructure/DependencyInjection/UmbracoBuilder.CoreServices.cs Registers default encoder factory implementation
tests/* Updates all test instantiations to use new constructor

@nikolajlauridsen
Copy link
Contributor

Looks good, and works for configurations; however, the values themselves are escaped in the property data table, for instance:

Configuration:
image

Saved values:
image

This tracks with the changes of this PR, but I'm wondering if the issue reported didn't relate to both of these options? Both the stored value and the configuration? 🤔

@AndyButland
Copy link
Contributor Author

This tracks with the changes of this PR, but I'm wondering if the issue reported didn't relate to both of these options? Both the stored value and the configuration? 🤔

Yes, I thought that was what I was seeing when I tested - i.e. the configuration and the property data both encoded as per the JavaScriptEncoder settings. I'll check again, but first just want to check you resaved the content after updating the code? As you would need to do that. Just resaving the data type wouldn't be enough.

@nikolajlauridsen
Copy link
Contributor

Yes, I resaved 😄 also threw a debugger on the value editor and could see that no custom encoder was configured.

I'm pretty sure this is because the configurations have their specific JSON serializer, which tracks with the IConfigurationEditorJsonSerializerEncoderFactory name as well 😄

An alternative solution could be to just introduce an IJsonSerializerEncoderFactory, and use that in both SystemTextConfigurationEditorJsonSerializer and SystemTextJsonSerializer potentially also SystemTextWebhookJsonSerializer it really depends on how granular we want people to control this,

@AndyButland
Copy link
Contributor Author

Thanks, I'll take a further look at this. On the face of it feels like this should be consistent, so one factory for all would make sense. Likely a project needing to use non Latin characters would want them serialized the same in all three of these places.

@nikolajlauridsen
Copy link
Contributor

Yeah, I completely agree, that seems to make the most sense to me too. Additionally, if the need for separate encodings arises we can extend the factory with a method to handle this, that is the benefit of the factory pattern after all 😄

….Tex.Json serializers.

Added the serializer's name as a parameter to allow for different encodings per serializer if required.
@AndyButland
Copy link
Contributor Author

I've pushed an update that should resolve the issue raised in the way we discussed.

Also just addressing the theoretical requirement for different encoders for different serializers, I've added a parameter to the factory method where the name of the serializer class can be passed, and used to customize the encoder returned. Seems reasonable, but maybe you think there's something better to provide here - I wondered about introducing an Alias property on the IJsonSerializer?

@nikolajlauridsen
Copy link
Contributor

nikolajlauridsen commented Aug 6, 2025

Tests out good now 👍

I think it is reasonable with the name as you did, however, I think a potentially more elegant solution could be to use generics, and pass the class through that way, I.E. JavaScriptEncoder.CreateEncoder<TSerializer>() that way you don't have to add an alias, and it gives more freedom to the implementation of IJsonSerializerEncoderFactory in how it decides the correct encoding, it could be class name, inherited class name, interface, all kinds of things 😄

@AndyButland
Copy link
Contributor Author

That's a much nicer idea, thanks. I've implemented that now.

Copy link
Contributor

@nikolajlauridsen nikolajlauridsen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great now 🤩

@nikolajlauridsen nikolajlauridsen merged commit 44aa5dc into main Aug 7, 2025
24 of 25 checks passed
@nikolajlauridsen nikolajlauridsen deleted the v16/improvement/allow-encoder-customization-in-system-text-configuration-editor-json-serializer branch August 7, 2025 07:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants