The behavior for unassigned codepoint of Shift_JIS is incompatible with WHATWG spec

### Version

v18.5.0

### Platform

_No response_

### Subsystem

_No response_

### What steps will reproduce the bug?

```js
const decoder = new TextDecoder('Shift_JIS');
const s = decoder.decode(new Uint8Array([255]));
```

### How often does it reproduce? Is there a required condition?

Always

### What is the expected behavior?

```js
const decoder = new TextDecoder('Shift_JIS');
const s = decoder.decode(new Uint8Array([255]));
console.log(s) // '�' === '\ufffd'
```

According to [WHATWG spec](https://encoding.spec.whatwg.org/#:~:text=the%0A%20%20associated%20steps%3A-,%22replacement%22,-Push%20U%2BFFFD), any decoder should use `�(U+FFFD)` when an unassigned codepoint is found during decoding.

### What do you see instead?

```js
const decoder = new TextDecoder('Shift_JIS');
const s = decoder.decode(new Uint8Array([255]));
console.log(s) // '\x1A'
```

From my investigation, ICU intentionally uses `\x1A` for unassigned codepoint on Shift_JIS encoding, and Node.js uses it as it is.
[Conversion Data - ICU Documentation](https://unicode-org.github.io/icu/userguide/conversion/data.html#:~:text=conversion%20from%20a%20codepage%20to%20unicode%20occurs%20and%20an%20unassigned%20codepoint%20is%20found)
[Which substitution character is used if a character cannot be converted?](https://documentation.softwareag.com/natural/nat914unx/unicode/uni-faq.htm#:~:text=This%20depends%20on,page%20is%20used.)

### Additional information

ICU provides the utility `ucnv_setSubstChars` to specify substitution characters for any encoding, and Node.js  already has it in library. I'm working on this.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

The behavior for unassigned codepoint of Shift_JIS is incompatible with WHATWG spec #43962

Version

Platform

Subsystem

What steps will reproduce the bug?

How often does it reproduce? Is there a required condition?

What is the expected behavior?

What do you see instead?

Additional information

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

The behavior for unassigned codepoint of Shift_JIS is incompatible with WHATWG spec #43962

Description

Version

Platform

Subsystem

What steps will reproduce the bug?

How often does it reproduce? Is there a required condition?

What is the expected behavior?

What do you see instead?

Additional information

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions