Fix UTF-8 BOM file type detection for first-line syntax patterns #3315

krikera · 2025-05-30T21:44:28Z

This fix strips the UTF-8 BOM character from the first line before performing syntax detection using strip_prefix('\u{feff}'), ensuring that files with BOM are handled correctly.

Changes

Modified get_first_line_syntax in src/assets.rs to strip BOM before pattern matching
Added comprehensive test coverage for XML, shell scripts, and PHP files with BOM
Updated CHANGELOG.md

Fixes #3314

sharkdp#3314

keith-hall

Nice, thanks

Fix UTF-8 BOM file type detection for first-line syntax patterns - Fixes

17e6952

sharkdp#3314

keith-hall approved these changes May 31, 2025

View reviewed changes

keith-hall merged commit 6886cda into sharkdp:master May 31, 2025
23 of 24 checks passed

krikera deleted the fix-utf8-bom-syntax-detection branch June 1, 2025 14:28

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Fix UTF-8 BOM file type detection for first-line syntax patterns #3315

Fix UTF-8 BOM file type detection for first-line syntax patterns #3315

Uh oh!

krikera commented May 30, 2025

Uh oh!

keith-hall left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Fix UTF-8 BOM file type detection for first-line syntax patterns #3315

Fix UTF-8 BOM file type detection for first-line syntax patterns #3315

Uh oh!

Conversation

krikera commented May 30, 2025

Changes

Uh oh!

keith-hall left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants