-
Notifications
You must be signed in to change notification settings - Fork 1.2k
Preserve UTF-8 BOM state in file-based app source files #52055
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
- Modified SourceFile.Load() to detect UTF-8 BOM in files - Modified SourceFile.Save() to preserve original BOM state - Added HasUtf8Bom property to SourceFile struct - Updated InternalAPI.Unshipped.txt with new API members - Added tests to verify BOM preservation (no-BOM and with-BOM cases) Fixes issue where dotnet add package command was adding BOM to files that originally didn't have one, breaking shebang scripts. Co-authored-by: DamianEdwards <249088+DamianEdwards@users.noreply.github.com>
src/Cli/Microsoft.DotNet.FileBasedPrograms/FileLevelDirectiveHelpers.cs
Outdated
Show resolved
Hide resolved
jjonescz
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
src/Cli/Microsoft.DotNet.FileBasedPrograms/FileLevelDirectiveHelpers.cs
Outdated
Show resolved
Hide resolved
test/dotnet.Tests/CommandTests/Run/FileBasedAppSourceEditorTests.cs
Outdated
Show resolved
Hide resolved
test/dotnet.Tests/CommandTests/Run/FileBasedAppSourceEditorTests.cs
Outdated
Show resolved
Hide resolved
- Use SourceText's Encoding property instead of custom HasUtf8Bom tracking - Use stackalloc for 3-byte buffer on .NET Core (with fallback for netstandard2.0) - Update tests to use _testAssetsManager.CreateTestDirectory() infrastructure - Add issue reference (#52054) to test docs - Remove unnecessary API surface from InternalAPI.Unshipped.txt Co-authored-by: jjonescz <3669664+jjonescz@users.noreply.github.com>
jjonescz
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
src/Cli/Microsoft.DotNet.FileBasedPrograms/FileLevelDirectiveHelpers.cs
Outdated
Show resolved
Hide resolved
- Remove custom DetectEncoding() method and use SourceText.From(stream, encoding: null) - SourceText automatically detects encoding including BOM when encoding is null - Move <see href/> tags inside <summary> elements for better documentation - Simplifies implementation while maintaining full functionality Co-authored-by: jjonescz <3669664+jjonescz@users.noreply.github.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This PR fixes a critical bug where dotnet add file.cs package <package> was inadvertently adding a UTF-8 BOM to file-based app source files, breaking shebang (#!) scripts on Unix-like systems. The fix preserves the original encoding state (with or without BOM) by leveraging Roslyn's SourceText encoding detection and preservation capabilities.
Key Changes:
- Modified
SourceFile.Load()to auto-detect encoding by passingencoding: nulltoSourceText.From() - Updated
SourceFile.Save()to use the detected encoding fromSourceText.Encodingproperty - Added comprehensive tests verifying BOM preservation for both cases (with and without BOM)
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated no comments.
| File | Description |
|---|---|
| src/Cli/Microsoft.DotNet.FileBasedPrograms/FileLevelDirectiveHelpers.cs | Updated SourceFile.Load() to auto-detect encoding and SourceFile.Save() to preserve it using SourceText.Encoding |
| test/dotnet.Tests/CommandTests/Run/FileBasedAppSourceEditorTests.cs | Added two test cases: PreservesNoBomEncoding() for files without BOM and PreservesBomEncoding() for files with BOM |
| var bytes = File.ReadAllBytes(tempFile); | ||
| Assert.True(bytes.Length >= 3 && bytes[0] == 0xEF && bytes[1] == 0xBB && bytes[2] == 0xBF, | ||
| "File should have UTF-8 BOM"); | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be good to verify the behavior when original file uses some other encoding besides UTF-8.
SourceFile.Save()usesEncoding.UTF8which adds BOM, breaking shebang linesFileLevelDirectiveHelpers.csFileBasedAppSourceEditorTests.csSourceFilestruct to preserve encoding using SourceText.Encoding propertystackallocfor 3-byte BOM detection buffer (with netstandard2.0 fallback)_testAssetsManager.CreateTestDirectory()infrastructure<see href/>inside<summary>elementsencoding: null)Original prompt
💬 We'd love your input! Share your thoughts on Copilot coding agent in our 2 minute survey.