Skip to content

Conversation

Copy link
Contributor

Copilot AI commented Dec 6, 2025

  • Understand the issue: SourceFile.Save() uses Encoding.UTF8 which adds BOM, breaking shebang lines
  • Locate the problematic code in FileLevelDirectiveHelpers.cs
  • Identify test infrastructure in FileBasedAppSourceEditorTests.cs
  • Fix SourceFile struct to preserve encoding using SourceText.Encoding property
  • Add test case to verify BOM is not added when file doesn't have one
  • Add test case to verify BOM is preserved when file has one
  • Build and run new tests to validate the fix (tests pass)
  • Address PR feedback (round 1):
    • Use stackalloc for 3-byte BOM detection buffer (with netstandard2.0 fallback)
    • Leverage SourceText's Encoding property instead of custom tracking
    • Update tests to use _testAssetsManager.CreateTestDirectory() infrastructure
    • Add issue reference to test documentation
  • Address PR feedback (round 2):
    • Move <see href/> inside <summary> elements
    • Simplify to use SourceText's built-in encoding auto-detection (pass encoding: null)
    • Remove custom BOM detection logic entirely
  • Manually verify fix works end-to-end with shebang scripts
  • Reply to all PR comments
Original prompt

This section details on the original issue you should resolve

<issue_title>file-based apps: dotnet add file.cs package changes encoding of file.cs</issue_title>
<issue_description>### Describe the bug
The SDK changes encoding of file-based app files when adding packages.

To Reproduce

1 - Create a a file.cs (UTF-8 encoded)
2 - Add a shebang : #!/usr/bin/env dotnet run and chmod +x file.cs
3- Run it: ./file.cs
4 - Add any package dotnet add file.cs AnyNugetPackageIdHere => file.cs is converted to UTF-8 with BOM
5 - Running ./file.cs again after the package is added results in failure:

> ./file.cs
./file.cs: line 1: #!/usr/bin/env: No such file or directory

Further technical details

.NET 10 SDK, 1.0.100</issue_description>

<agent_instructions>Look at the root cause analysis and suggested fix in the issue comments</agent_instructions>

Comments on the Issue (you are @copilot in this section)

@DamianEdwards Analysis from Copilot:

Root Cause Analysis

The issue is in the SourceFile struct in [src/Cli/Microsoft.DotNet.FileBasedPrograms/FileLevelDirectiveHelpers. cs](https://github.com/dotnet/sdk/blob/e300dea82e0248b0801f706914c96073c1bd5350/src/Cli/Microsoft.DotNet. FileBasedPrograms/FileLevelDirectiveHelpers.cs#L256-L274):

public void Save()
{
    using var stream = File.Open(Path, FileMode.Create, FileAccess. Write);
    using var writer = new StreamWriter(stream, Encoding.UTF8);
    Text.Write(writer);
}

Problem: Encoding. UTF8 (the static property) emits a UTF-8 BOM by default. When dotnet add file. cs package <package> modifies the file and saves it, the BOM (0xEF 0xBB 0xBF) is prepended to the file content.

On Unix-like systems, the shebang #!/usr/bin/env dotnet run becomes <BOM>#!/usr/bin/env dotnet run, which the kernel doesn't recognize as a valid interpreter directive, causing:

./file.cs: line 1: #!/usr/bin/env: No such file or directory

Fix: The Save() method should preserve the original file's encoding/BOM characteristics. Specifically, for files without a BOM, it should use:

new UTF8Encoding(encoderShouldEmitUTF8Identifier: false)

The Load() method should also detect and store whether the original file had a BOM so it can be preserved on save. </comment_new>


💬 We'd love your input! Share your thoughts on Copilot coding agent in our 2 minute survey.

- Modified SourceFile.Load() to detect UTF-8 BOM in files
- Modified SourceFile.Save() to preserve original BOM state
- Added HasUtf8Bom property to SourceFile struct
- Updated InternalAPI.Unshipped.txt with new API members
- Added tests to verify BOM preservation (no-BOM and with-BOM cases)

Fixes issue where dotnet add package command was adding BOM to files
that originally didn't have one, breaking shebang scripts.

Co-authored-by: DamianEdwards <249088+DamianEdwards@users.noreply.github.com>
Copilot AI changed the title [WIP] Fix SDK encoding issue when adding package to file-based apps Preserve UTF-8 BOM state in file-based app source files Dec 6, 2025
Copilot AI requested a review from DamianEdwards December 6, 2025 20:19
Copilot finished work on behalf of DamianEdwards December 6, 2025 20:19
Copy link
Member

@jjonescz jjonescz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jjonescz jjonescz added the Area-run-file Items related to the "dotnet run <file>" effort label Dec 7, 2025
- Use SourceText's Encoding property instead of custom HasUtf8Bom tracking
- Use stackalloc for 3-byte buffer on .NET Core (with fallback for netstandard2.0)
- Update tests to use _testAssetsManager.CreateTestDirectory() infrastructure
- Add issue reference (#52054) to test docs
- Remove unnecessary API surface from InternalAPI.Unshipped.txt

Co-authored-by: jjonescz <3669664+jjonescz@users.noreply.github.com>
Copilot AI requested a review from jjonescz December 7, 2025 20:58
Copilot finished work on behalf of jjonescz December 7, 2025 20:58
Copy link
Member

@jjonescz jjonescz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

- Remove custom DetectEncoding() method and use SourceText.From(stream, encoding: null)
- SourceText automatically detects encoding including BOM when encoding is null
- Move <see href/> tags inside <summary> elements for better documentation
- Simplifies implementation while maintaining full functionality

Co-authored-by: jjonescz <3669664+jjonescz@users.noreply.github.com>
Copilot AI requested a review from jjonescz December 7, 2025 21:27
Copilot finished work on behalf of jjonescz December 7, 2025 21:27
@jjonescz jjonescz requested a review from a team December 8, 2025 09:11
@jjonescz jjonescz marked this pull request as ready for review December 8, 2025 09:12
Copilot AI review requested due to automatic review settings December 8, 2025 09:12
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR fixes a critical bug where dotnet add file.cs package <package> was inadvertently adding a UTF-8 BOM to file-based app source files, breaking shebang (#!) scripts on Unix-like systems. The fix preserves the original encoding state (with or without BOM) by leveraging Roslyn's SourceText encoding detection and preservation capabilities.

Key Changes:

  • Modified SourceFile.Load() to auto-detect encoding by passing encoding: null to SourceText.From()
  • Updated SourceFile.Save() to use the detected encoding from SourceText.Encoding property
  • Added comprehensive tests verifying BOM preservation for both cases (with and without BOM)

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated no comments.

File Description
src/Cli/Microsoft.DotNet.FileBasedPrograms/FileLevelDirectiveHelpers.cs Updated SourceFile.Load() to auto-detect encoding and SourceFile.Save() to preserve it using SourceText.Encoding
test/dotnet.Tests/CommandTests/Run/FileBasedAppSourceEditorTests.cs Added two test cases: PreservesNoBomEncoding() for files without BOM and PreservesBomEncoding() for files with BOM

var bytes = File.ReadAllBytes(tempFile);
Assert.True(bytes.Length >= 3 && bytes[0] == 0xEF && bytes[1] == 0xBB && bytes[2] == 0xBF,
"File should have UTF-8 BOM");
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be good to verify the behavior when original file uses some other encoding besides UTF-8.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Area-run-file Items related to the "dotnet run <file>" effort

Projects

None yet

Development

Successfully merging this pull request may close these issues.

file-based apps: dotnet add file.cs package <package> changes encoding of file.cs

4 participants