Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.syntblaze.com/llms.txt

Use this file to discover all available pages before exploring further.

A string in C# is a reference type that represents a sequential, read-only memory buffer of char objects, encoded as UTF-16 code units. It is a language keyword that serves as a direct alias for the .NET System.String class.

Immutability and StringBuilder

Strings are strictly immutable. Once a string object is instantiated, its state cannot be modified. Any operation that appears to mutate a string—such as concatenation, replacement, or trimming—actually allocates and returns a new System.String instance on the managed heap, leaving the original object in memory unchanged until it is garbage collected. Because repeated allocations cause significant performance overhead, C# provides the System.Text.StringBuilder class for scenarios requiring extensive string manipulation, such as loops. StringBuilder maintains a mutable memory buffer, allowing efficient in-place modifications without generating intermediate string allocations.
// Inefficient: Allocates a new string on the heap in every iteration
string result = "";
for (int i = 0; i < 100; i++)
{
    result += i.ToString(); 
}

// Efficient: Mutates a single buffer, allocating a string only at the end
StringBuilder sb = new StringBuilder();
for (int i = 0; i < 100; i++)
{
    sb.Append(i);
}
string optimizedResult = sb.ToString();

Null, Empty, and Validation

As a reference type, a string variable can be null (pointing to no memory address). This is distinct from an empty string, which is a valid, allocated object with a length of zero. Both the empty string literal "" and the string.Empty field point to the exact same interned memory reference, meaning there is no allocation difference between the two. The primary distinction is that "" is a compile-time constant (usable in attributes, switch cases, and default parameters), whereas string.Empty is a static readonly field. C# provides built-in utility methods to safely evaluate these states.
string nullString = null;
string emptyString = string.Empty; // static readonly field
string literalEmpty = "";          // Compile-time constant, identical in memory to string.Empty
string whitespaceString = "   ";

// Validation methods
bool isNullOrEmpty = string.IsNullOrEmpty(emptyString);           // True
bool isNullOrWhiteSpace = string.IsNullOrWhiteSpace(whitespaceString); // True

Memory Architecture and Interning

To optimize memory footprint, the .NET runtime implements string interning. During compilation, the C# compiler deduplicates identical string literals and stores them within the assembly’s metadata (specifically, the #US or User Strings stream). When the assembly is loaded and executed, the .NET runtime (CLR) reads this metadata and places these literals into an internal hash table called the intern pool. This ensures only one instance of each unique literal exists in memory.
string literal1 = "architecture";
string literal2 = "architecture";

// True: Both point to the exact same memory address in the intern pool
bool isSameReference = object.ReferenceEquals(literal1, literal2); 
Dynamically constructed strings (e.g., via concatenation at runtime) are allocated in standard heap memory and are not interned by default, though they can be manually interned using string.Intern().

Syntax and Literals

C# provides multiple literal formats to construct strings, dictating how the compiler parses escape sequences and formatting:
// 1. Regular String Literal (processes standard escape sequences)
string regular = "Line 1\nLine 2\tTabbed";

// 2. Verbatim String Literal (ignores escape sequences, allows multiline)
string verbatim = @"C:\Program Files\App\config.json";

// 3. Interpolated String (evaluates expressions at runtime)
int capacity = 256;
string interpolated = $"Buffer size: {capacity} bytes";

// 4. Raw String Literal (C# 11+) (preserves exact formatting, ignores escapes)
// Note: Backslashes are doubled here to represent valid JSON syntax.
string raw = """
    {
        "compiler": "Roslyn",
        "path": "C:\\data"
    }
    """;

// 5. UTF-8 String Literal (C# 11+) (Returns ReadOnlySpan<byte>, not System.String)
ReadOnlySpan<byte> utf8String = "UTF-8 Encoded"u8;

Equality Semantics

Despite being a reference type, the string class overloads the equality (==) and inequality (!=) operators to perform value equality. When comparing two strings, the runtime first checks for reference equality and length equality as an optimization, then performs a character-by-character (ordinal) comparison of the UTF-16 code units.
string a = "data";
string b = new string(new char[] { 'd', 'a', 't', 'a' });

bool isValueEqual = (a == b);                   // True: character sequences match
bool isRefEqual = object.ReferenceEquals(a, b); // False: distinct heap allocations

Internal Structure and Indexing

A string implements IEnumerable<char> and IEquatable<string>. It provides an indexer to access individual char elements (16-bit integers). Under the hood, strings are not backed by char[] arrays. Instead, the character data is allocated inline within the string object’s memory footprint itself, alongside the object header and length. Because strings are sequences of UTF-16 code units, the Length property returns the number of char objects, which does not always equate to the number of rendered Unicode characters (grapheme clusters). Characters outside the Basic Multilingual Plane (BMP) require two char objects (a surrogate pair) to represent a single grapheme.
string text = "A";
char firstCodeUnit = text[0];    // 'A'
int codeUnitCount = text.Length; // 1

// Surrogate pair example (Emoji)
string emoji = "🚀";
int emojiLength = emoji.Length;  // 2 (High and Low surrogate code units)
Master C# with Deep Grasping Methodology!Learn More