Skip to content

Instantly share code, notes, and snippets.

@jaredpar
Created May 29, 2026 01:12
Show Gist options
  • Select an option

  • Save jaredpar/f3cdeb2aedbda86aec25f0776a9e8776 to your computer and use it in GitHub Desktop.

Select an option

Save jaredpar/f3cdeb2aedbda86aec25f0776a9e8776 to your computer and use it in GitHub Desktop.

x86 Test Host OOM Analysis

Summary

Two testhost.net472.x86.exe processes crashed with OOM on a Helix CI agent. Root cause: Roslyn's TemporaryStorageService accumulated 525+ unique 8 MB memory-mapped file (MMF) sections, consuming ~4,200 MB of virtual address space — exceeding the x86 process limit of 4 GB.

Environment

Property Value
Process testhost.net472.x86.exe
CLR .NET Framework 4.8.9325.0
SDK 10.0.108
LARGEADDRESSAWARE Yes (4 GB VA limit)
Architecture x86 on x64 (WoW64)

Dump Files

Dump Size Captured Uptime
testhost.net472.x86_1228_20260527T193328_hangdump.dmp 3.01 GB May 27, 2026 12:33 21 min
testhost.net472.x86_6368_20260528T171209_hangdump.dmp 3.30 GB May 28, 2026 10:12 28 min

VA Space Breakdown (Dump 1)

Category Committed Est. VA Cost
Private RW (GC heap + native) 1,363 MB 1,363 MB
Mapped R (memory-mapped files) 1,343 MB ~1,733 MB
Image (DLL code/data) 346 MB 346 MB
Other committed 23 MB 23 MB
Reserved 135 MB 135 MB
Total 3,210 MB ~3,600 MB
Remaining of 4 GB ~400 MB

Key Observations

  • 57 × 16 MB Private RW regions = GC heap segments (~912 MB)
  • 27,730 Mapped R regions (mostly 16–64 KB each), consuming 1,343 MB committed but ~1,733 MB VA due to 64 KB allocation granularity
  • Native heaps: 8 heaps, ~48 MB committed — not the cause

Root Cause: Roslyn Shared File Sections

Handle Analysis

Parsing the HandleDataStream from the minidump revealed:

Metric Dump 1 Dump 2
Total section handles 1,055 1,008
Unique 8 MB section IDs 525 296
Unique small (425 KB) section IDs 199
Avg handles per 8 MB file ~1.6
Max VA from 8 MB sections 4,200 MB 2,368 MB
Unnamed section handles 145

Handle name pattern:

\Sessions\1\BaseNamedObjects\Roslyn Shared File: Size=8388608 Id=<guid>

Mapped Region Content

Scanning the first bytes of all 27,725 mapped R regions:

Content Type Regions Size
BSJB (.NET metadata blobs) 1,231 157.4 MB
MZ (PE files) 11 2.6 MB
Metadata streams (#Strings, #US, #Blob, etc.) 26,483 1,182.5 MB

The vast majority are .NET assembly metadata sections — individual streams from the CLI metadata tables mapped as separate views.

Source Code Analysis

All paths lead to TemporaryStorageService in Microsoft.CodeAnalysis.Workspaces. This service stores source text and binary data in named Windows memory-mapped files so that data can be dropped from the GC heap and recovered on demand — or shared with the OOP (out-of-process) Roslyn service via the named MMF handle.

Name Creation

TemporaryStorageService.cs line 241:

public static string? CreateUniqueName(long size)
{
    return PlatformInformation.IsWindows || PlatformInformation.IsRunningOnMono
        ? $"Roslyn Shared File: Size={size} Id={Guid.NewGuid():N}"
        : null;
}

Every call gets a fresh GUID — there is no reuse of names.

Allocation Strategy

CreateTemporaryStorage() (line 203) uses a two-tier scheme:

  • Items < 256 KB (SingleFileThreshold) → bump-pointer allocated into shared 8 MB MMFs (MultiFileBlockSize = 256 KB × 32)
  • Items ≥ 256 KB → dedicated MMF per item (each gets its own Roslyn Shared File section)

The service holds exactly one mutable field — _fileReference — pointing to the current 8 MB MMF being filled. Once the current file can't fit the next allocation (_offset + size > _fileSize), a brand-new 8 MB MMF is created:

// Simplified from CreateTemporaryStorage()
lock (_gate)
{
    if (_fileReference == null || _offset + size > _fileSize)
    {
        var mapName = CreateUniqueName(MultiFileBlockSize);
        _fileReference = MemoryMappedFile.CreateNew(mapName, MultiFileBlockSize);
        _name = mapName;
        _fileSize = MultiFileBlockSize;
        _offset = size;
        return new MemoryMappedInfo(_fileReference, _name, offset: 0, size: size);
    }
    // ... bump pointer in existing file
}

The old _fileReference value is simply overwritten. The previous MemoryMappedFile object is not disposed — it stays alive because the MemoryMappedInfo objects returned to callers hold a direct reference to it. There is no tracking collection, no cap, no pool, and no eviction.

Write Sites (Who Creates These MMFs)

There are three callers relevant to the test host scenario:

1. RecoverableTextAndVersionRecoverableTextAndVersion.cs:195

Every document's source text gets saved to an MMF so the text can be dropped from the managed heap. The text is stored as UTF-16 (2 bytes/char), so a typical source file of 10–50 KB becomes 20–100 KB in the MMF. With hundreds of source files per workspace snapshot, this is likely the dominant contributor to MMF count — many small items packing into 8 MB blocks.

2. SerializerServiceSerializerService_Reference.cs:453

When metadata references (assembly PE data) are deserialized from the OOP service, each module's metadata is written into an MMF. A test host referencing dozens of assemblies will create one MMF entry per module. Since PE metadata for a single assembly can be hundreds of KB to several MB, many of these exceed the 256 KB threshold and get their own dedicated MMF section.

3. SkeletonReferenceCacheSolutionCompilationState.SkeletonReferenceCache.cs:258

Cross-language project references (e.g., a VB project referencing a C# project) produce skeleton assemblies via metadata-only emit. Each skeleton is dumped to an MMF. This is less likely to be the dominant factor in a single-language test host, but contributes in mixed-language solutions.

4. VisualStudioMetadataReferenceManagerVisualStudioMetadataReferenceManager.cs:243(VS-only, not applicable to this test host)

Why 525 Sections Accumulate in the Test Host

The core issue is that TemporaryStorageService has no cap, no pool, and no eviction:

  1. Single-slot tracker_fileReference only points to the latest 8 MB file. Once it fills up, a new one is created and the old reference is simply overwritten — not disposed, not tracked.

  2. Lifetime tied to callers — Each MemoryMappedInfo returned from CreateTemporaryStorage holds a direct reference to its MemoryMappedFile. The MMF stays alive (and its VA reservation persists) as long as any TemporaryStorageTextHandle or TemporaryStorageStreamHandle exists. These handles are held by document states and metadata reference caches that live for the duration of their workspace snapshot.

  3. Weak refs don't helpMemoryMappedInfo._weakReadAccessor allows the view to be unmapped when no one is reading. But the underlying MemoryMappedFile object (and its kernel section, which reserves VA space) persists independently of whether any view is currently mapped.

  4. Test host = worst case — A Roslyn test host running xUnit tests typically:

    • Creates many workspace instances across different tests
    • Each workspace loads hundreds of source files → RecoverableTextAndVersion fills MMFs
    • Each workspace resolves dozens of metadata references → SerializerService fills more MMFs
    • Tests run for 20+ minutes in a single process, accumulating without bound
    • Result: 525 × 8 MB = 4,200 MB of VA space — exceeds the x86 4 GB limit

Recommendations

Short Term

  1. Move to x64 — eliminates the 4 GB VA ceiling entirely
  2. Split test execution — run fewer tests per test host process to limit accumulation
  3. Reduce metadata references — fewer referenced assemblies = fewer MMF entries

Longer Term

  1. Add MMF count/size cap to TemporaryStorageService with LRU eviction
  2. Dispose idle handles — implement a mechanism to release MemoryMappedFile objects when their associated workspace snapshots are no longer reachable
  3. Consider alternatives for x86 — use file-backed storage instead of named MMFs, or skip MMF-based recovery entirely in test hosts

Debugging Notes

These dumps were captured by a 64-bit tool from a WoW64 process, creating an architecture mismatch that prevented all standard managed debugging tools from loading the x86 DAC:

  • dotnet-dump: BadImageFormatException (can't load 32-bit mscordacwks.dll)
  • x86 dotnet-dump: "SOS does not support the current target architecture 'x64' (0x8664)"
  • PerfView HeapDump: Same BadImageFormatException
  • WinDbg .loadby sos clr: "not a valid Win32 application"

Workarounds used:

  • Native debugging via copied WinDbg amd64 binaries with .effmach x86
  • Direct CLR symbol reading (.reload /f clr.dll under x86 effective machine)
  • Python minidump parsing for MemoryInfoListStream and HandleDataStream

For future x86 dumps: Use procdump -ma -32 to capture a proper 32-bit dump.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment