Skip to content

Instantly share code, notes, and snippets.

View GrabYourPitchforks's full-sized avatar
😵
On other projects, not checking GitHub notifications - ping via Teams if urgent.

Levi Broderick GrabYourPitchforks

😵
On other projects, not checking GitHub notifications - ping via Teams if urgent.
View GitHub Profile
@GrabYourPitchforks
GrabYourPitchforks / memory_docs_samples.md
Last active December 13, 2024 10:23
Memory<T> API documentation and samples

Memory<T> API documentation and samples

This document describes the APIs of Memory<T>, IMemoryOwner<T>, and MemoryManager<T> and their relationships to each other.

See also the Memory<T> usage guidelines document for background information.

First, a brief summary of the basic types

  • Memory<T> is the basic type that represents a contiguous buffer. This type is a struct, which means that developers cannot subclass it and override the implementation. The basic implementation of the type is aware of contigious memory buffers backed by T[] and System.String (in the case of ReadOnlyMemory<char>).
@GrabYourPitchforks
GrabYourPitchforks / validating_pool.cs
Created April 2, 2018 20:21
Validating MemoryPool<T>
/*
* !! WARNING !!
*
* COMPLETELY UNTESTED CODE
*/
using Microsoft.Win32.SafeHandles;
using System.Diagnostics;
using System.Runtime.CompilerServices;
using System.Runtime.ConstrainedExecution;
@GrabYourPitchforks
GrabYourPitchforks / string_comp.md
Last active August 15, 2018 01:01
String performance optimizations

This tests the performance of MemoryExtensions.ToUpperInvariant(this ReadOnlySpan<char>, Span<char>), String.GetHashCode(), and String.GetHashCode(StringComparison.OrdinalIgnoreCase).

In below table:

  • baseline coreclr = 3.0.0-preview1-26808-05
  • local build (6) = local build from private dev Utf8String branch, 6th rev.
  • local build (7) = local build from private dev Utf8String branch, 7th rev.
Method Toolchain StringLength Mean Error StdDev Scaled ScaledSD
ToUpperInvariant baseline coreclr 0 27.112 ns 0.7416 ns 1.1763 ns 1.00 0.00
@GrabYourPitchforks
GrabYourPitchforks / utf8char_ecosystem.md
Created December 13, 2018 02:31
Utf8Char and the .NET ecosystem

Motivations and driving principles behind the Utf8Char proposal

Utf8Char is synonymous with Char: they represent a single UTF-8 code unit and a single UTF-16 code unit, respectively. They are distinct from the integral types Byte and UInt16 in that sequences of the UTF-* code unit types are meant to represent textual data, while sequences of the integral types are meant to represent binary data.

Drawing this distinction is important. With UTF-16 data (String, Char[]), this distinction historically hasn't been a source of confusion. Developers are generally cognizant of the fact that aside from RPC, most i/o involves some kind of transcoding mechanism. Binary data doesn't come in from disk or the network in a format that can be trivially projected as a textual string; it must go through validation, recombining, and substitution. Similarly, when writing a string to disk or the network, a trivial projection is again impossible. The transcoding step must run in reverse to get the text data int

// In a loop, try reading a natural word at a time.
const int CharsPerNuint = sizeof(nuint) / sizeof(char);
for (; inputLength >= CharsPerNuint; pInputBuffer += CharsPerNuint, inputLength -= CharsPerNuint)
{
nuint utf16Data = Unsafe.ReadUnaligned<nuint>(pInputBuffer);
utf16Data &= unchecked((nuint)0xFF80_FF80_FF80_FF80ul);
if (utf16Data == 0)
{
@GrabYourPitchforks
GrabYourPitchforks / utf8_ldm_design.md
Last active September 14, 2019 17:38
UTF8 design for LDM

Utf8String design overview

Audience and scenarios

Utf8String and related concepts are meant for modern internet-facing applications that need to speak "the language of the web" (or i/o in general, really). Currently applications spend some amount of time transcoding into formats that aren't particularly useful, which wastes CPU cycles and memory.

A naive way to accomplish this would be to represent UTF-8 data as byte[] / Span<byte>, but this leads to a usability pit of failure. Developers would then become dependent on situational awareness and code hygiene to be able to know whether a particular byte[] instance is meant to represent binary data or UTF-8 textual data, leading to situations where it's very easy to write code like byte[] imageData = ...; imageData.ToUpperInvariant();. This defeats the purpose of using a typed language.

We want to expose enough functionality to make the Utf8String type usable and desirable by our developer audience, but it's not intended to serve as a

@GrabYourPitchforks
GrabYourPitchforks / mwi_module_initializer.csproj
Created January 24, 2020 20:08
Module initializer sample
<?xml version="1.0" encoding="utf-8"?>
<Project ToolsVersion="4.0" DefaultTargets="Build" xmlns="http://schemas.microsoft.com/developer/msbuild/2003">
<!-- ... -->
<Import Project="$(MSBuildToolsPath)\Microsoft.CSharp.targets" />
<!-- This task adds a module initializer to {IL}.txt. -->
<UsingTask TaskName="InjectModuleInitializer" TaskFactory="CodeTaskFactory" AssemblyFile="$(MSBuildToolsPath)\Microsoft.Build.Tasks.v4.0.dll">
<ParameterGroup>
<Path ParameterType="System.String" Required="true" />
<InitializerMethod ParameterType="System.String" Required="true" />
</ParameterGroup>
@GrabYourPitchforks
GrabYourPitchforks / memorymarshal_cast_utf8.cs
Last active December 10, 2022 15:14
MemoryMarshal.Cast challenge
using System;
using System.Runtime.InteropServices;
using System.Text;
class Program
{
static void Main(string[] args)
{
{
// the text below is meaningless
@GrabYourPitchforks
GrabYourPitchforks / binder_sample.cs
Created August 14, 2020 01:01
BinaryFormatter binder sample
using System;
using System.IO;
using System.Runtime.Serialization;
using System.Runtime.Serialization.Formatters.Binary;
class Program
{
static void Main(string[] args)
{
Stream inputStream = GetInputStream();