Skip to content

Instantly share code, notes, and snippets.

View neon-sunset's full-sized avatar
💭
So ARM64 has FJCVTZS for JS but nothing to count UTF-8 code point length :(

neon-sunset

💭
So ARM64 has FJCVTZS for JS but nothing to count UTF-8 code point length :(
View GitHub Profile
namespace System.Linq;
public static class ParallelExtensions
{
public static ParallelQuery<TResult> BatchSelect<T, TResult>(
this T[] source,
Func<Memory<T>, TResult> selector,
int degreeOfParallelism = 0)
{
return source.AsMemory().BatchSelect(selector, degreeOfParallelism);
@neon-sunset
neon-sunset / BSeachVectorized.cs
Last active August 10, 2023 17:07
SIMD B-Search
// This seems to be actually broken huh
private static bool BSearchContainsCore(Span<int> haystack, int needle)
{
// We could do better than to use divrem but stupid leetcode does not care for base latency
// so we won't bother with it either, which is wrong but no one will thank you
// for doing things properly anyway.
var rem = haystack.Length % Vector<int>.Count;
var vectors = MemoryMarshal.Cast<int, Vector<int>>(haystack[..^rem]);
var scalars = haystack[^rem..];
var needle = new Vector<int>(target);
using System.Runtime.CompilerServices;
var repro = new Example(new object(), 1234, 5678);
PrintAsMinOpts(repro);
PrintAs(repro);
PrintBitcast(repro);
[MethodImpl(MethodImplOptions.AggressiveOptimization)]
static void PrintAsMinOpts(Example example)
{
@neon-sunset
neon-sunset / TwitchLibComparison.md
Created July 4, 2023 10:29
TwitchLib.Client message handling performance of master vs dev + fixes
BenchmarkDotNet=v0.13.5, OS=macOS 14.0 (23A5276g) [Darwin 23.0.0]
Apple M1 Pro, 1 CPU, 8 logical and 8 physical cores
.NET SDK=8.0.100-preview.7.23327.3
  [Host]        : .NET 8.0.0 (8.0.23.32502), Arm64 RyuJIT AdvSIMD
  DefaultJob    : .NET 8.0.0 (8.0.23.32502), Arm64 RyuJIT AdvSIMD
  NativeAOT 8.0 : .NET 8.0.0-preview.7.23325.2, Arm64 NativeAOT AdvSIMD

Underlying IClient.ReceiveMessage -> TwitchClient message handler -> Registered event handler invokation (if any)

@neon-sunset
neon-sunset / trace
Created June 17, 2023 12:07
Quick .NET trace
dotnet trace collect --format speedscope -- ./{path-to-executable}
@neon-sunset
neon-sunset / ARM64.cs
Last active September 22, 2023 16:05
ARM64 snippets for C#
using System.Runtime.CompilerServices;
using System.Runtime.Intrinsics.Arm;
namespace System.Runtime.Instrinsics;
public static class VectorExtensions
{
[MethodImpl(MethodImplOptions.AggressiveInlining)]
internal static int CountMatches<T>(this Vector256<T> mask)
{
using Microsoft.AspNetCore.StaticFiles.Infrastructure;
using Microsoft.Extensions.FileProviders;
var builder = WebApplication.CreateBuilder(args);
// builder.Logging.SetMinimumLevel(LogLevel.Warning);
builder.Services
.AddAntiforgery()
.AddResponseCompression();
@neon-sunset
neon-sunset / SplitExtensions.cs
Last active June 20, 2023 18:14
SplitFirst and SplitLast convenience methods to handle the most common pair slicing pattern. Use .AsSpan() or .AsMemory() for alloction-free version.
// Proposal: https://github.com/dotnet/runtime/issues/75317
// This code is licensed under MIT license
// (c) 2022 neon-sunset
using System.Runtime.CompilerServices;
namespace System;
public static class SplitExtensions
{
@neon-sunset
neon-sunset / Buffer.cs
Last active February 13, 2023 19:26
A small helper to wrap a common pattern of stackalloc with fallback to array pool into a single data structure.
// This code is licensed under MIT license
// (c) 2022-2023 neon-sunset
using System.Buffers;
using System.Diagnostics.CodeAnalysis;
using System.Runtime.CompilerServices;
using System.Runtime.InteropServices;
using System.Text;
[StructLayout(LayoutKind.Auto)]
#pragma warning disable IL2026
using FastCache;
using Markdig;
using Microsoft.AspNetCore.Mvc;
using NonBlocking;
var builder = WebApplication.CreateBuilder();
builder.Services.AddHttpClient();
var app = builder.Build();