Skip to content

Instantly share code, notes, and snippets.

@certik
Created March 20, 2026 15:31
Show Gist options
  • Select an option

  • Save certik/eae6e6033028670b65650ff8f178afcf to your computer and use it in GitHub Desktop.

Select an option

Save certik/eae6e6033028670b65650ff8f178afcf to your computer and use it in GitHub Desktop.

Switching to Byte-Based Strides in LFortran's Array Descriptor

Background

LFortran's internal array descriptor is designed to be compatible with CFI_cdesc_t (the ISO_Fortran_binding descriptor), but it currently stores strides in element units rather than byte units. This mismatch requires conversion at every bind(c) boundary.

This document summarizes why switching to byte-based strides is the right move.

How Other Compilers Handle Strides

Compiler Internal stride Consequence
Flang Byte-based Descriptor is CFI_cdesc_t — zero conversion at bind(c) boundaries
GFortran Element-based Requires conversion; developers have called it a "kludge" (GCC Bug 37577) but can't change due to ABI stability
LFortran Element-based Requires copy + convert at every bind(c) call; not constrained by ABI stability

Flang made the pragmatic choice of using byte-based strides everywhere, so its internal descriptor is a superset of CFI_cdesc_t with no conversion layer. GFortran carries the legacy of element-based strides and can't change without breaking ABI. LFortran has no such constraint.

Current Cost of Element-Based Strides

The stride mismatch forces substantial conversion machinery in asr_to_llvm.cpp:

  1. On entry to a bind(c) function (~line 7140): incoming CFI byte strides are divided by element size to produce internal element strides. For allocatable intent(in), a full descriptor copy is made first.

  2. On exit from a bind(c) function (~line 7262): element strides are multiplied back to byte strides via convert_bindc_strides_to_byte(), tracked through BindCStrideExit bookkeeping structs.

  3. When calling a bind(c) function (~line 18956): the descriptor is copied to a temporary, strides are multiplied by element size, and CFI metadata fields are explicitly set via set_cfi_descriptor_fields().

  4. Special cases construct entirely new rank-0 CFI descriptors for scalars, character allocatables, and scalar allocatable/pointer arguments (~lines 18820–19177).

  5. Post-call fixup (~line 20654) reads back modified base_addr from CFI descriptors for allocatable arguments.

This is real runtime overhead — loops over dimensions, memory copies, and divide/multiply operations — at every interop boundary.

No Performance Cost for Normal Operations

Indexing (address calculation)

The total arithmetic work is identical, just distributed differently:

  • Element-based: addr = base + (Σ (i_k - lb_k) × stride_k) × elem_size — N multiplies by stride, plus one final multiply by elem_size.
  • Byte-based: addr = base + Σ (i_k - lb_k) × stride_k — N multiplies by stride (elem_size is already baked into each stride).

Same number of multiplications. For contiguous arrays where stride₁ = 1, element-based lets you skip one multiply, but byte-based stride₁ = elem_size which is a compile-time constant that LLVM folds anyway.

ubound, lbound, size, extent

These read lower_bound and extent directly from the descriptor. Strides are not involved. No change.

LLVM IR

With opaque pointers (modern LLVM), byte-offset arithmetic (ptr_as_i8 + byte_offset) optimizes identically to element-indexed GEP. For contiguous arrays, LFortran's "array to data" pass already eliminates descriptors entirely, so the stride representation is irrelevant in the hot path.

Summary

  • Net effect: eliminates conversion overhead at bind(c) boundaries with no cost to normal array operations.
  • Flang precedent: already uses byte-based strides successfully.
  • No ABI constraint: unlike GFortran, LFortran can make this change freely.
  • Simplifies codegen: removes ~500 lines of stride conversion, descriptor copying, and bookkeeping in asr_to_llvm.cpp.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment