Skip to content

Instantly share code, notes, and snippets.

@certik
Last active March 20, 2026 20:15
Show Gist options
  • Select an option

  • Save certik/5cb362a194b94b806d90294943d797d5 to your computer and use it in GitHub Desktop.

Select an option

Save certik/5cb362a194b94b806d90294943d797d5 to your computer and use it in GitHub Desktop.

Design: Explicit Self Argument in Call Args

Problem Statement

In the current LFortran ASR, type-bound procedure calls with PASS semantics store the self/pass argument separately from the explicit call arguments:

SubroutineCall(name, original_name, args, dt, strict_bounds_checking)
FunctionCall(name, original_name, args, type, value, dt)

The dt field holds the implicit self argument (the struct instance), while args holds only the explicitly-written arguments. The actual function, however, has the self parameter as its first formal parameter. This creates a persistent index mismatch: function parameter index i corresponds to call argument index i - 1 when dt is present and the procedure uses PASS semantics.

Every ASR pass and every codegen backend must independently compute this offset. This is error-prone and has already caused bugs (e.g., pass_array_by_data failing to split array arguments for procedure pointer calls through struct members, because it did not account for the offset).

Current offset locations

The following code locations all independently compute the implicit-self offset:

File Function/Context Offset Pattern
asr_utils.h Call_t_body i + is_method
asr_verify.cpp visit_SubroutineCall formal_offset = 1
pass/pass_array_by_data.cpp construct_new_args i + dt_implicitPass
pass/transform_optional_argument_functions.cpp optional arg handling i - is_method
pass/array_passed_in_function_call.cpp arg offset arg_offset = 1
pass/array_struct_temporary.cpp array detection checks m_dt & nopass
codegen/asr_to_llvm.cpp convert_call_args i + is_method
codegen/asr_to_llvm.cpp runtime polymorphic calls prepend m_dt to args

Each of these must correctly determine whether the call has PASS semantics (checking m_dt != nullptr AND !is_nopass), and then shift indices accordingly. Getting any one of these wrong produces silent miscompilation or LLVM verification errors.

Background: Fortran PASS Semantics

Fortran allows specifying which parameter receives the instance pointer via PASS(arg_name). The passed-object dummy argument can be at any position in the parameter list, not necessarily the first:

type :: t
contains
   ! Default PASS: self is 1st param
   procedure :: method1              ! sub(self, a, b)

   ! Explicit PASS(self): self is 1st param (same as default)
   procedure, pass(self) :: method2  ! sub(self, a, b)

   ! PASS(rhs): self is 2nd param (common for assignment operators)
   procedure, pass(rhs) :: assign    ! sub(lhs, rhs)

   ! NOPASS: no self at all
   procedure, nopass :: helper       ! sub(a, b)
end type

Current encoding

StructMethodDeclaration stores:

  • identifier? self_argument — the name of the passed-object dummy (e.g., "rhs"), or null for default PASS (meaning first param).
  • bool is_nopass — true when no self is passed at all.

The LLVM backend currently handles non-first PASS by saving the self value in a pass_arg variable and appending it after all explicit call args. It then resolves the correct formal parameter by name lookup in the function's symtab. This works but is fragile — it relies on the codegen knowing the name.

Proposed Design

Core change

Move the self/pass argument from the separate dt field into args at the correct position matching the function's formal parameter list. Add a boolean is_method flag to preserve the "this is a method call" information.

ASDL changes

-- Before:
| SubroutineCall(symbol name, symbol? original_name, call_arg* args,
                 expr? dt, bool strict_bounds_checking)
| FunctionCall(symbol name, symbol? original_name, call_arg* args,
               ttype type, expr? value, expr? dt)

-- After:
| SubroutineCall(symbol name, symbol? original_name, call_arg* args,
                 bool is_method, bool strict_bounds_checking)
| FunctionCall(symbol name, symbol? original_name, call_arg* args,
               ttype type, expr? value, bool is_method)

The dt field (type expr?) is removed and replaced by is_method (type bool).

Invariant

args[i] always corresponds to func->m_args[i]. No offset computation is ever needed, regardless of PASS position.

When is_method is true:

  • The self/pass expression is in args at the position matching its formal parameter. For default PASS, that is args[0]. For PASS(rhs) where rhs is the 2nd parameter, it is args[1].
  • n_args == func->n_args — exact match.

When is_method is false:

  • All of args are explicit user-written arguments.
  • This covers both non-method calls and NOPASS method calls.

PASS(arg_name) handling

During AST→ASR (semantics), the self expression is inserted into args at the position matching the named parameter:

Fortran source:     call t_instance%assign(str)
Function signature: subroutine assign(lhs, rhs)  ! PASS(rhs), rhs is index 1
ASR args:           [str, t_instance]             ! self inserted at index 1

The semantics phase already knows the PASS argument name from StructMethodDeclaration.m_self_argument and can resolve its index in the function's formal parameter list. This is a one-time lookup during ASR construction — no downstream code ever needs to figure out the PASS position.

For default PASS (no explicit arg name), self goes at index 0:

Fortran source:     call x%method(a)
Function signature: subroutine method(self, a)    ! default PASS, self is index 0
ASR args:           [x, a]                        ! self inserted at index 0

NOPASS handling

For NOPASS procedure calls (call obj%method(a) where method has NOPASS):

  • The call is semantically a non-method call — obj is only used to locate the procedure pointer, not passed as an argument.
  • is_method = false.
  • args contains only the explicit arguments [a].
  • The obj expression for locating the procedure pointer is already encoded in m_name (which points to the procedure variable in the struct). No separate field is needed; the symbol resolution path provides access to the struct.

What is_method is used for

The is_method flag is not used for index offset calculations (there are none). It serves these purposes:

  1. ASR-to-Fortran pretty-printing (asr_to_fortran.cpp): When printing the call, the self argument should be extracted from args (at the PASS position) and rendered as obj%method(rest...) rather than method(obj, rest...). The PASS position is determined from the StructMethodDeclaration.m_self_argument field on the callee symbol.

  2. Codegen polymorphic dispatch (asr_to_llvm.cpp): The backend needs to know which argument is the struct instance to perform vtable lookup for runtime polymorphism. With is_method, it looks up the PASS position from the callee's StructMethodDeclaration and uses that args[pass_idx] for vtable dispatch. No special prepend/append logic is needed.

  3. Verification (asr_verify.cpp): The verifier can check that when is_method is true, the argument at the PASS position has a type compatible with the corresponding formal parameter (a class/struct type).

  4. Debug/diagnostic messages: Error messages can distinguish "method call on X" from "call to X" for better user-facing diagnostics.

Impact on StructMethodDeclaration

StructMethodDeclaration is unchanged. It retains self_argument and is_nopass — these are properties of the method binding, not the call site. The semantics phase uses self_argument to determine where to insert self into args. After that, no downstream code needs to consult self_argument.

Changes Required

Phase 1: ASDL and ASR node construction

File: src/libasr/ASR.asdl

Replace expr? dt with bool is_method on both SubroutineCall and FunctionCall.

File: src/libasr/asr_utils.h

  • make_SubroutineCall_t_util: Instead of accepting a_dt and passing it as a separate field, accept bool is_method. When is_method is true, expect that the caller has already prepended self to a_args. Remove the code that wraps a_dt in a StructInstanceMember_t.
  • make_FunctionCall_t_util: Same change.
  • Call_t_body: Remove the is_method / nopass offset logic. The function already receives the full argument list; just iterate args[i] against func->m_args[i] directly.
  • get_class_proc_nopass_val: Can be removed or simplified — it is only used to compute the offset that no longer exists.

Phase 2: Semantics (AST → ASR)

Files: src/lfortran/semantics/ast_body_visitor.cpp, src/lfortran/semantics/ast_common_visitor.h

When creating a SubroutineCall or FunctionCall for a type-bound procedure:

  • If the procedure has PASS semantics (!is_nopass):
    • Determine the PASS position: look up self_argument in the function's formal parameter list. If self_argument is null (default PASS), position is 0. Otherwise, find the index of the named parameter.
    • Insert the self expression (v_expr) into args at that position.
    • Set is_method = true.
  • If the procedure has NOPASS:
    • Do not insert self.
    • Set is_method = false.

Example for PASS(rhs) where rhs is the 2nd formal parameter:

Source:    call t_instance%assign(str)
Function: subroutine assign(lhs, rhs)  ! rhs at index 1
Action:   args = [str]  →  insert t_instance at index 1  →  args = [str, t_instance]
Result:   SubroutineCall(assign, args=[str, t_instance], is_method=true)

The current code already prepends self in some paths (e.g., ast_common_visitor.h line ~10250: args.push_front(al, self_arg)). These paths set is_method = true and pass nullptr for dt.

Other paths currently pass v_expr as dt without prepending. These must be changed to insert v_expr into args at the correct PASS position and set is_method = true.

Phase 3: ASR passes

All passes that currently compute the implicit-self offset can be simplified:

pass/pass_array_by_data.cpp:

  • Remove dt_implicitPass parameter from construct_new_args.
  • Remove call_with_implicit_dt_passed and is_structMethodDeclaration_with_pass helper functions.
  • construct_new_args iterates args[i] against indices directly — index i in args always corresponds to index i in the function's formal parameters.

pass/transform_optional_argument_functions.cpp:

  • Remove the is_method offset variable and all i - is_method expressions.
  • Iterate directly: args[i] corresponds to func->m_args[i].

pass/array_passed_in_function_call.cpp:

  • Remove arg_offset = 1 logic.

pass/array_struct_temporary.cpp:

  • Remove special m_dt / nopass checks for determining array offsets.

pass/nested_vars.cpp (14 references to .m_dt):

  • Replace visit_expr(*x.m_dt) with normal iteration over x.m_args (self is now in args at the PASS position and handled uniformly).
  • Remove special-case code that separately processes m_dt.

pass/array_op.cpp (5 references):

  • For elemental operations on arrays: if m_dt was an array being processed, it is now in args at the PASS position and handled uniformly.

pass/unused_functions.cpp (4 references):

  • Symbol usage tracking that walks m_dt now walks all args uniformly.

pass/replace_symbolic.cpp, pass/promote_allocatable_to_nonallocatable.cpp:

  • Minor: replace m_dt references with access to the self argument in args (look up PASS position from callee's StructMethodDeclaration when is_method is true).

Phase 4: Codegen backends

codegen/asr_to_llvm.cpp (24 references to .m_dt):

  • convert_call_args: Remove the is_method parameter. The function iterates args[0..n-1] and maps each to func->m_args[i] directly.
  • Runtime polymorphic calls (visit_RuntimePolymorphicSubroutineCall, visit_RuntimePolymorphicFunctionCall): Instead of separately evaluating x.m_dt and prepending/appending it to the LLVM args vector, these paths find the self argument at x.m_args[pass_idx] (where pass_idx is determined from the callee's StructMethodDeclaration.m_self_argument). Use the is_method flag to know when to perform vtable dispatch. The current pass_arg / append-after-explicit-args pattern for PASS(rhs) is eliminated — the argument is already in the right position.
  • bounds_check_call: Remove is_method = x.m_dt && !is_nopass — just iterate args directly.

codegen/asr_to_fortran.cpp (4 references):

  • When printing calls with is_method = true, look up the PASS position from the callee's StructMethodDeclaration, extract that argument as the object, and print as obj%method(remaining args...).

codegen/asr_to_c_cpp.h:

  • No significant changes needed (only 1 reference to m_dt_sym, which is on StructType, not on calls).

Phase 5: Verification

asr_verify.cpp:

  • Remove formal_offset = 1 logic.
  • Verify the following invariants on every SubroutineCall and FunctionCall:
1. n_args == func->n_args
   Call argument count must exactly match formal parameter count.
   No offset. No exceptions.

2. For each i in 0..n_args-1:
     type_of(args[i]) is compatible with type_of(func->m_args[i])
   Every call argument must type-match its corresponding formal parameter.
   This catches misaligned self arguments immediately.

3. When is_method is true:
   a. The callee symbol (past ExternalSymbol) is a StructMethodDeclaration
      with is_nopass == false, OR is a Variable with a procedure pointer
      type whose interface has PASS semantics.
   b. Let pass_idx = index of self_argument in func->m_args
      (0 if self_argument is null, otherwise the named parameter's index).
      args[pass_idx] must have a class/struct type compatible with the
      derived type that owns the method.

4. When is_method is false:
   a. If the callee is a StructMethodDeclaration, it must have
      is_nopass == true.
   b. No argument should be a class/struct type matching the owning
      derived type at the PASS position (prevents accidental inclusion
      of self without setting is_method).

These checks make argument misalignment a verifier error caught during development rather than a silent miscompilation or LLVM crash at codegen time.

Phase 6: Other references

asr_lookup_name.h (4 references):

  • Replace m_dt field access with is_method flag check and self argument access at the PASS position.

Migration Strategy

This is a pervasive change touching the ASDL definition, semantics, ~10 pass files, all codegen backends, and the verifier. Recommended approach:

  1. Add is_method field first alongside dt, keeping dt temporarily. Set is_method = (dt != nullptr && !nopass) at all call construction sites. This is a non-breaking additive change.

  2. Insert self into args at all call construction sites where dt is set and !is_nopass. Use the PASS position (from self_argument) to determine the insertion index. Keep dt populated for now. Add a verifier check that when is_method is true, the argument at the PASS position matches dt.

  3. Migrate consumers one by one: Change each pass and codegen file to use args directly instead of dt, and remove offset calculations. After each file, run the full test suite. The verifier ensures consistency during migration.

  4. Remove dt field: Once all consumers use args and is_method, remove the dt field from the ASDL and all construction sites.

Benefits

  • Eliminates an entire class of bugs: No code ever needs to compute the implicit-self offset. Function parameter index i always equals call argument index i.
  • Simplifies every pass: ~8 pass files lose offset logic. New passes cannot introduce offset bugs because there is no offset.
  • Uniform argument processing: The self argument is just another argument. Passes that transform arguments (array splitting, optional arg insertion, type casting) automatically handle self correctly.
  • Cleaner codegen: Backends iterate args directly without prepending or appending logic. The current pass_arg pattern for PASS(rhs) (save self, append after explicit args) is eliminated entirely.
  • Correct PASS(arg) by construction: Non-first-position PASS arguments are placed at the right index during ASR construction. No downstream code needs to resolve the PASS argument name to find its position.

Risks and Mitigations

  • Large diff: The change touches many files. The phased migration strategy (keeping dt temporarily) allows incremental validation.
  • Fortran pretty-printing: Must not print self as a regular argument. The is_method flag enables this distinction.
  • NOPASS edge case: NOPASS calls must NOT prepend self. The semantics phase already distinguishes PASS from NOPASS; the new design simply makes the outcome (self in args vs. not) rather than storing a flag that consumers must check.
  • Modfile compatibility: Serialized ASR in .mod files will change format. Bump the modfile version number.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment