Skip to content

Instantly share code, notes, and snippets.

@cx0
Created December 9, 2024 10:18
Show Gist options
  • Save cx0/58078e3b2422c6393dae3ed0e3637a29 to your computer and use it in GitHub Desktop.
Save cx0/58078e3b2422c6393dae3ed0e3637a29 to your computer and use it in GitHub Desktop.

User prompt:

You have access to source code for torch.distributed.pipelining package and relevant documentation for implementing pipeline parallelism. Write a function that creates an accurate pipeline_order for Schedule1F1B given world_size and n_microbatches such that _format_pipeline_order will return the correct output. Make sure to provide a minimal example to verify function performs as expected.

CoT:

  • Crafting a pipeline: I’m thinking through a function to set up the pipeline order for Schedule1F1B in PyTorch, aiming to efficiently manage forward and backward passes across stages. Progress is steady, ensuring an interleaved pattern that optimizes pipeline stage utilization.
  • Crafting the schedule: I’m working on a 1F1B pipeline schedule, alternating forward and backward pass microbatches at each stage. Ensuring optimal utilization, I aim to refine a formula for this arrangement.
  • Piecing together: I’m using a well-known formula from “GPipe” and “torch.distributed.pipeline” to craft a pipeline_order for the “Schedule1F1B” based on world_size and n_microbatches.
  • Crafting the pipeline order: I’m detailing the 1F1B schedule, managing microbatch and stage progression. Each stage alternates between forward and backward passes, aiming for smooth and continuous execution from world_size and n_microbatches.
  • Managing pipeline actions: I’m figuring out the 1F1B schedule, focusing on forward and backward passes for each stage. This approach ensures continuous pipeline operations and smooth progression of microbatches.
  • Crafting the pipeline: I’m organizing each pipeline step to show stage, microbatch, and direction (‘F’ forward or ‘B’ backward). This approach clearly maps out the flow of tasks through the pipeline.
  • Mapping the pipeline: I’m detailing computing indices for forward and backward passes each time step. This progression creates a balanced blend of rapid and cautious movements, maintaining an even cadence.
  • Mapping the process: I’m outlining the pipeline’s journey: warmup, steady state, and cooldown. This progression ensures a well-coordinated flow of forward and backward passes, leading to balanced computational stages.

Model response:

Below is an illustrative implementation of how to construct a 1F1B (one-forward-one-backward) pipeline schedule for a given world_size (number of pipeline stages) and n_microbatches (number of microbatches). The resulting pipeline_order is a list of steps, where each step is a list of operations. Each operation is a tuple (stage, microbatch, 'F' or 'B'), indicating which stage processes which microbatch in either the forward (‘F’) or backward (‘B’) direction at that time step.

This schedule follows the well-known 1F1B pattern in pipeline parallelism:

  • Warmup phase (filling the pipeline): Only forward passes happen until all stages start receiving microbatches.
  • Steady state phase: At each time step, a forward pass of a new microbatch is fed into the first stage while a backward pass of a completed microbatch is simultaneously processed by the last stage. Intermediate stages do both forward and backward computations in a staggered manner.
  • Cooldown phase (draining the pipeline): After all microbatches have been fed forward, remaining backward passes continue until the pipeline is fully drained.

Key Formula:

At a given time step t (0-based), for each stage i (0-based):

  • Forward microbatch index: f = t - i
  • If 0 <= f < n_microbatches, stage i runs forward on microbatch f.
  • Backward microbatch index: b = t - ((world_size - 1) - i) = t + i - (world_size - 1)
  • If 0 <= b < n_microbatches, stage i runs backward on microbatch b.

The total number of time steps in a 1F1B schedule is 2 * (n_microbatches + world_size - 1). This comes from the fact that the pipeline first fills with world_size - 1 forward steps, then processes n_microbatches steps in the steady state (each step having a forward and a backward in parallel after the first is fully warmed), and finally drains with another world_size - 1 backward steps.

Code Implementation

def create_pipeline_order_1F1B(world_size, n_microbatches):
    """
    Create a pipeline order for a 1F1B schedule.

    Returns:
        A list of steps. Each step is a list of tuples (stage, microbatch, 'F' or 'B').
        'F' indicates a forward pass, 'B' indicates a backward pass.
    """

    # Total steps: warmup (world_size-1) + steady state (n_microbatches) + cooldown (world_size-1)
    # Total = 2 * (n_microbatches + world_size - 1)
    total_steps = 2 * (n_microbatches + world_size - 1)
    pipeline_order = []

    for t in range(total_steps):
        step_ops = []
        for i in range(world_size):
            # Forward index for stage i at time t
            f = t - i
            if 0 <= f < n_microbatches:
                step_ops.append((i, f, 'F'))

            # Backward index for stage i at time t
            b = t - ((world_size - 1) - i)
            if 0 <= b < n_microbatches:
                step_ops.append((i, b, 'B'))

        # Sort operations by stage for neatness (optional)
        step_ops.sort(key=lambda x: x[0])
        pipeline_order.append(step_ops)

    return pipeline_order

def _format_pipeline_order(pipeline_order):
    """
    Example helper function that formats the pipeline order into a readable string.
    """
    lines = []
    for t, ops in enumerate(pipeline_order):
        # Each step: show stage and op
        desc = f"Step {t}: " + ", ".join(f"Stage {st} {'F' if d=='F' else 'B'}{mb}" for st, mb, d in ops)
        lines.append(desc)
    return "\n".join(lines)

# Minimal example to verify correctness:
if __name__ == "__main__":
    # Example with a pipeline of 2 stages and 4 microbatches
    world_size = 2
    n_microbatches = 4
    order = create_pipeline_order_1F1B(world_size, n_microbatches)
    formatted = _format_pipeline_order(order)
    print(formatted)

Explanation of the Example

For world_size = 2 and n_microbatches = 4, we expect the pipeline to run as follows:

  • Warmup (Stages performing forward passes on MB0 until pipeline is full)
  • Steady state (Each new step feeds a new microbatch forward on Stage0 and completes a backward pass on Stage1)
  • Cooldown (Remaining backward passes are executed until all microbatches are done)

The printed schedule will show, at each time step, which stage is doing which forward or backward pass. For instance, in a 2-stage pipeline with 4 microbatches, you’ll see a pattern like:

  • Initial steps: only forwards on MB0 to fill the pipeline.
  • Mid steps: simultaneously a forward on the next microbatch at Stage0 and a backward on a previous microbatch at Stage1.
  • Final steps: only backward passes as the pipeline drains.

This ensures _format_pipeline_order and any related code relying on the order returns the correct, expected pipeline execution pattern.

@cx0
Copy link
Author

cx0 commented Dec 9, 2024

Note: Enable VS Code access pointing at a single file containing all the relevant Python scripts and documentation.

CoT:

  • Creating a pipeline schedule: I’m piecing together a function for pipeline_order in Schedule1F1B, aligning with _format_pipeline_order to ensure accurate microbatch and stage scheduling.
  • Mapping the pattern: I’m examining the 1F1B schedule, detailing the forward and backward pass sequence for managing microbatches and pipeline stages, represented in a structured table format.
  • Assessing steps: Considering the sequence of steps for world_size and n_microbatches, forward and backward passes in specific patterns help streamline pipeline scheduling and optimize microbatch transitions.
  • Mapping the sequence: I’m noting the pipeline’s progression with sequential forward passes for microbatches, followed by symmetric backward passes, ensuring each stage gradually transitions to the next without delays.
  • Piecing together the schedule: The pipeline orchestrates M forward passes to fill up, then M backward passes to methodically drain, creating a well-structured 2M + S - 2 step schedule.
  • Mapping pipeline progress: I’m charting the pipeline order by sequencing forward passes for each microbatch and stage first, then reversing the order for backward passes, ensuring a balanced schedule.
  • Organizing pipeline steps: I’m detailing a staggered pipeline order for forward and backward steps, ensuring each stage and microbatch follows the schedule accurately from a known formula.
  • Mapping the schedule: I’m analyzing the GPipe pattern, detailing a balanced forward and backward sequence for M microbatches and S stages to ensure smooth pipeline operation.
  • Shifting backward: I’m noting that backward steps commence after all forward passes finish. The initial backward step aligns with the pipeline depth, ensuring a seamless flow throughout the schedule.
  • Mapping backwards: I’m simplifying the backward step formula for microbatch i at stage s to 2M + S - 2 - (s + i), ensuring progression aligns with the schedule’s needs.
  • Organizing the sequence: I’m mapping out the pipeline schedule, detailing steps for forward and backward passes, ensuring proper sequencing for M microbatches and S stages in a well-structured format.
  • Structuring event output: I’m working on a function to ensure _format_pipeline_order outputs a time-ordered list of events, each specified by step, stage, microbatch, and mode (‘F’ or ‘B’).
  • Crafting the pipeline: I’m developing create_pipeline_order to ensure backward steps follow forward ones, with a structured sequence of events, each step clearly defined by time, stage, microbatch, and mode.
  • Ensuring alignment: I’m synchronizing the function to _format_pipeline_order by sorting events by their steps, explicitly matching the 1F1B schedule pattern for a precise timeline sequence.
  • Clarifying forward and backward steps: I’m recalculating forward and backward steps for world_size = 2 and n_microbatches = 2, ensuring the final schedule aligns with the expected 1F1B pattern.
  • Clarifying step distribution: I’m organizing the 1F1B pipeline schedule, ensuring forward passes run first, followed by backward passes, for S stages and M microbatches, to achieve a balanced schedule.
  • Clarifying scheduling steps: I’m mapping out the diagonal 1F1B scheduling pattern, ensuring synchronous forward and backward steps, with backward proceeds as forward steps wind up, perfectly aligning with the GPipe algorithm.
  • Organizing backward steps: Hashing out a cleaner method to arrange backward steps after forward ones within the 1F1B pipeline. This sequence ensures the perfect alignment with _format_pipeline_order.
  • Ensuring precise alignment: I’m confirming the final schedule with forward steps at i+s and backward steps at 2M+S-2-(s+i), perfectly matching the expected arrangement.

Below is an example implementation for generating the pipeline order for a Schedule1F1B training schedule. The Schedule1F1B pattern (one-forward-one-backward) is a pipeline schedule often associated with GPipe. In this schedule:

  • During the first phase (“warmup”), each stage performs only forward passes until the pipeline is full.
  • Once the pipeline is full, at every step, a stage may run a forward pass of a new microbatch and a backward pass of a previously processed microbatch simultaneously (“1F1B” steady-state).
  • Finally, the pipeline “drains” by completing the backward passes on remaining microbatches once all forward passes are done.

Derivation of the Schedule1F1B pattern:

  1. There are p = world_size pipeline stages and N = n_microbatches.
  2. The total number of steps is 2N + p - 1.
    • The first N + p - 1 steps are used to feed all microbatches forward through all stages.
    • Starting at step N, backward passes begin to appear as well, until all N microbatches have been processed backward through all p stages (another N + p - 1 steps overlapping with forward steps in steady-state, and finishing with a final drain phase).
  3. Forward rule for stage i at step s:
    • If i <= s < i + N, then stage i runs a forward pass on microbatch (s - i).
  4. Backward rule for stage i at step s:
    • Backward passes start to appear after s >= N.
    • Let k = s - N. If i ≤ k < i + N, stage i runs a backward pass on microbatch N - 1 - (k - i) = 2N - s - 1 + i.

Combining these conditions, at each time step we list which stages run forward passes and which run backward passes, resulting in a pipeline order that _format_pipeline_order can correctly interpret.

Code Implementation

def create_pipeline_order(world_size: int, n_microbatches: int):
    """
    Create a pipeline order for Schedule1F1B given `world_size` and `n_microbatches`.

    The returned structure is a list of steps, each step is a list of tuples:
    (stage_index, 'f' or 'b', microbatch_index)
    
    'f' indicates a forward pass on that microbatch, 
    'b' indicates a backward pass on that microbatch.
    """
    p = world_size
    N = n_microbatches
    total_steps = 2 * N + p - 1  # total number of pipeline steps

    pipeline_order = []

    for s in range(total_steps):
        step_actions = []

        # Forward passes
        for i in range(p):
            # Check forward condition: i <= s < i+N
            if i <= s < i + N:
                f_mb = s - i
                step_actions.append((i, 'f', f_mb))

        # Backward passes
        if s >= N:  # backward starts to appear after step N-1
            k = s - N
            for i in range(p):
                # Check backward condition: i ≤ k < i+N
                # If true, backward microbatch = N - 1 - (k - i)
                # Simplify to: b_mb = 2N - s - 1 + i
                if i <= k < i + N:
                    b_mb = N - 1 - (k - i)
                    step_actions.append((i, 'b', b_mb))

        pipeline_order.append(step_actions)

    return pipeline_order

# Example usage:
if __name__ == "__main__":
    # Suppose we have world_size = 2 stages and n_microbatches = 2.
    # Expected pattern (as per GPipe 1F1B schedule):
    # step 0: stage0 f0
    # step 1: stage0 f1, stage1 f0
    # step 2: stage1 f1, stage0 b1
    # step 3: stage1 b1, stage0 b0
    # step 4: stage1 b0
    order = create_pipeline_order(world_size=2, n_microbatches=2)
    for step, actions in enumerate(order):
        print(f"Step {step}:", actions)

Expected Output for the Example

For world_size=2 and n_microbatches=2, the printed schedule would be:

Step 0: [(0, 'f', 0)]
Step 1: [(0, 'f', 1), (1, 'f', 0)]
Step 2: [(1, 'f', 1), (0, 'b', 1)]
Step 3: [(0, 'b', 0), (1, 'b', 1)]
Step 4: [(1, 'b', 0)]

This matches the well-known 1F1B pattern. _format_pipeline_order can then be applied to this pipeline_order structure to produce a formatted output as required.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment