Skip to content

Instantly share code, notes, and snippets.

@MrSteve2
Created October 3, 2025 18:27
Show Gist options
  • Save MrSteve2/554694c70eec1b86076cf61f24e8971e to your computer and use it in GitHub Desktop.
Save MrSteve2/554694c70eec1b86076cf61f24e8971e to your computer and use it in GitHub Desktop.
Initial Claude.md file for analytics

Project Context: Data Analytics Dashboards

This project builds data analytics dashboards using Python and Plotly Dash.


1. Architecture Principles

Core Design Philosophy

  • Loose Coupling: Components interact through well-defined interfaces
  • High Cohesion: Related functionality stays together; each function has a single clear purpose
  • Composability: Basic visualizations combine to create complex dashboards
  • ACAP-ADAN: As Common As Possible, As Differentiated As Necessary

Design Evolution

  • Start simple, let solutions evolve
  • Add complexity only when real needs emerge
  • Postpone decisions to the last responsible moment

2. Component Hierarchy

Organized from first principles up the ladder of abstraction:

Level 1: Basic Visualizations

Individual chart functions representing atomic visualization types:

  • create_bar_chart()
  • create_line_chart()
  • create_scatter_chart()
  • etc.

Each basic visualization is its own function with a single clear purpose.

Level 2: Composite Visualizations

Combinations of basic visualizations:

  • KPI cards with sparklines
  • Dashboard panels combining multiple chart types
  • Complex analytical displays

Composite visualizations accept basic visualizations as parameters or combine their outputs.

Level 3: Layout and Navigation

Higher-level abstractions (future development):

  • Page structure and routing
  • Callbacks for interactivity
  • Application-level orchestration

3. Code Style

Standards

  • Follow PEP 8 strictly
  • Use type hints for all function parameters and returns
  • Maximum line length: 88 characters (Black formatter default)

Function Organization

  • Each visualization type is its own function
  • Keep functions focused on a single responsibility
  • Group related utilities in cohesive modules

4. Data and Interface Specification

Data Format Requirements

Required: All data must be in Tidy (long-form) format as Pandas DataFrames

  • One observation per row
  • Variables in columns
  • Each value in its own cell

If wide-form data is passed, raise a clear error with link to documentation.

Visualization Function Interface

Standard Function Signature Pattern

def create_bar_chart(
    data: pd.DataFrame,
    x: str,
    y: Union[str, List[str]],
    return_type: str = 'figure',
    color_scheme: str = None,
    height: Union[int, str] = None,
    width: Union[int, str] = None,
    **kwargs
) -> Union[go.Figure, dcc.Graph]:
    """
    Create a bar chart visualization.
    
    Parameters
    ----------
    data : pd.DataFrame
        Tidy format dataframe with one observation per row
    x : str
        Column name for x-axis
    y : str or List[str]
        Column name(s) for series to plot
    return_type : str, default 'figure'
        Return 'figure' (Plotly object) or 'component' (Dash dcc.Graph)
    color_scheme : str, optional
        Named color scheme from config.py
    height : int or str, optional
        Chart height in pixels or percentage
    width : int or str, optional
        Chart width in pixels or percentage
    
    Returns
    -------
    go.Figure or dcc.Graph
        Depending on return_type parameter
    """

Required Parameters

Every visualization function must accept:

  • data: Pandas DataFrame in tidy format
  • x: Single column name for x-axis (or equivalent primary dimension)
  • y: Single column name or list of column names for series

Optional Parameters (with sensible defaults)

  • return_type: 'figure' (Plotly figure object) or 'component' (Dash dcc.Graph)
    • Default: 'figure'
    • Allows use in notebooks (figure) or production dashboards (component)
  • color_scheme: Named color scheme from config.py
    • Default: Auto-selected based on data type (sequential/divergent/qualitative)
  • height: Chart height in pixels or percentage
  • width: Chart width in pixels or percentage

Composability Implementation

  • Small multiples: Implemented as parameters within individual graph functions
    • small_multiples: bool = False
    • columns: int = 2 (for layout)
    • shared_axes: bool = True
  • Return values: Basic visualizations return objects (figures or components) that composite functions can accept and combine
  • Parameter passing: Composite visualizations accept basic visualizations as parameters or combine their outputs programmatically

5. Example Code

import pandas as pd
import plotly.graph_objects as go
from dash import dcc
from typing import Union, List
from config import color_scheme_qual_category_8

def create_bar_chart(
    data: pd.DataFrame,
    x: str,
    y: Union[str, List[str]],
    return_type: str = 'figure',
    color_scheme: str = None,
    height: Union[int, str] = None,
    width: Union[int, str] = None,
) -> Union[go.Figure, dcc.Graph]:
    """Create a bar chart from tidy data."""
    
    # Validate data format
    if not _is_tidy_format(data):
        raise ValueError(
            "Data must be in tidy format. "
            "See: docs/data_format_guide.md"
        )
    
    # Validate required columns exist
    if x not in data.columns:
        raise ValueError(f"Column '{x}' not found in data")
    
    # Auto-select color scheme if not provided
    if color_scheme is None:
        color_scheme = _infer_color_scheme(data, y)
    
    # Create the figure
    fig = go.Figure()
    
    # Add traces (simplified example)
    y_cols = [y] if isinstance(y, str) else y
    for col in y_cols:
        fig.add_trace(go.Bar(x=data[x], y=data[col], name=col))
    
    # Apply styling
    fig.update_layout(
        height=height,
        width=width,
        # Additional layout configuration
    )
    
    # Return appropriate type
    if return_type == 'figure':
        return fig
    elif return_type == 'component':
        return dcc.Graph(figure=fig)
    else:
        raise ValueError(
            f"return_type must be 'figure' or 'component', got '{return_type}'"
        )

def _is_tidy_format(data: pd.DataFrame) -> bool:
    """Helper to validate tidy format (implementation details)."""
    # Implementation logic here
    pass

def _infer_color_scheme(data: pd.DataFrame, y: Union[str, List[str]]) -> str:
    """Helper to infer appropriate color scheme from data."""
    # Implementation logic here
    pass

6. Configuration Management

config.py Structure

Create config.py for shared settings that multiple visualization functions need.

Currently includes:

  • Color scheme definitions

Evolution approach: Keep configuration minimal initially. Add settings only when:

  • Multiple functions need the same value
  • The value should be consistent across the project
  • Changing it in one place should affect all uses

Color Scheme Naming Convention

Format: color_scheme_{type}_{name}_{max_colors}

Type abbreviations:

  • seq: Sequential (for ordered data, e.g., low to high)
  • cont: Continuous (for continuous numerical scales)
  • div: Divergent (for data with meaningful center, e.g., profit/loss)
  • qual: Qualitative (for categorical data with no inherent order)

Examples:

# In config.py
color_scheme_seq_blues_9 = ['#f7fbff', '#deebf7', '#c6dbef', ...]
color_scheme_div_redblue_11 = ['#67001f', '#b2182b', ..., '#053061']
color_scheme_qual_category_8 = ['#1f77b4', '#ff7f0e', '#2ca02c', ...]

Defaults Location

Place defaults as close to the code using them as possible:

  • Shared across functions: In config.py (e.g., color schemes)
  • Function-specific: In function signature defaults (e.g., return_type='figure')

This maintains high cohesion while allowing shared configuration where needed.


7. Error Handling

Error Handling Principles

All exceptions must follow this three-part pattern:

  1. Raise an appropriate error type (ValueError, TypeError, etc.)
  2. Provide a clear, short description of what went wrong
  3. Include a URL/path to detailed documentation for resolution

Examples

Invalid data format:

if not is_tidy_format(data):
    raise ValueError(
        "Data must be in tidy format. "
        "See: docs/data_format_guide.md"
    )

Color scheme capacity exceeded:

if n_categories > max_colors:
    raise ValueError(
        f"Color scheme '{color_scheme}' supports max {max_colors} colors, "
        f"but data has {n_categories} categories. "
        f"See: docs/color_schemes.md for alternatives"
    )

Missing required column:

if x not in data.columns:
    raise ValueError(
        f"Column '{x}' not found in data. "
        f"Available columns: {list(data.columns)}"
    )

Invalid parameter value:

if return_type not in ['figure', 'component']:
    raise ValueError(
        f"return_type must be 'figure' or 'component', got '{return_type}'"
    )

8. Documentation Standards

What to Document

Document only what cannot be intuited from reading the code:

  • Why a design decision was made (not what the code does)
  • Non-obvious parameter constraints or relationships
  • Expected data structures when not clear from type hints
  • Links to external resources for complex concepts
  • Business logic reasoning that isn't self-evident

What NOT to Document

Avoid documenting:

  • What is already clear from function/variable names
  • What type hints already express
  • Simple implementations that speak for themselves
  • Obvious parameter descriptions

Docstring Format

Use Google style docstrings consistently across the project.

Example (Google style):

def create_bar_chart(data, x, y, return_type='figure'):
    """Create a bar chart visualization from tidy data.
    
    Uses ACAP-ADAN principle: common bar chart logic with differentiation
    through parameters. Automatically selects appropriate color scheme
    based on data type if not specified.
    
    Args:
        data: Tidy format dataframe with one observation per row.
        x: Column name for categorical x-axis.
        y: Column name(s) for numeric values to plot.
        return_type: Whether to return 'figure' (Plotly object) or 
            'component' (Dash dcc.Graph). Defaults to 'figure'.
    
    Returns:
        Plotly figure object or Dash component based on return_type.
        
    Raises:
        ValueError: If data is not in tidy format or required columns missing.
    
    See Also:
        docs/data_format_guide.md for explanation of tidy data format.
    """

Development Workflow

When creating new visualizations:

  1. Start with the simplest working version
  2. Add parameters only when needed
  3. Ensure error messages are helpful with documentation links
  4. Test with edge cases (empty data, single row, max colors exceeded)
  5. Update this Claude.md if new patterns emerge

Future Considerations

Areas to develop as needs become clear:

  • Testing standards and patterns
  • Callback patterns for interactivity
  • File and folder structure conventions
  • Performance optimization guidelines

This document evolves with the project. Update it as patterns emerge and solidify.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment