This project builds data analytics dashboards using Python and Plotly Dash.
- Loose Coupling: Components interact through well-defined interfaces
- High Cohesion: Related functionality stays together; each function has a single clear purpose
- Composability: Basic visualizations combine to create complex dashboards
- ACAP-ADAN: As Common As Possible, As Differentiated As Necessary
- Start simple, let solutions evolve
- Add complexity only when real needs emerge
- Postpone decisions to the last responsible moment
Organized from first principles up the ladder of abstraction:
Individual chart functions representing atomic visualization types:
create_bar_chart()
create_line_chart()
create_scatter_chart()
- etc.
Each basic visualization is its own function with a single clear purpose.
Combinations of basic visualizations:
- KPI cards with sparklines
- Dashboard panels combining multiple chart types
- Complex analytical displays
Composite visualizations accept basic visualizations as parameters or combine their outputs.
Higher-level abstractions (future development):
- Page structure and routing
- Callbacks for interactivity
- Application-level orchestration
- Follow PEP 8 strictly
- Use type hints for all function parameters and returns
- Maximum line length: 88 characters (Black formatter default)
- Each visualization type is its own function
- Keep functions focused on a single responsibility
- Group related utilities in cohesive modules
Required: All data must be in Tidy (long-form) format as Pandas DataFrames
- One observation per row
- Variables in columns
- Each value in its own cell
If wide-form data is passed, raise a clear error with link to documentation.
def create_bar_chart(
data: pd.DataFrame,
x: str,
y: Union[str, List[str]],
return_type: str = 'figure',
color_scheme: str = None,
height: Union[int, str] = None,
width: Union[int, str] = None,
**kwargs
) -> Union[go.Figure, dcc.Graph]:
"""
Create a bar chart visualization.
Parameters
----------
data : pd.DataFrame
Tidy format dataframe with one observation per row
x : str
Column name for x-axis
y : str or List[str]
Column name(s) for series to plot
return_type : str, default 'figure'
Return 'figure' (Plotly object) or 'component' (Dash dcc.Graph)
color_scheme : str, optional
Named color scheme from config.py
height : int or str, optional
Chart height in pixels or percentage
width : int or str, optional
Chart width in pixels or percentage
Returns
-------
go.Figure or dcc.Graph
Depending on return_type parameter
"""
Every visualization function must accept:
data
: Pandas DataFrame in tidy formatx
: Single column name for x-axis (or equivalent primary dimension)y
: Single column name or list of column names for series
return_type
: 'figure' (Plotly figure object) or 'component' (Dash dcc.Graph)- Default: 'figure'
- Allows use in notebooks (figure) or production dashboards (component)
color_scheme
: Named color scheme from config.py- Default: Auto-selected based on data type (sequential/divergent/qualitative)
height
: Chart height in pixels or percentagewidth
: Chart width in pixels or percentage
- Small multiples: Implemented as parameters within individual graph functions
small_multiples: bool = False
columns: int = 2
(for layout)shared_axes: bool = True
- Return values: Basic visualizations return objects (figures or components) that composite functions can accept and combine
- Parameter passing: Composite visualizations accept basic visualizations as parameters or combine their outputs programmatically
import pandas as pd
import plotly.graph_objects as go
from dash import dcc
from typing import Union, List
from config import color_scheme_qual_category_8
def create_bar_chart(
data: pd.DataFrame,
x: str,
y: Union[str, List[str]],
return_type: str = 'figure',
color_scheme: str = None,
height: Union[int, str] = None,
width: Union[int, str] = None,
) -> Union[go.Figure, dcc.Graph]:
"""Create a bar chart from tidy data."""
# Validate data format
if not _is_tidy_format(data):
raise ValueError(
"Data must be in tidy format. "
"See: docs/data_format_guide.md"
)
# Validate required columns exist
if x not in data.columns:
raise ValueError(f"Column '{x}' not found in data")
# Auto-select color scheme if not provided
if color_scheme is None:
color_scheme = _infer_color_scheme(data, y)
# Create the figure
fig = go.Figure()
# Add traces (simplified example)
y_cols = [y] if isinstance(y, str) else y
for col in y_cols:
fig.add_trace(go.Bar(x=data[x], y=data[col], name=col))
# Apply styling
fig.update_layout(
height=height,
width=width,
# Additional layout configuration
)
# Return appropriate type
if return_type == 'figure':
return fig
elif return_type == 'component':
return dcc.Graph(figure=fig)
else:
raise ValueError(
f"return_type must be 'figure' or 'component', got '{return_type}'"
)
def _is_tidy_format(data: pd.DataFrame) -> bool:
"""Helper to validate tidy format (implementation details)."""
# Implementation logic here
pass
def _infer_color_scheme(data: pd.DataFrame, y: Union[str, List[str]]) -> str:
"""Helper to infer appropriate color scheme from data."""
# Implementation logic here
pass
Create config.py
for shared settings that multiple visualization functions need.
Currently includes:
- Color scheme definitions
Evolution approach: Keep configuration minimal initially. Add settings only when:
- Multiple functions need the same value
- The value should be consistent across the project
- Changing it in one place should affect all uses
Format: color_scheme_{type}_{name}_{max_colors}
Type abbreviations:
seq
: Sequential (for ordered data, e.g., low to high)cont
: Continuous (for continuous numerical scales)div
: Divergent (for data with meaningful center, e.g., profit/loss)qual
: Qualitative (for categorical data with no inherent order)
Examples:
# In config.py
color_scheme_seq_blues_9 = ['#f7fbff', '#deebf7', '#c6dbef', ...]
color_scheme_div_redblue_11 = ['#67001f', '#b2182b', ..., '#053061']
color_scheme_qual_category_8 = ['#1f77b4', '#ff7f0e', '#2ca02c', ...]
Place defaults as close to the code using them as possible:
- Shared across functions: In
config.py
(e.g., color schemes) - Function-specific: In function signature defaults (e.g.,
return_type='figure'
)
This maintains high cohesion while allowing shared configuration where needed.
All exceptions must follow this three-part pattern:
- Raise an appropriate error type (ValueError, TypeError, etc.)
- Provide a clear, short description of what went wrong
- Include a URL/path to detailed documentation for resolution
Invalid data format:
if not is_tidy_format(data):
raise ValueError(
"Data must be in tidy format. "
"See: docs/data_format_guide.md"
)
Color scheme capacity exceeded:
if n_categories > max_colors:
raise ValueError(
f"Color scheme '{color_scheme}' supports max {max_colors} colors, "
f"but data has {n_categories} categories. "
f"See: docs/color_schemes.md for alternatives"
)
Missing required column:
if x not in data.columns:
raise ValueError(
f"Column '{x}' not found in data. "
f"Available columns: {list(data.columns)}"
)
Invalid parameter value:
if return_type not in ['figure', 'component']:
raise ValueError(
f"return_type must be 'figure' or 'component', got '{return_type}'"
)
Document only what cannot be intuited from reading the code:
- Why a design decision was made (not what the code does)
- Non-obvious parameter constraints or relationships
- Expected data structures when not clear from type hints
- Links to external resources for complex concepts
- Business logic reasoning that isn't self-evident
Avoid documenting:
- What is already clear from function/variable names
- What type hints already express
- Simple implementations that speak for themselves
- Obvious parameter descriptions
Use Google style docstrings consistently across the project.
Example (Google style):
def create_bar_chart(data, x, y, return_type='figure'):
"""Create a bar chart visualization from tidy data.
Uses ACAP-ADAN principle: common bar chart logic with differentiation
through parameters. Automatically selects appropriate color scheme
based on data type if not specified.
Args:
data: Tidy format dataframe with one observation per row.
x: Column name for categorical x-axis.
y: Column name(s) for numeric values to plot.
return_type: Whether to return 'figure' (Plotly object) or
'component' (Dash dcc.Graph). Defaults to 'figure'.
Returns:
Plotly figure object or Dash component based on return_type.
Raises:
ValueError: If data is not in tidy format or required columns missing.
See Also:
docs/data_format_guide.md for explanation of tidy data format.
"""
When creating new visualizations:
- Start with the simplest working version
- Add parameters only when needed
- Ensure error messages are helpful with documentation links
- Test with edge cases (empty data, single row, max colors exceeded)
- Update this Claude.md if new patterns emerge
Areas to develop as needs become clear:
- Testing standards and patterns
- Callback patterns for interactivity
- File and folder structure conventions
- Performance optimization guidelines
This document evolves with the project. Update it as patterns emerge and solidify.