Skip to content

Instantly share code, notes, and snippets.

Dynamo

Executive Summary

This report provides a comprehensive comparison of NVIDIA Dynamo with other frameworks, including vLLM and NVIDIA Triton Server, for large language model (LLM) inference workloads. The key findings highlight Dynamo's superior performance in multi-GPU setups, achieving higher throughput and lower latency compared to vLLM. Additionally, Dynamo's disaggregated serving approach offers more flexible performance tuning, especially for large models and variable workload conditions. In comparison to NVIDIA Triton Server, Dynamo is optimized for low-latency generative AI/LLM workloads, while Triton Server excels in multi-model inference serving. The report also delves into Dynamo's technical architecture, which is designed to accelerate inference workloads through components such as disaggregated serving, smart routing, and distributed KV cache management. Benchmarking methodologies and key performance metrics are also discussed, emphasizing the importance of standardized evaluation for

@slopp
slopp / README.md
Created January 2, 2025 22:37
LangGraph Exploration

A modification of the LangChain SQL Q&A tutorial https://python.langchain.com/docs/tutorials/sql_qa/.

The changes are:

  • uses Pydantic to type the state and inputs/outputs
  • uses duckDB on the palmerpenguin dataset
  • uses a Nvidia NIM for the LLM
  • instead of a sequence write_query -> run_query -> gen_answer, this graph adds a LLM that checks the write_query output for validity and to see if it answers the question, leading to a more dynamic graph that looks like this:

graph

@slopp
slopp / README.md
Created December 31, 2024 00:10
AI Coffee Shop Streamlit App

This simple streamlit app uses the Google Maps and Places API, along with a hosted Nvidia NIM wrapper of the Llama model, to help you find coffee shops near an address.

Screenshot 2024-12-30 at 5 03 38 PM

To run

  1. Install dependencies
@slopp
slopp / README.md
Last active January 2, 2025 22:32
Simple LLM agent to recommend fake coffee shops via tool calling
@slopp
slopp / README.md
Last active January 2, 2025 22:32
Code for creating a RAG chatbot based on theradavist
@slopp
slopp / ReadMe.md
Created December 23, 2024 20:17
Torch Experiment

Torch Experiment

Goal

  • First attempt at using Torch for some type of "deep" learning
  • Take advantage of modal to access serverless Python compute, including GPUs

Approach

@slopp
slopp / example.py
Created September 24, 2024 21:42
IO Manager that Depends on Resource
from typing import Any
from dagster import ConfigurableResource, ConfigurableIOManager, InputContext, OutputContext, asset, Definitions, ResourceDependency, EnvVar
from pydantic import Field
# https://docs.dagster.io/concepts/resources#resources-that-depend-on-other-resources
class myResource(ConfigurableResource):
username: str = Field(description="the username")
password: str = Field(description="the password")
@slopp
slopp / job.py
Created September 23, 2024 19:31
Canary Ping Dagster
import os
from dagster import define_asset_job, load_assets_from_package_module, repository, with_resources, op, job, ScheduleDefinition
from my_dagster_project import assets
from datadog_api_client import ApiClient, Configuration
from datadog_api_client.v2.api.metrics_api import MetricsApi
from datadog_api_client.v2.model.metric_intake_type import MetricIntakeType
from datadog_api_client.v2.model.metric_payload import MetricPayload
from datadog_api_client.v2.model.metric_point import MetricPoint
from datetime import datetime
@slopp
slopp / palmer.py
Last active August 28, 2024 13:03
Palmer ML Workflow with Dagster
import datetime
import pins
import os
import seaborn as sns
from dagster import asset, asset_check, AssetCheckResult
from posit import connect # install as uv pip install posit-sdk
from sklearn.linear_model import LogisticRegression
from sklearn.model_selection import train_test_split
from sklearn.pipeline import Pipeline
@slopp
slopp / assets.py
Created June 27, 2024 18:04
Custom dagster asset decorator
from dagster import asset
# add an attribute to all assets using this decorator without users having to adjust it
def bi_team_asset(**asset_decorator_kwargs):
def _wrapper(f):
@asset(**asset_decorator_kwargs, owners=["[email protected]"], name=f.__name__)
def _impl(**kwargs):
return f(**kwargs)
return _impl