Skip to content

Instantly share code, notes, and snippets.

View gouravjshah's full-sized avatar

Gourav J. Shah gouravjshah

View GitHub Profile

switch to instavote namespace

kubectl config set-context --current --namespace=instavote
helm uninstall -n dev instavote 
kubectl delete deploy vote redis db result worker  -n instavote 
kubectl delete svc vote redis db result -n instavote 
---
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: vote
namespace: instavote
spec:
ingressClassName: nginx
rules:
- host: vote.example.com

Event Driven Auto Scaling with KEDA

Configure Prometheus

Install Prometheus with Grafana with helm

helm repo add prometheus-community https://prometheus-community.github.io/helm-charts

Create a namespace

kubectl create namespace instavote 
kubectl config set-context --current --namespace=instavote
git clone https://github.com/schoolofdevops/instavote-kustomize.git
apiVersion: apps/v1
kind: Deployment
metadata:
name: vote
namespace: default
spec:
replicas: 2
selector:
matchLabels:
app: vote

rag/build_index.py

# rag/build_index.py
import argparse, json
from pathlib import Path
from typing import Dict, Any, Iterable, Tuple, List

# ---------- common snippet renderers ----------

What’s happening

  • That log line:

    Overriding ... dispatch key: AutocastCPU ... new kernel: ... ipex-cpu ...
    INFO ... Automatically detected platform cpu.
    

means IPEX’s autocast kernels replaced the default ones. With --dtype=float16 on CPU, PyTorch/ipex either upcasts or hits slow/non-vectorized code paths and can “hang” at model load/compile.

{
"annotations": {
"list": [
{
"builtIn": 1,
"datasource": {
"type": "grafana",
"uid": "-- Grafana --"
},
"enable": true,
@gouravjshah
gouravjshah / Dockerfile
Created October 9, 2025 08:20
Dockrfile to build a vLLM Image with CPU on Mac
FROM openeuler/vllm-cpu:0.9.1-oe2403lts
# Patch the cpu_worker.py to handle zero NUMA nodes
RUN sed -i 's/cpu_count_per_numa = cpu_count \/\/ numa_size/cpu_count_per_numa = cpu_count \/\/ numa_size if numa_size > 0 else cpu_count/g' \
/workspace/vllm/vllm/worker/cpu_worker.py
ENV VLLM_TARGET_DEVICE=cpu \
VLLM_CPU_KVCACHE_SPACE=1 \
OMP_NUM_THREADS=2 \
OPENBLAS_NUM_THREADS=1 \
  1. Local registry for KIND

We’ll run a registry container named kind-registry on port 5001 and attach it to the kind network so nodes can pull via kind-registry:5001/....

scripts/start_local_registry.sh

#!/usr/bin/env bash
set -euo pipefail