Skip to content

Instantly share code, notes, and snippets.

View ehzawad's full-sized avatar
🎃
Wasteland Baby!

ehzawad ehzawad

🎃
Wasteland Baby!
View GitHub Profile

The team needs an internal workflow that ingests messy legal-style documents, pulls usable information out of them, and turns that information into grounded draft outputs an operator can edit.

The inputs will not be clean. Expect scanned pages, low-resolution PDFs, handwritten notes, partially illegible records, and inconsistently formatted files. Your system has to cope with that.

At a high level, the system you build should:

  • Ingest and process the source documents.
  • Extract usable text and structured fields.
  • Retrieve relevant evidence from those documents.
  • Generate grounded draft responses or legal-style drafts.

I want to add a logging middleware to my FastAPI app. Log every request and
response with: method, path, status, latency, a correlation ID, and redact
auth headers. Structured JSON output. Before you write any code, ask codex
for a second opinion on the design — specifically: async vs sync middleware,
how to propagate the correlation ID through downstream async tasks, and
what the cleanest redaction point is. Then reconcile and implement.

https://github.com/ehzawad/codex-opinion

User question:                                                                                        
{truncated user question}                                                                             
                                                                                                      
Candidates:                                                                                           
Candidate 1:                                                                                          
Tag: {tag}                                                                                            
Matched question: {truncated question}                                                              
Answer: {truncated answer}                                                                            

Retrieval score: {score:.6f}

Scaffolding a React + Vite SPA portfolio site in an empty directory

Session: b5538e14 | Date: 2026-04-01 05:59:40 | Branch: HEAD | Turns: 25 Project: /home/synesis/bada

Summary

Developer asked for a single page application. Claude chose React + Vite over Next.js (lighter for a true SPA), scaffolded via create-vite, then rewrote the default template into a dark-themed portfolio with nav, hero, projects grid, about, and contact sections. Build succeeded in 91ms with no errors.

What happened

#!/usr/bin/env python3
"""FastAPI batch ASR server — Bengali Whisper (faster-whisper / CTranslate2).
Usage:
python serve.py
python serve.py --port 8001 --host 0.0.0.0
"""
import base64
import io

Findings: Vibe Engineer Codebase

What this repo is

This repository is a Python CLI tool called ve for documentation-driven development. It turns the prompts you'd normally give an AI agent into persistent, discoverable documentation — creating a self-building institutional memory.

  • The installed ve command is defined in pyproject.toml (ve = "ve:cli").
  • The entrypoint is src/ve.py.
  • CLI command groups are assembled in src/cli/__init__.py (Click framework).
  • Data models use Pydantic for YAML frontmatter validation.

Research Summary

Current baseline was the temporal PARSeq-Small/parseq setup (e.g., embed_dim=384, enc_depth=12, ~23.8M-class model family).

Two approaches are proposed:

  1. Approach 1: PARSeq-Tiny transfer learning
  • Dimitri's temporal PARSeq modifications were reused from the existing pipeline, but the model was switched from PARSeq-Small to PARSeq-Tiny and retrained from the Hugging Face/PARSeq-Tiny checkpoint.
  • PARSeq-Tiny should be sufficient because the target vocabulary is only digits (0-9) plus control token(s), unlike broader OCR character sets.
  • Unseen-number handling via digit-level recognition: supervision remained token-level (0-9 + EOS), not 100 jersey classes. The model learned digit identities and sequence order, so unseen combinations were compositional.
@ehzawad
ehzawad / specs.md
Created October 10, 2025 20:56
Specifications

Bangladesh Election Commission NID Chatbot - Complete Specification

System Overview

The Bangladesh Election Commission NID chatbot is a conversational AI system that helps citizens with National Identity Card (NID) and voter registration queries. The system operates in two distinct modes:

  1. Form Mode (9 specialized tags) - Multi-turn conversations requiring country information
  2. FAQ Mode (201 other tags) - Simple question-answer responses

Problem Decomposition: Bengali RAG-Based Chatbot for Election Commission

Project Overview

A production-ready Bengali conversational AI system for National Identity Card (NID) and voter registration queries. The system uses semantic search (RAG) with multi-turn form conversations, interruption handling, and state management.

Core Challenge: Build an intelligent chatbot that understands Bengali queries about NID/voter registration, retrieves relevant answers from a knowledge base, and handles complex multi-turn conversations like form filling.

The Tentative Approach

import os
import sys
import numpy as np
import psutil
import torch
def get_memory_info():
"""Get current memory usage of the process."""