Skip to content

Instantly share code, notes, and snippets.

@Ghost---Shadow
Created February 18, 2026 06:07
Show Gist options
  • Select an option

  • Save Ghost---Shadow/f1d3e1f70c7991955de44c5185086c05 to your computer and use it in GitHub Desktop.

Select an option

Save Ghost---Shadow/f1d3e1f70c7991955de44c5185086c05 to your computer and use it in GitHub Desktop.
Step-by-step guide for downloading West Bengal electoral roll PDFs from India's ECI portal (voters.eci.gov.in) and the CEO West Bengal historical portal — including CAPTCHA handling, XHR interception to capture the base64 PDF payload, and Bengali OCR for finding names in scanned rolls.

Download India ECI Electoral Roll PDFs (West Bengal)

Portals

Year URL
2025 (recent rolls) https://voters.eci.gov.in/download-eroll
2002 (historical WB) https://ceowestbengal.wb.gov.in/Roll_ps/`<AC_NUMBER>`

2025 Roll — voters.eci.gov.in

Form Fields (fill in this order)

  1. StateWest Bengal (code S25)
  2. Year of Revision → e.g. 2025
  3. Roll TypeFinal Roll 2025
  4. District → e.g. DARJEELING
  5. Assembly Constituency → e.g. 25 - Matigara-Naxalbari
  6. LanguageBENGALI

Part Selection

  • Use the Search box to filter by part number (e.g. 259)
  • Check the checkbox next to the matching part
  • ⚠️ The checkbox unchecks itself after every CAPTCHA attempt — always re-check it before clicking Download

CAPTCHA Tips

  • Always zoom in on the CAPTCHA image before reading it
  • Click the ↻ refresh icon to get a new CAPTCHA if unsure
  • A 400 response = wrong CAPTCHA
  • A 200 response = accepted ✓

How the Download Actually Works (API Flow)

1. POST → gateway-voters.eci.gov.in/api/v1/printing-publish/generate-published-pdfs
         ← { "payload": ["<fileId>"], "status": "Success" }

2. XHR GET → gateway-vpd.eci.gov.in/api/v1/ext-printing-publish/get-published-file?fileId=<fileId>
            ← { "payload": "<base64 PDF>", "refId": "<filename>.pdf" }

⚠️ The fileId expires very quickly — capture the XHR response immediately after clicking Download. The two gateways are different: gateway-voters for POST, gateway-vpd for GET.

Step 1 — Set Up XHR Interceptor (run BEFORE clicking Download)

window._fullPdfPayload = null;
window._fullPdfFilename = null;

const origOpen = XMLHttpRequest.prototype.open;
const origSend = XMLHttpRequest.prototype.send;

XMLHttpRequest.prototype.open = function(m, url, ...r) {
  this._url2 = url;
  return origOpen.apply(this, [m, url, ...r]);
};

XMLHttpRequest.prototype.send = function(b) {
  if ((this._url2 || '').includes('get-published-file')) {
    this.addEventListener('load', function() {
      const data = JSON.parse(this.responseText);
      if (data.payload && data.payload.length > 100) {
        window._fullPdfPayload = data.payload;
        window._fullPdfFilename = data.refId || 'voter_roll.pdf';
      }
    });
  }
  return origSend.apply(this, [b]);
};

Step 2 — Trigger the Save (run after ~5s)

const bytes = new Uint8Array(atob(window._fullPdfPayload).split('').map(c => c.charCodeAt(0)));
const blob = new Blob([bytes], { type: 'application/pdf' });
const a = document.createElement('a');
a.href = URL.createObjectURL(blob);
a.download = window._fullPdfFilename;
document.body.appendChild(a);
a.click();
document.body.removeChild(a);

2002 Roll — ceowestbengal.wb.gov.in

Navigation

  1. Go to https://ceowestbengal.wb.gov.in/Roll_ps/<AC_NUMBER> (e.g. /Roll_ps/25 for AC 25)
  2. Find the Part row and click the Final Roll link
  3. Solve the CAPTCHA in the dialog that appears

Step 1 — Intercept window.open (run BEFORE solving CAPTCHA)

window.open = function(url) {
  window._capturedPdfUrl = url;
  return null;
};

Step 2 — Fetch and Save

fetch(window._capturedPdfUrl, { credentials: 'include' })
  .then(r => r.blob())
  .then(blob => {
    const a = document.createElement('a');
    a.href = URL.createObjectURL(blob);
    a.download = '2002_WB_AC025_Part042.pdf';
    document.body.appendChild(a);
    a.click();
  });

Decoding the 2025 PDF Filename

2025-EROLLGEN-S25-25-FinalRoll-Revision1-BEN-259-WI.pdf
                │   │                        │
                │   │                        └─ Part number (259 = North Bengal University Primary School)
                │   └─ Assembly Constituency number (25 = Matigara-Naxalbari)
                └─ State code (S25 = West Bengal)
Segment Meaning
S25 West Bengal state code
25 Assembly Constituency number
FinalRoll Roll type
BEN Bengali language
259 Part number
WI Women Inclusive / Ward Index suffix

Finding Names in Downloaded PDFs (Bengali OCR)

Both PDFs are scanned images — pdfplumber and PyPDF2 won't work. Use EasyOCR with the Bengali model.

import fitz
import easyocr
import numpy as np

reader = easyocr.Reader(['bn'], gpu=False)
doc = fitz.open("voter_roll.pdf")

for i, page in enumerate(doc):
    # Render at 3x zoom for better OCR accuracy
    pix = page.get_pixmap(matrix=fitz.Matrix(3, 3))
    img = np.frombuffer(pix.samples, dtype=np.uint8).reshape(pix.h, pix.w, pix.n)
    results = reader.readtext(img, detail=0)
    for text in results:
        if "নন্দ" in text or "আশীষ" in text or "সৌরদীপ" in text:
            print(f"Page {i+1}: {text}")

⚠️ The 2002 PDF uses a non-Unicode legacy Bengali font — the text layer is garbage. OCR is the only reliable method for both PDFs.


AC / Part Number Reference (Darjeeling District)

AC No. Name Notes
23 Darjeeling
24 Kurseong
25 Matigara-Naxalbari North Bengal University area
26 Siliguri Called "Siliguri" in 2002 rolls
27 Phansidewa

Part 259 of AC 25 = North Bengal University Primary School Part 42 of AC 25 = same area in the 2002 roll

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment