Skip to content

Instantly share code, notes, and snippets.

@cpfiffer
Created July 14, 2025 21:02
Show Gist options
  • Save cpfiffer/62446a65e735f4676521777e269512c0 to your computer and use it in GitHub Desktop.
Save cpfiffer/62446a65e735f4676521777e269512c0 to your computer and use it in GitHub Desktop.
Letta tool to retrieve llms.txt files
def fetch_llms_txt(url: str, timeout: int = 10):
"""
Fetch content from URLs ending in 'llms.txt'.
Args:
url (str): The URL to fetch (must end with 'llms.txt')
timeout (int): Request timeout in seconds (default: 10)
Returns:
str: Text content of the llms.txt file
Raises:
ValueError: If URL doesn't end with 'llms.txt' or is invalid
Exception: For network errors, HTTP errors, or other failures
"""
import requests
import urllib.parse
# Validate URL ends with llms.txt
if not url.lower().endswith('llms.txt'):
raise ValueError("URL must end with 'llms.txt'")
# Validate URL format
parsed_url = urllib.parse.urlparse(url)
if not parsed_url.scheme or not parsed_url.netloc:
raise ValueError("Invalid URL format")
# Set default headers
default_headers = {
'User-Agent': 'llms.txt-fetcher/1.0',
'Accept': 'text/plain, text/*, */*'
}
try:
# Make the request
response = requests.get(
url,
timeout=timeout,
headers=default_headers,
allow_redirects=True
)
# Check if request was successful
if response.status_code == 200:
return response.text
else:
raise Exception(f"HTTP {response.status_code}: {response.reason}. The llms.txt file may not exist at this URL or the server is returning an error.")
except requests.exceptions.Timeout:
raise Exception(f"Request timed out after {timeout} seconds. Try increasing the `timeout` argument for slower servers.")
except requests.exceptions.ConnectionError:
raise Exception("Connection error - could not reach the server. Check the URL and your internet connection.")
except requests.exceptions.TooManyRedirects:
raise Exception("Too many redirects. The server may be misconfigured or the URL may be invalid.")
except requests.exceptions.RequestException as e:
raise Exception(f"Request failed: {str(e)}. Check the URL format and network connectivity.")
except Exception as e:
# Re-raise if it's already our custom exception
if any(phrase in str(e) for phrase in ['Request timed out', 'Connection error', 'Too many redirects', 'Request failed', 'HTTP ']):
raise
else:
raise Exception(f"Unexpected error: {str(e)}. This may be due to network issues or server problems.")
@cpfiffer
Copy link
Author

llms.txt fetcher tool for Letta agents

Description

The fetch_llms_txt function is a specialized HTTP client for fetching llms.txt files from web servers. It's designed to be used as a Letta tool.

This will throw an error if the model attempts to access any page that does not end in llms.txt.

How It Works

  1. URL Validation: Ensures the provided URL ends with 'llms.txt' and has a valid format
  2. HTTP Request: Makes a GET request with appropriate headers and timeout handling
  3. Response Processing: Returns the raw text content on success (HTTP 200)
  4. Error Handling: Provides detailed error messages for various failure scenarios with actionable suggestions

The function uses standard HTTP headers including a custom User-Agent and accepts text content types. It handles redirects automatically and provides specific error messages for timeouts, connection issues, HTTP errors, and other network problems.

Key Features

  • Validates URLs must end with 'llms.txt'
  • Configurable timeout with helpful timeout error messages
  • Comprehensive error handling with actionable feedback for your agent
  • Returns plain text content or raises exceptions
  • Handles redirects automatically

JSON Schema

If you add this to Letta using the ADE, you can place this in the tool's schema:

{
  "type": "object",
  "description": "Arguments for fetching llms.txt files from URLs",
  "properties": {
    "url": {
      "type": "string",
      "description": "The URL to fetch (must end with 'llms.txt')"
    },
    "timeout": {
      "type": "integer",
      "description": "Request timeout in seconds (default: 10)"
    }
  },
  "required": ["url"]
}

Usage Examples

# Basic usage
content = fetch_llms_txt("https://example.com/llms.txt")

# With custom timeout
content = fetch_llms_txt("https://slow-server.com/llms.txt", timeout=30)

# Error handling
try:
    content = fetch_llms_txt("https://example.com/llms.txt")
    print(f"Fetched {len(content)} characters")
except Exception as e:
    print(f"Failed to fetch: {e}")

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment