This problem set guides you through building a basic HTTP server from scratch using Python's standard library (primarily the socket module). The goal is to create a server that can handle simple GET requests, parse incoming HTTP requests, and send valid HTTP responses. We'll break it down into "tickets" – small, incremental tasks that build upon each other, simulating a real-world project workflow.
Prerequisites:
- Basic Python knowledge (sockets, strings, file I/O).
- No external libraries are allowed; stick to Python's built-in modules.
- Test your server using tools like
curl(e.g.,curl http://localhost:8080/) or a web browser.
Project Structure:
- Create a single Python file, e.g.,
simple_http_server.py. - The server should run on
localhostport 8080 by default. - Handle only GET requests for this project; ignore others with a 405 Method Not Allowed response.
- Serve static files from a directory named
wwwin the same folder as your script (create it and add some HTML files for testing).
Running the Server:
- Once complete, run your script:
python simple_http_server.py. - It should listen indefinitely until interrupted (e.g., Ctrl+C).
Now, let's dive into the tickets. Complete them in order, testing each step before moving on.
Description: Create the foundation for your HTTP server by setting up a TCP socket that listens for incoming connections.
Tasks:
- Import the
socketmodule. - Create a socket object using
socket.socket(socket.AF_INET, socket.SOCK_STREAM). - Bind it to
('localhost', 8080). - Set it to listen with a backlog of at least 5 connections.
- In a loop, accept incoming connections and immediately close them (for now – we'll handle them later).
- Print a message like "Server is listening on port 8080..." when starting.
Testing:
- Run the script and use
telnet localhost 8080orcurl http://localhost:8080in another terminal. The connection should be accepted and closed without errors. - Ensure the server doesn't crash on multiple connections.
Hints:
- Use
try-exceptto handle keyboard interrupts gracefully. - Remember to call
sock.listen().
Description: Extend the server to read data from accepted connections instead of just closing them.
Tasks:
- In the accept loop, use
conn.recv(1024)to read up to 1024 bytes of data from the client. - Decode the received bytes as UTF-8 and print the raw request to the console.
- Send a dummy response back: just the bytes
"HTTP/1.1 200 OK\r\n\r\nHello, World!". - Close the connection after sending.
Testing:
- Use
curl http://localhost:8080and check if the server prints the request (e.g., "GET / HTTP/1.1...") and curl receives "Hello, World!". - Try accessing from a browser; you should see "Hello, World!".
Hints:
- HTTP requests end with
\r\n\r\nfor headers, but for now, just read once. - Ensure your response has the correct line endings:
\r\n.
Description: Start parsing the incoming HTTP request by extracting the method, path, and HTTP version from the first line.
Tasks:
- Split the received data by
\r\nto get lines. - Take the first line (request line) and split it by spaces: e.g.,
["GET", "/", "HTTP/1.1"]. - Store these in variables:
method,path,version. - If the method is not "GET", send a response:
"HTTP/1.1 405 Method Not Allowed\r\n\r\n". - For GET, continue sending the dummy "Hello, World!" response.
- Handle cases where the request is malformed (e.g., fewer than 3 parts) by sending
"HTTP/1.1 400 Bad Request\r\n\r\n".
Testing:
curl -X GET http://localhost:8080/should work.curl -X POST http://localhost:8080/should return 405.- Send invalid requests via telnet (e.g., "INVALID / HTTP/1.1") and check for 400.
Hints:
- Use
str.split()carefully; strip any extra whitespace. - Assume paths start with
/.
Description: Parse the HTTP headers from the request to extract useful information like Host, User-Agent, etc.
Tasks:
- After splitting lines, collect headers from line 1 until an empty line.
- Parse each header line as
key: value(split by first:). - Store headers in a dictionary (keys lowercase for consistency, e.g.,
headers['host']). - For now, just print the headers dictionary to the console.
- If no headers or malformed (e.g., no colon), handle with 400 Bad Request.
- Continue with the dummy response for valid GET requests.
Testing:
- Use
curl -v http://localhost:8080/to see headers in the request; check if your server prints them correctly. - Test with custom headers:
curl -H "X-Test: value" http://localhost:8080/.
Hints:
- Headers end at the first empty line (
""after splitting). - Strip whitespace from keys and values.
Description: Implement basic routing for GET requests and construct proper HTTP responses.
Tasks:
- For path
/, respond with"HTTP/1.1 200 OK\r\nContent-Type: text/plain\r\nContent-Length: 13\r\n\r\nHello, World!". - For path
/echo, respond with the raw request as the body (calculate Content-Length accordingly). - For any other path, respond with
"HTTP/1.1 404 Not Found\r\nContent-Type: text/plain\r\nContent-Length: 9\r\n\r\nNot Found". - Ensure responses include at least Status Line, Content-Type, Content-Length, and a blank line before the body.
Testing:
curl http://localhost:8080/→ "Hello, World!"curl http://localhost:8080/echo→ The full request echoed back.curl http://localhost:8080/invalid→ "Not Found" with 404.
Hints:
- Calculate Content-Length as
len(body.encode('utf-8')). - Use
\r\nfor all line endings.
Description: Enhance the server to serve files from a www directory for GET requests.
Tasks:
- Create a
wwwfolder with anindex.htmlcontaining"<h1>Welcome!</h1>". - If path is
/, servewww/index.html. - For other paths like
/style.css, servewww/style.cssif it exists. - Determine Content-Type based on file extension:
.html→ text/html,.css→ text/css,.txt→ text/plain, default to application/octet-stream. - Read the file in binary mode, calculate Content-Length, and send in the body.
- If file not found, send 404 as before.
- Handle directory traversal attempts (e.g.,
/../) by normalizing the path and checking if it's withinwww.
Testing:
- Add files to
www: index.html, test.txt. curl http://localhost:8080/→ HTML content.curl http://localhost:8080/test.txt→ Text content.curl http://localhost:8080/missing→ 404.- Try
curl http://localhost:8080/../secret→ Should 404 or 400.
Hints:
- Use
os.pathfor safe path handling:os.path.normpath,os.path.join. - Import
osand checkos.path.exists. - For Content-Type, use a simple dict mapping extensions.
Description: Make the server more robust with better error handling and logging.
Tasks:
- Wrap the connection handling in a try-except to catch exceptions (e.g., socket errors) and log them.
- Log each request: e.g., print(f"{method} {path} - {status_code}").
- Handle request bodies (though for GET, there shouldn't be any; ignore for now).
- Limit recv size and handle partial reads if needed (but for simple requests, 1024 should suffice).
- Add a Server header to responses: e.g.,
Server: SimplePythonServer.
Testing:
- Simulate errors: e.g., close connection mid-request via telnet.
- Check console logs for each request.
Hints:
- Use Python's
loggingmodule for better logs if you want, but print is fine. - Ensure the server doesn't crash on bad inputs.
Description: (Optional) Parse and use query parameters in requests.
Tasks:
- Split path by
?to get base path and query string. - Parse query string into a dict (e.g.,
name=Alice&age=30→ {'name': 'Alice', 'age': '30'}). - For
/echo?param=value, include the params in the echoed response. - Use
urllib.parse.parse_qs(importurllib.parse).
Testing:
curl http://localhost:8080/echo?name=test→ Response includes "name: test".
Once all tickets are complete, you should have a functional HTTP server!