Instantly share code, notes, and snippets.
Created
February 2, 2024 01:04
-
Star
(0)
0
You must be signed in to star a gist -
Fork
(0)
0
You must be signed in to fork a gist
-
Save ivansaul/40ad4f23ee375b38654be02631902041 to your computer and use it in GitHub Desktop.
Python save cookies with Selenium, Playwright and Requests in Netscape Format
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
{ | |
"cells": [ | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"# Python save cookies with Selenium, Playwright and Requests in Netscape Format\n", | |
"\n", | |
"- How to save cookies to txt file?\n", | |
"- How to save cookies to Netscape format?\n", | |
"- How to get cookies with Selenium, Playwright and Requests?" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"### All cookies are something like this(list[dict]):\n", | |
"\n", | |
"```json\n", | |
"[\n", | |
" {\n", | |
" \"domain\": \"codigofacilito.com\",\n", | |
" \"expirationDate\": 1721236171.746433,\n", | |
" \"hostOnly\": true,\n", | |
" \"httpOnly\": false,\n", | |
" \"name\": \"ahoy_visitor\",\n", | |
" \"path\": \"/\",\n", | |
" \"sameSite\": \"unspecified\",\n", | |
" \"secure\": true,\n", | |
" \"session\": false,\n", | |
" \"storeId\": \"0\",\n", | |
" \"value\": \"18a3f5d0-fb23-442b-a7b9-de1f08a14e4d\"\n", | |
" },\n", | |
" {\n", | |
" \"domain\": \".codigofacilito.com\",\n", | |
" \"expirationDate\": 1707439819.231845,\n", | |
" \"hostOnly\": false,\n", | |
" \"httpOnly\": false,\n", | |
" \"name\": \"__stripe_mid\",\n", | |
" \"path\": \"/\",\n", | |
" \"sameSite\": \"strict\",\n", | |
" \"secure\": true,\n", | |
" \"session\": false,\n", | |
" \"storeId\": \"0\",\n", | |
" \"value\": \"8f3b3562-a465-4b58-a350-27c8a87dea7465ae39\"\n", | |
" },\n", | |
" {\n", | |
" \"domain\": \".codigofacilito.com\",\n", | |
" \"hostOnly\": false,\n", | |
" \"httpOnly\": false,\n", | |
" \"name\": \"CloudFront-Key-Pair-Id\",\n", | |
" \"path\": \"/\",\n", | |
" \"sameSite\": \"unspecified\",\n", | |
" \"secure\": true,\n", | |
" \"session\": true,\n", | |
" \"storeId\": \"0\",\n", | |
" \"value\": \"APKAIAHLS7PK3GEUR2RQ\"\n", | |
" }\n", | |
"]\n", | |
"```" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": null, | |
"metadata": {}, | |
"outputs": [], | |
"source": [ | |
"def to_netscape_string(cookie_data: list[dict]) -> str:\n", | |
" \"\"\"\n", | |
" Convert cookies to Netscape cookie format.\n", | |
"\n", | |
" This function takes a list of cookie dictionaries and transforms them into\n", | |
" a single string in Netscape cookie file format, which is commonly used by\n", | |
" web browsers and other HTTP clients for cookie storage. The Netscape string\n", | |
" can be used to programmatically interact with websites by simulating the\n", | |
" presence of cookies that might be set during normal web browsing.\n", | |
"\n", | |
" Args:\n", | |
" cookie_data (list of dict): A list of dictionaries where each dictionary\n", | |
" represents a cookie. Each dictionary should have the following keys:\n", | |
" - 'domain': The domain of the cookie.\n", | |
" - 'expires': The expiration date of the cookie as a timestamp.\n", | |
" - 'path': The path for which the cookie is valid.\n", | |
" - 'secure': A boolean indicating if the cookie is secure.\n", | |
" - 'name': The name of the cookie.\n", | |
" - 'value': The value of the cookie.\n", | |
"\n", | |
" Returns:\n", | |
" str: A string representing the cookie data in Netscape cookie file format.\n", | |
"\n", | |
" Example of Netscape cookie file format:\n", | |
" .example.com\tTRUE\t/\tTRUE\t0\tCloudFront-Key-Pair-Id\tAPKAIAHLS7PK3GAUR2RQ\n", | |
" \"\"\"\n", | |
" result = []\n", | |
" for cookie in cookie_data:\n", | |
" domain = cookie.get(\"domain\", \"\")\n", | |
" expiration_date = cookie.get(\"expires\", 0)\n", | |
" path = cookie.get(\"path\", \"\")\n", | |
" secure = cookie.get(\"secure\", False)\n", | |
" name = cookie.get(\"name\", \"\")\n", | |
" value = cookie.get(\"value\", \"\")\n", | |
"\n", | |
" include_sub_domain = domain.startswith(\".\") if domain else False\n", | |
" expiry = str(int(expiration_date)) if expiration_date > 0 else \"0\"\n", | |
" result.append(\n", | |
" [\n", | |
" domain,\n", | |
" str(include_sub_domain).upper(),\n", | |
" path,\n", | |
" str(secure).upper(),\n", | |
" expiry,\n", | |
" name,\n", | |
" value,\n", | |
" ]\n", | |
" )\n", | |
" return \"\\n\".join(\"\\t\".join(cookie_parts) for cookie_parts in result)\n", | |
"\n", | |
"\n", | |
"def save_cookies_to_file(\n", | |
" cookie_data: list[dict], file_path='cookies.txt'\n", | |
") -> None:\n", | |
" \"\"\"\n", | |
" Save cookies to txt file\n", | |
" \"\"\"\n", | |
" netscape_string = to_netscape_string(cookie_data)\n", | |
" with open(file_path, \"w\", encoding=\"utf-8\") as file:\n", | |
"\n", | |
" header = \"\"\"\\\n", | |
" # Netscape HTTP Cookie File\n", | |
" # http://www.netscape.com/newsref/std/cookie_spec.html\n", | |
" # This is a generated file! Do not edit.\\n\n", | |
" \"\"\"\n", | |
" file.write(header)\n", | |
" file.write(netscape_string)" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"## Get cookies with Requests" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": null, | |
"metadata": {}, | |
"outputs": [], | |
"source": [ | |
"import requests\n", | |
"\n", | |
"url = \"https://example.com\"\n", | |
"response = requests.get(url)\n", | |
"cookies = response.cookies\n", | |
"# save cookies to file\n", | |
"netscape_string = to_netscape_string(cookies)\n", | |
"save_cookies_to_file(netscape_string, 'cookies.txt')" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"## Get cookies with Playwright" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": null, | |
"metadata": {}, | |
"outputs": [], | |
"source": [ | |
"from playwright.sync_api import sync_playwright\n", | |
"\n", | |
"with sync_playwright() as p:\n", | |
" browser = p.chromium.launch()\n", | |
" context = browser.new_context()\n", | |
" page = context.new_page()\n", | |
" url = \"https://example.com\"\n", | |
" page.goto(url)\n", | |
" cookies = context.cookies()\n", | |
" browser.close()\n", | |
"\n", | |
" # save cookies to file\n", | |
" netscape_string = to_netscape_string(cookies)\n", | |
" save_cookies_to_file(netscape_string, 'cookies.txt')" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"## Get cookies with Selenium" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": null, | |
"metadata": {}, | |
"outputs": [], | |
"source": [ | |
"from selenium import webdriver\n", | |
"\n", | |
"url = 'https://example.com'\n", | |
"driver = webdriver.Chrome()\n", | |
"driver.get(url)\n", | |
"cookies = driver.get_cookies()\n", | |
"# save cookies to file\n", | |
"netscape_string = to_netscape_string(cookies)\n", | |
"save_cookies_to_file(netscape_string, 'cookies.txt')" | |
] | |
} | |
], | |
"metadata": { | |
"language_info": { | |
"name": "python" | |
} | |
}, | |
"nbformat": 4, | |
"nbformat_minor": 2 | |
} |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment