This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# ============================================================================= | |
# Title: Twitter Users Tweets Scraper | |
# Language: Python | |
# Description: This script does scrape the first 100 tweets | |
# of any Twitter User. | |
# Author: Sasha Bouloudnine | |
# Date: 2023-08-08 | |
# | |
# Usage: | |
# - Make sure you have the required libraries installed by running: |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
import os | |
import requests | |
import html2text | |
import re | |
import argparse | |
OPENAI_API_KEY = 'YOUR_OPEN_AI_API_KEY' | |
COMPLETION_URL = 'https://api.openai.com/v1/chat/completions' | |
PROMPT = """Find the main article from this product page, and return from this text content, as JSON format: |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
import requests | |
import re | |
import json | |
from lxml import html | |
import time | |
from retry import retry | |
import csv | |
URL = 'https://www.cdiscount.com/search/10/barbecue.html' |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
from curl_cffi import requests | |
from lxml import html | |
import json | |
import csv | |
import time | |
import argparse | |
HEADERS = { | |
'authority': 'www.doctolib.fr', | |
'accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.7', |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
import requests | |
import csv | |
from lxml import html | |
import argparse | |
import time | |
class YelpSearchScraper: | |
def iter_listings(self, url): | |
response = requests.get(url) | |
if response.status_code != 200: |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
""" | |
GrowthHacking.fr Forum Scraper | |
This script is used to scrape data from the GrowthHacking.fr forum, specifically from the "Scraping" category. | |
It retrieves information about forum topics and saves it as CSV data. | |
Usage: | |
1. Install the required library using the following command: | |
$ pip install requests |
OlderNewer