Oleg Kulyk kami4ka

Backend Position Technical Task

Implement REST API that allows users to:

Lookup for a particular IP address info via https://ipwhois.io/ and store in to DB
Response with a stored lookup info from DB in case the spefic IP was already searched (DB-caching)
Remove cached result by IP
Cache should be auto-removed after TTL of 60 seconds, so only the cache result would be updated each 60 seconds for the particular IP address

Required parts

SQL or noSQL database or file

Test task

Implement API and website testing automation using any preferred JS toolbox

Parts that should be test-automated

API: http://quotes.toscrape.com/api/quotes
Website page 1: http://quotes.toscrape.com/scroll
Website page 2: http://quotes.toscrape.com/js-delayed/

Ideally to have

Reasons for using a particular JS toolbox

ScrapingAnt's Backend Test Task

Task Description

You need to implement a web scraping script that extracts product data from an e-commerce website books.toscrape.com and stores the extracted data in a database.

The script should:

Scrape product details (title, price, availability, quantity in stock) from multiple pages of a given website.
Use Asyncio to efficiently fetch data from multiple pages in parallel.
Store the extracted data in a database (your choice) deployed as a separate Docker container.

	/**
	* Amazon Batch Scraper - create file with a list of keywords and all products would be scraped in one CSV file
	*
	* Installation instructions:
	* npm install "@scrapingant/amazon-proxy-scraper"
	* npm install json2csv
	*
	*/

	const ProductsScraper = require("@scrapingant/amazon-proxy-scraper");

	const cheerio = require('cheerio');
	const ScrapingAnt = require('@scrapingant/scrapingant-client');

	const API_KEY = '<YOUR_SCRAPINGANT_API_KEY>';
	const URL_TO_SCRAPE = 'https://ra.co/events/1479360';
	const BASE_URL = 'https://ra.co';

	const client = new ScrapingAnt({ apiKey: API_KEY });

	main()

	import requests
	from bs4 import BeautifulSoup

	YOUR_API_KEY = '<YOUR_SCRAPINGANT_API_KEY>'


	def get_page(page_url):
	response = requests.get(url='https://api.scrapingant.com/v2/general', params={'browser': False, 'url': page_url, 'x-api-key': YOUR_API_KEY, 'proxy_country': 'IN'})
	content = response.content.decode('windows-1252')