Skip to content

Instantly share code, notes, and snippets.

We can make this file beautiful and searchable if this error is corrected: It looks like row 2 should actually have 30 columns, instead of 24 in line 1.
job_title,url,company,location,remote,salary_string,min_annual_salary_usd,max_annual_salary_usd,salary_currency,countries,country,cities,continents,technology_names,date_posted,seniority,country_codes,created_at,final_url,normalized_title,manager_roles,matching_phrases,matching_words,company_url,company_linkedin,industry,company_size,probability_actual_domain_found,company_url_source,logo
Pleno Engenheiro de Software Java | Kotlin,https://indeed.com/viewjob?jk=3326b0400a6a9485,Impulso,"Remoto, Brazil",false,R$ 13.600 - R$ 14.400 por mês,32979.02,34918.97,BRL,,Brazil,,,"Cloud, Java, AWS, Amazon Web Services, Docker, Spring, Kubernetes, BEM, PostgreSQL, Spring Boot, Kotlin, Postgres",2023-09-11,Mid-Level,BR,2023-09-12T06:32:18.600117+00:00,https://impulso.team/en/profissionais/oportunidade/1938,engenheiro de software,,"É super importante ter conhecimento no uso de feature flags / feature toggles, DDD, Continuous integration e Continuous delivery, qualidade de código e testes automatizados",feature flags,https:
We can make this file beautiful and searchable if this error is corrected: Unclosed quoted field in line 1.
"113375236","https://indeed.com/viewjob?jk=9b8f0538f79216b3","Asistente- Gestión De Documentos","Banco General, S.A.","PROPOSITO DEL CARGO Gestionar, clasificar, digitalizar y ordenar un conjunto de documentos que son necesarios para conservar una base de datos de los clientes; cumpliendo las normas y procedimientos del Banco, el Código de Ética y Valores, las normas y procedimientos de uniforme y de imagen del Banco; y así mantener un control de la producción de información, como de su manejo y edición. FUNCIONES GENERALES Archivo de carpeta o documentos de portafolio de inversión Preparar documentos de cuentas de inversión local o internacional de persona natural o jurídica para el proceso de digitalización Preparar documentos de cuentas bancarias de persona natural o jurídica para el proceso de digitalización Digitalizar cuentas bancarias y cuentas de inversión local o internacional de persona natural o jurídica Rearmado de carpeta de cuentas de inversión local o internacional de persona natural o jurídica
We can make this file beautiful and searchable if this error is corrected: Unclosed quoted field in line 2.
name_text,newsletter_tags_list_option_category_keywords,profile_image_image,starting_price_number,subscribers_number,Slug,click_rate_number,newsletter_freq_option_newsletter_frequency,open_rate_number,status_option_newsletter_status,url_text,newsletter_other_category_text,newsletter_crosspromotion_option_cross_promotions,long_summary_text,owner1_list_user,verified_date,Modified Date,short_summary_text,facebook_url_text,twitter_url_text,newletter_category_option_category,flytier_url_text
Codementor's Newsletter,"['tech', 'productivity']",//s3.amazonaws.com/appforest_uf/f1658204152516x482178286177161700/NewLogo_circle_white%20%281%29.png,3000,471771,codementors-newsletter,0.12,week,0.21,approved,https://www.codementor.io/,Education,i_m_open_to_cross_newsletter_sponsorships0,"🚩 Codementor is a newsletter that helps developers learn and grow in programming knowledge through curated events and community articles.
⌛️ Every week, Codementor sends a newsletter with upcoming virtual events and featured articles by o
@xoelop
xoelop / tinybird-requests.json
Created December 22, 2022 12:31
Retool dashboard to explore queries to any pipe of your Tinybird account
{
"uuid": "60799e3a-81e6-11ed-8e5b-1377772fbf58",
"page": {
"id": 115259211,
"data": {
"appState": "[\"~#iR\",[\"^ \",\"n\",\"appTemplate\",\"v\",[\"^ \",\"isFetching\",false,\"plugins\",[\"~#iOM\",[\"textInputToken\",[\"^0\",[\"^ \",\"n\",\"pluginTemplate\",\"v\",[\"^ \",\"id\",\"textInputToken\",\"type\",\"widget\",\"subtype\",\"TextInputWidget2\",\"namespace\",null,\"resourceName\",null,\"resourceDisplayName\",null,\"template\",[\"^3\",[\"spellCheck\",false,\"readOnly\",false,\"iconAfter\",\"\",\"showCharacterCount\",false,\"autoComplete\",false,\"maxLength\",null,\"hidden\",false,\"customValidation\",\"\",\"patternType\",\"\",\"hideValidationMessage\",false,\"textBefore\",\"\",\"validationMessage\",\"\",\"textAfter\",\"\",\"showInEditor\",false,\"showClear\",false,\"pattern\",\"\",\"tooltipText\",\"\",\"labelAlign\",\"left\",\"formDataKey\",\"{{ self.id }}\",\"value\",\"\",\"labelCaption\",\"\",\"labelWidth\",\"33\",\"autoFill\",\"\",\"placeholder\",\"Enter value\",\"la
@xoelop
xoelop / create_heroku_jobs.py
Created November 8, 2022 17:06
Script to create jobs in the Heroku Scheduler programatically
import json
import requests
from dotenv import load_dotenv
load_dotenv(override=True)
import os
for i in range(3, 52):
@xoelop
xoelop / download_stripe_invoices.sh
Last active January 25, 2023 15:05
Downloads all paid invoices on Stripe, collected automatically, between 2 dates, in PDF
# if .env, source .env
if test -f .env; then
source .env &&
echo `date`: sourcing .env
fi
# mkdir if not exists
mkdir -p data/script_invoices
# download invoices created later or on this date (yyyy-mm-dd)
@xoelop
xoelop / Explain SQLAlchemy Postgres queries
Last active August 10, 2022 10:20
Some functions to see the execution plan of a Postgres query emitted by SQLAlchemy
# Source: https://github.com/sqlalchemy/sqlalchemy/wiki/Query-Plan-SQL-construct
# This adds the last function, to print the query plan
# Caveats: stmt has to be built using sqlalchemy.select(...). If you use session.query(...) it'll fail.
# This is Postgres-only
# Guide to migrate to SQLAlchemy 2.0-style (from session.query() to select(...) ): https://docs.sqlalchemy.org/en/14/changelog/migration_20.html#migration-orm-usage
from sqlalchemy.ext.compiler import compiles
from sqlalchemy.sql.expression import ClauseElement
from sqlalchemy.sql.expression import Executable
@xoelop
xoelop / twitter_theirstack_enrich_result.json
Last active June 23, 2022 16:14
This is the result of enriching 'twitter.com' with TheirStack.com
[
{
"name": "Twitter",
"url": "twitter.com",
"industry": "internet",
"country": "United States",
"employee_count": 8200,
"linkedin_url": "http://www.linkedin.com/company/twitter",
"technology_names": [
"4D",
-- s1 is not used
WITH s1 AS (
SELECT
linkedin_slug
, count(*) OVER (PARTITION BY 1)
, max(updated_at) max_updated_at
FROM person
GROUP BY linkedin_slug
HAVING count(*) > 1
)
@xoelop
xoelop / ingest_data_to_tb_deduplicate.sh
Last active April 23, 2023 19:43
Deduplicating rows on Tinybird almost on ingestion time
# full process. Ingests data from postgres to tb, calculates duplicates, inserts them to a new datasource and removes the rows from that datasource that appear in the original one.
source $(pwd)/.env
# related, to ingest data from postgres: https://blog.tinybird.co/2019/10/14/the-one-cron-job-that-will-speed-up-your-analytical-queries-in-postgres-a-hundred-fold/
echo 'Ingesting most recently update jobs'
psql $HEROKU_POSTGRES_URL -c "COPY (SELECT id, url, job_title, company, description, description_cleaned, date_posted, now() FROM job WHERE COALESCE(description, '') <> '' AND updated_at > now() - interval '70 minutes') TO STDOUT WITH (FORMAT CSV)" | curl -F csv=@- "https://api.tinybird.co/v0/datasources?name=jobs&mode=append&token=$TINYBIRD_ADMIN_TOKEN";