This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# não se preocupa com ocorrências de & como caractere especial | |
with open('meu_arquivo.xml', 'r') as f: | |
content = f.read() | |
with open('meu_novo_arquivo.xml', 'w') as f: | |
f.write(content.replace('&', '')) | |
# se preocupa > https://www.tjohearn.com/2018/01/24/safe-ampersand-parsing-in-xml-files/ |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
CREATE EXTENSION fuzzystrmatch; | |
create table nomes(nome) as | |
(values | |
('abelardo'), | |
('aberlado'), | |
('aberlardo'), | |
('jurandir'), | |
('jurandr'), | |
('abraão') |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
create table censo_new (id int, nome text); | |
create table censo_old (id int, nome text); | |
insert into censo_new values (1, 'abelardo'); | |
insert into censo_new values (2, 'fulano'); | |
insert into censo_old values (1, 'abelardo vieira mota'); | |
insert into censo_old values (2, 'fulano'); | |
with row_1 as ( | |
select a.id, j.* |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
-- | |
CREATE TABLE measurement_np ( | |
city_id int not null, | |
logdate date not null, | |
peaktemp int, | |
unitsales int | |
); | |
-- partitioned |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
import math | |
import os | |
buckets_dir = './all_texts_buckets' | |
if not os.path.isdir(buckets_dir): | |
os.mkdir(buckets_dir) | |
n_characters = 30000 |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
df = pd.DataFrame({'category': {2829: 'Building', | |
7313: 'Airport', | |
2534: 'SportsTeam', | |
4146: 'Building', | |
2125: 'Food', | |
7977: 'City', | |
8312: 'City', | |
4801: 'Food', | |
723: 'Building', | |
628: 'ComicsCharacter'}, |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
-- a partir de quando as statistics estão sendo coletadas and else | |
-- https://www.postgresql.org/docs/9.2/static/monitoring-stats.html#PG-STAT-DATABASE-VIEW | |
select stats_reset, datname, to_char(tup_returned*100.0 / (tup_fetched + tup_returned), '90.00%') as pct_tup_returned, | |
tup_fetched, tup_returned, temp_bytes, tup_inserted, tup_updated, tup_deleted, | |
xact_commit + xact_rollback as total_transaction, xact_commit, xact_rollback, deadlocks | |
from pg_stat_database | |
where datname not like 'template%'; | |
-- statistics about columns |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
"hot",hot | |
Harcourt_(publisher),Harcourt (publishers) | |
African_Americans,D.C. African Americans | |
Philippines,The country's | |
12.0,12 | |
Olympic_Stadium_(Athens),The Olympic Stadium (in Athens) | |
HAL_Light_Combat_Helicopter,HAL light combat helicopters | |
California_State_Assembly,the California State Assembly | |
A.S._Gubbio_1910,the club | |
"Deceased",The |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
{'Asterix_(comicsCharacter)': {'asterix', | |
'asterix (comics character)', | |
'the comic book character asterix', | |
'the comic character asterix', | |
'the comic character, asterix', | |
'the comic strip character asterix'}, | |
'Auron_(comicsCharacter)': {'auron', | |
'auron (comics character)', | |
'the comic book character auron', | |
'the comic character auron', |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
import pandas as pd | |
ciclo = pd.read_csv('exempo_ciclo.txt', encoding='iso-8859-1', sep='\t') | |
gps = pd.read_csv('exempolo_GPS.txt', encoding='iso-8859-1', sep='\t') | |
ciclo['dataCarregamento'] = pd.to_datetime(ciclo['dataCarregamento'], format='%d/%m/%y %H:%M:%S') | |
ciclo['dataBasculamento'] = pd.to_datetime(ciclo['dataBasculamento'], format='%d/%m/%y %H:%M:%S') | |
gps['Data-Hora'] = pd.to_datetime(gps['Data-Hora'], format='%d/%m/%y %H:%M:%S') |