Skip to content

Instantly share code, notes, and snippets.

View jtemporal's full-sized avatar
😎
Working from home

Jessica Temporal jtemporal

😎
Working from home
View GitHub Profile
@jtemporal
jtemporal / memory-error-8GB-machine.txt
Created January 30, 2017 16:02
memory-error-8GB-machine
root@rosie-staging-8gb:~/rosie# docker run --rm -v /tmp/serenata-data:/tmp/serenata-data rosie
2017-01-30 14:57:15 Creating the CSV file
2017-01-30 14:57:15 Reading the XML file
2017-01-30 14:57:17 Writing record #3,200 to the CSV
2017-01-30 14:57:17 Done!
2017-01-30 14:57:17 Creating the CSV file
2017-01-30 14:57:17 Reading the XML file
2017-01-30 14:59:22 Writing record #342,225 to the CSV
2017-01-30 14:59:22 Done!
2017-01-30 14:59:22 Creating the CSV file
@jtemporal
jtemporal / unicode-error-rosie-DO
Created January 30, 2017 16:31
While running Rosie on DO 16GB machine got an encode error
(serenata_rosie) root@rosie-staging-16gb:~/rosie-conda/rosie# python rosie.py run
2017-01-30 16:27:27 Creating the CSV file
2017-01-30 16:27:27 Reading the XML file
Traceback (most recent call last): #2 to the CSV
File "rosie.py", line 36, in <module>
command()
File "rosie.py", line 23, in run
rosie.main(target_directory)
File "/root/rosie-conda/rosie/rosie/__init__.py", line 64, in main
dataset = Dataset(target_directory).get()
root@rosie-staging-16gb:~/rosie# docker run --rm -v /tmp/serenata-data:/tmp/serenata-data rosie
2017-01-30 16:16:03 Creating the CSV file
2017-01-30 16:16:03 Reading the XML file
2017-01-30 16:16:04 Writing record #3,200 to the CSV
2017-01-30 16:16:04 Done!
2017-01-30 16:16:04 Creating the CSV file
2017-01-30 16:16:04 Reading the XML file
2017-01-30 16:18:07 Writing record #342,225 to the CSV
2017-01-30 16:18:07 Done!
2017-01-30 16:18:07 Creating the CSV file
@jtemporal
jtemporal / ipython tests.txt
Created January 30, 2017 17:37
Investigate parser
In [6]: from serenata_toolbox.xml2csv import convert_xml_to_csv
In [7]: convert_xml_to_csv('data/AnoAtual.xml', 'data/AnoAtual.csv')
2017-01-30 17:28:26 Creating the CSV file
2017-01-30 17:28:26 Reading the XML file
---------------------------------------------------------------------------
UnicodeEncodeError Traceback (most recent call last)
<ipython-input-7-87ccd4d5ef66> in <module>()
----> 1 convert_xml_to_csv('data/AnoAtual.xml', 'data/AnoAtual.csv')
@jtemporal
jtemporal / iterparse-ipdb.txt
Created January 30, 2017 19:38
change in iterparse with ipdb
In [8]: convert_xml_to_csv('data/AnoAnterior.xml', 'data/AnoAnterior.csv')
2017-01-30 18:16:40 Creating the CSV file
> /root/anaconda3/envs/serenata_rosie/lib/python3.6/site-packages/serenata_toolbox/xml2csv.py(67)convert_xml_to_csv()
66 import ipdb; ipdb.set_trace()
---> 67 create_csv(csv_file_path, headers)
68
ipdb> n
> /root/anaconda3/envs/serenata_rosie/lib/python3.6/site-packages/serenata_toolbox/xml2csv.py(69)convert_xml_to_csv()
68
@jtemporal
jtemporal / compose-up
Created February 10, 2017 16:15
docker compose up --build output
docker-compose up --build
Creating network "docker_default" with the default driver
Building research
Step 1 : FROM jupyter/datascience-notebook:latest
---> e2bd4f58f5e4
Step 2 : LABEL maintainer "Data Science Brigade <[email protected]>"
---> Using cache
---> 955e4715f489
Step 3 : USER root
---> Using cache
@jtemporal
jtemporal / .gitignore_global
Created February 11, 2017 13:28
Global gitignore settings
# https://help.github.com/articles/ignoring-files/#create-a-global-gitignore
# Python
*.pyc
# VIM
*.swp
# Ruby
Gemfile.lock
@jtemporal
jtemporal / gist:f945f4a188dbeba382eb20ef7670a7b7
Created March 14, 2017 21:36
output from spark first try
$ spark-submit \
> --class "ChamberOfDeputies" \
> --packages com.databricks:spark-xml_2.11:0.4.1 \
> target/scala-2.11/chamber-of-deputies_2.11-1.0.jar \
> ~/Code/serenata/serenata-de-amor/data/AnoAtual.xml
Ivy Default Cache set to: /home/temporal/.ivy2/cache
The jars for the packages stored in: /home/temporal/.ivy2/jars
:: loading settings :: url = jar:file:/usr/local/spark/jars/ivy-2.4.0.jar!/org/apache/ivy/core/settings/ivysettings.xml
com.databricks#spark-xml_2.11 added as a dependency
:: resolving dependencies :: org.apache.spark#spark-submit-parent;1.0
$ python manage.py migrate
Traceback (most recent call last):
File “/Users/temporal/anaconda3/envs/jarbas/lib/python3.5/site-packages/django/db/backends/base/base.py”, line 213, in ensure_connection
self.connect()
File “/Users/temporal/anaconda3/envs/jarbas/lib/python3.5/site-packages/django/db/backends/base/base.py”, line 189, in connect
self.connection = self.get_new_connection(conn_params)
File “/Users/temporal/anaconda3/envs/jarbas/lib/python3.5/site-packages/django/db/backends/postgresql/base.py”, line 176, in get_new_connection
connection = Database.connect(**conn_params)
File “/Users/temporal/anaconda3/envs/jarbas/lib/python3.5/site-packages/psycopg2/__init__.py”, line 130, in connect
conn = _connect(dsn, connection_factory=connection_factory, **kwasync)
$ python manage.py test
Creating test database for alias 'default'...
Got an error creating the test database: permission denied to create database
Type 'yes' if you would like to try deleting the test database 'test_jarbas', or 'no' to cancel: no
Tests cancelled.