Talk by the two Ola-s that created Django Girls group and an intro to how they felt through the process in getting to be Django devs.
github.com/birdsarah/gtimei or something like that
github pandas tutorial recommended.
Glyphs are shapes in the plots. shapes of the parts
bokeh.models is the lowest level.
Chart class takes data and makes the graph.
Usually models for parallelization use graphs. This one inspect python code or sth like that.
The idea is to write the dependencies between tasks and then let flowy manage the paralellism.
Presenter @ionelmc
use check-manifest to check the manifest
Use graft and excludes. Dirty better than not working
setuptools > distutils
For bdist, use MANIFEST.in + include_package_data=True. Discourage use of package_data
data_files is horrible thing to use they will get spread all over the place
Python packaging tools are for libraries not for applications (configs, pre/post install actions)
pex can be used to bundle/vendor embedd stuff
setuptools_scm handles version specification for you
if you import from main.py they will get executed twice because of python -m mypackage
use console_scripts always avoid setup(scripts='') thing
extra_requires are to use like pip install 'mypackage[pdf]', not recommended for dev use extra_requires={'pdf':['reportlab']}. Use tox for dev dependencies
Use tox. For managing virtualenv, you can use also vex and pew. pyenv manages complete interpreters.
there are extra_require features, declarative conditional dependencies.
extras_require={ ':python_version=="2.6"': ['argparse'], ':sys_platform=="win32"': ['colorama'], }
can build wheels with dependencies
Coverage for C extensions: export CFLAGS=-coverage
To upload, use twine. Twine upload metadata and sort of stuff. Pypi doens't allow reuploading distributions anymore.
Versioning has come normalized. PEP-440 is not compatible with semver.org
Cookiecutter tetmplate with these ideas cookiecutter-pylibrary
Explanation about how not to have memory leaks with circular references and that stuff. Pypy implementation for garbing collecting (two levesl of gc) with probability of getting halted.
Presented several examples. working on maps. Can hack it to use websockets
dusk is for defining stuff as algorithms to make agnostic on data. Need to put library name here for new computing resources optimization
Cosmic ray * Mutation testing using ast, multiprocessing etc. Instructions * Utilities for iterating over sets of data Fernando * How to meet internals in python * get source, compile with gdb support and symbols, execute and put breakpoints to trace calls Smartfeedz * Social media agreggation tons of current gen techs (nltk, etc.) * going to do it opensource Massage * Massage for donations tuesday 16:45 exxtreme * Story telling RPG creator * Use esperanto for your API, that way you know you don't mix different languages pygame zero * Py game with syntax parser and so
Guido Van Rossum talk. Talking about how the creation of Django Girls and the creation of Python are similar. The talk is assembled as a Q&A Session.
Because it's actually better. Easier to teach. New features. Python2 won't have any updates but security updates.
Directory listings faster Multiply overriding for numpy etc. pep 492 * async for async if, etc
Shit happens and you don't have time for everything. There are also non usefull functionalities. Backwards compatibility breaking. Obscure edge cases.
New stuff, no idea
Pypy may have hidden incompatibilities.
neo4j orientdb arangodb titandb for graph data
We usually think of code as text. Graphs are used to optimize etc.
https://quantifiedcode.github.io/code-is-beautiful
The circunvalar complexity measures how complex is your code. Can be used to know how many unit tests you should write.
There are city graphs showing how complex your code is. You can make complex checks on code not by how it looks in text but figuratively.
Juju talk. It seems interesting because you can define how services upgrade between versions, appart from what you do in puppet/salt, etc.
jujucharms.com
Talk about tools. Already on the internet. https://ox.cx/B Talks about statsd graphite etc.
Explanation on how to recognise objetcs. And they explain all the internal algorithms.
opencv has functions to :
- calibrate camera (make lines more linear)
Specify config by dict.
Super basic intro. Example to blocking i/o.
The idea is to create connected services that see each other and create a complete set of services by having them communicate automatically.
Presenting the mongo searching framework queries, finally giving tips on how to limit queries on size
Optional, using a pep 484 to introduce functionality. You can make the annotations in a stub file, instead of in place
The make a workflow where each team has a user, upload there their packages, and then create specific indices mixing them.
Devpi can be used as a pypy for several namespaces, mirror, etc. Support some auth, mainly using the http protoco, mirror, etc. Support some auth, mainly using the http protocoll
How pylint internas and astroid work
Sample application. Needs for a distributed click counter:
- Multidatacenter
- Unique counter
- Automatic resize
- Distributed configuration
Tips:
- Rely on the stack
- Use easy to understand tools
- Simple and small tools
- Isolated components
Ideas:
- Microservices * Isolated components really tiny. But if gone really crazy, you will need to comunicate everything
Stack:
- nginx + uWSGI
- collector, gathers the http get and retuns total
- consumer, increments the counter
- code with flask and gevent loop
- queues with beanstalkd
- zookeper * consul * etcd <-> consul because k/v store and multidatacenter
- Consul has a uWSGI consul plugin
Sparck Resilient distributed datasets api, and on top of the spark core, you can find sparksql for example
Can be in top of mesos
Use it for all bigdata
- Use scrapy, google maps and Iptyhon to create a map of the flats they wanted
- RAMLifications * API creator for raml. rogue.ly/ramlfications
- attrs * Tuples without tuples to overcome the problem that they are actually not taking namedtuple name as a value
- FUD * Do not judge projects without trying them
- asyncio fast tutorial * zefciu/lolcats-asyncio
- py3c.readthedocs.org * Porting c bindings to python 3
- python unconference september hamburg 4-6
- Python tips & tricks judy2k/stupid-python-tricks
- Marge sympson is valid python code
- Do not modify your callers env although you can
- -ish funciton to check against 'False' and stuff like that
- Python subprocesses do not give back memory (it does but fragments memory)
- RinohType * Latex similar text processor
- qutebrowser * Navegador webkit-qt con vim para clickar enlaces con accesos directos
- vimium, vimperaTOR * Working alternatives
Education and python, how adapted to show to students. @MissPhilbin talks basically about the strengths and flaws of python.
Paxos BigTable GFS
- boost factors on filter facilities
- gauss for example for location
- field value factor for taking a real score into account
- random score to boost new stuff
- really boost searches
Map reduce streaming are done through binaries.
Don't use Dumbo nor hadoopy. Use Pydoop < Luigi < MRJob
Mrjob has:
- Super docs
- Integration with Amazon EMR
- Local testing without Hadoop
- Automatic upload to cluster
- Multistep jobs
Luigi:
- Framework with real workflow
- central scheduler
- task history
- automatic upload to cluster
- integration with snakebite
Pydoop:
- fast but slower than snakebite
- hdfs api based on libhdfs
- implement record reader/writer in python
- implement partitioner in python
- difficult to install and small comunity and doesn't upload itself to cluster
Pig MapReduce:
- Faster than python and can be extended with python
- uses DSL for itself
- Pig UDF can be in pig and jython
For complex workflow organization, job chaining and hdfs manipulation Use luigi + Snakebite
For lightning speed mr jobs and beggining difficulties Use pydoop + pig
Amazon emr and testing in local: Use MRJob
https://github.com/maxtepkeev/talks
Documentation is important, and it's important to be portable, adoptable in workflow and scalable and aadaptable by project.
First, who is going to be my reader?
Second, what do my readers want to know?
Third, when do my readers need this content? Looking at the lifecycle of the software (at the beggining, installation or orientation on how to use it, later configuration problem solving
Fourth, where do my readers consume this content? From terminal? Mobile, on the go?
Fifth, why do my readers even need this content? BAD docs are worse than no docs.
Example about gnome, missing categories. But ordered by category.
Example Archlinux wiki. Search functionality, really not wiki needed
Example RHEL Openstack. Really messy.
DevOps for docs:
- Unified toolchain. Use dev's docs to generate docs
- Continuous integration for docs
- Iterative authoring
- Curate the content, for example splitting the docs from one big file to many little ones
- Automation, continuous deployment (not really important), Automated testing for not only code examples, but with tools hemingway, supply pluggin.
Contribution guide lines, provide templates.
pip install pyladies , organizer's book, etc. They are full of resouces and tips.
You need to have an objective.
How to know if the event was successful?
- First conference: There IS someone else
- Do not overcommit to expectations.
People and resources:
- You are the most important resource because you can make it happen. So you need to have commitment
- Other people to help, organizers etc.
they create a multidimensional space to point users and products, and they print users closer if they click and farther if they don't.
LSHForest with scikit-learn but it's slow
FLANN is plain to deploy
annoy is great but you can't add point to an existing index
So they create rpforest, overcoming the limitations
- multiplayer game dragondemo.net
- deep neural networks on emotions judy2k/stupid-python-tricks, 3GPU processing hours, with -ish
- masterkey => override base classes stuff jespino/python-master-key
- Codecombat