ChillarAnand · May 9, 2016 13:42
diff --git a/pyvideo-log.txt b/pyvideo-log.txt
 WARNING:root:conference: chipy
 WARNING:root:[{'file_path': 'data/chipy/videos/post-djangocon-an-overview-of-edx.json', 'summary': "| edx is a major django application serving huge numbers of students for\nMIT, Harvard, Stanford, Berkely, and more.\n| - A brief history of Computer-Based Instruction (python has a role); -\nincomplete survey of current open-source CBI; - edX: how's it different\n/ what's it's rough structure, what (besides django/python) is involved;\n- edX: hacking the platform (django development); - edX: hacking\ncourses; a deployment-level VM, and how to get started there; - finally:\nfuture topics: deployment; what this can't do (maybe) and why; - wrapup:\ncall for interest & edx project night(s);\n\nI'll try to have some USBs for anyone who want to try one of the edX VMs\nduring the talk\n", 'error': 'Line block ends without a blank line.'}]
 WARNING:root:conference: djangocon-2012
 WARNING:root:[{'file_path': 'data/djangocon-2012/videos/using-celery-with-social-networks.json', 'description': 'Twitter conditionally rate limits based on IP address rather than access\ntoken even when one is provided for some of its API calls. Facebook has\nat least 10 unique error messages to indicate a bad or expired access\ntoken (that I\'ve found so far). LinkedIn\'s pagination has an occasional\noff-by-one bug resulting in an endless list of 1-user pages. Let\'s face\nit: interfacing with social networks is tricky. Celery helps, but to\nprovide stable, reliable, and fast social features for your website,\nyou\'ll need an arsenal of strategies and tools to get you the rest of\nthe way there.\n\nBy the end of this talk, you\'ll understand how to set up tasks to\nquickly serve users with massive networks by employing intelligent\ndistribution. You\'ll be able to design robust processes to handle\ninconsistencies or instabilities in 3rd party APIs. And you\'ll know how\nto have confidence that the work you intend to do gets done, regardless\nof external rate limits, pagination design, or API call dependency\nchains.\n\nThis talk is intended for people who have basic familiarity with celery\nand would like to learn more about how to take advantage of it for\nlarge, distributed task loads.\n\nOutline\n-------\n\nI. Intro\n\nA. 3rd party interfaces are hard\n\n::\n\n      * Speed\n\n        * Much slower than local data\n        * Users may still expect near-immediate results\n\n      * Rate limits\n\n        * Different rules for every service\n        * Need to handle reactive & proactive as some don\'t publish rates\n\n      * Instability\n\n        * Outages (yes, Facebook does go down)\n        * Random failures\n\nB. Why Celery?\n\n::\n\n      * Asynchronous\n      * Distributed\n      * Fault tolerant\n\nII.  Task Organization\n\n     A. Small, atomic tasks (1 API call per task) B. Minimal message\n     state\n\n     -  Primitive types only (no model instances!)\n     -  Defer as much data access to the task itself as possible\n\n     C. Create Task subclasses for common patterns D. Whenever possible,\n     make tasks idempotent\n\nIII. Task Distribution\n\n     A. Managing pagination\n\n     ::\n\n         * For a known set size\n\n           * Where limit/offset is supported, launch all page tasks simlutaneously\n           * Otherwise, 1 page launches the next as soon as the next cursor is obtained\n\n         * For an unknown set size\n\n           * Set max simultaneous pages\n           * Task is terminal if blank, otherwise launches page w/ offset + max pages\n\n         * Setting page size is an art, not a science\n\n           * Minimize the number of api calls when possible\n           * Avoid long-running tasks by setting a timeout ceiling\n           * Avoid the temptation to pass API data to dependent tasks\n\n     B. Tracking task dependencies ("Done?" is difficult for distributed\n     systems)\n\n     ::\n\n         * Use an external backend to store a dependency tree\n         * Subclass ResultSet to evaluate the task state of the tree\n         * Requires ignore_result=False\n\nIV.  Rate Limiting\n\n     A. Problems\n\n     -  Celery\'s rate limiting doesn\'t do what you think it does\n     -  3rd party rate limits depend on many factors\n\n     B. Solution\n\n     -  For services with known rate limits:\n\n        -  Use an external backend to store rate limit counters\n        -  Increment counters based on rate limit factors per api call\n\n     -  For services with unknown rate limits:\n\n        -  Use an external backend to store rate limit backoff counters\n        -  Ramp up / ratchet down call rate by power law as api calls\n           fail/succeed\n\nV.   Failover\n\nA. Problems\n\n::\n\n      * Celery\'s countdown doesn\'t do what you think it does\n      * 3rd parties can fail in lots of "interesting" ways\n\nB. Solution\n\n::\n\n      * Implement native RabbitMQ alternative to countdown\n      * Create task base classes per social network to handle error conditions\n\nVI.   Multiple queues\n\n      A. Better control over task priority management & resource\n      distribution B. Not all social accounts are created equal\n      (handling whales & spikes) C. When you can\'t stream updates, use a\n      trickle queue\n\nVII.  Celerybeat considered harmful\n\n      A. Periodic task persistence gets out of sync with code B. Just 1\n      more process to manage C. Cron: it\'s just. not. that. hard.\n\nVIII. Debugging\n\n      A. Don\'t use "always eager" B. Logging, logging, logging C. Unit\n      tests are good, but integration tests save lives\n\nIXIV. Gotchas\n\n      A. Open socket prevents Celery soft timeout B. Celery soft timeout\n      doesn\'t retry the task C. If result state is not known, Celery\n      reports "PENDING"\n\n\n', 'error': 'Enumerated list start value not ordinal-1: "B" (ordinal 2)'}]
 WARNING:root:conference: djangocon-2012
 WARNING:root:[{'file_path': 'data/djangocon-2012/videos/cryptography-for-django-applications.json', 'description': "Introduction\n============\n\nThe web is a hostile place, and isn't showing any signs of becoming less\nso. In order to mitigate this, many developers turn to cryptography.\nUnfortunately, cryptography can be complicated, and is easily\ncircumvented if not properly handled. This presentation will provide an\nintroduction to cryptographic tools available to Python/Django\napplications, appropriate use cases for each, proper usage, and\noperational concerns necessary to operate in a certified environment.\nFinally, we will also demonstrate a reusable application that wraps this\nall up, providing secure key-management capabilities to a running Django\nenvironment via the Django admin.\n\nWhy Encrypt?\n============\n\nRules of Encryption\n===================\n\n-  Don't do it if you don't need it.\n-  Don't write your own.\n-  Understand what you're doing if you do.\n\nWhen to encrypt?\n================\n\nUnderstand what you're protecting\n---------------------------------\n\n-  Data\n-  User records\n-  Code\n-  Systems\n\nUnderstand your attack vectors\n------------------------------\n\n-  Data (backups, revision control)\n-  Systems\n-  Application\n-  Transport\n-  Client\n\nUnderstand the types of encryption you might use:\n-------------------------------------------------\n\n-  Hashing\n\nPasswords are a special case. Use a key derivation function\n~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~\n\n-  PBKDF2 – Upgrade to Django 1.4!\n\nAlgorithms\n~~~~~~~~~~\n\n-  MD5 - fine as a checksum. not fine as a cryptographic hash.\n-  SHA1 - fine as a checksum. becoming less fine as a cryptographic hash\n   every day\n-  SHA2 - so far so good. use as many bits as you can handle.\n\nSymmetric Encryption\n--------------------\n\n-  Fast\n-  Reversible\n\nAlgorithms\n~~~~~~~~~~\n\n-  Caesar Cipher (for fun puzzles)\n-  DES (don't use)\n-  AES (certified)\n-  Blowfish\n\nAsymmetric Encryption\n---------------------\n\n-  Slow\n-  One-way\n\nAlgorithms\n~~~~~~~~~~\n\n-  RSA\n-  DSA\n\nUses\n^^^^\n\nSigning\n'''''''\n\nWeb of Trust\n            \n\n-  PGP\n\nPKI\n   \n\nEncryption\n''''''''''\n\n-  PGP\n-  SSL\n-  TLS\n\nDoing it right\n==============\n\nUse known-good algorithms\n-------------------------\n\n-  AES-256\n-  SHA2\n-  RSA\n-  DSA\n\nUse known-good implementations\n------------------------------\n\n-  Open Source is good\n\nExtra Credit\n------------\n\n-  FIPS 140 certified implementations\n-  FIPS 140 / NIST configurations\n\nTransport (always use HTTPS)\n----------------------------\n\n-  Use good algorithms AES-256\n\nAt Rest (insecure servers or backups)\n-------------------------------------\n\n-  Understand the ramifications of key management\n\nExamples\n========\n\nHashing\n-------\n\n-  Use a key-derivation function\n\nDon't be linked-in\n~~~~~~~~~~~~~~~~~~\n\n-  Salt your hashes (with a secret).\n-  Salt and pepper your hashes if possible (with a known unique value)\n\nSSL\n---\n\n-  Forced connections\n-  Making the application aware\n-  Hardened cipher selection\n\nRobust PKI\n~~~~~~~~~~\n\n-  Client authentication\n-  SSL Test Page\n\nAsymmetric Encryption\n---------------------\n\nKey Management\n~~~~~~~~~~~~~~\n\n-  Using GPG Agent\n-  GPG Manager App\n\nPGP Files\n~~~~~~~~~\n\nSymmetric Encryption\n--------------------\n\nKey Management\n~~~~~~~~~~~~~~\n\n-  Use Asymmetric Encryption\n\nUse a unique Initialization Vector if possible\n~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~\n\n-  LoopBack Devices\n\n", 'error': 'Duplicate implicit target name: "algorithms".'}]
 WARNING:root:conference: djangocon-2013
 WARNING:root:[{'file_path': 'data/djangocon-2013/videos/database-sharding-with-django.json', 'summary': "Database Sharding with Django Ash Christopher Wednesday 4 p.m.--4:45\np.m.\n\nAudience level: Intermediate\n\nDescription\n\nDjango works great for single database webapps, but did you know they\ngive us the tools to create webapps with sharded data right out of the\nbox? This talk will go over how we leveraged Django's features to add\nsharding to our webapp infrastructure. Abstract\n\nTopics to cover: 1. Why do you need to shard your data? lots of data\nlots of writes\n\n2.  What should you do before sharding? Scale up - throw money at it.\n    Feature partitioning - split feature data on different databases\n    Sharded databases.\n\n3.  How to set up multiple databases in Django tools that we used at\n    Wave. talk about what Django gives us right out of the box (DB\n    Routers) How database routers work. Gotchas when using South for DB\n    migrations.\n\n4.  The downsides of multi-database systems. what we lose (ForeignKeys,\n    Transactions management, select\\_related(), prefetch\\_related())\n\n5.  Scaling the database via sharding. what does this mean? how do I\n    pick a key to shard on? what makes a good and bad sharding key?\n\n6.  Database routers for sharding.\n7.  New tooling that needs to be written to deal with database\n    migrations. deal with database migrations\n\n8.  How to interact with your data through the ORM when it's sharded.\n9.  Need to start using Globally Unique PK's go over a few different\n    strategies used by other companies. strategy for unique ID\n    generation in Python.\n\n10. Transactions + Sharded data. transactions are only useful on\n    same-shard\n\n11. Balancing sharded data across servers we chose multiple DB's per\n    node. make db migration a DevOps task. downsides (limited db\n    connections).\n\n\n", 'error': 'Enumerated list start value not ordinal-1: "2" (ordinal 2)'}]
 WARNING:root:conference: europython-2011
 WARNING:root:[{'file_path': 'data/europython-2011/videos/objects-and-classes-in-python-and-javascript.json', 'summary': '[EuroPython 2011] Jonathan Fine - 23 June 2011 in "Training Pizza Napoli\n"\n', 'error': "Possible title underline, too short for the title.\nTreating it as ordinary text because it's so short."}]
 WARNING:root:conference: europython-2011
 WARNING:root:[{'file_path': 'data/europython-2011/videos/building-a-website-with-pyhp-and-liwe.json', 'summary': '[EuroPython 2011] Fabio Rotondo - 22 June 2011 in "Training Pizza Napoli\n"\n', 'error': "Possible title underline, too short for the title.\nTreating it as ordinary text because it's so short."}]
 WARNING:root:conference: europython-2011
 WARNING:root:[{'file_path': 'data/europython-2011/videos/python-oracle-prosperity-performance.json', 'summary': '[EuroPython 2011] Todd Trichler - 23 June 2011 in "Training Pizza Napoli\n"\n', 'error': "Possible title underline, too short for the title.\nTreating it as ordinary text because it's so short."}]
 WARNING:root:conference: europython-2011
 WARNING:root:[{'file_path': 'data/europython-2011/videos/pypy-hands-on.json', 'description': 'The session is divided into two parts, of roughly 2 hours each. People\nwho are interested only in the first part, can leave the session after\nit. However, the first part is a prerequisite for the second one, thus\npeople are not advised to join in the middle of the session.\n\nThe session is meant to be highly interactive. People are invited to\nbring their own laptop and try things by themselves.\n\nPart 1: Run your application under PyPy\n---------------------------------------\n\nThis tutorial is targeted to Python users who want to run their favorite\nPython application under PyPy, and exploit the most of it. The following\ntopics will be covered:\n\n::\n\n    - how to fix/avoid CPython implementation details (e.g., refcounting)\n\n    - general overview of how the PyPy JIT works\n\n    - how to optimize your program for the PyPy JIT\n\n    - how to view and interpret the traces produced by the JIT\n\n    - how to tweak the parameters of the JIT and the GC\n\n    - how to use existing CPython C extensions on PyPy, and fix them if necessary\n\nPart 2: Write your own interpreter with PyPy\n--------------------------------------------\n\nPyPy is not only a Python interpreter, but also a toolchain to implement\ndynamic languages. This tutorial is targeted to people who want to\nimplement their own programming languages, or who simply want to know\nmore about how the PyPy JIT works internally.\n\nThe students will be given the source code for a toy language\nimplemented in RPython. They will learn:\n\n::\n\n    - how to translate it to C using the PyPy translation toolchain\n\n    - what are the "hints" needed by the JIT generator, and how to place them\n\nThen, they will be challenged to add the proper hints to the toy\ninterpreter, to get the best result with the JIT.\n\n**THINGS TO DO BEFORE THE TRAINING**\n\nYou are encouraged to bring your laptop to the training session.\n\nMake sure that the following prerequisites are met:\n\n-  Install PyPy 1.5:\n\n   -  http://pypy.org/download.html\n\n   -  http://doc.pypy.org/en/latest/getting-started.html#installing-pypy\n\n-  Make sure that ``setuptools`` or ``distribute`` are installed (look\n   at the URL above for instructions)\n\n-  Clone the pypy repository, and update to the 1.5 version::\n\n$ hg clone http://bitbucket.org/pypy/pypy\n\n$ cd pypy\n\n$ hg up -r release-1.5\n\n-  Clone the jitviewer repository and install it on pypy::\n\n$ hg clone http://bitbucket.org/pypy/jitviewer\n\n$ cd jitviewer\n\n$ /path/to/pypy-1.5/bin/pypy setup.py develop\n\nIf you intend to follow also the second part ("Write your own\ninterpreter with PyPy"), you need to make sure you have a working\n`developing\nenvironment <http://doc.pypy.org/en/latest/getting-started-python.html%20#translating-the-pypy-python-interpreter>`__\n', 'error': 'Literal block expected; none found.'}]
 WARNING:root:conference: europython-2011
 WARNING:root:[{'file_path': 'data/europython-2011/videos/rubrica-indirizzi-allennesima-potenza.json', 'summary': '[EuroPython 2011] Davide Corio - 23 June 2011 in "Track Italiana Big Mac\n"\n', 'error': "Possible title underline, too short for the title.\nTreating it as ordinary text because it's so short."}]
 WARNING:root:conference: europython-2011
 WARNING:root:[{'file_path': 'data/europython-2011/videos/man-page-of-the-warrior-of-light.json', 'summary': '[EuroPython 2011] Semen Trygubenko - 22 June 2011 in "Track Tagliatelle\n"\n', 'error': "Possible title underline, too short for the title.\nTreating it as ordinary text because it's so short."}]
 WARNING:root:conference: europython-2011
 WARNING:root:[{'file_path': 'data/europython-2011/videos/python-oracle-prosperity-performance-0.json', 'summary': '[EuroPython 2011] Todd Trichler - 23 June 2011 in "Training Pizza Napoli\n"\n', 'error': "Possible title underline, too short for the title.\nTreating it as ordinary text because it's so short."}]
 WARNING:root:conference: europython-2011
 WARNING:root:[{'file_path': 'data/europython-2011/videos/objects-and-classes-in-python-and-javascript-0.json', 'summary': '[EuroPython 2011] Jonathan Fine - 23 June 2011 in "Training Pizza Napoli\n"\n', 'error': "Possible title underline, too short for the title.\nTreating it as ordinary text because it's so short."}]
 WARNING:root:conference: europython-2011
 WARNING:root:[{'file_path': 'data/europython-2011/videos/pypy-hands-on-0.json', 'description': 'The session is divided into two parts, of roughly 2 hours each. People\nwho are interested only in the first part, can leave the session after\nit. However, the first part is a prerequisite for the second one, thus\npeople are not advised to join in the middle of the session.\n\nThe session is meant to be highly interactive. People are invited to\nbring their own laptop and try things by themselves.\n\nPart 1: Run your application under PyPy\n---------------------------------------\n\nThis tutorial is targeted to Python users who want to run their favorite\nPython application under PyPy, and exploit the most of it. The following\ntopics will be covered:\n\n::\n\n    - how to fix/avoid CPython implementation details (e.g., refcounting)\n\n    - general overview of how the PyPy JIT works\n\n    - how to optimize your program for the PyPy JIT\n\n    - how to view and interpret the traces produced by the JIT\n\n    - how to tweak the parameters of the JIT and the GC\n\n    - how to use existing CPython C extensions on PyPy, and fix them if necessary\n\nPart 2: Write your own interpreter with PyPy\n--------------------------------------------\n\nPyPy is not only a Python interpreter, but also a toolchain to implement\ndynamic languages. This tutorial is targeted to people who want to\nimplement their own programming languages, or who simply want to know\nmore about how the PyPy JIT works internally.\n\nThe students will be given the source code for a toy language\nimplemented in RPython. They will learn:\n\n::\n\n    - how to translate it to C using the PyPy translation toolchain\n\n    - what are the "hints" needed by the JIT generator, and how to place them\n\nThen, they will be challenged to add the proper hints to the toy\ninterpreter, to get the best result with the JIT.\n\n**THINGS TO DO BEFORE THE TRAINING**\n\nYou are encouraged to bring your laptop to the training session.\n\nMake sure that the following prerequisites are met:\n\n-  Install PyPy 1.5:\n\n   -  http://pypy.org/download.html\n\n   -  http://doc.pypy.org/en/latest/getting-started.html#installing-pypy\n\n-  Make sure that ``setuptools`` or ``distribute`` are installed (look\n   at the URL above for instructions)\n\n-  Clone the pypy repository, and update to the 1.5 version::\n\n$ hg clone http://bitbucket.org/pypy/pypy\n\n$ cd pypy\n\n$ hg up -r release-1.5\n\n-  Clone the jitviewer repository and install it on pypy::\n\n$ hg clone http://bitbucket.org/pypy/jitviewer\n\n$ cd jitviewer\n\n$ /path/to/pypy-1.5/bin/pypy setup.py develop\n\nIf you intend to follow also the second part ("Write your own\ninterpreter with PyPy"), you need to make sure you have a working\n`developing\nenvironment <http://doc.pypy.org/en/latest/getting-started-python.html%20#translating-the-pypy-python-interpreter>`__\n', 'error': 'Literal block expected; none found.'}]
 WARNING:root:conference: europython-2014
 WARNING:root:[{'file_path': 'data/europython-2014/videos/stackless-recent-advancements-and-future-goals.json', 'description': "Stackless: Recent advancements and future goals\n-----------------------------------------------\n\nSince Python release 1.5 Stackless Python is an enhanced variant of\nC-Python. Stackless is best known for its addition of lightweight\nmicrothreads (tasklets) and channels.\n\nLess known are the recent enhancements that became available with\nStackless 2.7.6. In this talk core Stackless developers demonstrate\n\n-  The improved multi-threading support\n-  How to build custom scheduling primitives based on atomic tasklet\n   operations\n-  The much improved debugger support\n-  ...\n\nStackless recently switched the new master repository from\nhg.python.org/stackless to bitbucket to allow for a more open\ndevelopment process. We'll summarise our experience and discuss our\nplans for the future development of Stackless.\n\nThe talk will be help by Anselm Kruis and Christian Tismer. If we are\nlucky, we will also welcome Kristján Valur Jónsson from Iceland.\n", 'error': "Unexpected possible title overline or transition.\nTreating it as ordinary text because it's so short."}]
 WARNING:root:conference: europython-2014
 WARNING:root:[{'file_path': 'data/europython-2014/videos/conversing-with-people-living-in-poverty.json', 'description': '43% of the world\'s population live on less than €1.5 per day.\n\nThe United Nations defines poverty as a "lack of basic capacity to\nparticipate effectively in society". While we often think of the poor as\nlacking primarily food and shelter, the UN definition highlights their\nisolation. They have the least access to society\'s knowledge and\nservices and the most difficulty making themselves and their needs heard\nin our democracies.\n\nWhile smart phones and an exploding ability to collect and process\ninformation are transforming our access to knowledge and the way we\norganize and participate in our societies, those living in poverty have\nlargely been left out. This has to change.\n\nBasic mobile phones present an opportunity to effect this change\n`3 <http://www.youtube.com/watch?v=0bXjgx4J0C4#t=20>`__. Only three\ncountries in the world have fewer than 65 mobile phones per 100 people\n`4 <http://en.wikipedia.org/wiki/List_of_countries_by_number_of_mobile_phones_in_use>`__.\nThe majority of these phones are not Android or iPhones, but they do\nnevertheless provide a means of communication -- via voice calls, SMSes\n`6 <http://en.wikipedia.org/wiki/Short_Message_Service>`__, USSD\n`7 <http://en.wikipedia.org/wiki/Unstructured_Supplementary_Service_Data%20%20%20[8]:%20<http://blog.praekeltfoundation.org/post/65981723628/wikipedia-zero-over-text-with-praekelt-foundation>`__\nand instant messaging.\n\nBy comparison, 25 countries have less than 5% internet penetration\n`5 <http://en.wikipedia.org/wiki/List_of_countries_by_number_of_Internet_users>`__.\n\nVumi `1 <http://vumi.org/>`__ is an open source text messaging system\ndesigned to reach out to those in poverty on a massive scale via their\nmobile phones. It\'s written in Python using Twisted.\n\nVumi is already used to:\n\n-  provide Wikipedia access over USSD and SMS in Kenya [8].\n-  register a million voters in Libya\n   `10 <http://www.libyaherald.com/2014/01/01/over-one-million-register-for-constitutional-elections-on-final-sms-registration-day/#axzz2sroHcg00>`__.\n-  deliver health information to mothers in South Africa\n   `9 <http://blog.praekeltfoundation.org/post/65042080515/mama-launches-healthy-family-nutrition-programme>`__.\n-  prevent election violence in Kenya\n   `11 <http://blog.praekeltfoundation.org/post/51210616848/the-texting-will-never-be-done-peace-messages-in-kenya>`__.\n\nThis talk will cover:\n\n-  a brief overview of mobile networking and cellphone use in Africa\n-  why we built Vumi\n-  the challenges of operating in unreliable environments\n-  an overview of Vumi\'s features and architecture\n-  how you can help!\n\nVumi features some cutting edge design choices:\n\n-  horizontally scalable Twisted processes communicating using RabbitMQ.\n-  declarative data models backed by Riak.\n-  sharing common data models between Django and Twisted.\n-  sandboxing hosted Javascript code from Python.\n\nOverview of challenges Vumi addresses:\n\n*Scalability*: Vumi needs to support both small scale applications\n(demos, pilot projects, applications tailored for a particular\ncommunity) and large ones (things that everyone within a country might\nuse). We address this using Twisted workers that exchange messages via\nRabbitMQ and store data in Riak. Having projects share RabbitMQ and Riak\ninstances significantly reduces the overhead for small projects (e.g.\nits not cost effective to launch the recommended minimum of 5 Riak\nservers for a small project).\n\n*Barriers to entry*: Often the people with good ideas don\'t have access\nto one of many things needed to run a production system themselves, e.g.\ncapital, time, stable infrastructure. We address this by providing a\nhosted Vumi instance that runs sandboxed Javascript applications. All\nthe application author needs is their idea, the ability to write\nJavascript and upload it to our servers. The target audience here is\nAfrican entrepreneurs at incubator spaces like iHub (Nairobi), kLab\n(Kigali), BongoHive (Lusaka) and JoziHub (Johannesburg).\n\n*Unreliable third-party systems*: It\'s one thing for parts of ones own\nsystem to go down, it\'s another for crucial third-party systems to go\ndown. Vumi takes an SMTP-like approach to solving this and uses\npersistent queues so that messages can back up in the queue while\nthird-party systems are down and be processed when they become available\nagain. We also feedback information on whether third-party messaging\nsystems have accepted or reject messages to the application that\ninitiated them.\n\nVumi is developed by the Praekelt Foundation\n`2 <http://praekeltfoundation.org/>`__ (and individual contributors!).\n', 'error': 'Anonymous hyperlink mismatch: 1 references but 0 targets.\nSee "backrefs" attribute for IDs.'}]
 WARNING:root:conference: kiwi-pycon-2014
 WARNING:root:[{'file_path': 'data/kiwi-pycon-2014/videos/multimedia-programming-using-gstreamer-and-of-c.json', 'description': '= = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = =\nDouglas Bagnall: Multimedia programming using Gstreamer (and, of course,\nPython) = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = =\n= @ Kiwi PyCon 2014 - Sunday, 14 Sep 2014 - Track 1\nhttp://kiwi.pycon.org/\n\n**Audience level**\n\nIntermediate\n\n**Description**\n\nGstreamer is a multimedia framework consisting of hundreds of\ninterchangeable elements that can be plugged together like toy train\ntracks to create efficient processing pipelines. This talk will show how\neasy it is to put together reasonably complex Gstreamer pipelines with\nPython, and indicate why you might want to. It covers Gstreamer 1.x and\nboth versions of CPython.\n\n**Abstract**\n\nGstreamer [1] is a mature framework for developing applications that\ndecode, encode, play, analyse, broadcast, or otherwise manipulate audio\nand video streams. It is used on phones and embedded devices, all over\nthe linux desktop [2], and in giant multimedia displays.\n\nIt is easy to create simple Gstreamer pipelines using shell one-liners,\nbut with Python it is possible to go beyond that, dealing with multiple\nparallel streams and reconfiguring pipelines on the fly. Examples drawn\nfrom real life problems include interleaving several hundred audio\nstreams into one multi-channel stream and splitting a video stream into\n4 parts and sending each to a different projector for perfect\nsyncronisation.\n\nIt is also possible to write a Gstreamer plugin in Python, which will be\nbriefly touched on.\n\n[1] http://gstreamer.freedesktop.org/ [2]\nhttp://gstreamer.freedesktop.org/apps/\n\n**Slides**\n\nhttps://speakerdeck.com/nzpug/douglas-bagnall-multimedia-programming-using-gstreamer-and-of-course-python\n', 'error': 'Malformed table.\nNo bottom table border found.'}]
 WARNING:root:conference: kiwi-pycon-2014
 WARNING:root:[{'file_path': 'data/kiwi-pycon-2014/videos/francesco-biscani-a-snake-in-space-keynote.json', 'description': '= = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = =\nFrancesco Biscani: A Snake in Space - The rise of scientific Python in\nAstrodynamics and Astronomy = = = = = = = = = = = = = = = = = = = = = =\n= = = = = = = = = = = @ Kiwi PyCon 2014 - Sunday, 14 Sep 2014 - Keynote\nhttp://kiwi.pycon.org/\n\n**Slides**\n\nhttps://speakerdeck.com/nzpug/francesco-biscani-a-snake-in-space-the-rise-of-scientific-python-in-astrodynamics-and-astronomy-keynote\n', 'error': 'Malformed table.\nNo bottom table border found.'}]
 WARNING:root:conference: kiwi-pycon-2014
 WARNING:root:[{'file_path': 'data/kiwi-pycon-2014/videos/java-for-python-developers.json', 'description': "= = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = =\nChristopher Neugebauer: Java for Python Developers = = = = = = = = = = =\n= = = = = = = = = = = = = = = = = = = = = = @ Kiwi PyCon 2014 -\nSaturday, 13 Sep 2014 - Track 1 http://kiwi.pycon.org/\n\n**Audience level**\n\nIntermediate\n\n**Description**\n\nStop looking at me like that.\n\nNo really. Stop it. I'm serious.\n\nCalling Java APIs from CPython is one of the more interesting challenges\nfacing developers who want to get Python working as a first-class\nlanguage for developing apps for Android.\n\nThis talk looks at solutions, past and present, for making the world of\nJava accessible from Python.\n\n**Abstract**\n\nStop looking at me like that.\n\nNo really. Stop it. I'm serious. Why are you looking so confused?\n\nYes. I'm talking about Java at a Python conference. What of it?\n\nOK, well, I'm actually talking about avoiding having to code in Java,\nwhen circumstances almost certainly require you to code in Java… or at\nleast require your applications to run in a Java environment.\n\nOne of the more interesting challenges for Python developers targetting\nAndroid is being able to call Java APIs from CPython. Environments like\nAndroid require developers to use Java to get access to Android's user\ninterface libraries. Perhaps more importantly, Android has APIs for\naccessing hardware features like accelerometers and geolocation, and\nsoftware features like notifications, but all of these have a Java\ninterface.\n\nThis talk looks at how these problems have been solved, and where they\nhaven't, approaches to solutions that might exist.\n\n**Slides**\n\nhttps://speakerdeck.com/nzpug/christopher-neugebauer-java-for-python-developers\n", 'error': 'Malformed table.\nNo bottom table border found.'}]
 WARNING:root:conference: kiwi-pycon-2014
 WARNING:root:[{'file_path': 'data/kiwi-pycon-2014/videos/deploying-test-and-production-systems-with-ansibl.json', 'description': '= = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = =\nJuergen Brendel: Deploying test and production systems with Ansible = =\n= = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = @ Kiwi\nPyCon 2014 - Sunday, 14 Sep 2014 - Track 1 http://kiwi.pycon.org/\n\n**Audience level**\n\nIntermediate\n\n**Description**\n\nThis talk presents Ansible as a light-weight, simple and effective\nconfiguration management tool. We will see why configuration management\nis important and how we can use Ansible to automatically deploy and test\nfull production clusters - showcased with a sample Django application -\nwith just a single command and in a number of different environments.\n\n**Abstract**\n\nWith modern configuration, deployment and orchestration systems,\nsoftware developers can easily maintain, bring up and tear down local\ntest systems, staging, demonstration and production systems, which are\n100% identical to each other.\n\nThis presentation will give an introduction to Ansible, a modern, simple\nand efficient configuration management system. We will see how we can\ndeploy complete clusters (with load balancers, database and some Django\napplication servers), reliably and fully automated, either on dedicated\nservers, on local virtual machines or in the cloud.\n\nIf all goes well, there will even be a live demo in which we will bring\nup a complete cluster or two in different environments, to illustrate\nthe power of this approach.\n\n**Slides**\n\nhttps://speakerdeck.com/nzpug/juergen-brendel-deploying-test-and-production-systems-with-ansible\n', 'error': 'Malformed table.\nNo bottom table border found.'}]
 WARNING:root:conference: kiwi-pycon-2014
 WARNING:root:[{'file_path': 'data/kiwi-pycon-2014/videos/external-dependencies-in-web-apps-system-libs-ar.json', 'description': '= = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = =\nFrancois Marier: External dependencies in web apps: system libs are not\nthat scary = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = =\n= = @ Kiwi PyCon 2014 - Saturday, 13 Sep 2014 - Track 2\nhttp://kiwi.pycon.org/\n\n**Audience level**\n\nIntermediate\n\n**Description**\n\nGoing to production is very different from setting up a development\nenvironment. We have great tools like pip and virtualenv for the latter\nbut they lead to maintenance anti-patterns when they are used for\nproduction deployments. We don\'t "pip install apache" and instead\nleverage the QA and integration work of distro developers, so why not\nrely on distros for more than just the base system?\n\n**Abstract**\n\nToday\'s web applications often have a lot of external dependencies.\nStart off with a basic framework, sprinkle a couple of handy plugins or\npython modules and finish with a generous serving of JavaScript\nfront-end libraries.\n\nWhat you end up is a gigantic mess of code from different sources which\nfollow very different release schedules and policies. Language-specific\npackage managers can automate much of the dependency resolution and\npackage installation, but you\'re on your own in terms of integration and\nquality assurance. Also, the minute you start distributing someone\nelse\'s code with your project, you become responsible for the security\nof that external code.\n\nYou could of course decide to avoid the problem by writing it all\nyourself from scratch. Realistically though, you\'re not likely to go\npast the stage of a toy WSGI app using that approach because it\'s a lot\nof work.\n\nThis talk will examine the decision that the Libravatar [1] project made\nto outsource much of its maintenance burden to the distro by using\nsystem packages for almost everything.\n\n[1] https://www.libravatar.org/\n\n**Slides**\n\nhttps://speakerdeck.com/nzpug/francois-marier-external-dependencies-in-web-apps-system-libs-are-not-that-scary\n', 'error': 'Malformed table.\nNo bottom table border found.'}]
 WARNING:root:conference: kiwi-pycon-2014
 WARNING:root:[{'file_path': 'data/kiwi-pycon-2014/videos/lightning-talks-12.json', 'description': '= = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = =\nLightning Talks = = = = = = = = = = = = = = = = = = = = = = = = = = = =\n= = = = = @ Kiwi PyCon 2014 - Sunday, 14 Sep 2014 - Track 1\nhttp://kiwi.pycon.org/\n\n01 - Jen Zajac - PRIZE DRAW - 1:44 02 - Steven Ellis - LCA Auckland 2015\n- 5:43 03 - Jay Gattuso - Heritage Preserve; #WeWantJam - 8:54 04 -\nJames Polly - Fewer Boilerplate! - 12:07 05 - Aurynn Shaw - Let me show\nyou the world - 16:19 06 - Ronen Baram - MySQL Performance Schema -\n19:41 07 - Douglas Bagnall - Julia - 22:14 08 - Grant Paton-Simpson /\nRichard Shea - NZ Python Promotion Pamhplet 24:50 09 - Elliot\nPaton-Simpson - Blackbox - 27:23 10 - Danny Adair / Thomi Richards - New\nZealand Python User Group - 30:45 11 - Ben Olsen - Docker & Python -\n35:37 12 - Tim Penhey - Object Factories for TDD - 40:46 13 - Hamish\nCampbell - Horrible Python Obfuscation Tricks - 43:20 14 - Nick Wareing\n- Arduino Pull Request Notifier - 48:35 15 - Giovanni Moretti -\nDynamically creating Python tutorials and presentations - 53:12 16 -\nRobert Coup - Bad-ass Postgres Tricks - 58:36 17 - Juergen Brendel -\n"One button" test and deploy on AWS - 1:03:59 18 - Tobi Wulff - Python\nAnti-Patterns - 1:09:14 19 - Sanjay Wadhwa - Map Production using Python\nscripts - 1:14:32 20 - Chris McDowall - Making weird maps with Python -\n1:19:34\n', 'error': 'Malformed table.\nNo bottom table border found.'}]
 WARNING:root:conference: pycon-au-2013
 WARNING:root:[{'file_path': 'data/pycon-au-2013/videos/build-your-infrastructure-with-python.json', 'summary': "Cloud computing is changing the way that businesses think about their\ncomputing requirements. Instead of ordering hardware, waiting for\ndelivery, allocating space in a data center, installing and wiring it\nup, and then configuring each piece of the system, you can now do the\nequivalent with a few clicks on a control panel, but that gets tedious.\nWhat's much more interesting is to do all of this programmatically,\nusing our favorite language: Python!\n\nThis session will deep-dive into this topic by using pyrax, the Python\nSDK for the OpenStack and Rackspace Clouds. It will cover the following:\n\n• Getting your cloud account and credentials • Installing pyrax •\nCreating servers • Saving a customized server as a template image •\nCreating more servers on demand from your template images. • Creating,\nattaching, and imaging Block Storage volumes. •\xa0Using private networks\nto create a bastion host setup • Managing these servers with a Load\nBalancer • Creating and managing Cloud Databases • Using pyrax to manage\nyour DNS • Object storage and management using pyrax\n", 'error': 'Bullet list ends without a blank line; unexpected unindent.'}]
 WARNING:root:conference: pycon-au-2013
 WARNING:root:[{'file_path': 'data/pycon-au-2013/videos/salt-how-to-be-truly-lazy.json', 'summary': 'In the immortal words of that modern day philosopher Homer Jay Simpson:\n"Can\'t Someone Else Do It?"\n\nIf you\'re too lazy to configure your own servers: let Salt do it for\nyou. Salt is an open source configuration management tool like Chef or\nPuppet but written in Python using ZeroMQ.\n\n| If you\'re too lazy to login to your servers to run commands: let Salt\ndo it for you. Salt is also a remote execution system. A single command\nrun from the "master" salt server can call tens, hundreds, or\n| thousands of remote servers.\n\nAnd if you\'re too lazy to install both an OS and Salt itself: let Salt\ndo it for you. Salt Cloud can spin up new boxes for you in the cloud,\ninstall Salt on them, and introduce them to the "master" salt server.\n\nSalty vagrants, masters, minions, states, pillars, grains, salt clouds,\nparallel execution: I\'ll attempt to touch on them all in this talk.\n\nYou can also be expected to be "assaulted" with a barrage of terrible\nsalt themed puns.\n', 'error': 'Line block ends without a blank line.'}]
 WARNING:root:conference: pycon-au-2015
 WARNING:root:[{'file_path': 'data/pycon-au-2015/videos/inviting-code.json', 'description': 'Projects tend to grow, expand and reshape as they progress, this can\nlead to; high code maintenance, legacy code and feature addition\nfatigue. Implementing a well-documented and malleable software\narchitecture is a major step in addressing these associated time sinks.\nThis talk will discuss viable software architecture approaches – while\ndemonstrating core python modules, concepts and design patterns from a\nbig data perspective.\n\nMany projects welcome code addition from third party developers, however\na well thought out software architecture should not be modified without;\nreview, testing, possible cascading code and documentation changes. To\nremove the additional complexity of reviewing and handling third party\ncode, designing and offering secure plugin infrastructure can allow for\noutside additions without the overhead.\n\nA big data visualisation python project utilised by Oxford and Cambridge\nuniversities will be used in code demonstrations to show the direct use\nof a project built with architecture design in mind, and a multi-level\ndynamic plugin infrastructure. Attendees will be provided with a clear\nunderstanding of how function documentation techniques (PEP-3107 &\nPEP-0484) can be utilised for class clarity and how to (re)design a code\nbase to allow for integration of third party code, while providing a\nconcise framework for code inclusion.\n\nAreas covered in this talk include\n\n| • Python 3.4 focused • Project architecture design • Abstract classes\n• Plugin frameworks • Documentation frameworks • Type hinting & Function\nannotations\n| • Class method decorators • Code validation • Image manipulation\n', 'error': 'Line block ends without a blank line.'}]
 WARNING:root:conference: pycon-us-2011
 WARNING:root:[{'file_path': 'data/pycon-us-2011/videos/pycon-2011--django--pitfalls-i-encountered-and-ho.json', 'description': 'Django: Pitfalls I Encountered and How to Avoid Them\n\nPresented by Luke Sneeringer\n\nAre you starting a moderate to large sized Django project? Do you need\nto plan ahead and build an application that will react to unanticipated\nneeds? This talk covers some techniques and pitfalls I encountered in\nwriting my first reasonably large Django site, and what I did\ndifferently the second time I started a project.\n\nAbstract\n\nWhen working on a company product, especially one where developers don\'t\nalways have full control over the scope and needs of the application,\nit\'s important to plan ahead for unanticipated needs.\n\nThis talk will cover simple tricks and methods that are a small amount\nof work up front, but can save you lots of time later.\n\nPyCon Talk Outline\n\n1. Introduction (5m)\n\n   -  Me!\n\n2. Making Mistakes\n\n   -  It happens. "Code quality can be measured by the number of WTFs\n      per minute in the code review."\n   -  When dealing with a big, expansive framework like Django,\n      sometimes you just don\'t know that something is there. Good docs\n      don\'t completely solve this...there\'s always going to be the thing\n      you don\'t find. Similarly, sometimes you don\'t realize how to\n      leverage something that you do know about until much later.\n\nMy regrets with my current project aren\'t sweeping architectural issues.\nI did most of the big stuff right. My regrets are mostly small things\nthat, because it was my first big project, there was this piece or that\npiece that I didn\'t see or didn\'t fully appreciate, and so now I have\nlittle blocks of code that are tougher to maintain than they need to be.\nEnd of the world? No. Worth thinking through for next time? Yes.\n\n3. Some trivial things (10m)\n\n   -  Preface: Yeah, some of these are dumb.\n   -  Beginning at the beginning: Project Setup\n   -  I had sys.path pointing to the directory above the project root,\n      like the tutorial does. I wish I hadn\'t done that.\n   -  Need to run two instances on the same box that don\'t share the\n      actual codebase (e.g. a staging server)? You still can, but it\'s\n      more awkward. Better to set sys.path at your project root.\n   -  Dude, where\'s my Media class?\n   -  How did I do it? First I had a magic template variable. Then I\n      copied Form.Media\n   -  Then, on a later project, I realized a block works just fine.\n   -  My boss wants \\_\\_\\_\\_ available on every page!\n   -  How did I do it? I had a method we called everywhere that took\n      arbitrary keyword arguments...\n   -  Oh, there\'s TEMPLATE\\_CONTEXT\\_PROCESSORS...\n\n      -  ...if you manually use RequestContext every time! So, just do\n         that. Always. Even if you don\'t need it.\n      -  I want .select\\_related(\'something\') every time!\n      -  ...so I typed it! A lot.\n      -  Oh, that can be done by overriding def queryset on the manager\n         class? That\'s easier to maintain...\n      -  ...but make sure you set the flag to use it on related fields!\n\n   -  We need sample data for so-and-so, such-and-such...\n   -  Disclaimer: This one actually isn\'t mine; my boss did it. But,\n      it\'s amusing, and worth mentioning.\n   -  We needed sample data so my boss could preview themes...so he set\n      up a second database, put in fake data, and hard-coded it in the\n      app-wide (not server-specific) settings.py.\n\n      -  Copied the entire DB structure...at the time. But it changes.\n      -  Oh, and the unit testing framework didn\'t appreciate it,\n         either.\n\n   -  Fixtures are the right way (and sooner or later I\'ll get this\n      fixed...it\'s still there).\n\n      -  (space reserved for my stumbling upon something else silly, and\n         hopefully humorous, that I did wrong)\n\n      1. How to avoid missing trivial things?\n\n   -  Read the documentation. Over and over.\n   -  Become familiar with the Django code.\n\n4. A non-trivial thing: Forms (10m)\n\n   -  Django forms can do anything...given sufficient shenanigans.\n      Always do it the Django forms way; your life will be easier.\n   -  Forms and ModelForms are static, and I needed dynamic choices on a\n      form...\n   -  ...so I just ditched newforms\n   -  But wait, this is Python. A trivial function that calls the\n      metaclass can solve this problem!\n\n      -  This looks complicated, but it\'s not. Walk through how to do\n         it.\n      -  It\'s quite maintainable, and you get all the other bells and\n         whistles.\n\n5. Questions? (5m)\n\n', 'error': 'Enumerated list start value not ordinal-1: "3" (ordinal 3)'}]
 WARNING:root:conference: pycon-za-2015
 WARNING:root:[{'file_path': 'data/pycon-za-2015/videos/thursday-lightning-talks.json', 'description': '-  `(0:00:00) <http://youtu.be/DiaE9GCJ0nM?t=0h0m0s>`__ **Racy interrupt\n   handling** by Bruce Merry\n-  `(0:06:00) <http://youtu.be/DiaE9GCJ0nM?t=0h6m0s>`__ **Vulture in\n   Python** by Philip Sterne\n-  `(0:11:09) <http://youtu.be/DiaE9GCJ0nM?t=0h11m9s>`__ **Edx** by Carl\n   Dawson\n-  `(0:17:39) <http://youtu.be/DiaE9GCJ0nM?t=0h17m39s>`__ **AST\n   linting** by Bryn Divey\n-  `(0:24:33) <http://youtu.be/DiaE9GCJ0nM?t=0h24m33s>`__ **Numpy in\n   Anger! ** by Laura Richter\n-  `(0:29:28) <http://youtu.be/DiaE9GCJ0nM?t=0h29m28s>`__ **How to screw\n   up loading CSVs in Python ** by James Saunders\n-  `(0:33:34) <http://youtu.be/DiaE9GCJ0nM?t=0h33m34s>`__ **PyQuery** by\n   Nicholas Spagnoletti\n-  `(0:37:27) <http://youtu.be/DiaE9GCJ0nM?t=0h37m27s>`__ **Debian\n   Python moves kicking and screaming to Git** by Stefano Rivera\n\n', 'error': 'Inline strong start-string without end-string.'}]
 WARNING:root:conference: pytexas-2011
 WARNING:root:[{'file_path': 'data/pytexas-2011/videos/asynchronous-web-development-with-tornado.json', 'description': 'Web frameworks like Django, Flask, etc. are great for most traditional\nweb sites. However, there is a growing need to produce web applications\nthat are responsive to external "events", whether a response from\nTwitter\'s API or a new message in an online chat room. Newer,\nnon-blocking frameworks like Tornado seek to address this in a scalable\nmanner.\n\nThis talk will briefly introduce non-blocking principles and patterns,\nand move quickly into an overview of the library, as well as use cases\nand anti- use cases. A portion of the time will also be spent pointing\nout community libraries that are building on Tornado\'s foundation.\n\nOutline:\n\n1. Introduction\n\n   1. Why another framework?\n   2. Intro to Asynchronous Design\n\n2. Overview of a Tornado project\n\n   1. \n\n      a. Application\n\n   2. \n\n      b. Basic (blocking) Request Handlers\n\n   3. \n\n      c. Templates\n\n   4. \n\n      d. Asynchronous Handlers\n\n   5. \n\n      e. To block or not to block\n\n3. Batteries included\n\n   1. Auth module\n   2. Options module\n   3. Database\n   4. UIModules\n   5. Security\n\n4. Community\n\n   1. \n\n      a. Tornad.io\n\n   2. \n\n      b. No-SQL libraries\n\n   3. \n\n      c. Twisted integration\n\n   4. \n\n      d. Torn Admin\n\n5. Q&A;\n\n', 'error': 'Enumerated list start value not ordinal-1: "b" (ordinal 2)'}]
 WARNING:root:conference: pytexas-2015
 WARNING:root:[{'file_path': 'data/pytexas-2015/videos/the-reference-model-for-disease-progression-and-l.json', 'description': '| The Reference Model for Disease progression is a league of disease\nmodels that determines fitness of publicly available populations and\nmodels and assumptions. The Reference Model has grown to the point that\nit was hard to maintain the code base and increase the number of\npopulations. Recent advances with the MIcro Simulation Tool (MIST), that\nruns the model, allow object oriented population generation.\nEvolutionary Computation, using the Python Inspyred library by Aaron\nGarrett, is a key element in MIST population generation. Recent\ndevelopments allow easier introductions of variations in population\ngeneration to figure out fitness of unknown base assumptions. This is\npowerful in conjunction with High Performance Computing that allows\ntesting multiple modeling assumptions in parallel.\n| Previous work was presented in this and other venues: \\*\nhttps://www.youtube.com/watch?v=vyvxiljc5vA \\*\nhttp://www.youtube.com/watch?v=AD896WakR94\n', 'error': 'Line block ends without a blank line.'}]
 WARNING:root:conference: scipy-2013
 WARNING:root:[{'file_path': 'data/scipy-2013/videos/data-processing-with-python-scipy2013-tutorial-7.json', 'summary': 'Presenters: Ben Zaitlen, Clayton Davis\n\nDescription\n\nThis tutorial is a crash course in data processing and analysis with\nPython. We will explore a wide variety of domains and data types (text,\ntime-series, log files, etc.) and demonstrate how Python and a number of\naccompanying modules can be used for effective scientific expression.\nStarting with NumPy and Pandas, we will begin with loading, managing,\ncleaning and exploring real-world data right off the instrument. Next,\nwe will return to NumPy and continue on with SciKit-Learn, focusing on a\ncommon dimensionality-reduction technique: PCA.\n\nIn the second half of the course, we will introduce Python for Big Data\nAnalysis and introduce two common distributed solutions: IPython\nParallel and MapReduce. We will develop several routines commonly used\nfor simultaneous calculations and analysis. Using Disco -- a Python\nMapReduce framework -- we will introduce the concept of MapReduce and\nbuild up several scripts which can process a variety of public data\nsets. Additionally, users will also learn how to launch and manage their\nown clusters leveraging AWS and StarCluster.\n\nOutline\n\n*Setup/Install Check (15) *\\ NumPy/Pandas (30) *Series *\\ Dataframe\n*Missing Data *\\ Resampling *Plotting *\\ PCA (15) *NumPy *\\ Sci-Kit\nLearn *Parallel-Coordinates *\\ MapReduce (30) *Intro *\\ Disco *Hadoop\n*\\ Count Words *EC2 and Starcluster (15) *\\ IPython Parallel (30) *Bitly\nLinks Example (30) *\\ Wiki Log Analysis (30)\n\n45 minutes extra for questions, pitfalls, and break\n\nEach student will have access to a 3 node EC2 cluster where they will\nmodify and execute examples. Each cluster will have Anaconda, IPython\nNotebook, Disco, and Hadoop preconfigured\n\nRequired Packages\n\nAll examples in this tutorial will use real data. Attendees are expected\nto have some familiarity with statistical methods and familiarity with\ncommon NumPy routines. Users should come with the latest version of\nAnaconda pre-installed on their laptop and a working SSH client.\n\nDocumentation\n\nPreliminary work can be found at:\nhttps://github.com/ContinuumIO/tutorials\n', 'error': 'Inline emphasis start-string without end-string.'}]
 WARNING:root:conference: scipy-2013
 WARNING:root:[{'file_path': 'data/scipy-2013/videos/data-processing-with-python-scipy2013-tutorial-6.json', 'summary': 'Presenters: Ben Zaitlen, Clayton Davis\n\nDescription\n\nThis tutorial is a crash course in data processing and analysis with\nPython. We will explore a wide variety of domains and data types (text,\ntime-series, log files, etc.) and demonstrate how Python and a number of\naccompanying modules can be used for effective scientific expression.\nStarting with NumPy and Pandas, we will begin with loading, managing,\ncleaning and exploring real-world data right off the instrument. Next,\nwe will return to NumPy and continue on with SciKit-Learn, focusing on a\ncommon dimensionality-reduction technique: PCA.\n\nIn the second half of the course, we will introduce Python for Big Data\nAnalysis and introduce two common distributed solutions: IPython\nParallel and MapReduce. We will develop several routines commonly used\nfor simultaneous calculations and analysis. Using Disco -- a Python\nMapReduce framework -- we will introduce the concept of MapReduce and\nbuild up several scripts which can process a variety of public data\nsets. Additionally, users will also learn how to launch and manage their\nown clusters leveraging AWS and StarCluster.\n\nOutline\n\n*Setup/Install Check (15) *\\ NumPy/Pandas (30) *Series *\\ Dataframe\n*Missing Data *\\ Resampling *Plotting *\\ PCA (15) *NumPy *\\ Sci-Kit\nLearn *Parallel-Coordinates *\\ MapReduce (30) *Intro *\\ Disco *Hadoop\n*\\ Count Words *EC2 and Starcluster (15) *\\ IPython Parallel (30) *Bitly\nLinks Example (30) *\\ Wiki Log Analysis (30)\n\n45 minutes extra for questions, pitfalls, and break\n\nEach student will have access to a 3 node EC2 cluster where they will\nmodify and execute examples. Each cluster will have Anaconda, IPython\nNotebook, Disco, and Hadoop preconfigured\n\nRequired Packages\n\nAll examples in this tutorial will use real data. Attendees are expected\nto have some familiarity with statistical methods and familiarity with\ncommon NumPy routines. Users should come with the latest version of\nAnaconda pre-installed on their laptop and a working SSH client.\n\nDocumentation\n\nPreliminary work can be found at:\nhttps://github.com/ContinuumIO/tutorials\n', 'error': 'Inline emphasis start-string without end-string.'}]
 WARNING:root:conference: scipy-2013
 WARNING:root:[{'file_path': 'data/scipy-2013/videos/data-processing-with-python-scipy2013-tutorial-5.json', 'summary': 'Presenters: Ben Zaitlen, Clayton Davis\n\nDescription\n\nThis tutorial is a crash course in data processing and analysis with\nPython. We will explore a wide variety of domains and data types (text,\ntime-series, log files, etc.) and demonstrate how Python and a number of\naccompanying modules can be used for effective scientific expression.\nStarting with NumPy and Pandas, we will begin with loading, managing,\ncleaning and exploring real-world data right off the instrument. Next,\nwe will return to NumPy and continue on with SciKit-Learn, focusing on a\ncommon dimensionality-reduction technique: PCA.\n\nIn the second half of the course, we will introduce Python for Big Data\nAnalysis and introduce two common distributed solutions: IPython\nParallel and MapReduce. We will develop several routines commonly used\nfor simultaneous calculations and analysis. Using Disco -- a Python\nMapReduce framework -- we will introduce the concept of MapReduce and\nbuild up several scripts which can process a variety of public data\nsets. Additionally, users will also learn how to launch and manage their\nown clusters leveraging AWS and StarCluster.\n\nOutline\n\n*Setup/Install Check (15) *\\ NumPy/Pandas (30) *Series *\\ Dataframe\n*Missing Data *\\ Resampling *Plotting *\\ PCA (15) *NumPy *\\ Sci-Kit\nLearn *Parallel-Coordinates *\\ MapReduce (30) *Intro *\\ Disco *Hadoop\n*\\ Count Words *EC2 and Starcluster (15) *\\ IPython Parallel (30) *Bitly\nLinks Example (30) *\\ Wiki Log Analysis (30)\n\n45 minutes extra for questions, pitfalls, and break\n\nEach student will have access to a 3 node EC2 cluster where they will\nmodify and execute examples. Each cluster will have Anaconda, IPython\nNotebook, Disco, and Hadoop preconfigured\n\nRequired Packages\n\nAll examples in this tutorial will use real data. Attendees are expected\nto have some familiarity with statistical methods and familiarity with\ncommon NumPy routines. Users should come with the latest version of\nAnaconda pre-installed on their laptop and a working SSH client.\n\nDocumentation\n\nPreliminary work can be found at:\nhttps://github.com/ContinuumIO/tutorials\n', 'error': 'Inline emphasis start-string without end-string.'}]
 WARNING:root:conference: scipy-2014
 WARNING:root:[{'file_path': 'data/scipy-2014/videos/socialite-python-intergrated-query-language-for.json', 'description': '| SociaLite is a Python-integrated query language for distributed data\nanalysis.\n| It makes scientific data analysis simple, yet achieves fast\nperformance with its compiler optimizations. The performance of\nSociaLite is often more than three orders of magnitude faster than\nHadoop programs, and close to optimized C programs. For example,\nPageRank algorithm can be implemented in just 2 lines of SociaLite\nquery, which runs nearly as fast as an optimal parallelized C code.\n\nSociaLite supports well-known high-level concepts to make data analysis\neasy for non-expert programmers. We support relational tables for\nstoring data, and relational operations, such as join, selection, and\nprojection, for processing the data. Moreover, SociaLite queries are\nfully integrated with Python, so both SociaLite and Python code can be\nused to implement data analysis logic. For the integration with Python,\nwe support embedding and extending SociaLite, where embedding supports\nusing SociaLite queries directly in Python code, and extending supports\nusing Python functions in SociaLite queries.\n\nThe Python integration makes it easy to implement various analysis\nalgorithms in SociaLite and Python. For example, the BLAST algorithm in\nbioinformatics can be implemented in just a few lines of SociaLite\nqueries and Python code. Also genome assembly algorithm -- generating a\nDe Bruijn graph and applying Eulerian cycle algorithm -- can be simply\nimplemented. In the talk, I will demonstrate these algorithms in\nSociaLite as well as more general algorithms such as K-means clustering\nand logistic regression.\n\nThe SociaLite queries are compiled to highly optimized\nparallel/distributed code; we apply optimizations such as pipelined\nevaluation and prioritization. The runtime system also speeds up the\nperformance; for example, the customized memory allocator reduces memory\nallocation time and footprint. In short, SociaLite makes\nhigh-performance data analysis easy with its high-level abstractions and\ncompiler/runtime optimizations.\n', 'error': 'Line block ends without a blank line.'}]
 WARNING:root:conference: scipy-2014
 WARNING:root:[{'file_path': 'data/scipy-2014/videos/bayesian-statistical-analysis-using-python-part.json', 'description': "The aim of this course is to introduce new users to the Bayesian\napproach of statistical modeling and analysis, so that they can use\nPython packages such as NumPy, SciPy and\n`PyMC <https://github.com/pymc-devs/pymc>`__ effectively to analyze\ntheir own data. It is designed to get users quickly up and running with\nBayesian methods, incorporating just enough statistical background to\nallow users to understand, in general terms, what they are implementing.\nThe tutorial will be example-driven, with illustrative case studies\nusing real data. Selected methods will include approximation methods,\nimportance sampling, Markov chain Monte Carlo (MCMC) methods such as\nMetropolis-Hastings and Slice sampling. In addition to model fitting,\nthe tutorial will address important techniques for model checking, model\ncomparison, and steps for preparing data and processing model output.\nTutorial content will be derived from the instructor's book *Bayesian\nStatistical Computing using Python*, to be published by Springer in late\n2014.\n\n.. figure:: http://d.pr/i/pqWT+\n   :alt: PyMC forest plot\n\n   PyMC forest plot\n.. figure:: http://d.pr/i/AHZV+\n   :alt: DAG\n\n   DAG\nAll course content will be available as a GitHub repository, including\nIPython notebooks and example data.\n\nTutorial Outline\n----------------\n\n1. Overview of Bayesian statistics.\n2. Bayesian Inference with NumPy and SciPy\n3. Markov chain Monte Carlo (MCMC)\n4. The Essentials of PyMC\n5. Fitting Linear Regression Models\n6. Hierarchical Modeling\n7. Model Checking and Validation\n\nInstallation Instructions\n-------------------------\n\nThe easiest way to install the Python packages required for this\ntutorial is via\n`Anaconda <https://store.continuum.io/cshop/anaconda/>`__, a scientific\nPython distribution offered by Continuum analytics. Several other\ntutorials will be recommending a similar setup.\n\nOne of the key features of Anaconda is a command line utility called\n``conda`` that can be used to manage third party packages. We have built\na PyMC package for ``conda`` that can be installed from your terminal\nvia the following command:\n\n::\n\n    conda install -c https://conda.binstar.org/pymc pymc\n\nThis should install any prerequisite packages that are required to run\nPyMC.\n\nOne caveat is that conda does not yet have a build of PyMC for **Python\n3**. Therefore, you would have to build it yourself via pip:\n\n::\n\n    pip install git+git://github.com/pymc-devs/[email protected]\n\nFor those of you on Mac OS X that are already using the\n`Homebrew <http://brew.sh>`__ package manager, I have prepared a script\nthat will install the entire Python scientific stack, including PyMC\n2.3. You can download the script\n`here <https://gist.github.com/fonnesbeck/7de008b05e670d919b71>`__ and\nrun it via:\n\n::\n\n    sh install_superpack_brew.sh\n\n", 'error': 'Explicit markup ends without a blank line; unexpected unindent.'}]
 WARNING:root:conference: scipy-2014
 WARNING:root:[{'file_path': 'data/scipy-2014/videos/bayesian-statistical-analysis-using-python-part-1.json', 'description': "The aim of this course is to introduce new users to the Bayesian\napproach of statistical modeling and analysis, so that they can use\nPython packages such as NumPy, SciPy and\n`PyMC <https://github.com/pymc-devs/pymc>`__ effectively to analyze\ntheir own data. It is designed to get users quickly up and running with\nBayesian methods, incorporating just enough statistical background to\nallow users to understand, in general terms, what they are implementing.\nThe tutorial will be example-driven, with illustrative case studies\nusing real data. Selected methods will include approximation methods,\nimportance sampling, Markov chain Monte Carlo (MCMC) methods such as\nMetropolis-Hastings and Slice sampling. In addition to model fitting,\nthe tutorial will address important techniques for model checking, model\ncomparison, and steps for preparing data and processing model output.\nTutorial content will be derived from the instructor's book *Bayesian\nStatistical Computing using Python*, to be published by Springer in late\n2014.\n\n.. figure:: http://d.pr/i/pqWT+\n   :alt: PyMC forest plot\n\n   PyMC forest plot\n.. figure:: http://d.pr/i/AHZV+\n   :alt: DAG\n\n   DAG\nAll course content will be available as a GitHub repository, including\nIPython notebooks and example data.\n\nTutorial Outline\n----------------\n\n1. Overview of Bayesian statistics.\n2. Bayesian Inference with NumPy and SciPy\n3. Markov chain Monte Carlo (MCMC)\n4. The Essentials of PyMC\n5. Fitting Linear Regression Models\n6. Hierarchical Modeling\n7. Model Checking and Validation\n\nInstallation Instructions\n-------------------------\n\nThe easiest way to install the Python packages required for this\ntutorial is via\n`Anaconda <https://store.continuum.io/cshop/anaconda/>`__, a scientific\nPython distribution offered by Continuum analytics. Several other\ntutorials will be recommending a similar setup.\n\nOne of the key features of Anaconda is a command line utility called\n``conda`` that can be used to manage third party packages. We have built\na PyMC package for ``conda`` that can be installed from your terminal\nvia the following command:\n\n::\n\n    conda install -c https://conda.binstar.org/pymc pymc\n\nThis should install any prerequisite packages that are required to run\nPyMC.\n\nOne caveat is that conda does not yet have a build of PyMC for **Python\n3**. Therefore, you would have to build it yourself via pip:\n\n::\n\n    pip install git+git://github.com/pymc-devs/[email protected]\n\nFor those of you on Mac OS X that are already using the\n`Homebrew <http://brew.sh>`__ package manager, I have prepared a script\nthat will install the entire Python scientific stack, including PyMC\n2.3. You can download the script\n`here <https://gist.github.com/fonnesbeck/7de008b05e670d919b71>`__ and\nrun it via:\n\n::\n\n    sh install_superpack_brew.sh\n\n", 'error': 'Explicit markup ends without a blank line; unexpected unindent.'}]
 WARNING:root:conference: scipy-2014
 WARNING:root:[{'file_path': 'data/scipy-2014/videos/anatomy-of-matplotlib-part-1.json', 'description': 'Introduction\n============\n\nPurpose of matplotlib\n---------------------\n\nOnline Documentation\n--------------------\n\n`matplotlib.org <http://matplotlib.org>`__\n~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~\n\nMailing Lists and StackOverflow\n~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~\n\nGithub Repository\n~~~~~~~~~~~~~~~~~\n\nBug Reports & Feature Requests\n~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~\n\nWhat is this "backend" thing I keep hearing about?\n==================================================\n\nInteractive versus non-interactive\n~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~\n\nAgg\n~~~\n\nTk, Qt, GTK, MacOSX, Wx, Cairo\n~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~\n\nPlotting Functions\n==================\n\nGraphs (plot, scatter, bar, stem, etc.)\n~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~\n\nImages (imshow, pcolor, pcolormesh, contour[f], etc.)\n~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~\n\nLesser Knowns: (pie, acorr, hexbin, streamplot, etc.)\n~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~\n\nWhat goes in a Figure?\n======================\n\nAxes\n~~~~\n\nAxis\n~~~~\n\nticks (and ticklines and ticklabels) (both major & minor)\n~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~\n\naxis labels\n~~~~~~~~~~~\n\naxes title\n~~~~~~~~~~\n\nfigure suptitle\n~~~~~~~~~~~~~~~\n\naxis spines\n~~~~~~~~~~~\n\ncolorbars (and the oddities thereof)\n~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~\n\naxis scale\n~~~~~~~~~~\n\naxis gridlines\n~~~~~~~~~~~~~~\n\nlegend\n~~~~~~\n\nManipulating the "Look-and-Feel"\n================================\n\nIntroducing matplotlibrc\n~~~~~~~~~~~~~~~~~~~~~~~~\n\nProperties\n~~~~~~~~~~\n\n-  color (and edgecolor, linecolor, facecolor, etc...)\n-  linewidth and edgewidth and markeredgewidth (and the oddity that\n   happens in errorbar())\n-  linestyle\n-  fonts\n-  zorder\n-  visible\n\nWhat are toolkits?\n==================\n\naxes\\_grid1\n~~~~~~~~~~~\n\nmplot3d\n~~~~~~~\n\nbasemap\n~~~~~~~\n\n', 'error': 'Title level inconsistent:'}]
 WARNING:root:conference: scipy-2014
 WARNING:root:[{'file_path': 'data/scipy-2014/videos/bayesian-statistical-analysis-using-python-part-0.json', 'description': "The aim of this course is to introduce new users to the Bayesian\napproach of statistical modeling and analysis, so that they can use\nPython packages such as NumPy, SciPy and\n`PyMC <https://github.com/pymc-devs/pymc>`__ effectively to analyze\ntheir own data. It is designed to get users quickly up and running with\nBayesian methods, incorporating just enough statistical background to\nallow users to understand, in general terms, what they are implementing.\nThe tutorial will be example-driven, with illustrative case studies\nusing real data. Selected methods will include approximation methods,\nimportance sampling, Markov chain Monte Carlo (MCMC) methods such as\nMetropolis-Hastings and Slice sampling. In addition to model fitting,\nthe tutorial will address important techniques for model checking, model\ncomparison, and steps for preparing data and processing model output.\nTutorial content will be derived from the instructor's book *Bayesian\nStatistical Computing using Python*, to be published by Springer in late\n2014.\n\n.. figure:: http://d.pr/i/pqWT+\n   :alt: PyMC forest plot\n\n   PyMC forest plot\n.. figure:: http://d.pr/i/AHZV+\n   :alt: DAG\n\n   DAG\nAll course content will be available as a GitHub repository, including\nIPython notebooks and example data.\n\nTutorial Outline\n----------------\n\n1. Overview of Bayesian statistics.\n2. Bayesian Inference with NumPy and SciPy\n3. Markov chain Monte Carlo (MCMC)\n4. The Essentials of PyMC\n5. Fitting Linear Regression Models\n6. Hierarchical Modeling\n7. Model Checking and Validation\n\nInstallation Instructions\n-------------------------\n\nThe easiest way to install the Python packages required for this\ntutorial is via\n`Anaconda <https://store.continuum.io/cshop/anaconda/>`__, a scientific\nPython distribution offered by Continuum analytics. Several other\ntutorials will be recommending a similar setup.\n\nOne of the key features of Anaconda is a command line utility called\n``conda`` that can be used to manage third party packages. We have built\na PyMC package for ``conda`` that can be installed from your terminal\nvia the following command:\n\n::\n\n    conda install -c https://conda.binstar.org/pymc pymc\n\nThis should install any prerequisite packages that are required to run\nPyMC.\n\nOne caveat is that conda does not yet have a build of PyMC for **Python\n3**. Therefore, you would have to build it yourself via pip:\n\n::\n\n    pip install git+git://github.com/pymc-devs/[email protected]\n\nFor those of you on Mac OS X that are already using the\n`Homebrew <http://brew.sh>`__ package manager, I have prepared a script\nthat will install the entire Python scientific stack, including PyMC\n2.3. You can download the script\n`here <https://gist.github.com/fonnesbeck/7de008b05e670d919b71>`__ and\nrun it via:\n\n::\n\n    sh install_superpack_brew.sh\n\n", 'error': 'Explicit markup ends without a blank line; unexpected unindent.'}]
 WARNING:root:conference: scipy-2014
 WARNING:root:[{'file_path': 'data/scipy-2014/videos/wcsaxes-a-framework-for-plotting-astronomical-an.json', 'description': "Astronomical data (whether images on the sky, or other data) are\ntypically stored with information about their corresponding projection\n(Gnomonic, Mercator, Conical, Aitoff, and *many* more) and coordinate\nsystem (Equatorial, Galactic, Ecliptic, and so on).\n\nI will present `WCSAxes <https://github.com/astrofrog/wcsaxes>`__, a new\nframework for plotting such astronomical data, developed as part of the\n`Astropy <http://www.astropy.org>`__ project. WCSAxes consists primarily\nof a `Matplotlib <http://www.matplotlib.org>`__ Axes sub-class that\nseamlessly handles the plotting of ticks, tick labels, and grid lines\nfor arbitrary coordinate systems and projections.\n\nAs an example, the following plot was produced with WCSAxes:\n\n.. figure:: http://www.mpia.de/~robitaille/chandra_avm_small.png\n   :alt: The Galactic center as seen by the Chandra X-ray observatory\n\n   The Galactic Center as seen by Chandra\n(Image Credit: NASA/CXC/UMass/D. Wang et al. -\nhttp://chandra.harvard.edu/photo/2009/gcenter/)\n\nSince it is a sub-class of the Matplotlib Axes class, all the default\nMatplotlib methods such as plot, scatter, imshow, contour, as well as\npatches, lines, collections, and so on are supported, and WCSAxes - in\ncombination with Matplotlib's ability to accept arbitrary\ntransformations - makes it very easy to define whether the plotting\nshould apply to pixel coordinates, or a world coordinate system related\nto the data.\n\nWCSAxes has been designed as a framework that can be easily used in\nother Python tools, and it is planned for inclusion in\n`Glue <http://www.glueviz.org>`__, `APLpy <http://aplpy.github.io>`__,\nand other astronomical tools. While originally written for Astronomical\nimages, it should be easily extendable to any kind of map (such as\nEarth-based geospatial data) provided that the projection and coordinate\nsystem can be represented by a pixel-to-world transformation.\n", 'error': 'Explicit markup ends without a blank line; unexpected unindent.'}]
 WARNING:root:conference: scipy-2014
 WARNING:root:[{'file_path': 'data/scipy-2014/videos/anatomy-of-matplotlib-part-2.json', 'description': 'Introduction\n============\n\nPurpose of matplotlib\n---------------------\n\nOnline Documentation\n--------------------\n\n`matplotlib.org <http://matplotlib.org>`__\n~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~\n\nMailing Lists and StackOverflow\n~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~\n\nGithub Repository\n~~~~~~~~~~~~~~~~~\n\nBug Reports & Feature Requests\n~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~\n\nWhat is this "backend" thing I keep hearing about?\n==================================================\n\nInteractive versus non-interactive\n~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~\n\nAgg\n~~~\n\nTk, Qt, GTK, MacOSX, Wx, Cairo\n~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~\n\nPlotting Functions\n==================\n\nGraphs (plot, scatter, bar, stem, etc.)\n~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~\n\nImages (imshow, pcolor, pcolormesh, contour[f], etc.)\n~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~\n\nLesser Knowns: (pie, acorr, hexbin, streamplot, etc.)\n~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~\n\nWhat goes in a Figure?\n======================\n\nAxes\n~~~~\n\nAxis\n~~~~\n\nticks (and ticklines and ticklabels) (both major & minor)\n~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~\n\naxis labels\n~~~~~~~~~~~\n\naxes title\n~~~~~~~~~~\n\nfigure suptitle\n~~~~~~~~~~~~~~~\n\naxis spines\n~~~~~~~~~~~\n\ncolorbars (and the oddities thereof)\n~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~\n\naxis scale\n~~~~~~~~~~\n\naxis gridlines\n~~~~~~~~~~~~~~\n\nlegend\n~~~~~~\n\nManipulating the "Look-and-Feel"\n================================\n\nIntroducing matplotlibrc\n~~~~~~~~~~~~~~~~~~~~~~~~\n\nProperties\n~~~~~~~~~~\n\n-  color (and edgecolor, linecolor, facecolor, etc...)\n-  linewidth and edgewidth and markeredgewidth (and the oddity that\n   happens in errorbar())\n-  linestyle\n-  fonts\n-  zorder\n-  visible\n\nWhat are toolkits?\n==================\n\naxes\\_grid1\n~~~~~~~~~~~\n\nmplot3d\n~~~~~~~\n\nbasemap\n~~~~~~~\n\n', 'error': 'Title level inconsistent:'}]
 WARNING:root:conference: scipy-2014
 WARNING:root:[{'file_path': 'data/scipy-2014/videos/anatomy-of-matplotlib-part-3.json', 'description': 'Introduction\n============\n\nPurpose of matplotlib\n---------------------\n\nOnline Documentation\n--------------------\n\n`matplotlib.org <http://matplotlib.org>`__\n~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~\n\nMailing Lists and StackOverflow\n~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~\n\nGithub Repository\n~~~~~~~~~~~~~~~~~\n\nBug Reports & Feature Requests\n~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~\n\nWhat is this "backend" thing I keep hearing about?\n==================================================\n\nInteractive versus non-interactive\n~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~\n\nAgg\n~~~\n\nTk, Qt, GTK, MacOSX, Wx, Cairo\n~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~\n\nPlotting Functions\n==================\n\nGraphs (plot, scatter, bar, stem, etc.)\n~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~\n\nImages (imshow, pcolor, pcolormesh, contour[f], etc.)\n~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~\n\nLesser Knowns: (pie, acorr, hexbin, streamplot, etc.)\n~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~\n\nWhat goes in a Figure?\n======================\n\nAxes\n~~~~\n\nAxis\n~~~~\n\nticks (and ticklines and ticklabels) (both major & minor)\n~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~\n\naxis labels\n~~~~~~~~~~~\n\naxes title\n~~~~~~~~~~\n\nfigure suptitle\n~~~~~~~~~~~~~~~\n\naxis spines\n~~~~~~~~~~~\n\ncolorbars (and the oddities thereof)\n~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~\n\naxis scale\n~~~~~~~~~~\n\naxis gridlines\n~~~~~~~~~~~~~~\n\nlegend\n~~~~~~\n\nManipulating the "Look-and-Feel"\n================================\n\nIntroducing matplotlibrc\n~~~~~~~~~~~~~~~~~~~~~~~~\n\nProperties\n~~~~~~~~~~\n\n-  color (and edgecolor, linecolor, facecolor, etc...)\n-  linewidth and edgewidth and markeredgewidth (and the oddity that\n   happens in errorbar())\n-  linestyle\n-  fonts\n-  zorder\n-  visible\n\nWhat are toolkits?\n==================\n\naxes\\_grid1\n~~~~~~~~~~~\n\nmplot3d\n~~~~~~~\n\nbasemap\n~~~~~~~\n\n', 'error': 'Title level inconsistent:'}]
diff --git a/valid.py b/valid.py
 """
 Validate rst data in `description` & `summary` fields of json files.

 Usage::

    python bin/validate.py
 """

 import glob
 import json
 import logging
 import subprocess

 import restructuredtext_lint


 LOG_FILENAME = 'pyvideo.log'
 logging.basicConfig(filename=LOG_FILENAME, level=logging.DEBUG)


 def validate_rst(rst):
    errors = restructuredtext_lint.lint(rst)
    return errors


 def validate_file(file_path):
    with open(file_path) as fp:
        content = json.load(fp)

    errors = []
    for section_name in ('description', 'summary'):
        section = content.get(section_name)
        if section:
            error = validate_rst(section)
            if error:
                errors.append({
                    'file_path': file_path,
                    section_name: section,
                    'error':  error[0].message
                })
    return errors


 def run_command(command):
    output = subprocess.check_output(command.split())
    return output.decode('utf-8').split('\n')


 DATA_DIR = '../data'


 def validate(branch):
    # print(branch)
    try:
        output = run_command('git checkout {}'.format(branch))
    except:
        return
    branch = branch.split('/')[-1]
    pattern = '{}/{}/**/*.json'.format(DATA_DIR, branch)
    json_files = glob.iglob(pattern, recursive=True)
    # print(json_files, pattern)
    for json_file in json_files:
        # print(json_file)
        errors = validate_file(json_file)
        if errors:
            logging.warning('conference: {}'.format(branch))
            logging.warning(errors)
    run_command('git checkout master')



 branches = run_command('git branch -a')
 branches = [b.strip() for b in branches if 'origin' in b]
 # print(branches)

 for branch in branches:
    validate(branch)
	"""
	Validate rst data in `description` & `summary` fields of json files.

	Usage::

	python bin/validate.py
	"""

	import glob
	import json
	import logging
	import subprocess

	import restructuredtext_lint


	LOG_FILENAME = 'pyvideo.log'
	logging.basicConfig(filename=LOG_FILENAME, level=logging.DEBUG)


	def validate_rst(rst):
	errors = restructuredtext_lint.lint(rst)
	return errors


	def validate_file(file_path):
	with open(file_path) as fp:
	content = json.load(fp)

	errors = []
	for section_name in ('description', 'summary'):
	section = content.get(section_name)
	if section:
	error = validate_rst(section)
	if error:
	errors.append({
	'file_path': file_path,
	section_name: section,
	'error': error[0].message
	})
	return errors


	def run_command(command):
	output = subprocess.check_output(command.split())
	return output.decode('utf-8').split('\n')


	DATA_DIR = '../data'


	def validate(branch):
	# print(branch)
	try:
	output = run_command('git checkout {}'.format(branch))
	except:
	return
	branch = branch.split('/')[-1]
	pattern = '{}/{}/*/.json'.format(DATA_DIR, branch)
	json_files = glob.iglob(pattern, recursive=True)
	# print(json_files, pattern)
	for json_file in json_files:
	# print(json_file)
	errors = validate_file(json_file)
	if errors:
	logging.warning('conference: {}'.format(branch))
	logging.warning(errors)
	run_command('git checkout master')



	branches = run_command('git branch -a')
	branches = [b.strip() for b in branches if 'origin' in b]
	# print(branches)

	for branch in branches:
	validate(branch)
No results found