Skip to content

Instantly share code, notes, and snippets.

@durden
Last active December 21, 2015 05:29
Show Gist options
  • Save durden/6257052 to your computer and use it in GitHub Desktop.
Save durden/6257052 to your computer and use it in GitHub Desktop.
Notes from Pytexas 2013

Filtering and Deduplicating Data in IPython Notebook

  • Nice shortcut to create a list of strings
fields = '''\
test1,
test2,
test3,
test4'''.split()

instancecheck

James Powell

  • Dunder methods are looked up on the type for new-style classes

  • Dunder methods are looked up on the instance for old-style classes

  • Add iteration to a class object with metaclasses

class Foo(object):
    def __iter__(self):
        yield 'iterating over instance'
    class __metaclass__(type):
        def __iter__(self):
            yield 'iterating over class, not instance'
  • Could be useful to do something like iterating over the fields
  • Similar to how Django objects have a fields method to iterate

instancecheck dunder

__instancecheck__ and __issubclass__ exist on the metaclass or type itself, not on an instance

class Dangerous(object):
    class __metaclass__(type):
        def __instancecheck__(self):
            return hasattr(obj, 'attack')

class Monster(object):
    def attack(self): pass

griffin = Monster()
assert isinstance(griffin, Dangerous)
  • Way to use mixins and check for those instead of inheritance

issubclass dunder

class Dangerous(object):
    class __metaclass__(type):
        def __subclasscheck__(self, cls):
            return hasattr(obj, 'attack')

class Dragon(Monster):
    def __init__(self):
        self.attack = 10

print issubclass(Drangon, Dangerous)

What are these useful for?

  • Modeling in terms of types (userland type systems)

    • Make a mixin that uses a metaclass and instancecheck for something like:
    class PrimeSequence(object):
        class __metaclass__(type):
            def __instancecheck__(self):
                return all([x % 2 for x in self])
    
    print isinstance([1,2,3,4], OddSequence)
  • This boils to writing a function that tells us if a sequence is odd or using a class to make it look like a type check.

    • Using a class might be more useful for some scenarios b/c it actually constraings what you can do.
    • For example, instead of using a string field in an object you could have no limitations b/c you have to compare a string which could have different case, etc.
      • So, when you use a type you constrain what you can do by putting this string into the type itself.

Classes and metaclasses

James Powell

  • Metaclasses are a tool for enforcing constraints from base to derived

  • Build a configuration system by using the existing import system instead of config files, etc.

    • This allows you to integrate your configuration directly into the language instead of a high-level way.
  • Instance methods in Python are dynamically created when they are asked for, this means that each time you call the method it changes

  • Use case for old-style classes:

    • Can change dunder methods on the fly
    • new-style dunder methods are looked up on the type object
  • Creation of a class in Python is really:

    • Create a function that knows how to create a class using a standard protocol:
      • call the new, init, etc. in this order
      • Think of it as just a function:
        def class_builder(cls):
            cls.__new__()
            cls.__init__()
            return cls
  • Use metaclasses to do something like creating your own object model:

    class foo()
        pass
    class baz():
        pass
    class bar():
        pass
    
    new_class = (foo | baz) & bar
  • Metaclass is only looked up for in the first base class

    class A(Foo, Bar, Baz):
        pass
    • Only the metaclass in Foo will be used (if it exists)
  • Use metaclass to add constraints to inherited classes from base class

    • Python 3.2
    class metaclass(type):
        def __init__(self, name, bases, body):
            if name == 'Derived':
                raise ValueError("I don't like your name")
    
            return type.__init__(self, name, bases, body)
    
    class Base(metaclass=metaclass):
        pass
    
    class Derived(Base):
        pass

Nobody expects the Python Packaging Authority

Nick Coghlan

  • Any interesting packaging tools need to support all the way back to Python 2.6 b/c 2.6 and 2.7 are still very heavily used.

setuptools vs distribute

  • Historically setuptools development process was very hidden. So, it was confusing for newcomers and distrubute was created
  • Earlier this year these two projects resolved their differences and merged into setuptools 0.8+

pip vs easy_install

  • Biggest benefit of pip was the defaults were better in line with what people wanted.
  • However, pip has limitations as well, doesn't support binary egg format
  • Latest version of pip can do binary packages with new binary package called wheel. (pip 1.4+)
  • Now, if you can regenerate the binaries as wheels instead of eggs then pip is the way for the future.

What was wrong with binary eggs that required the creation of wheels?

  • https://pypi.python.org/pypi/wheel
  • eggs are just a big blog of files and don't play nice with file system hierarchies
  • file naming conventions in wheel are better and give you more information about what they are suited for

Python packaging authority (PyPA)

  • Home of distlib, pip, setuptools, pypi, virtualenv, etc.
  • Python packaging user guide is coming from this group
  • Goals of group:
    • Have clear authority and guidance on what for users to use
    • Make it easy to get started with pip
    • Fast, reliable, and reasonably secure
    • Better platform interoperability

Science tools are exception

  • Scientific community has tough dependencies so using wheels may not be the best way forward
  • The naming scheme for wheels is not always suitable for scientific tools.
  • Using hashdist on packaging side and anaconda on the install side
  • Not the end of the world b/c setuptools/pip get to focus on the common use-case and make it much better.
  • Another tool is zztop builder?

Pip in stdlib

  • Moving pip in the stdlib is controversial b/c it would be nice if pip could upgrade itself and not be tied to Python install.
    • Make some pip bootstrap tools and include only those in the stdlib, not the pip version itself.
    • PEP 439 describes mechanism for bootstrapping that is considered to not work, probably won't be used.
    • Hopefully these bootstrap tools will be in Python 3.4
    • For Python 2.7, 3.3 these tools will be hosted on pypi and allow older versions to get these as well.

Fast, reliable, and reasonably secure

  • Things to slow down fast distribution
    • Finding mirrors in pip is weird and complicated, most don't use it
    • Finding dependencies requires that you pull down the entire app and read the files, the index server doesn't include metadata
  • Fastly donated CDN so pypi should be faster now
  • PEP 438: Eliminating scanning of external links
  • Pip can cache wheels to speed things up

What prevents secure distribution

  • No packaging signing
  • No SSL verfication

Future (post Python 3.4)

  • Better metadata using JSON, not key value
  • Pypi can now have the metadata and speed up installation b/c can query for dependencies without pulling down the entire package first
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment