A journey to python 3: porting redis-limpyd

Introduction

A few weeks ago, I - Twidi - made a proposal to talk about redis-limpyd at Pytong.

And it was accepted. But... the main subject of Pytong is "After python 2".

redis-limpyd currently only works on python 2.6 and 2.7, and as I said during my lightning talk at PyconFR in October 2013 (start at 47:50), "porting to python 3 is on it's way"... but the only thing that was done at this time - and it's still the case as I'm writing these lines - was a simple run of the 2to3 command to see the things that needed to be done, by Yohan Boniface, original author of the project.

So, now it's time to go. I'll write this post during my work on porting the project to python 3. So at this precise point in time, I still don't know how it will be done: it's my first time doing this.

The only thing I know is the goal: I want redis-limpyd to work for projects with python 2.7, and python 3.

Will it still support python 2.6? Will it have two distinct code bases?

We'll all know at the end of the journey.

Preparing the environments

Preparing git:

# Fetching the project
cd ~/dev
git clone [email protected]:twidi/redis-limpyd.git
cd redis-limpyd
# Stay up to date with the original version (mine is a fork)
git remote add upstream https://github.com/yohanboniface/redis-limpyd.git
git fetch upstream
git checkout master
git merge upstream/master

The work will be done in a branch I use git-flow so:

git flow feature start python3-compat

So my branch is:

git status
> On branch feature/python3-compat

Now that i have my project ready to work on, I'll create two environments, one for python 2.7 and one for python 3.4. The work will be done on these two versions of python by simplicity: they both come out of the box with Ubuntu 14.04.

I use virtualenv-wrapper so to create the environments, I simply open two terminals, and do:

For python 2.7, the default python, in terminal #1:

mkvirtualenv limpyd2.7
cd ~/dev/redis-limpyd

For python 3.4,in terminal #2:

mkvirtualenv -p `which python3.4` limpyd-3.4
cd ~/dev/redis-limpyd

Now I need to install requirements. In each terminal, I run the following command:

pip install -r requirements.txt

Everything is ok in the python 2.7 terminal, but not on the 3.4, because the unitest2 package is not python 3 compatible:

    except Exception, e:

                    ^

SyntaxError: invalid syntax

I remember the requirement of unittest2 had smoething to do wih python 2.6. So I checked the history, and it's cnfirmed by the commit 39897b5, which added the argparse and unittest2 packages to support python 2.6.

So I didn't even started that I already have to make a choice.

Rejecting python 2.6, or make separate requirements.txt files?

To be 2.6 or not to be 2.6

I asked Aymeric Augusting (code dev Django and CTO at Oscaro where I work) a few days ago about the need to support 2.6 and his answer was clear: python 2.6 is "end-of-life" for a year, so it's not really useful to complicate things by supporting it.

Ok, mYk, Ok, it will be simpler...

git being a wonderful tool, it provides a command to do exactly what I need: to create a commit that revert what was done by the commit I found earlier in 39897b5:

git revert 39897b5

git was not able to create the commit because of a very simple conflict in... the requirements.txt file.

The conflict is:

<<<<<<< HEAD
redis==2.9.1
argparse
unittest2
=======
redis
>>>>>>> parent of 39897b5... Add argparse and unittest2 requirements to work with python 2.6

It's easy to resolve, simple keep the redis==2.9.1 line, and run git add requirements.txt.

Before creating the final commit, I also have to inform travis-ci that I don't want him to test with python 2.6 anymore, so I remove the line with 2.6 in the .travis.yml file, then I add this file to be commited (git add .travis.yml)

There is now reason that something may be broken with python 2.7 by doing that, but just to be sure, in the python 2.7 terminal, I check by doing this:

$ python run_tests.py
[...]
Ran 300 tests in 3.910s

OK

So now I can create and push my first commit.

git ci -m "Remove support for python 2.6 (revert [39897b5])"
# It's my first push on this branch, so I ask git-flow to publish the branch
git flow publish

I can now install requirements on both environments:

# First line only needed in the python 2.7 terminal to remove now uneeded packages
for req in `pip freeze`; do pip uninstall -qy $req; done
pip install -r requirements.txt

And in both environments, the only required packags, redis==2.9.1 was correctly installed.

What needs to be changed?

Even if we (Yohan and myself) think that we wrote good code, it's not - at all - the question.

Python 3 is a compatibility killer. Changes in the way metaclasses work, iterators everywhere, strings vs unicode... and we used all of these stuff.

Simply trying to run the limpyd test suite in the python 3.4 terminal showed me the work to do:

$ python run_tests.py
Traceback (most recent call last):
  File "run_tests.py", line 7, in <module>
    from tests import base, model, utils, collection, lock, fields
  File "/home/twidi/dev/redis-limpyd/tests/model.py", line 9, in <module>
    from limpyd import model
  File "/home/twidi/dev/redis-limpyd/limpyd/model.py", line 7, in <module>
    from limpyd.fields import *
  File "/home/twidi/dev/redis-limpyd/limpyd/fields.py", line 5, in <module>
    from itertools import izip
ImportError: cannot import name 'izip'

But to know the amount of things that need to be changed, a tool is needed.

By searching on the internet about porting python 2 code to python 3, one relevant link is https://docs.python.org/3/howto/pyporting.html.

And the "Projects to Consider" paragraph told me that there are many options, but the one that attracted me was "If you want to write your compatible code to feel more like Python 3 there is the future project.", pointing us to the python-future.org website.

On this website, I easily found the Automatic conversion to Py2/3 page, which sounds great, giving me tools to show what needs to be changed, and even do a lot of the changes for me!

So let's start by installing the future package in both environments:

pip install future

And I also add this line to the requirements file:

future==0.13.0

And I can now easily see what needs to be changed in the limpyd package and the other python files (run_tests.py and setup.py) by running the following command (I use the non-default --unicode-literals option to manage strings that may not be prefixed by u in the current codebase):

futurize --unicode-literals *.py limpyd/ tests/

And below the output of the command, by python file:

run_tests.py:

@@ -1,5 +1,6 @@
 #!/usr/bin/env python
 
+from __future__ import unicode_literals
 import unittest
 import argparse

setup.py:

@@ -1,3 +1,5 @@
+from __future__ import unicode_literals
+from future.builtins import str
 #!/usr/bin/env python
 # -*- coding: utf-8 -*-

limpyd/init.py:

@@ -1,6 +1,8 @@
 """Limpyd provides an easy way to store objects in Redis, without losing the
 power and the control of the Redis API, in a limpid way, with just as
 abstraction as needed."""
+from __future__ import unicode_literals
+from future.builtins import map
 
 VERSION = (0, 1, 3)

limpyd/collection.py:

@@ -1,3 +1,5 @@
+from __future__ import unicode_literals
+from future.builtins import object
 # -*- coding:utf-8 -*-
 
 from limpyd.utils import unique_key
@@ -290,7 +292,7 @@
 
     def _add_filters(self, **filters):
         """Define self._lazy_collection according to filters."""
-        for key, value in filters.iteritems():
+        for key, value in filters.items():
             if self.cls._field_is_pk(key):
                 pk = self.cls.get_field('pk').normalize(value)
                 self._lazy_collection['pks'].add(pk)

limpyd/database.py:

@@ -1,3 +1,6 @@
+from __future__ import unicode_literals
+from future.builtins import str
+from future.builtins import object
 # -*- coding:utf-8 -*-
 
 import redis

limpyd/exceptions.py:

@@ -1,3 +1,4 @@
+from __future__ import unicode_literals
 __all__ = [
     "UniquenessError",
     "ImplementationError",

limpyd/fields.py:

@@ -1,8 +1,13 @@
+from __future__ import unicode_literals
+from future.builtins import str
+from future.builtins import zip
+from future.builtins import object
 # -*- coding:utf-8 -*-
 
 from logging import getLogger
 from copy import copy
-from itertools import izip
+from future.utils import with_metaclass
+
 from redis.exceptions import RedisError
 from redis.client import Lock
 
@@ -78,15 +83,7 @@
         return it
 
 
-class RedisProxyCommand(object):
-
-    __metaclass__ = MetaRedisProxy
-
-    # Commands allowed for an object, by type, each attribute is a list/typle.
-    # If an attribute is not defined, the one from its parent class is used.
-    # Here the different attributes:
-    #  - available_getters:  commands that get data from redis
-    #  - available_modifiers: commands that set data in redis
+class RedisProxyCommand(with_metaclass(MetaRedisProxy, object)):
 
     @classmethod
     def _make_command_method(cls, command_name):
@@ -676,7 +673,7 @@
                     raise RedisError("ZADD requires an equal number of "
                                      "values and scores")
                 keys.extend(args[1::2])
-            for pair in kwargs.iteritems():
+            for pair in kwargs.items():
                 keys.append(pair[0])
             self.index(keys)
         return self._traverse_command(command, *args, **kwargs)
@@ -708,7 +705,7 @@
                                  "values and scores")
             pieces.extend(args)
 
-        for pair in kwargs.iteritems():
+        for pair in kwargs.items():
             pieces.append(pair[1])
             pieces.append(pair[0])
 
@@ -719,7 +716,7 @@
         scores = pieces[0::2]
 
         pieces = []
-        for z in izip(scores, values):
+        for z in zip(scores, values):
             pieces.extend(z)
 
         return pieces
@@ -833,7 +830,7 @@
     def _call_hmset(self, command, *args, **kwargs):
         if self.indexable:
             current = self.proxy_get()
-            _to_deindex = dict((k, current[k]) for k in kwargs.iterkeys() if k in current)
+            _to_deindex = dict((k, current[k]) for k in kwargs.keys() if k in current)
             self.deindex(_to_deindex)
             self.index(kwargs)
         return self._traverse_command(command, kwargs)
@@ -899,7 +896,7 @@
 
         if values is None:
             values = self.proxy_get()
-        for field_name, value in values.iteritems():
+        for field_name, value in values.items():
             if value is not None:
                 key = self.index_key(value, field_name)
                 self.add_index(key)
@@ -912,7 +909,7 @@
 
         if values is None:
             values = self.proxy_get()
-        for field_name, value in values.iteritems():
+        for field_name, value in values.items():
             if value is not None:
                 key = self.index_key(value, field_name)
                 self.remove_index(key)

limpyd/model.py:

@@ -1,3 +1,4 @@
+from __future__ import unicode_literals
 # -*- coding:utf-8 -*-
 
 from logging import getLogger
@@ -9,6 +10,7 @@
 from limpyd.exceptions import *
 from limpyd.database import RedisDatabase
 from limpyd.collection import CollectionManager
+from future.utils import with_metaclass
 
 __all__ = ['RedisModel', ]
 
@@ -46,7 +48,7 @@
         own_fields = []
 
         # limit to redis fields
-        redis_fields = [(k, v) for k, v in attrs.items() if not k.startswith('_') and isinstance(v, RedisField)]
+        redis_fields = [(k, v) for k, v in list(attrs.items()) if not k.startswith('_') and isinstance(v, RedisField)]
         # and keep them by declaration order
         redis_fields = [(k, v) for k, v in sorted(redis_fields, key=lambda f: f[1]._creation_order)]
 
@@ -107,12 +109,10 @@
         return it
 
 
-class RedisModel(RedisProxyCommand):
+class RedisModel(with_metaclass(MetaRedisModel, RedisProxyCommand)):
     """
     Base redis model.
     """
-
-    __metaclass__ = MetaRedisModel
 
     namespace = None  # all models in an app may have the same namespace
     lockable = True
@@ -173,7 +173,7 @@
             # redis do not has "real" transactions)
             # Here we do not set anything, in case one unique field fails
             kwargs_pk_field_name = None
-            for field_name, value in kwargs.iteritems():
+            for field_name, value in kwargs.items():
                 if self._field_is_pk(field_name):
                     if kwargs_pk_field_name:
                         raise ValueError(u'You cannot pass two values for the '
@@ -339,8 +339,8 @@
             raise ValueError(u"`Exists` method requires at least one kwarg.")
 
         # special case to check for a simple pk
-        if len(kwargs) == 1 and cls._field_is_pk(kwargs.keys()[0]):
-            return cls.get_field('pk').exists(kwargs.values()[0])
+        if len(kwargs) == 1 and cls._field_is_pk(list(kwargs.keys())[0]):
+            return cls.get_field('pk').exists(list(kwargs.values())[0])
 
         # get only the first element of the unsorted collection (the fastest)
         try:
@@ -361,8 +361,8 @@
             pk = args[0]
         elif kwargs:
             # special case to check for a simple pk
-            if len(kwargs) == 1 and cls._field_is_pk(kwargs.keys()[0]):
-                pk = kwargs.values()[0]
+            if len(kwargs) == 1 and cls._field_is_pk(list(kwargs.keys())[0]):
+                pk = list(kwargs.values())[0]
             else:  # case with many filters
                 result = cls.collection(**kwargs).sort(by='nosort')
                 if len(result) == 0:
@@ -430,7 +430,7 @@
         one redis call. You must pass kwargs with field names as keys, with
         their value.
         """
-        if kwargs and not any(kwarg in self._instancehash_fields for kwarg in kwargs.iterkeys()):
+        if kwargs and not any(kwarg in self._instancehash_fields for kwarg in kwargs.keys()):
             raise ValueError("Only InstanceHashField can be used here.")
 
         indexed = []
@@ -439,7 +439,7 @@
         try:
 
             # Set indexes for indexable fields.
-            for field_name, value in kwargs.items():
+            for field_name, value in list(kwargs.items()):
                 field = self.get_field(field_name)
                 if field.indexable:
                     indexed.append(field)

limpyd/utils.py:

@@ -1,3 +1,5 @@
+from __future__ import unicode_literals
+from future.builtins import str
 import uuid
 
 from functools import wraps
@@ -8,7 +10,7 @@
 
 def make_key(*args):
     """Create the key concatening all args with `:`."""
-    return u":".join(unicode(arg) for arg in args)
+    return u":".join(str(arg) for arg in args)
 
 
 def unique_key(connection):

limpyd/contrib/collection.py:

@@ -1,3 +1,6 @@
+from __future__ import unicode_literals
+from future.builtins import zip
+from future.builtins import object
 # -*- coding:utf-8 -*-
 
 from itertools import islice, chain
@@ -124,7 +127,7 @@
                                'stored at a key that does not exist anymore.')
 
         for set_ in sets:
-            if isinstance(set_, basestring):
+            if isinstance(set_, str):
                 all_sets.add(set_)
             elif isinstance(set_, ExtendedFilter):
                 # We have a RedisModel and we'll use its pk, or a RedisField
@@ -193,7 +196,7 @@
             elif isinstance(set_, MultiValuesField) and not getattr(set_, '_instance', None):
                 raise ValueError('%s passed to "intersect" must be bound'
                                  % set_.__class__.__name__)
-            elif not isinstance(set_, (tuple, basestring, MultiValuesField, _StoredCollection)):
+            elif not isinstance(set_, (tuple, str, MultiValuesField, _StoredCollection)):
                 raise ValueError('%s is not a valid type of argument that can '
                                  'be used as a set. Allowed are: string (key '
                                  'of a redis set), limpyd multi-values field ('
@@ -282,7 +285,7 @@
             by = parameters.get('by_score', None)
             if isinstance(by, SortedSetField) and getattr(by, '_instance', None):
                 by = by.key
-            elif not isinstance(by, basestring):
+            elif not isinstance(by, str):
                 by = None
 
             if by is None:
@@ -433,9 +436,9 @@
          tuples: [('id1', 'name1'), ('id2', 'name2')]
          dicts:  [{'id': 'id1', 'name': 'name1'}, {'id': 'id2', 'name': 'name2'}]
         """
-        result = zip(*([iter(collection)] * len(self._values['fields']['names'])))
+        result = list(zip(*([iter(collection)] * len(self._values['fields']['names']))))
         if self._values['mode'] == 'dicts':
-            result = [dict(zip(self._values['fields']['names'], a_result))
+            result = [dict(list(zip(self._values['fields']['names'], a_result)))
                                                 for a_result in result]
         return result
 
@@ -537,7 +540,7 @@
         """
         string_filters = filters.copy()
 
-        for field_name, value in filters.iteritems():
+        for field_name, value in filters.items():
 
             is_extended = False
 
@@ -749,7 +752,7 @@
         flat = kwargs.pop('flat', False)
         if kwargs:
             raise ValueError('Unexpected keyword arguments for the values method: %s'
-                             % (kwargs.keys(),))
+                             % (list(kwargs.keys()),))
 
         if not fields:
             fields = self._get_simple_fields()

limpyd/contrib/database.py:

@@ -1,3 +1,4 @@
+from __future__ import unicode_literals
 # -*- coding:utf-8 -*-
 
 from redis.client import StrictPipeline

limpyd/contrib/related.py:

@@ -1,3 +1,6 @@
+from __future__ import unicode_literals
+from future.builtins import map
+from future.builtins import object
 # -*- coding:utf-8 -*-
 
 import re
@@ -6,6 +9,7 @@
 from limpyd import model, fields
 from limpyd.exceptions import *
 from limpyd.contrib.collection import ExtendedCollectionManager
+from future.utils import with_metaclass
 
 # used to validate a related_name
 re_identifier = re.compile(r"\W")
@@ -152,7 +156,7 @@
         # models impacted by the change of database (impacted_models), to move
         # these relation on the new database
         reverse_relations = {}
-        for related_model_name, relations in original_database._relations.iteritems():
+        for related_model_name, relations in original_database._relations.items():
             for relation in relations:
                 reverse_relations.setdefault(relation[0], []).append((related_model_name, relation))
 
@@ -195,7 +199,7 @@
         return it
 
 
-class RelatedFieldMixin(object):
+class RelatedFieldMixin(with_metaclass(RelatedFieldMetaclass, object)):
     """
     Base mixin for all fields holding related instances.
     This mixin provides:
@@ -211,7 +215,6 @@
       "_commands_with_many_values_from_python" (see RelatedFieldMetaclass)
     - management of related parameters: "to" and "related_name"
     """
-    __metaclass__ = RelatedFieldMetaclass
 
     _copy_conf = copy(fields.RedisField._copy_conf)
     _copy_conf['kwargs'] += [('to', 'related_to'), 'related_name']
@@ -286,7 +289,7 @@
         if isinstance(self.related_to, type) and issubclass(self.related_to, RelatedModel):
             model_name = self.related_to._name
 
-        elif isinstance(self.related_to, basestring):
+        elif isinstance(self.related_to, str):
             if self.related_to == 'self':
                 model_name = self._model._name
             elif ':' not in self.related_to:
@@ -361,7 +364,7 @@
         """
         Apply the from_python to each values and return the final list
         """
-        return map(self.from_python, values)
+        return list(map(self.from_python, values))
 
     @classmethod
     def _make_command_method(cls, command_name, many=False):

tests/base.py:

@@ -1,3 +1,5 @@
+from __future__ import unicode_literals
+from future.builtins import object
 import unittest
 import sys

tests/collection.py:

@@ -1,3 +1,5 @@
+from __future__ import absolute_import
+from __future__ import unicode_literals
 # -*- coding:utf-8 -*-
 
 import unittest
@@ -6,8 +8,8 @@
 from limpyd import fields
 from limpyd.collection import CollectionManager
 from limpyd.exceptions import *
-from base import LimpydBaseTest, TEST_CONNECTION_SETTINGS
-from model import Boat, Bike, TestRedisModel
+from .base import LimpydBaseTest, TEST_CONNECTION_SETTINGS
+from .model import Boat, Bike, TestRedisModel
 
 
 class CollectionBaseTest(LimpydBaseTest):

tests/lock.py:

@@ -1,3 +1,7 @@
+from __future__ import absolute_import
+from __future__ import unicode_literals
+from future import standard_library
+standard_library.install_hooks()
 # -*- coding:utf-8 -*-
 
 import unittest
@@ -8,8 +12,8 @@
 from limpyd.utils import make_key
 from limpyd import fields
 
-from base import LimpydBaseTest
-from model import TestRedisModel
+from .base import LimpydBaseTest
+from .model import TestRedisModel
 
 
 class HookedStringField(fields.StringField):

tests/model.py:

@@ -1,3 +1,7 @@
+from __future__ import absolute_import
+from __future__ import unicode_literals
+from future import standard_library
+standard_library.install_hooks()
 # -*- coding:utf-8 -*-
 
 import unittest
@@ -9,7 +13,7 @@
 from limpyd import model
 from limpyd import fields
 from limpyd.exceptions import *
-from base import LimpydBaseTest, TEST_CONNECTION_SETTINGS
+from .base import LimpydBaseTest, TEST_CONNECTION_SETTINGS
 
 
 class TestRedisModel(model.RedisModel):
@@ -576,7 +580,7 @@
             wagons = fields.StringField(default=10)
 
         # Check that db is empty
-        self.assertEqual(len(self.connection.keys()), 0)
+        self.assertEqual(len(list(self.connection.keys())), 0)
         # Create two models, to check also that the other is not
         # impacted by the delete of some field
         train1 = Train(name="Occitan", kind="Corail")
@@ -590,20 +594,20 @@
         # - the 2 kind fields
         # - the kind index for "Corail"
         # - the 2 wagons fields
-        self.assertEqual(len(self.connection.keys()), 11)
+        self.assertEqual(len(list(self.connection.keys())), 11)
         # If we delete the name field, only 9 key must remain
         # the train1.name field and the name:"Occitan" index are deleted
         train1.name.delete()
-        self.assertEqual(len(self.connection.keys()), 9)
+        self.assertEqual(len(list(self.connection.keys())), 9)
         self.assertEqual(train1.name.get(), None)
         self.assertFalse(Train.exists(name="Occitan"))
         self.assertEqual(train1.wagons.get(), '10')
         self.assertEqual(train2.name.get(), 'Teoz')
-        self.assertEqual(len(self.connection.keys()), 9)
+        self.assertEqual(len(list(self.connection.keys())), 9)
         # Now if we delete the train1.kind, only one key is deleted
         # The kind:"Corail" is still used by train2
         train1.kind.delete()
-        self.assertEqual(len(self.connection.keys()), 8)
+        self.assertEqual(len(list(self.connection.keys())), 8)
         self.assertEqual(len(Train.collection(kind="Corail")), 1)
 
     def test_instancehashfield_keys_are_deleted(self):
@@ -615,7 +619,7 @@
             wagons = fields.InstanceHashField(default=10)
 
         # Check that db is empty
-        self.assertEqual(len(self.connection.keys()), 0)
+        self.assertEqual(len(list(self.connection.keys())), 0)
         # Create two models, to check also that the other is not
         # impacted by the delete of some field
         train1 = Train(name="Occitan", kind="Corail")
@@ -627,24 +631,24 @@
         # - 2 trains hash key
         # - the 2 names index (one by value)
         # - the kind index
-        self.assertEqual(len(self.connection.keys()), 7)
+        self.assertEqual(len(list(self.connection.keys())), 7)
         # The train1 hash must have three fields (name, kind and wagons)
         self.assertEqual(self.connection.hlen(train1.key), 3)
         # If we delete the train1 name, only 6 key must remain
         # (the name index for "Occitan" must be deleted)
         train1.name.delete()
-        self.assertEqual(len(self.connection.keys()), 6)
+        self.assertEqual(len(list(self.connection.keys())), 6)
         self.assertEqual(self.connection.hlen(train1.key), 2)
         self.assertEqual(train1.name.hget(), None)
         self.assertFalse(Train.exists(name="Occitan"))
         self.assertEqual(train1.wagons.hget(), '10')
         self.assertEqual(train2.name.hget(), 'Teoz')
-        self.assertEqual(len(self.connection.keys()), 6)
+        self.assertEqual(len(list(self.connection.keys())), 6)
         # Now if we delete the train1.kind, no key is deleted
         # Only the hash field must be deleted
         # The kind:"Corail" is still used by train2
         train1.kind.delete()
-        self.assertEqual(len(self.connection.keys()), 6)
+        self.assertEqual(len(list(self.connection.keys())), 6)
         self.assertEqual(self.connection.hlen(train1.key), 1)
         self.assertEqual(len(Train.collection(kind="Corail")), 1)
 
@@ -667,7 +671,7 @@
             wagons = fields.InstanceHashField(default=10)
 
         # Check that db is empty
-        self.assertEqual(len(self.connection.keys()), 0)
+        self.assertEqual(len(list(self.connection.keys())), 0)
         # Create two models, to check also that the other is not
         # impacted by the delete of some field
         train1 = Train(name="Occitan", kind="Corail")
@@ -680,10 +684,10 @@
         # - the 2 names index (one by value)
         # - the two kind keys
         # - the kind:Corail index
-        self.assertEqual(len(self.connection.keys()), 9)
+        self.assertEqual(len(list(self.connection.keys())), 9)
         # If we delete the train1, only 6 key must remain
         train1.delete()
-        self.assertEqual(len(self.connection.keys()), 6)
+        self.assertEqual(len(list(self.connection.keys())), 6)
         with self.assertRaises(DoesNotExist):
             train1.name.hget()
         with self.assertRaises(DoesNotExist):
@@ -691,7 +695,7 @@
         self.assertFalse(Train.exists(name="Occitan"))
         self.assertTrue(Train.exists(name="Teoz"))
         self.assertEqual(train2.name.hget(), 'Teoz')
-        self.assertEqual(len(self.connection.keys()), 6)
+        self.assertEqual(len(list(self.connection.keys())), 6)
         self.assertEqual(len(Train.collection(kind="Corail")), 1)
         self.assertEqual(len(Train.collection()), 1)

tests/utils.py:

@@ -1,8 +1,10 @@
+from __future__ import absolute_import
+from __future__ import unicode_literals
 # -*- coding:Utf-8 -*-
 
 import unittest
 
-from base import LimpydBaseTest
+from .base import LimpydBaseTest
 from limpyd.utils import make_key, unique_key

tests/contrib/collection.py:

@@ -1,3 +1,4 @@
+from __future__ import unicode_literals
 # -*- coding:utf-8 -*-
 
 import unittest
@@ -656,7 +657,7 @@
         self.assertEqual(len(stored_collection), 2)
 
     def test_len_should_work_with_values(self):
-        collection = Group.collection(active=1).sort(by='name').values()
+        collection = list(Group.collection(active=1).sort(by='name').values())
         self.assertEqual(len(collection), 2)

tests/contrib/database.py:

@@ -1,3 +1,4 @@
+from __future__ import unicode_literals
 # -*- coding:utf-8 -*-
 
 from ..base import LimpydBaseTest, TEST_CONNECTION_SETTINGS

tests/contrib/related.py:

@@ -1,3 +1,5 @@
+from __future__ import unicode_literals
+from future.builtins import object
 # -*- coding:utf-8 -*-
 
 from limpyd import model, fields
@@ -107,7 +109,7 @@
         self.assertEqual(set(ms_php.related_name_persontest_set()), set([ybon._pk]))
 
     def test_related_name_should_follow_namespace(self):
-        class SubTest():
+        class SubTest(object):
             """
             A class to create another model with the name "Group"
             """

tests/fields/init.py:

@@ -1,3 +1,4 @@
+from __future__ import unicode_literals
 from .sortedset import *
 from .pk import *
 from .instancehash import *

tests/fields/hash.py:

@@ -1,3 +1,4 @@
+from __future__ import unicode_literals
 # -*- coding:utf-8 -*-
 
 from limpyd import fields

tests/fields/instancehash.py:

@@ -1,3 +1,4 @@
+from __future__ import unicode_literals
 from redis.exceptions import DataError, ResponseError
 
 from limpyd import fields

tests/fields/list.py:

@@ -1,3 +1,4 @@
+from __future__ import unicode_literals
 # -*- coding:utf-8 -*-
 
 from redis.exceptions import RedisError
@@ -164,10 +165,10 @@
         self.assertEqual(obj.field.proxy_get(), ['foo'])
         self.assertCollection([obj._pk], field="foo")
 
-        nb_key_before = len(self.connection.keys())
+        nb_key_before = len(list(self.connection.keys()))
         obj.field.linsert('before', 'foo', 'thevalue')
         # It should only have add one key for the new index
-        nb_key_after = len(self.connection.keys())
+        nb_key_after = len(list(self.connection.keys()))
         self.assertEqual(nb_key_after, nb_key_before + 1)
 
         self.assertEqual(obj.field.proxy_get(), ['thevalue', 'foo'])

tests/fields/pk.py:

@@ -1,3 +1,4 @@
+from __future__ import unicode_literals
 from limpyd import fields
 from limpyd.exceptions import UniquenessError, ImplementationError

tests/fields/set.py:

@@ -1,3 +1,4 @@
+from __future__ import unicode_literals
 from limpyd import fields
 from limpyd.exceptions import UniquenessError

tests/fields/sortedset.py:

@@ -1,3 +1,4 @@
+from __future__ import unicode_literals
 from limpyd import fields
 from limpyd.exceptions import UniquenessError

tests/fields/string.py:

@@ -1,3 +1,4 @@
+from __future__ import unicode_literals
 # -*- coding:utf-8 -*-
 
 from limpyd import fields

So thanks to the future package, it is able to do a lot of things that would be too repetitive to do it manually.

But I can see that it's not perfect.

But it was just a preview...

Futurize limpyd

So as it seems that future and its futurize command will do a lot of work for us, I'll use them.

The future documentation offers us to do the work in two steps:

Stage 1 is for "safe" changes that modernize the code
Stage 2 is to complete the process

Stage 1

So, let's start - really this time, not a preview (see the -w parameter) - with Stage 1:

futurize -1 -w -n --unicode-literals *.py limpyd/ tests/

Nothing fancy was done here, only to things:

add the following line at the top of each file:

from __future__ import unicode_literals

add the following line at the top of all files with relatives imports, and change these imports to be relative:

from __future__ import absolute_import

Example of diff for a file with relative import:

--- tests/collection.py (original)
+++ tests/collection.py (refactored)
@@ -1,3 +1,5 @@
+from __future__ import absolute_import
+from __future__ import unicode_literals
 # -*- coding:utf-8 -*-
 
 import unittest
@@ -6,8 +8,8 @@
 from limpyd import fields
 from limpyd.collection import CollectionManager
 from limpyd.exceptions import *
-from base import LimpydBaseTest, TEST_CONNECTION_SETTINGS
-from model import Boat, Bike, TestRedisModel
+from .base import LimpydBaseTest, TEST_CONNECTION_SETTINGS
+from .model import Boat, Bike, TestRedisModel
 
 
 class CollectionBaseTest(LimpydBaseTest):

As I can see in this example, there is a problem. The following line must be the first line in a python file:

# -*- coding:utf-8 -*-

As there is too much files with this modification, I wrote a little helper to correct this automatically (note that the problem is not present with file starting with a shebang line):

for f in `find ./ -name '*.py'`
do
    if [[ `grep -LE '^#!' $f` ]]
    then
        coding_line=`grep -hnEi 'coding: ?utf-8' $f | cut -f1 -d:`
        if [[ 1 -lt "$coding_line" ]]
        then
            echo '# -*- coding:utf-8 -*-' > $f.new
            head -n $(( coding_line - 1 )) $f >> $f.new
            tail -n +$(( coding_line + 1)) $f >> $f.new
            mv $f.new $f
        fi
    fi
done

I can now run tests (in python 2.7 terminal) to see if all is still ok:

$ python run_tests.py
[...]
Ran 300 tests in 3.910s

OK

Stage 2

Now it's time to apply stage 2, which should give us a version that work in python3

futurize -2 -w -n --unicode-literals *.py limpyd/ tests/

At this point of time, all updates that were shown by the preview mode earlier are applied.

Now check if it still works on python 2.7:

$ python run_tests.py 
Traceback (most recent call last):
  File "run_tests.py", line 8, in <module>
    from tests import base, model, utils, collection, lock, fields
  File "/home/twidi/dev/redis-limpyd/tests/model.py", line 13, in <module>
    from limpyd import model
  File "/home/twidi/dev/redis-limpyd/limpyd/model.py", line 8, in <module>
    from limpyd.fields import *
  File "/home/twidi/dev/redis-limpyd/limpyd/fields.py", line 86, in <module>
    class RedisProxyCommand(with_metaclass(MetaRedisProxy, object)):
  File "/home/twidi/dev/envs/limpyd-2.7/local/lib/python2.7/site-packages/future/utils/__init__.py", line 133, in __new__
    return meta(name, bases, d)
  File "/home/twidi/dev/redis-limpyd/limpyd/fields.py", line 61, in __new__
    it = super(MetaRedisProxy, mcs).__new__(mcs, name, base, dct)
TypeError: Error when calling the metaclass bases
    metaclass conflict: the metaclass of a derived class must be a (non-strict) subclass of the metaclasses of all its bases

Hmmm not good, future, not good.

But continue by checking on python 3.4:

$ python run_tests.py 
[...]
======================================================================
FAIL: test_undefined_related_name_should_be_auto_created (tests.contrib.related.RelatedNameTest)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/twidi/dev/redis-limpyd/tests/contrib/related.py", line 88, in test_undefined_related_name_should_be_auto_created
    self.assertEqual(set(core_devs.person_set()), set([ybon._pk]))
AssertionError: Items in the first set but not the second:
b'ybon'
Items in the second set but not the first:
'ybon'

----------------------------------------------------------------------
Ran 300 tests in 0.997s

FAILED (failures=28, errors=235)

Hum, not good at all...

I still made a commit with all these changes, named "Apply stage2 of futurize". It will be squashed later with commits that solve the different problems, to only have commit in a valid working state.

The metaclass problem

Let's start by resolving the metaclass problem under python 2.7.

I found that passing object as a base of base metaclasses was that caused the error.

So by applying this diff, this problem was easily solved:

limpyd/contrib/related.py:

@@ -199,7 +199,7 @@ class RelatedFieldMetaclass(fields.MetaRedisProxy):
         return it
 
 
-class RelatedFieldMixin(with_metaclass(RelatedFieldMetaclass, object)):
+class RelatedFieldMixin(with_metaclass(RelatedFieldMetaclass)):
     """
     Base mixin for all fields holding related instances.
     This mixin provides:

limpyd/fields.py:

@@ -83,7 +83,7 @@ class MetaRedisProxy(type):
         return it
 
 
-class RedisProxyCommand(with_metaclass(MetaRedisProxy, object)):
+class RedisProxyCommand(with_metaclass(MetaRedisProxy)):
 
     @classmethod
     def _make_command_method(cls, command_name):

I create a commit "Correct with_metaclass calls" with these two updates.

Running tests with python 3.4 shows us that nothing has changed:

$ python run_tests.py 
[...]
FAILED (failures=28, errors=235)

But with python 2.7, the metaclass problem is not here anymore but...

$ python run_tests.py 
Traceback (most recent call last):
  File "run_tests.py", line 9, in <module>
    from tests.contrib import database, related, collection as contrib_collection
  File "/home/twidi/dev/redis-limpyd/tests/contrib/related.py", line 24, in <module>
    class Person(TestRedisModel):
  File "/home/twidi/dev/redis-limpyd/limpyd/model.py", line 92, in __new__
    field._attach_to_model(it)
  File "/home/twidi/dev/redis-limpyd/limpyd/contrib/related.py", line 252, in _attach_to_model
    self.related_to = self._get_related_model_name()
  File "/home/twidi/dev/redis-limpyd/limpyd/contrib/related.py", line 301, in _get_related_model_name
    raise ImplementationError("The `to` argument to a related field "
limpyd.exceptions.ImplementationError: The `to` argument to a related field must be a RelatedModel as a class or as a string (with or without namespace). Or simply 'self'.

WAT?

I launched the debugger in the _get_related_model_name method. And what i saw made me cry:

> /home/twidi/dev/redis-limpyd/limpyd/contrib/related.py(293)_get_related_model_name()
    292 
--> 293         elif isinstance(self.related_to, str):
    294             if self.related_to == 'self':

ipdb> p self.related_to
u'Group'
ipdb> p isinstance(self.related_to, str)
False
ipdb> p self.related_to.__class__
<type 'unicode'>
ipdb>

So it's the unicode problem. It's always the unicode problem :(

The string/unicode/bytes/whatever problem

So the isinstance to check if a thing is a str fails.

So lets check on the future website. I quickly found http://python-future.org/compatible_idioms.html#basestring which told me to add

from future.builtins import str

on top of the files doing this kind of isinstance call. Here are the files with this line added, and some with other simple adjustments:

limpyd/contrib/related.py: Simply added the line

limpyd/contrib/collection.py: Simply added the line

test/utils.py: Added the line and reworked a test

@@ -1,6 +1,7 @@
 # -*- coding:utf-8 -*-
 from __future__ import absolute_import
 from __future__ import unicode_literals
+from future.builtins import str
 
 import unittest
 
@@ -27,7 +28,7 @@ class UniqueKeyTest(LimpydBaseTest):
 
     def test_generated_key_must_be_a_string(self):
         key = unique_key(self.connection)
-        self.assertEqual(type(key), str)
+        self.assertTrue(isinstance(key, str))
 
     def test_generated_key_must_be_unique(self):
         key1 = unique_key(self.connection)

limpyd/utils: Added a call to str to convert a "string"

@@ -19,7 +19,16 @@ def unique_key(connection):
     keyspace.
     """
     while 1:
-        key = uuid.uuid4().hex
+        key = str(uuid.uuid4().hex)
         if not connection.exists(key):
             break
     return key

There was an other important problem with a string decoding:

    def from_python(self, value):
        """
        Coerce a value before using it in Redis.
        """
        if value and isinstance(value, str):
            value = value.decode('utf-8')
        return value

The import str line being imported, it doesn't work anymore.

I came up with this solution. Just import bytes with str:

from future.builtins import str, bytes

And replace the str in isinstance by bytes!

I discovered that I had to do this in another method so i made a function. Here is the diff about this part:

limpyd/utils.py:

@@ -1,5 +1,5 @@
 from __future__ import unicode_literals
-from future.builtins import str
+from future.builtins import str, bytes
 import uuid
 
 from functools import wraps
@@ -23,3 +23,12 @@ def unique_key(connection):
         if not connection.exists(key):
             break
     return key
+
+
+def normalize(value):
+    """
+    Simple method to always have the same kind of value
+    """
+    if value and isinstance(value, bytes):
+        value = value.decode('utf-8')
+    return value

limpyd/fields.py:

@@ -11,7 +11,7 @@ from future.utils import with_metaclass
 from redis.exceptions import RedisError
 from redis.client import Lock
 
-from limpyd.utils import make_key
+from limpyd.utils import make_key, normalize
 from limpyd.exceptions import *
 
 log = getLogger(__name__)
@@ -469,9 +469,7 @@ class RedisField(RedisProxyCommand):
         """
         Coerce a value before using it in Redis.
         """
-        if value and isinstance(value, str):
-            value = value.decode('utf-8')
-        return value
+        return normalize(value)
 
     def _reset(self, command, *args, **kwargs):
         """
@@ -520,7 +518,7 @@ class SingleValueField(RedisField):
         """
         if self.indexable:
             current = self.proxy_get()
-            if current != value:
+            if normalize(current) != normalize(value):
                 if current is not None:
                     self.deindex(current)
                 if value is not None:

And it was all I found by simply playing with str and bytes from the future.builtin modules.

So I made a commit "Use str and bytes from future.builtins".

Then checking the tests..

For python 3.4:

$ python run_tests.py 
[...]
FAILED (failures=158, errors=20)

There's a progress! But we'll see later.

And for python 2.7:

$ python run_tests.py 
[...]
======================================================================
ERROR: test_len_should_work_with_values (tests.contrib.collection.LenTest)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/twidi/dev/redis-limpyd/tests/contrib/collection.py", line 660, in test_len_should_work_with_values
    collection = list(Group.collection(active=1).sort(by='name').values())
  File "/home/twidi/dev/redis-limpyd/limpyd/collection.py", line 49, in __iter__
    return self._collection.__iter__()
  File "/home/twidi/dev/redis-limpyd/limpyd/contrib/collection.py", line 111, in _collection
    return super(ExtendedCollectionManager, self)._collection
  File "/home/twidi/dev/redis-limpyd/limpyd/collection.py", line 162, in _collection
    collection = self._final_redis_call(final_set, sort_options)
  File "/home/twidi/dev/redis-limpyd/limpyd/contrib/collection.py", line 251, in _final_redis_call
    final_set, sort_options)
  File "/home/twidi/dev/redis-limpyd/limpyd/collection.py", line 183, in _final_redis_call
    return conn.sort(final_set, **sort_options)
  File "/home/twidi/dev/envs/limpyd-2.7/local/lib/python2.7/site-packages/redis/client.py", line 1178, in sort
    return self.execute_command('SORT', *pieces, **options)
  File "/home/twidi/dev/envs/limpyd-2.7/local/lib/python2.7/site-packages/redis/client.py", line 461, in execute_command
    return self.parse_response(connection, command_name, **options)
  File "/home/twidi/dev/envs/limpyd-2.7/local/lib/python2.7/site-packages/redis/client.py", line 471, in parse_response
    response = connection.read_response()
  File "/home/twidi/dev/envs/limpyd-2.7/local/lib/python2.7/site-packages/redis/connection.py", line 344, in read_response
    raise response
ResponseError: One or more scores can't be converted into double

----------------------------------------------------------------------
Ran 300 tests in 4.084s

FAILED (errors=1)

ONLY ONE MORE FAILING TEST!

The unwanted list

So let's look at this problem. The failing code is this simple test, in tests/contrib/collections.py

    def test_len_should_work_with_values(self):
        collection = list(Group.collection(active=1).sort(by='name').values())
        self.assertEqual(len(collection), 2)

It seems strange, why a list to get it's length? I know that there is no need to cast to a list to get the length of a limpyd collection.

But... if.... YES! It was added by the "Stage 2" pass, thinking that it was a call the the values method of a dictionary!

@@ -656,9 +656,9 @@ class LenTest(BaseTest):
         stored_collection = collection.store().sort(by='name')
         self.assertEqual(len(stored_collection), 2)
 
     def test_len_should_work_with_values(self):
-        collection = Group.collection(active=1).sort(by='name').values()
+        collection = list(Group.collection(active=1).sort(by='name').values())
         self.assertEqual(len(collection), 2)
 
 
 class Boat(BaseBoat):

So let's remove this cast to list and rerun the test on the python 2.7 version:

$ python run_tests.py -v 0
----------------------------------------------------------------------
Ran 300 tests in 3.858s

OK

HURRAY! We're done for python 2.7!

Commit this change ("Remove erroneous list() added by futurize") and know, work on the python 3.4 version.

One setting to rule them all

By looking at the failing tests for the python 3.4 versions, I could easily find a pattern, having all messages being like:

AssertionError: b'bar' != 'bar'

AssertionError: Items in the first set but not the second:
b'1'
Items in the second set but not the first:
'1'

And so on...

So there is a global problem. All "string" comparisons failed.

I frequently heard about "compat" modules in libraries that needed to support python 2 and 3.

So I decided to look at the one for redis-py, the library used by redis-limpyd to talk to redis.

But what I found wasn't why I expected: All strings were encapsulated in a call to a b() function, defined in a different way for python 2 and 3.

Repass on all the tests was not what I wanted. So i kept looking on the redis-py source code.

And then I found an argument to the constructor of the main redis-py object, what they call the Client: a decode_responses argument, default to False.

What if....

I was really curious so i tried:

@@ -48,7 +48,7 @@ class RedisDatabase(object):
             settings = self.connection_settings
         connection_key = ':'.join([str(settings[k]) for k in sorted(settings)])
         if connection_key not in self._connections:
-            self._connections[connection_key] = redis.StrictRedis(**settings)
+            self._connections[connection_key] = redis.StrictRedis(decode_responses=True, **settings)
         return self._connections[connection_key]
 
     def reset(self, **connection_settings):

And I launched the tests, again:

$ python run_tests.py -v 0
----------------------------------------------------------------------
Ran 300 tests in 3.974s

OK

WAT? REA...LLY?

I made a double, and even a triple check. But that was real.

And checking in python 2.7 gave the same result.

I found all the issues, the porting is over. Now, redis-limpyd supports python 2.7 and python 3.4.

I just have to squash all the commits following the Stage 2 call to futurize to only have working commits in the final code base.

Conclusion

I didn't know what to expect. It was my first time, and... it seemed easy at the beginning until I have to fight against string/unicode/bytes/you name it...

I lost a lot of time trying a lot of things, without even knowing what i did. I read a lot of documentation, example on other python 2&3 compatible projects, and I finally succeeded.

Know the Pull-request is waiting for Yohon Boniface to accept it...

At a last effort I checked if tests passed with python 2.6 and 3.3, and it was ok, so we also support these two versions (I reverted the "remove 2.6 support" commit)

And finally, I made a pass over the changes in iterators and adapted them to keep performances for python 2.x and python 3x (thanks from future.utils import iteritems, iterkeys)

twidi/porting-redis-limpyd-to-python3.md

A journey to python 3: porting redis-limpyd

Introduction

Preparing the environments

To be 2.6 or not to be 2.6

What needs to be changed?

Futurize limpyd

Stage 1

Stage 2

The metaclass problem

The string/unicode/bytes/whatever problem

The unwanted list

One setting to rule them all

Conclusion