Skip to content

Instantly share code, notes, and snippets.

@makmanalp
Last active August 29, 2015 14:27
Show Gist options
  • Select an option

  • Save makmanalp/579b66585b91769c2f8a to your computer and use it in GitHub Desktop.

Select an option

Save makmanalp/579b66585b91769c2f8a to your computer and use it in GitHub Desktop.
Weird null comparison issue in pandas / python / numpy

When I run this:

    def fill_parents(row):
        print (row.parent_id,
               type(row.parent_id),
               row.parent_id is pd.np.nan,
               row.parent_id == pd.np.nan,
               row.parent_id is None,
               row.parent_id == None,
               pd.isnull(row.parent_id)
               )
        if row.parent_id is pd.np.nan:
            # never reaches this
            row.parent_id = lookup_table[row.code[:2]]
        return row

On this:

code                99773
level        municipality
name             Cumaribo
parent_id             NaN
Name: 1217, dtype: object

This returns:

(nan, <type 'numpy.float64'>, False, False, False, False, True)

What!! I'm guessing somehow I got a different nan object (pandas internal null vs numpy null? something else?) so the "is" comparison fails. When I take the object out and replace the NaN like thus:

x = parent_id_table.iloc[1217].to_dict()
x["parent_id"] = pd.np.nan
fill_parents(pd.Series(x))

Now it prints:

(nan, <type 'float'>, True, False, False, False, True)

Lesson: watch out for null comparisons! pd.isnull() is your friend.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment