When I run this:
def fill_parents(row):
print (row.parent_id,
type(row.parent_id),
row.parent_id is pd.np.nan,
row.parent_id == pd.np.nan,
row.parent_id is None,
row.parent_id == None,
pd.isnull(row.parent_id)
)
if row.parent_id is pd.np.nan:
# never reaches this
row.parent_id = lookup_table[row.code[:2]]
return rowOn this:
code 99773
level municipality
name Cumaribo
parent_id NaN
Name: 1217, dtype: object
This returns:
(nan, <type 'numpy.float64'>, False, False, False, False, True)What!! I'm guessing somehow I got a different nan object (pandas internal null vs numpy null? something else?) so the "is" comparison fails. When I take the object out and replace the NaN like thus:
x = parent_id_table.iloc[1217].to_dict()
x["parent_id"] = pd.np.nan
fill_parents(pd.Series(x))Now it prints:
(nan, <type 'float'>, True, False, False, False, True)Lesson: watch out for null comparisons! pd.isnull() is your friend.