Skip to content

Instantly share code, notes, and snippets.

@thomaswilley
Last active May 29, 2024 14:15
Show Gist options
  • Save thomaswilley/5079f1106b1ddf2c71b6 to your computer and use it in GitHub Desktop.
Save thomaswilley/5079f1106b1ddf2c71b6 to your computer and use it in GitHub Desktop.
Get Apple Health data as Pandas DataFrame
# Get Apple Health data as Pandas DataFrame
# ===
# pre-reqs: python3, lxml, pandas
# to get started:
# export and mail yourself your data following steps within the Health app on iPhone
# download and unzip contents of exported zip file; find path to export.xml and set path_to_exportxml below
import pandas as pd
import xml.etree.ElementTree
import datetime
path_to_exportxml = "<path to apple health's export.xml>"
def iter_records(healthdata):
healthdata_attr = healthdata.attrib
for rec in healthdata.iterfind('.//Record'):
rec_dict = healthdata_attr.copy()
rec_dict.update(healthdata.attrib)
for k, v in rec.attrib.items():
if 'date' in k.lower():
rec_dict[k] = datetime.datetime.strptime(v, '%Y-%m-%d %H:%M:%S %z')
else:
rec_dict[k] = v
yield rec_dict
e = xml.etree.ElementTree.parse(path_to_exportxml).getroot()
df = pd.DataFrame(list(iter_records(e)))
df
@BhargavaSwamy
Copy link

I am learning pandas and this is a small project to use my skills to solve problems in real life.

I am trying to stick to a consistent routine - Eg sleep and wake up at roughly the same time everyday or complete around 8000 steps everyday. I am writing a pandas script that will help me visually detect abnormal variation in my routine and ignore normal day to day variation. The script will plot control charts (https://bit.ly/3zAuIG0) for sleep time, wake up time and other metrics from apple data.

@aliciaboyce
Copy link

aliciaboyce commented Sep 18, 2021

Hi Thomas,

Thank you for posting this. I am very new to programming and this is my first time working with an XML file. I was able to get your code to work, but it looks like some of the data that I want, is not included in the data frame. If you look at the XML code below, "totalDistance" is included, but it outputs "0". It appears the actual distance is in the "MetadataEntry key="HKIndoorBikeDistance" value="8656 m". This MetadataEntry is not output with the code you have provided.

I have researched online, and still have not found a way to bring this into a column for the final dataframe.
Can you help me with this?

Thank you in advance
AppleHealth
,

@thomaswilley
Copy link
Author

Hi Thomas,

Thank you for posting this. I am very new to programming and this is my first time working with an XML file. I was able to get your code to work, but it looks like some of the data that I want, is not included in the data frame. If you look at the XML code below, "totalDistance" is included, but it outputs "0". It appears the actual distance is in the "MetadataEntry key="HKIndoorBikeDistance" value="8656 m". This MetadataEntry is not output with the code you have provided.

I have researched online, and still have not found a way to bring this into a column for the final dataframe.
Can you help me with this?

Thank you in advance
AppleHealth
,

Hi Alicia, sure. Let me see if I can help. Can you send me an email at [email protected] and we can follow up offline by email?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment