-
-
Save thomaswilley/5079f1106b1ddf2c71b6 to your computer and use it in GitHub Desktop.
# Get Apple Health data as Pandas DataFrame | |
# === | |
# pre-reqs: python3, lxml, pandas | |
# to get started: | |
# export and mail yourself your data following steps within the Health app on iPhone | |
# download and unzip contents of exported zip file; find path to export.xml and set path_to_exportxml below | |
import pandas as pd | |
import xml.etree.ElementTree | |
import datetime | |
path_to_exportxml = "<path to apple health's export.xml>" | |
def iter_records(healthdata): | |
healthdata_attr = healthdata.attrib | |
for rec in healthdata.iterfind('.//Record'): | |
rec_dict = healthdata_attr.copy() | |
rec_dict.update(healthdata.attrib) | |
for k, v in rec.attrib.items(): | |
if 'date' in k.lower(): | |
rec_dict[k] = datetime.datetime.strptime(v, '%Y-%m-%d %H:%M:%S %z') | |
else: | |
rec_dict[k] = v | |
yield rec_dict | |
e = xml.etree.ElementTree.parse(path_to_exportxml).getroot() | |
df = pd.DataFrame(list(iter_records(e))) | |
df |
You’re welcome Bhargava, thanks for your note! I’d be curious what type of analysis you’re doing I’m always interested to find new ways for using health data
I am learning pandas and this is a small project to use my skills to solve problems in real life.
I am trying to stick to a consistent routine - Eg sleep and wake up at roughly the same time everyday or complete around 8000 steps everyday. I am writing a pandas script that will help me visually detect abnormal variation in my routine and ignore normal day to day variation. The script will plot control charts (https://bit.ly/3zAuIG0) for sleep time, wake up time and other metrics from apple data.
Hi Thomas,
Thank you for posting this. I am very new to programming and this is my first time working with an XML file. I was able to get your code to work, but it looks like some of the data that I want, is not included in the data frame. If you look at the XML code below, "totalDistance" is included, but it outputs "0". It appears the actual distance is in the "MetadataEntry key="HKIndoorBikeDistance" value="8656 m". This MetadataEntry is not output with the code you have provided.
I have researched online, and still have not found a way to bring this into a column for the final dataframe.
Can you help me with this?
Hi Thomas,
Thank you for posting this. I am very new to programming and this is my first time working with an XML file. I was able to get your code to work, but it looks like some of the data that I want, is not included in the data frame. If you look at the XML code below, "totalDistance" is included, but it outputs "0". It appears the actual distance is in the "MetadataEntry key="HKIndoorBikeDistance" value="8656 m". This MetadataEntry is not output with the code you have provided.
I have researched online, and still have not found a way to bring this into a column for the final dataframe.
Can you help me with this?
Hi Alicia, sure. Let me see if I can help. Can you send me an email at [email protected] and we can follow up offline by email?
Hi Thomas,
I used this script this week. It helped me analyse my Apple Health data in Pandas. Thank you sharing this.
regards,
Bhargava Swamy