devdattaT/ConvertGoogleHistory.py

devdattaT · 2024-06-26T05:00:18Z

@GitMae99 Keep your Records.json file in the same folder as this script, and then you can run the following command: python ConvertGoogleHistory.py Records.json out.csv The script reads the first file (i.e. Records.json) and then generates the second file.

davidyoder18 · 2024-06-26T16:08:40Z

@GitMae99 Thanks for making this script. Sorry, I'm new to running python scripts I'm getting this error.

Windows PowerShell
Copyright (C) Microsoft Corporation. All rights reserved.

Install the latest PowerShell for new features and improvements! https://aka.ms/PSWindows

PS D:\OneDrive - Rubix LLC\Software Data\Google History> python ConvertGoogleHistory.py Records.json out.csv
Reading D:\OneDrive - Rubix LLC\Software Data\Google History\Records.json
Traceback (most recent call last):
File "ConvertGoogleHistory.py", line 56, in
for r in reader:
File "ConvertGoogleHistory.py", line 18, in make_reader
for item in json_data['locations']:
TypeError: list indices must be integers, not str
PS D:\OneDrive - Rubix LLC\Software Data\Google History>

GitMae99 · 2024-06-26T16:38:01Z

@devdattaT Thanks for the response.
When I go to google takeout and export from there, I only get three files, encryptedBackups.txt, settings.json and tombstones.csv. So no Records.json that I can find.
When I run the script with the export I did from my Android phone (which does give me a .json data file) I'm getting the error:
Reading C:\Temp\Records.json
Traceback (most recent call last):
File "C:\Temp\ConvertGoogleHistory.py", line 56, in
for r in reader:
File "C:\Temp\ConvertGoogleHistory.py", line 18, in make_reader
for item in json_data['locations']:
~~~~~~~~~^^^^^^^^^^^^^
KeyError: 'locations'

Maybe it doesn't work with the file I exported from the android through the settings??

pbsings · 2024-07-07T05:20:07Z

@GitMae99 @devdattaT I'm running into the same problem, and it looks like the location-history.json file from Google is in a different format. There is no "locations" array in the json file. It has a "semanticSegments" array at the top. I assume this is because of Google's change to the timeline. I probably only have access to the Semantic Location History information, and not the Raw Location History data.

dinilj007 · 2024-07-24T22:50:03Z

I made a minor adjustment using ChatGPT (not a coder at all, so please check) and it worked!

please replace path in/out with the relevant details.. the file that was released was kepler compatible for me! Maybe someone with more coding experience can adjust, improve it?

import json
import csv
from datetime import datetime
import os

def make_reader(in_json):
    # Open location history data
    with open(in_json, 'r') as file:
        json_data = json.load(file)
    
    for item in json_data:
        end_time = item.get('endTime', 'Unknown')
        start_time = item.get('startTime', 'Unknown')

        if '.' in end_time:
            end_date = datetime.strptime(end_time, '%Y-%m-%dT%H:%M:%S.%f%z').date()
        else:
            end_date = datetime.strptime(end_time, '%Y-%m-%dT%H:%M:%S%z').date()
        
        end_tm = end_time.split('T')[1].split('+')[0]

        if '.' in start_time:
            start_date = datetime.strptime(start_time, '%Y-%m-%dT%H:%M:%S.%f%z').date()
        else:
            start_date = datetime.strptime(start_time, '%Y-%m-%dT%H:%M:%S%z').date()
        
        start_tm = start_time.split('T')[1].split('+')[0]

        # Extract location details
        location = 'Unknown Location'
        lat, lon = None, None
        
        if 'visit' in item:
            placeLocation = item['visit'].get('topCandidate', {}).get('placeLocation', 'Unknown')
            if 'geo:' in placeLocation:
                location = placeLocation.split('geo:')[1]
                lat, lon = location.split(',')
        
        elif 'activity' in item:
            placeLocation = item['activity'].get('start', 'Unknown')
            if 'geo:' in placeLocation:
                location = placeLocation.split('geo:')[1]
                lat, lon = location.split(',')

        yield [end_date, end_tm, start_date, start_tm, lat, lon]

def getFullPath(inPath):
    if not os.path.isabs(inPath):
        # we need to set up the absolute path
        script_path = os.path.abspath(__file__)
        path, file = os.path.split(script_path)
        inPath = os.path.join(path, inPath)
    return inPath

# Hard-coded file paths
in_file = 'path/to/your/input.json'  # Replace with your input JSON file path
out_file = 'path/to/your/output.csv'  # Replace with your desired output CSV file path

in_file = getFullPath(in_file)
out_file = getFullPath(out_file)

features = []
# add the Headers
features.append(['End Date', 'End Time', 'Start Date', 'Start Time', 'Latitude', 'Longitude'])
print("Reading {0}".format(in_file))

reader = make_reader(in_file)

for r in reader:
    features.append(r)

print('Read {0} Records'.format(len(features) - 1))

# write this data
with open(out_file, 'w', newline='') as f:
    writer = csv.writer(f)
    writer.writerows(features)

s0meguy1 · 2024-10-21T15:25:41Z

@dinilj007 - I do a decent amount of python coding but for small things like this I use GPT also. Yours didn't work for my timeline export, so I fed GPT a sample of some of mine, followed by your script and the error message and it gave me a working script. Took less than 2 mins, which is why I didn't figure it out myself haha:

import json
import csv
from datetime import datetime
import os

def make_reader(in_json):
    # Open location history data
    with open(in_json, 'r') as file:
        json_data = json.load(file)

    # Access the "semanticSegments" list
    segments = json_data.get("semanticSegments", [])

    for segment in segments:
        start_time = segment.get('startTime', 'Unknown')
        end_time = segment.get('endTime', 'Unknown')

        # Process the start and end times
        if start_time != 'Unknown':
            if '.' in start_time:
                start_date = datetime.strptime(start_time, '%Y-%m-%dT%H:%M:%S.%f%z').date()
            else:
                start_date = datetime.strptime(start_time, '%Y-%m-%dT%H:%M:%S%z').date()

            start_tm = start_time.split('T')[1].split('-')[0]
        else:
            start_date = 'Unknown'
            start_tm = 'Unknown'

        if end_time != 'Unknown':
            if '.' in end_time:
                end_date = datetime.strptime(end_time, '%Y-%m-%dT%H:%M:%S.%f%z').date()
            else:
                end_date = datetime.strptime(end_time, '%Y-%m-%dT%H:%M:%S%z').date()

            end_tm = end_time.split('T')[1].split('-')[0]
        else:
            end_date = 'Unknown'
            end_tm = 'Unknown'

        # Extract location details from "visit" or "timelinePath"
        lat, lon = None, None

        if 'visit' in segment:
            placeLocation = segment['visit'].get('topCandidate', {}).get('placeLocation', {}).get('latLng', None)
            if placeLocation:
                lat, lon = [float(coord.strip('°')) for coord in placeLocation.split(", ")]

        elif 'timelinePath' in segment:
            # Extract the first point from timelinePath if it exists
            if len(segment['timelinePath']) > 0:
                point = segment['timelinePath'][0].get('point', None)
                if point:
                    lat, lon = [float(coord.strip('°')) for coord in point.split(", ")]

        # Yield the extracted data
        yield [end_date, end_tm, start_date, start_tm, lat, lon]

def getFullPath(inPath):
    if not os.path.isabs(inPath):
        # we need to set up the absolute path
        script_path = os.path.abspath(__file__)
        path, file = os.path.split(script_path)
        inPath = os.path.join(path, inPath)
    return inPath

# Hard-coded file paths
in_file = 'input.json'  # Replace with your input JSON file path
out_file = 'output.csv'  # Replace with your desired output CSV file path

in_file = getFullPath(in_file)
out_file = getFullPath(out_file)

features = []
# add the Headers
features.append(['End Date', 'End Time', 'Start Date', 'Start Time', 'Latitude', 'Longitude'])
print("Reading {0}".format(in_file))

reader = make_reader(in_file)

for r in reader:
    features.append(r)

print('Read {0} Records'.format(len(features) - 1))

# write this data
with open(out_file, 'w', newline='') as f:
    writer = csv.writer(f)
    writer.writerows(features)

epk · 2024-12-30T09:17:54Z

I hacked this together https://gist.github.com/epk/a70dd9b7a2d5bf8e5d86ebdcaefb6b32

sitz · 2025-01-01T15:08:47Z

As of Dec 31, 2024 location timeline export on Android - the JSON data format has changed and needs a chance in make_reader() function, which I added to my fork. Please free to pull it to sync if needed. Thanks for bootstrapping this script.

fab343 · 2025-02-06T19:11:03Z

@sitz I tried to run your code but I get a KeyError: 'semanticSegments'.

devdattaT/ConvertGoogleHistory.py

devdattaT commented Jun 26, 2024

Uh oh!

davidyoder18 commented Jun 26, 2024

Uh oh!

GitMae99 commented Jun 26, 2024

Uh oh!

pbsings commented Jul 7, 2024

Uh oh!

dinilj007 commented Jul 24, 2024 •

edited

Loading

Uh oh!

s0meguy1 commented Oct 21, 2024

Uh oh!

epk commented Dec 30, 2024

Uh oh!

sitz commented Jan 1, 2025

Uh oh!

fab343 commented Feb 6, 2025

Uh oh!

	import json
	import csv
	import sys
	from datetime import datetime
	import os

	def has_keys(dictionary, keys):
	return all(key in dictionary for key in keys)

	def make_reader(in_json):

	# Open location history data
	json_data = json.loads(open(in_json).read())
	#Will read the following keys
	keys_to_check = ['timestamp', 'longitudeE7', 'latitudeE7', 'accuracy']

	# Get the easy fields
	for item in json_data['locations']:
	if has_keys(item, keys_to_check):
	timestamp = item['timestamp']
	if ('.' in timestamp):
	date = datetime.strptime(timestamp, '%Y-%m-%dT%H:%M:%S.%fZ').date()
	else:
	date = datetime.strptime(timestamp, '%Y-%m-%dT%H:%M:%SZ').date()
	tm = timestamp.split('T')[1].split('Z')[0]
	longitude = item['longitudeE7']/10000000.0
	latitude = item['latitudeE7']/10000000.0
	accuracy = item['accuracy']

	yield [date, tm, longitude, latitude, accuracy]


	def getFullPath(inPath):
	if(not os.path.isabs(inPath)):
	# we need to set up the absolute path
	script_path = os.path.abspath(__file__)
	path, file = os.path.split(script_path)
	inPath = os.path.join(path, inPath)
	return inPath


	# Read the Parameters
	in_file = sys.argv[1]
	out_file = sys.argv[2]

	in_file = getFullPath(in_file)
	out_file = getFullPath(out_file)

	features = []
	# add the Headers
	features.append(['Date', 'Time', 'Longitude', 'Latitude', 'Accuracy'])
	print("Reading {0}".format(in_file))

	reader = make_reader(in_file)

	for r in reader:
	features.append(r)

	print('Read {0} Records'.format(len(features)-1))

	# write this data
	with open(out_file, 'w', newline='')as f:
	writer = csv.writer(f)
	writer.writerows(features)

devdattaT/ConvertGoogleHistory.py

devdattaT commented Jun 26, 2024

Uh oh!

davidyoder18 commented Jun 26, 2024

Uh oh!

GitMae99 commented Jun 26, 2024

Uh oh!

pbsings commented Jul 7, 2024

Uh oh!

dinilj007 commented Jul 24, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

s0meguy1 commented Oct 21, 2024

Uh oh!

epk commented Dec 30, 2024

Uh oh!

sitz commented Jan 1, 2025

Uh oh!

fab343 commented Feb 6, 2025

Uh oh!

dinilj007 commented Jul 24, 2024 •

edited

Loading