Skip to content

Instantly share code, notes, and snippets.

@jamesls
Last active July 30, 2024 05:59
Show Gist options
  • Save jamesls/1f7a323f042089b4bc33 to your computer and use it in GitHub Desktop.
Save jamesls/1f7a323f042089b4bc33 to your computer and use it in GitHub Desktop.
Making sqlite slower in python by using threads:
import os
import time
import sqlite3
import threading
conn = None
def create_table():
global conn
try:
os.remove("foo.db")
except OSError:
pass
conn = sqlite3.connect("foo.db", check_same_thread=False)
c = conn.cursor()
c.execute("""CREATE TABLE test (id INTEGER PRIMARY KEY, name TEXT)""")
def writer_thread():
writer = conn.cursor()
for i in range(100000):
writer.execute("INSERT INTO test (name) VALUES (?)", (str(i),))
print "Writer done"
def reader_thread():
reader = conn.cursor()
count = 0
for i in range(10):
for row in reader.execute("SELECT * FROM test"):
count += 1
print "Reader done, num_rows seen:", count
# Single threaded.
print "Single threaded, write 100000, read all records 10 times"
start = time.time()
create_table()
writer_thread()
for i in range(10):
reader_thread()
end = time.time()
print "Total single threaded: %.4f" % (end - start)
## Used for sanity check later.
first_result = list(conn.execute("SELECT * FROM test"))
# Threaded version.
print "\nMulti threaded, write 100000, read all records 10 times, all in separate threads."
start = time.time()
create_table()
write = threading.Thread(target=writer_thread)
readers = [threading.Thread(target=reader_thread) for i in range(10)]
write.start()
[r.start() for r in readers]
write.join()
[r.join() for r in readers]
end = time.time()
print "Total multi threaded: %.4f" % (end - start)
# Double check we aren't getting any different results
# in the final result between single threaded and multithreaded
# versions.
second_result = list(conn.execute("SELECT * FROM test"))
assert first_result == second_result
# OUTPUT when you run this:
# Single threaded, write 100000, read all records 10 times
# Writer done
# Reader done, num_rows seen: 1000000
# Reader done, num_rows seen: 1000000
# Reader done, num_rows seen: 1000000
# Reader done, num_rows seen: 1000000
# Reader done, num_rows seen: 1000000
# Reader done, num_rows seen: 1000000
# Reader done, num_rows seen: 1000000
# Reader done, num_rows seen: 1000000
# Reader done, num_rows seen: 1000000
# Reader done, num_rows seen: 1000000
# Total single threaded: 10.3086
#
# Multi threaded, write 100000, read all records 10 times, all in separate threads.
# Writer done
# Reader done, num_rows seen: 218389
# Reader done, num_rows seen: 335396
# Reader done, num_rows seen: 347060
# Reader done, num_rows seen: 395266
# Reader done, num_rows seen: 422447
# Reader done, num_rows seen: 409391
# Reader done, num_rows seen: 439284
# Reader done, num_rows seen: 487602
# Reader done, num_rows seen: 506180
# Reader done, num_rows seen: 501524
# Total multi threaded: 64.9558
@khanakia
Copy link

@jamesls Can you explain why it behaves like this?

single thread is too fast but over multithreaded becomes too slow.

I believe the issue is because of this:

The Python sqlite3 module has a threadsafety level of 1, which means that although you can't share database connections between threads, multiple threads can use the module simultaneously. So, you could have each thread create its own connection to the database.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment