ccritchfield · December 5, 2019 04:14
diff --git a/_process_timer.readme b/_process_timer.readme
 --------------------------------------------
 Python Code Timer
 --------------------------------------------

 Quick-n-dirty Python code timer. Add a line of code
 before and after the code you want to time to kick
 off a timer then return it when done processing.
 Can have multiple timers going to test speed / 
 bottlenecks of various parts of code.

 Purpose ...

 Sometimes you just want a quick-n-dirty code timer.
 Python has TimeIt (included a .py that demonstrates it),
 but it's kind of a pain to use sometimes by getting
 it to encapsulate stand-alone code. Easier to do
 on commandline, but have to encapsulate it as a
 string on IDE.

 I made a VBA code timer years ago that I could
 just kick off timers anywhere I pleased, and call
 again to end them. So, I recreated that same thing
 in Python to time my py codes, especially as big
 data sets start to bog down data sci routines.

 I initially created this in "intro to python" class
 in college to help me trim down code times, test
 code to see which ways were more efficient, and to
 hand to the rest of the class I was in to help them
 get into the habit of timing code to make it more
 elegant. (I was the unofficial TA of my python class.
 We had undergrad and grad all in the same class, and
 prof was teaching it in "sink or swim" fashion. I
 would send out code examples to the class to help them
 understand what the prof was doing. A code timer
 was a natural evolution of that.)

 I later revised it in Data Sci class (which was
 Python-heavy) to help me see how long massive data
 runs (big data, unstructured data, etc) took to
 process in order to optimize processing.
diff --git a/process_timer.py b/process_timer.py
 ########################################
 """
 Process Timer

 As you do more complex programming, you might
 find yourself in need of a process timer...
 something that lets you time your overall
 code run, or snippets of it.

 This is where an ad-hoc process timer comes in.

 ---------------------------------------

 Main Process Timer functions are:

 procTimer(name: string) -> (returns nothing, but prints/logs results if ending a timer)
    Send it a timer name as a string.
    If it can't find the timer name in the timer dictionary, it creates
        the timer and snapshots the current time it added it.
    If it can find the timer name, then it removes the timer from
        the dictionary and returns the elapsed time (stop time - start time)

 procLog(activate: bool, filename: string) -> (returns nothing)
    Use this function to activate text file logging of timers.
    The text file dumps out to the same folder that the
    "process_timer.py" code file is in (eg: if it's in
        c:\temp\ then your log fille will end up in c:\temp\ too.
            
    activate = boolean True/False ...
        True = activate and log timers to log file
        False = stop logging timers
    filename = filename to use for appending timer string messages to.
        default = "timer_log.txt", so if you dont' provide a value,
        it will use "timer_log.txt".

 procInfo() -> (returns nothing, but prints summary of proc timer stuff)
    Call it to get a summary of whether logging is on/off,
    what the log file name currently is, and a list of timers
    still active in the process timer dictionary.
    
    This is mainly useful if you're running code from the
    console one line at a time, b/c the code and variables
    there remain running after each line of input. So,
    you can run procInfo() to get a quick check of timers
    still active.

 ---------------------------------------

    If you want to import process_timer.py stuff from any
    code you're doing in the IDE without having to worry
    about copy/paste'ing process_timer.py file to the
    same folder as your code file, then you can toss
    process_timer.py into ...
    

    C:\\Users\\(username)\\Anaconda3\\Lib

    (have to use double \\, otherwise Python thinks
    the single slash is an escape character, even in comment )

    That's the Windows directory that houses all of the 
    extra boiler-plate Python code libraries, eg: random.py.
    
    (Not sure what directory to use for Linux or Mac).
    
    The problem with doing this is that if you turn on
    timer logging, it will spit out a log file to that
    lib directory still. (I could have imported os.py
    and used that to come up with directory creation
    and such, but logging is an extracurricular feature
    and I didn't want to open up the os.py can of worms.
    This code file is already hitting scope-creep levels
    of bloat as it is.)
    
    I may expand on this later if I think of more
    stuff to add to it, or optimize it or whatever.
    But, for now, I feel it's good enough.
    
    - Craig Critchfield

 """
 ########################################
 # CODE LIBRARY IMPORTS
 ########################################

 # datetime.datetime.now() function
 # tracks time in milliseconds
 from datetime import datetime

 ########################################
 # GLOBAL VARIABLES
 ########################################

 proc_timers = {}                # dictionary stores timer names & start times
 log_timers  = False             # flag if we're logging timers to txt log or not (default = no)
 log_file    = "timer_log.txt"   # default text file name to log to


 ########################################
 # FUNCTIONS - TIMER
 ########################################

 # single function to call to start/stop timers
 def procTimer( name ):
    if name in proc_timers:
        endTimer(name)
    else:
        addTimer(name)

 # add new timer to dict
 # with snapshot of current time
 def addTimer( name ):
    proc_timers[name] = datetime.now()

 # pop timer from dict,
 # printing results
 def endTimer( name ):
    stop  = datetime.now()
    start = proc_timers.pop(name)
    procTime = stop - start
    msg = '{:^20}'.format(name) # center name within 20 char padding
    logtime = datetime.now().strftime("%Y-%m-%d %H:%M:%S")
    msg = logtime + " ...  " + "timer ... " + msg + " ... " + str(procTime) + " (h:mm:ss.ms)"
    print(msg)

    if log_timers:
        logTimer( msg )


 ########################################
 # FUNCTIONS - LOG FILE
 ########################################

 # allows user to activate / deactivate log file
 # and change the filename that's used
 # Note that we have these vars setup as globals
 # so we use the "global" keyword to tell the
 # function to refer to the global vars, and not
 # simply kick off local variables with the same names.
 def procLog( activate = log_timers, filename = log_file):
    global log_timers
    global log_file
    log_timers  = activate
    log_file    = filename

 # append timer to output log file
 def logTimer( msg ):
    file = open(log_file, "a")     # create / open file to append
    file.write(msg + "\n")         # add line break after input
    file.close                     # close file to free object

 ########################################
 # FUNCTIONS - TIMER INFO
 ########################################
    
 def procInfo():
    border = "-" * 50
    print(border)
    print( "logging timers = " + str(log_timers))
    print( "log file name  = " + log_file)
    print(border)

    print( "active timers...")

    for name in proc_timers:
        print("\t* " + name)
        
    print(border)
diff --git a/process_timer_test.py b/process_timer_test.py
 ###############################################
 # Code below demonstrates how to use the process_timer.py code lib...
 ###############################################

 #--------------------------------------------
 # importing procTimer() functions...
 # we're importing specific ones
 # in the way below, so we can call them
 # like "procTimer()" instead of 
 # like "process_timer.procTimer()"
 # Did them like this instead of
 # .. from process_timer import *
 # so you could comment out log or
 # info if you don't want to use them
 #--------------------------------------------

 # import our main function.. the process timer handler
 from process_timer import procTimer

 # import function to turn logging on / off
 from process_timer import procLog

 # import function to get summary of timers & logging
 from process_timer import procInfo


 #--------------------------------------------
 # activating log file to keep
 # record of timers and times
 # (so we can look them over
 # after we make code changes to
 # see if we really made improvements or not),
 # and using a different filename
 # then the default "timer_log.txt"
 #--------------------------------------------

 procLog(True, "my_timer_log.txt")


 #--------------------------------------------
 # this is just demonstrating that if you
 # run this code 2+ times, we ended
 # by kicking off the "blahblah" timer
 # and never called it again. But, when
 # code is done running in this window,
 # all variables cleear out. So, the
 # procInfo() functions shows no timers,
 # because we're kickking off a fresh
 # timer dict object as this code starts.
 #--------------------------------------------

 procInfo()


 #--------------------------------------------
 # kicking off a timer
 # to encapsulate several things
 procTimer("overall_process")

 # ... and another to just
 # time a single code snippet
 procTimer("test1000")

 # code snippet
 for i in range(1000):
    print(str(i))

 # call the timer again
 # to return time results
 procTimer("test1000")

 #--------------------------------------------
 # kicking off another timer
 procTimer("test10000")

 # another code snippet
 for i in range(10000):
    print(str(i))

 # return how long it took
 procTimer("test10000")

 #--------------------------------------------
 # we're done tracking this overall
 # chunk of processing, so
 # fetch the timer
 procTimer("overall_process")


 #-----------------------------------------
 # turn off the timer log
 #--------------------------------------------

 procLog(False)

 #--------------------------------------------
 # kick off another timer..
 # we can re-use a timer name from above,
 # b/c it was already removed from the stack
 procTimer("test1000")

 # another code snippet
 for i in range(1000):
    print(str(i))

 # return how long it took
 # but this time it's just displaying
 # in console output only since we're
 # not logging it
 procTimer("test1000")




 #-----------------------------------------
 # testing procInfo
 #-----------------------------------------

 # kick off a timer
 procTimer("blahblah")

 # get summary
 procInfo()

 # done ...
 # python clears out the variables once the scope of
 # execution is done.
 # so "blahblah" won't exist if we run the code
 # again.
 # so, timers are only useful to track timings
 # within the code execution they're also kicked off
 # in and called back from.
diff --git a/process_timer_timeit.py b/process_timer_timeit.py
 ########################################
 """
 Demonstraing use of Python's built-in
 timeit library & function to time code snippets.

 -----------------------------------------

 timeit.timeit
 (
    stmt    = code to run (default = 'pass' (ie: run nothing))
              code must be passed in as string object

    setup   = code to pre-run before timing stmt (default = 'pass' (ie: run nothing))
              eg: needing to run an import to (like import math) that the
              code stmt will need to run properly
    
    timer   = timer to use (default = default timer).. just leave as-is
    
    number  = number of times to run code stmt (default = 100000 or so)
    
    globals = global variables to pass in
              passing "globals()" will pass it in all the global
              variables you created in the local namespace / current
              execution of your code, which is convenient.
 )

 -----------------------------------------
            
    timeit returns the elapsed time (in milliseconds) of ALL runs,
    not an avg of the runs.

    EG: if you do number = 1, then it will be elapsed time of 1 run.
        if you do number = 100, then it will be elapsed time of 100 runs.

        So, to get avg time, you can divide timeit's elapsed time by number
        of runs. This would require you tracking number of runs as
        a separate var to use for avg calculation afterwards (as shown below).

 -----------------------------------------

    My personal opinion is that timeit() is over-engineered (BY
    software engineers, FOR software engineers.. not user-friendly
    to new programmers).

    Sometimes you just want a simple timer to plug-n-play anywhere
    in your code to start timing and stop anywhere when you want.
    
    Timeit(), while powerful, requires you to turn your code into string,
    manually pass in setup and globals, and if you don't remember
    to change the number it will default to running the code shit-tons
    of times (which can leave your console unresponsive and you thinking
    the code locked-up.)

    You can spend more time dicking around with your code to
    get it to run in the timeit() function then it would take
    to just code your own ad-hoc timer. I had to google up
    how to use timeit() from various sources just to figure
    it out myself!
    
    I just feel it's more complicated then it needs to be for what
    it's doing. The point of python is to be simple
    and easy, but then software engineers get involved and 
    over-complicate certain things that should have remained
    simple and easy.
    
    That's why I prefer my own procTimer().
    
    But, I wanted to demonstrate the built-in timeit() in case
    others want to try it out.
    
    - Craig
        
 """
 ########################################

 #------------------------------------
 # Imports
 #------------------------------------
 import timeit


 #------------------------------------
 # global constants
 #------------------------------------
 LIMIT = 10000


 #------------------------------------
 # global variables
 #------------------------------------
 number_of_runs = 1000
 border = "-" * 50
 my_list = []
 i = 0


 #------------------------------------
 # populate test list
 #------------------------------------
 while i < LIMIT:
    my_list.append(i)
    i += 1


 #------------------------------------
 # test looping list with len in loop
 #------------------------------------

 test_code = '''
 i = 0
 while i < len(my_list):
    i += 1
 '''

 elapsed_time = timeit.timeit(stmt = test_code, number = number_of_runs, globals = globals())
 average_time = elapsed_time / number_of_runs

 print ( border )
 print ( "while i < len(list)...")
 print ( "elapsed_time = " + str(elapsed_time) )
 print ( "average_time = " + str(average_time) )


 #------------------------------------
 # test looping list with len as constant
 #------------------------------------

 test_code = '''
 i = 0
 length = len(my_list)
 while i < length:
    i += 1
 '''

 elapsed_time = timeit.timeit(stmt = test_code, number = number_of_runs, globals = globals())
 average_time = elapsed_time / number_of_runs

 print ( border )
 print ( "while i < length...")
 print ( "elapsed_time = " + str(elapsed_time) )
 print ( "average_time = " + str(average_time) )
	--------------------------------------------
	Python Code Timer
	--------------------------------------------

	Quick-n-dirty Python code timer. Add a line of code
	before and after the code you want to time to kick
	off a timer then return it when done processing.
	Can have multiple timers going to test speed /
	bottlenecks of various parts of code.

	Purpose ...

	Sometimes you just want a quick-n-dirty code timer.
	Python has TimeIt (included a .py that demonstrates it),
	but it's kind of a pain to use sometimes by getting
	it to encapsulate stand-alone code. Easier to do
	on commandline, but have to encapsulate it as a
	string on IDE.

	I made a VBA code timer years ago that I could
	just kick off timers anywhere I pleased, and call
	again to end them. So, I recreated that same thing
	in Python to time my py codes, especially as big
	data sets start to bog down data sci routines.

	I initially created this in "intro to python" class
	in college to help me trim down code times, test
	code to see which ways were more efficient, and to
	hand to the rest of the class I was in to help them
	get into the habit of timing code to make it more
	elegant. (I was the unofficial TA of my python class.
	We had undergrad and grad all in the same class, and
	prof was teaching it in "sink or swim" fashion. I
	would send out code examples to the class to help them
	understand what the prof was doing. A code timer
	was a natural evolution of that.)

	I later revised it in Data Sci class (which was
	Python-heavy) to help me see how long massive data
	runs (big data, unstructured data, etc) took to
	process in order to optimize processing.
	########################################
	"""
	Process Timer

	As you do more complex programming, you might
	find yourself in need of a process timer...
	something that lets you time your overall
	code run, or snippets of it.

	This is where an ad-hoc process timer comes in.

	---------------------------------------

	Main Process Timer functions are:

	procTimer(name: string) -> (returns nothing, but prints/logs results if ending a timer)
	Send it a timer name as a string.
	If it can't find the timer name in the timer dictionary, it creates
	the timer and snapshots the current time it added it.
	If it can find the timer name, then it removes the timer from
	the dictionary and returns the elapsed time (stop time - start time)

	procLog(activate: bool, filename: string) -> (returns nothing)
	Use this function to activate text file logging of timers.
	The text file dumps out to the same folder that the
	"process_timer.py" code file is in (eg: if it's in
	c:\temp\ then your log fille will end up in c:\temp\ too.

	activate = boolean True/False ...
	True = activate and log timers to log file
	False = stop logging timers
	filename = filename to use for appending timer string messages to.
	default = "timer_log.txt", so if you dont' provide a value,
	it will use "timer_log.txt".

	procInfo() -> (returns nothing, but prints summary of proc timer stuff)
	Call it to get a summary of whether logging is on/off,
	what the log file name currently is, and a list of timers
	still active in the process timer dictionary.

	This is mainly useful if you're running code from the
	console one line at a time, b/c the code and variables
	there remain running after each line of input. So,
	you can run procInfo() to get a quick check of timers
	still active.

	---------------------------------------

	If you want to import process_timer.py stuff from any
	code you're doing in the IDE without having to worry
	about copy/paste'ing process_timer.py file to the
	same folder as your code file, then you can toss
	process_timer.py into ...


	C:\\Users\\(username)\\Anaconda3\\Lib

	(have to use double \\, otherwise Python thinks
	the single slash is an escape character, even in comment )

	That's the Windows directory that houses all of the
	extra boiler-plate Python code libraries, eg: random.py.

	(Not sure what directory to use for Linux or Mac).

	The problem with doing this is that if you turn on
	timer logging, it will spit out a log file to that
	lib directory still. (I could have imported os.py
	and used that to come up with directory creation
	and such, but logging is an extracurricular feature
	and I didn't want to open up the os.py can of worms.
	This code file is already hitting scope-creep levels
	of bloat as it is.)

	I may expand on this later if I think of more
	stuff to add to it, or optimize it or whatever.
	But, for now, I feel it's good enough.

	- Craig Critchfield

	"""
	########################################
	# CODE LIBRARY IMPORTS
	########################################

	# datetime.datetime.now() function
	# tracks time in milliseconds
	from datetime import datetime

	########################################
	# GLOBAL VARIABLES
	########################################

	proc_timers = {} # dictionary stores timer names & start times
	log_timers = False # flag if we're logging timers to txt log or not (default = no)
	log_file = "timer_log.txt" # default text file name to log to


	########################################
	# FUNCTIONS - TIMER
	########################################

	# single function to call to start/stop timers
	def procTimer( name ):
	if name in proc_timers:
	endTimer(name)
	else:
	addTimer(name)

	# add new timer to dict
	# with snapshot of current time
	def addTimer( name ):
	proc_timers[name] = datetime.now()

	# pop timer from dict,
	# printing results
	def endTimer( name ):
	stop = datetime.now()
	start = proc_timers.pop(name)
	procTime = stop - start
	msg = '{:^20}'.format(name) # center name within 20 char padding
	logtime = datetime.now().strftime("%Y-%m-%d %H:%M:%S")
	msg = logtime + " ... " + "timer ... " + msg + " ... " + str(procTime) + " (h:mm:ss.ms)"
	print(msg)

	if log_timers:
	logTimer( msg )


	########################################
	# FUNCTIONS - LOG FILE
	########################################

	# allows user to activate / deactivate log file
	# and change the filename that's used
	# Note that we have these vars setup as globals
	# so we use the "global" keyword to tell the
	# function to refer to the global vars, and not
	# simply kick off local variables with the same names.
	def procLog( activate = log_timers, filename = log_file):
	global log_timers
	global log_file
	log_timers = activate
	log_file = filename

	# append timer to output log file
	def logTimer( msg ):
	file = open(log_file, "a") # create / open file to append
	file.write(msg + "\n") # add line break after input
	file.close # close file to free object

	########################################
	# FUNCTIONS - TIMER INFO
	########################################

	def procInfo():
	border = "-" * 50
	print(border)
	print( "logging timers = " + str(log_timers))
	print( "log file name = " + log_file)
	print(border)

	print( "active timers...")

	for name in proc_timers:
	print("\t* " + name)

	print(border)
	###############################################
	# Code below demonstrates how to use the process_timer.py code lib...
	###############################################

	#--------------------------------------------
	# importing procTimer() functions...
	# we're importing specific ones
	# in the way below, so we can call them
	# like "procTimer()" instead of
	# like "process_timer.procTimer()"
	# Did them like this instead of
	# .. from process_timer import *
	# so you could comment out log or
	# info if you don't want to use them
	#--------------------------------------------

	# import our main function.. the process timer handler
	from process_timer import procTimer

	# import function to turn logging on / off
	from process_timer import procLog

	# import function to get summary of timers & logging
	from process_timer import procInfo


	#--------------------------------------------
	# activating log file to keep
	# record of timers and times
	# (so we can look them over
	# after we make code changes to
	# see if we really made improvements or not),
	# and using a different filename
	# then the default "timer_log.txt"
	#--------------------------------------------

	procLog(True, "my_timer_log.txt")


	#--------------------------------------------
	# this is just demonstrating that if you
	# run this code 2+ times, we ended
	# by kicking off the "blahblah" timer
	# and never called it again. But, when
	# code is done running in this window,
	# all variables cleear out. So, the
	# procInfo() functions shows no timers,
	# because we're kickking off a fresh
	# timer dict object as this code starts.
	#--------------------------------------------

	procInfo()


	#--------------------------------------------
	# kicking off a timer
	# to encapsulate several things
	procTimer("overall_process")

	# ... and another to just
	# time a single code snippet
	procTimer("test1000")

	# code snippet
	for i in range(1000):
	print(str(i))

	# call the timer again
	# to return time results
	procTimer("test1000")

	#--------------------------------------------
	# kicking off another timer
	procTimer("test10000")

	# another code snippet
	for i in range(10000):
	print(str(i))

	# return how long it took
	procTimer("test10000")

	#--------------------------------------------
	# we're done tracking this overall
	# chunk of processing, so
	# fetch the timer
	procTimer("overall_process")


	#-----------------------------------------
	# turn off the timer log
	#--------------------------------------------

	procLog(False)

	#--------------------------------------------
	# kick off another timer..
	# we can re-use a timer name from above,
	# b/c it was already removed from the stack
	procTimer("test1000")

	# another code snippet
	for i in range(1000):
	print(str(i))

	# return how long it took
	# but this time it's just displaying
	# in console output only since we're
	# not logging it
	procTimer("test1000")




	#-----------------------------------------
	# testing procInfo
	#-----------------------------------------

	# kick off a timer
	procTimer("blahblah")

	# get summary
	procInfo()

	# done ...
	# python clears out the variables once the scope of
	# execution is done.
	# so "blahblah" won't exist if we run the code
	# again.
	# so, timers are only useful to track timings
	# within the code execution they're also kicked off
	# in and called back from.
	########################################
	"""
	Demonstraing use of Python's built-in
	timeit library & function to time code snippets.

	-----------------------------------------

	timeit.timeit
	(
	stmt = code to run (default = 'pass' (ie: run nothing))
	code must be passed in as string object

	setup = code to pre-run before timing stmt (default = 'pass' (ie: run nothing))
	eg: needing to run an import to (like import math) that the
	code stmt will need to run properly

	timer = timer to use (default = default timer).. just leave as-is

	number = number of times to run code stmt (default = 100000 or so)

	globals = global variables to pass in
	passing "globals()" will pass it in all the global
	variables you created in the local namespace / current
	execution of your code, which is convenient.
	)

	-----------------------------------------

	timeit returns the elapsed time (in milliseconds) of ALL runs,
	not an avg of the runs.

	EG: if you do number = 1, then it will be elapsed time of 1 run.
	if you do number = 100, then it will be elapsed time of 100 runs.

	So, to get avg time, you can divide timeit's elapsed time by number
	of runs. This would require you tracking number of runs as
	a separate var to use for avg calculation afterwards (as shown below).

	-----------------------------------------

	My personal opinion is that timeit() is over-engineered (BY
	software engineers, FOR software engineers.. not user-friendly
	to new programmers).

	Sometimes you just want a simple timer to plug-n-play anywhere
	in your code to start timing and stop anywhere when you want.

	Timeit(), while powerful, requires you to turn your code into string,
	manually pass in setup and globals, and if you don't remember
	to change the number it will default to running the code shit-tons
	of times (which can leave your console unresponsive and you thinking
	the code locked-up.)

	You can spend more time dicking around with your code to
	get it to run in the timeit() function then it would take
	to just code your own ad-hoc timer. I had to google up
	how to use timeit() from various sources just to figure
	it out myself!

	I just feel it's more complicated then it needs to be for what
	it's doing. The point of python is to be simple
	and easy, but then software engineers get involved and
	over-complicate certain things that should have remained
	simple and easy.

	That's why I prefer my own procTimer().

	But, I wanted to demonstrate the built-in timeit() in case
	others want to try it out.

	- Craig

	"""
	########################################

	#------------------------------------
	# Imports
	#------------------------------------
	import timeit


	#------------------------------------
	# global constants
	#------------------------------------
	LIMIT = 10000


	#------------------------------------
	# global variables
	#------------------------------------
	number_of_runs = 1000
	border = "-" * 50
	my_list = []
	i = 0


	#------------------------------------
	# populate test list
	#------------------------------------
	while i < LIMIT:
	my_list.append(i)
	i += 1


	#------------------------------------
	# test looping list with len in loop
	#------------------------------------

	test_code = '''
	i = 0
	while i < len(my_list):
	i += 1
	'''

	elapsed_time = timeit.timeit(stmt = test_code, number = number_of_runs, globals = globals())
	average_time = elapsed_time / number_of_runs

	print ( border )
	print ( "while i < len(list)...")
	print ( "elapsed_time = " + str(elapsed_time) )
	print ( "average_time = " + str(average_time) )


	#------------------------------------
	# test looping list with len as constant
	#------------------------------------

	test_code = '''
	i = 0
	length = len(my_list)
	while i < length:
	i += 1
	'''

	elapsed_time = timeit.timeit(stmt = test_code, number = number_of_runs, globals = globals())
	average_time = elapsed_time / number_of_runs

	print ( border )
	print ( "while i < length...")
	print ( "elapsed_time = " + str(elapsed_time) )
	print ( "average_time = " + str(average_time) )