Last active
July 28, 2017 14:56
-
-
Save oatsandsugar/29b17d0a0dd07308834a929e208cd8b1 to your computer and use it in GitHub Desktop.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
def count(tally_dataset, tally_column, tally_count, feature_dataset, feature_column, comparator = False, comparator_value = '', comparator_column = ''): | |
""" | |
Count variables of interest in one dataframe, write count into appropriate row of second dataframe. | |
Keyword arguments: | |
tally_dataset -- dataset in which tally of variable of interest is to be recorded | |
tally_column -- column containing variable which is the key to the tally of (e.g. ZIP code) | |
tally_count -- column in which tally of variable of interest is to be recorded | |
feature_dataset -- dataset containing variable the occurence of which is counted | |
feature_column -- column containing variable the occurence of which is key in the count (e.g. each ZIP code in this column is recorded to tally_count according to row tally_column) | |
comparator -- boolean value determining whether (if False) each row in feature_dataset is to be tallied or (if True) only occurrence of comparator_value is to be tallied (default False) | |
comparator_value -- a value the presence of which (in the comparator_column) will be tallied (default = '') | |
comparator_column -- the column in feature_dataset which will be iterated through to find and tally comparator_value (default = '') | |
""" | |
for index, row in tqdm(feature_dataset.iterrows()): | |
for index_2, row_2 in tally_dataset.iterrows(): | |
if comparator == True: | |
if (row[feature_column] == row_2[tally_column] and | |
comparator_value == row_2[comparator_column]): | |
tally_dataset.loc[index_2, tally_count] = tally_dataset.loc[index_2, tally_count] + 1 | |
else: | |
if row[feature_column] == row_2[tally_column]: | |
tally_dataset.loc[index_2, tally_count] = tally_dataset.loc[index_2, tally_count] + 1 |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment