For CSV to dataclass...
I orginally had this logic to check if the row of a CSV contained any blank values:
n_cols = len(rows[0])
for row in rows:
if len([x for x in row if x]) != n_cols:
continue
passBut then I realized that the list comp has to iterate every field, even if the row has a blank in the first field.
So, I thought, "would a for loop be faster?":
for row in rows:
if "" in row:
continue
passThe for-loop easily wins over the list-comp:
| Type | Blank placement | Time (s) for 30M rows |
|---|---|---|
| for_loop | BEG | 0.32 |
| for_loop | MID | 1.04 |
| for_loop | END | 1.79 |
| list_comp | BEG | 6.62 |
| list_comp | MID | 6.63 |
| list_comp | END | 6.65 |
The original solution also had a call to .strip() for every field, since I wasn't sure if OP's input CSV would have a row like, a, ,b,c, ,e (with multiple spaces between a and b). Then I remembered that csv.reader has the skipinitialspace= param that when set to True will discard all leading whitespace before the; this reduces fields with all whitespace to "". This obviates the need to .strip() on the backend, saving a costly function call in the tight loop.