Skip to content

Instantly share code, notes, and snippets.

@sashadev-sky
Last active April 5, 2021 18:50
Show Gist options
  • Save sashadev-sky/6790850336583509ed43ca8860f80626 to your computer and use it in GitHub Desktop.
Save sashadev-sky/6790850336583509ed43ca8860f80626 to your computer and use it in GitHub Desktop.
import uuid
import csv
def my_random_string(string_length=10):
"""Returns a random string of length string_length."""
random = str(uuid.uuid4()) # Convert UUID format to a Python string.
random = random.upper() # Make all characters uppercase.
random = random.replace("-", "") # Remove the UUID '-'.
return random[0:string_length] # Return the random string.
# print(my_random_string(8)) # For example, D9E50C
ids = set()
while len(ids) < 2000000:
ids.add(my_random_string(8))
cw = csv.writer(open("hello.csv", 'w'))
cw.writerows([c.strip() for c in r.strip(', ').split(',')]
for r in list(ids))
@sashadev-sky
Copy link
Author

find and print duplicate rows in perl: perl -ne 'print if $SEEN{$_}++' < input-file

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment