Skip to content

Instantly share code, notes, and snippets.

@MariusHerget
Forked from Allstreamer/snow.py
Last active September 25, 2024 20:35
Show Gist options
  • Save MariusHerget/8e061217ad0fb5709ac498e082903bd7 to your computer and use it in GitHub Desktop.
Save MariusHerget/8e061217ad0fb5709ac498e082903bd7 to your computer and use it in GitHub Desktop.
An advanced script to summarize a Snowflake(Tor Project) Log file with the corresponding docker compose file
# This Script uses the following dependancies
# pip install nums-from-string
# pip install datetime
#
# To Run this script type:
# python analyze_snowflake_logs.py <Log File Name>
#
# The default <Log File Name> is ./docker_snowflake.log
#
# Example:
# python analyze_snowflake_logs.py snow.log
#
# Written By Allstreamer_
# Licenced Under MIT
#
# Enhanced by MariusHerget
import nums_from_string
import sys
from datetime import datetime, timedelta
# Format of your timestamps in the beginning of the log
# e.g. "2022/01/01 16:50:30 <LOG ENTRY>" => "%Y/%m/%d %H:%M:%S"
timestamp_format = "%Y/%m/%d %H:%M:%S"
# Log file path from arguments (default: ./docker_snowflake.log)
logfile_path = sys.argv[1] if len(sys.argv) > 1 else "./docker_snowflake.log"
# Read in log file as lines
lines_all = []
with open(logfile_path, "r") as file:
lines_all = file.readlines()
# Catch phrase for lines who do not start with a timestamp
def catchTimestampException(rowSubString, timestampFormat):
try:
return datetime.strptime(rowSubString, timestampFormat)
except Exception as e:
#print(e)
return datetime.strptime("1970/01/01 00:00:00", "%Y/%m/%d %H:%M:%S")
# Filter the log lines based on a time delta in hours
def filterLinesBasedOnTimeDelta(log_lines, hours):
now = datetime.now()
length_timestamp_format = len(datetime.strftime(now, timestamp_format))
return filter(lambda row: now-timedelta(hours=hours) <= catchTimestampException(row[0:length_timestamp_format], timestamp_format) <= now, log_lines)
# Convert traffic information (in B, KB, MB, or GB) to B (Bytes) and add up to a sum
def get_byte_count(log_lines):
byte_count = 0
for row in log_lines:
symbols = row.split(" ")
if symbols[2] == "B":
byte_count += int(symbols[1])
elif symbols[2] == "KB":
byte_count += int(symbols[1]) * 1024
elif symbols[2] == "MB":
byte_count += int(symbols[1]) * 1024 * 1024
elif symbols[2] == "GB":
byte_count += int(symbols[1]) * 1024 * 1024 * 1024
return byte_count
# Filter important lines from the log
# Extract number of connections, uploaded traffic in GB and download traffic in GB
def getDataFromLines(lines):
# Filter out important lines (Traffic information)
lines = [row.strip() for row in lines]
lines = filter(lambda row: "In the" in row, lines)
lines = [row.split(",", 1)[1] for row in lines]
# Filter out all traffic log lines who did not had any connection
lines = list(filter(lambda row: not nums_from_string.get_nums(row)[0] == 0, lines))
# Extract number of connections as a sum
connections = sum([nums_from_string.get_nums(row)[0] for row in lines])
# Extract upload and download data
lines = [row.split("Relayed")[1] for row in lines]
upload = [row.split(",")[0].strip() for row in lines]
download = [row.split(",")[1].strip()[:-1] for row in lines]
# Convert upload/download data to GB
upload_gb = get_byte_count(upload) / 1024 / 1024 / 1024
download_gb = get_byte_count(download) / 1024 / 1024 / 1024
# Return information as a dictionary for better structure
return {'connections': connections, 'upload_gb': upload_gb, 'download_gb': download_gb}
# Get the statistics for various time windows
# e.g. all time => getDataFromLines(lines_all, 24)
# e.g. last 24h => getDataFromLines(filterLinesBasedOnTimeDelta(lines_all, 24))
# e.g. last Week => getDataFromLines(filterLinesBasedOnTimeDelta(lines_all, 24 * 7))
stats = {
'All time': getDataFromLines(lines_all),
'Last 24h': getDataFromLines(filterLinesBasedOnTimeDelta(lines_all, 24)),
'Last Week': getDataFromLines(filterLinesBasedOnTimeDelta(lines_all, 24*7)),
}
# Get longest string from results for nicer printing (align lines)
formatting = {
'time': len(max(stats.keys())),
'connections': len(str(max(map(lambda x: stats[x]['connections'], stats)))),
'upload': len(str(round(max(map(lambda x: stats[x]['upload_gb'], stats)), 4))),
'download': len(str(round(max(map(lambda x: stats[x]['download_gb'], stats)), 4)))
}
# Print all the results
for time in stats:
stat = stats[time]
print(f"[{time:<{formatting['time']}}] " +
f"Served {stat['connections']:>{formatting['connections']}} People with " +
f"↑ {round(stat['upload_gb'],4):>{formatting['upload']}} GB, " +
f"↓ {round(stat['download_gb'],4):>{formatting['download']}} GB")
version: "3.8"
# Docker Service Snowflake
# - Starts the latest snowflake proxy (tor) as a docker container
# - Uses the current directory as a log storage
# Log file name: "docker_snowflake.log"
#
# Start and run in background:
# docker-compose up -d snowflake-proxy
services:
snowflake-proxy:
network_mode: host
image: thetorproject/snowflake-proxy:latest
container_name: snowflake-proxy
volumes:
- .:/shared:rw
restart: unless-stopped
command: ["-verbose", "-log", "/shared/docker_snowflake.log"]
@MBurchard
Copy link

MBurchard commented Nov 18, 2022

Hi, thank you for this script. Sadly I'm not a python developer and there for not able to fix the issue.
It happens in line 38.

python analyze_snowflake_logs.py /var/log/snowflake.log
Traceback (most recent call last):
  File "/home/pi/snowflake/analyze_snowflake_logs.py", line 88, in <module>
    'Last 24h':  getDataFromLines(filterLinesBasedOnTimeDelta(lines_all, 24)),
  File "/home/pi/snowflake/analyze_snowflake_logs.py", line 60, in getDataFromLines
    lines = [row.strip() for row in lines]
  File "/home/pi/snowflake/analyze_snowflake_logs.py", line 60, in <listcomp>
    lines = [row.strip() for row in lines]
  File "/home/pi/snowflake/analyze_snowflake_logs.py", line 38, in <lambda>
    return filter(lambda row: now-timedelta(hours=hours) <= datetime.strptime(row[0:length_timestamp_format], timestamp_format) <= now, log_lines)
  File "/usr/lib/python3.9/_strptime.py", line 568, in _strptime_datetime
    tt, fraction, gmtoff_fraction = _strptime(data_string, format)
  File "/usr/lib/python3.9/_strptime.py", line 349, in _strptime
    raise ValueError("time data %r does not match format %r" %
ValueError: time data 'sctp ERROR: 2022/11' does not match format '%Y/%m/%d %H:%M:%S'

reason are these entries in the log file:

cat /var/log/snowflake.log | grep "sctp ERROR"
sctp ERROR: 2022/11/18 15:39:42 [0x400015ea80] stream 1 not found)
sctp ERROR: 2022/11/18 15:39:42 [0x400015ea80] stream 1 not found)
sctp ERROR: 2022/11/18 15:39:42 [0x400015ea80] stream 1 not found)
sctp ERROR: 2022/11/18 15:39:42 [0x400015ea80] stream 1 not found)
sctp ERROR: 2022/11/18 15:39:42 [0x400015ea80] stream 1 not found)
sctp ERROR: 2022/11/18 15:39:42 [0x400015ea80] stream 1 not found)
sctp ERROR: 2022/11/18 15:39:42 [0x400015ea80] stream 1 not found)

As one can see, these lines do not start with a timestamp

Would you please fix this script?

@MariusHerget
Copy link
Author

@MBurchard I added a workaround which fakes that a date got recognized. The lines which produce this kind of error will be filtered out later.

According to my quick testing it should work for your case now (and only prints out the error)

@MBurchard
Copy link

@MariusHerget Works like a charm, thank you for this extrem quick response 👍

python analyze_snowflake_logs.py /var/log/snowflake.log
time data 'sctp ERROR: 2022/11' does not match format '%Y/%m/%d %H:%M:%S'
time data 'sctp ERROR: 2022/11' does not match format '%Y/%m/%d %H:%M:%S'
time data 'sctp ERROR: 2022/11' does not match format '%Y/%m/%d %H:%M:%S'
time data 'sctp ERROR: 2022/11' does not match format '%Y/%m/%d %H:%M:%S'
time data 'sctp ERROR: 2022/11' does not match format '%Y/%m/%d %H:%M:%S'
time data 'sctp ERROR: 2022/11' does not match format '%Y/%m/%d %H:%M:%S'
time data 'sctp ERROR: 2022/11' does not match format '%Y/%m/%d %H:%M:%S'
time data 'sctp ERROR: 2022/11' does not match format '%Y/%m/%d %H:%M:%S'
time data 'sctp ERROR: 2022/11' does not match format '%Y/%m/%d %H:%M:%S'
time data 'sctp ERROR: 2022/11' does not match format '%Y/%m/%d %H:%M:%S'
time data 'sctp ERROR: 2022/11' does not match format '%Y/%m/%d %H:%M:%S'
time data 'sctp ERROR: 2022/11' does not match format '%Y/%m/%d %H:%M:%S'
time data 'sctp ERROR: 2022/11' does not match format '%Y/%m/%d %H:%M:%S'
time data 'sctp ERROR: 2022/11' does not match format '%Y/%m/%d %H:%M:%S'
[All time ] Served 26 People with ↑ 0.1473 GB, ↓ 0.0159 GB
[Last 24h ] Served 26 People with ↑ 0.1473 GB, ↓ 0.0159 GB
[Last Week] Served 26 People with ↑ 0.1473 GB, ↓ 0.0159 GB

@MrDrache333
Copy link

Ive created a docker-compose and Prometheus exporter based on your Script. Please have a look. Ive mentioned you as the Author at the Bottom of the ReadMe. https://github.com/MrDrache333/snowflake-prometheus-exporter

@MariusHerget
Copy link
Author

Ive created a docker-compose and Prometheus exporter based on your Script. Please have a look. Ive mentioned you as the Author at the Bottom of the ReadMe. https://github.com/MrDrache333/snowflake-prometheus-exporter

Uh very cool - will check it out in the next days! And thanks for the attribution <3

@Pietro395
Copy link

snowflake-proxy | 2024/04/25 08:57:22 open /shared/docker_snowflake.log: permission denied

I am having the permission denied error in the log folder, any ideas?

@MariusHerget
Copy link
Author

snowflake-proxy | 2024/04/25 08:57:22 open /shared/docker_snowflake.log: permission denied

I am having the permission denied error in the log folder, any ideas?

Hey have you checked your docker configuration and made sure to add the volume (see my docker compose file above, lines 15/16)? and have you checked whether the local folder you are referencing there is writeable for docker?

@Pietro395
Copy link

snowflake-proxy | 2024/04/25 08:57:22 open /shared/docker_snowflake.log: permission denied
I am having the permission denied error in the log folder, any ideas?

Hey have you checked your docker configuration and made sure to add the volume (see my docker compose file above, lines 15/16)? and have you checked whether the local folder you are referencing there is writeable for docker?

Thank you, sorry :)

@MariusHerget
Copy link
Author

snowflake-proxy | 2024/04/25 08:57:22 open /shared/docker_snowflake.log: permission denied
I am having the permission denied error in the log folder, any ideas?

Hey have you checked your docker configuration and made sure to add the volume (see my docker compose file above, lines 15/16)? and have you checked whether the local folder you are referencing there is writeable for docker?

Thank you, sorry :)

No worries! I hope it works now :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment