Skip to content

Instantly share code, notes, and snippets.

View mikkohei13's full-sized avatar

Mikko Heikkinen mikkohei13

View GitHub Profile
@mikkohei13
mikkohei13 / atlas_winter.py
Created March 14, 2025 18:00
Fetches atlas observations from winter period which can affect squre results
# Script that fetches winter period atlas observations and checks which of them might affect atlas results.
import os
import requests
import json
import time
from datetime import datetime
# Fetch atlas observations
@mikkohei13
mikkohei13 / augmentation.py
Last active March 8, 2025 19:40
Image augmentation for machine learning
# Image augmentation for machine learning
# Script that loops through images in subdirectories and replaces the background using rembg and does transformations to selected number of images in each directory
import rembg
from pathlib import Path
import random
from PIL import Image, ImageEnhance, ImageFilter
import numpy as np
import gc
import time
@mikkohei13
mikkohei13 / convert_to_flac.ps1
Created February 26, 2025 15:10
PowerShell script that converts all wav files in a folder (and subfolders) into flac format.
# Check if FFmpeg is installed
try {
ffmpeg -version | Out-Null
} catch {
Write-Error "FFmpeg is not installed or not in PATH. Please install FFmpeg first."
exit 1
}
# Allow user to specify the root directory or use current directory as default
$rootDirectory = Read-Host "Enter the root directory path (press Enter to use current directory)"
# PowerShell script for Windows.
# This script finds directories with 10 or more photos in them,
# matching the patterns used by digital camera: IMG_, DSC_, or DSCN.
param(
[Parameter(Mandatory=$true)]
[string]$DirectoryPath
)
# Verify the directory exists
# Join en and sv names from Syke's file to original Lely habitat file
# Habitat file, tab separated values
habitat_file = 'habitat_classification_v1.0.tsv'
"""
Example data of habitat_classification_v1.0.tsv:
Enum Vihkon elinympäristö 1.taso 2. taso 3. taso Swedish English
MY.habitatEnumValue1 Metsät Metsä Skog Forest
# Python script that creates a CSV file with image file paths and their corresponding categories
# Should work on both Linux & Windows, and with unicode filenames
'''
Input format:
directory;category
./lepidoptera/adult;adult
./lepidoptera/adult_specimen;adult_specimen
./lepidoptera/egg;egg
@mikkohei13
mikkohei13 / convert.py
Last active January 7, 2025 13:55
Convert Luke atlas dataset 2024 to FinBIF Data Bank format
'''
Converts "Luken aineistokooste 4. lintuatlakseen" tsv file into FinBIF Data Bank secondary data format.
Mikko Heikkinen 2023-12-29, updated 2025-01-07
'''
import pandas as pd
# Save file from Excel as UTF-8 CSV
# Load the file into a Pandas dataframe using tab as the delimiter. Keep "NA" as a value.
file_path = 'Luke_lintuatlasdata_2023-2024.csv'
@mikkohei13
mikkohei13 / parse_chatgpt_conversations.py
Created December 14, 2024 15:38
Script to parse ChatGPT conversations.json file to analyze how many messages were sent by ChatGPT and users, and how many conversations were created per year and month.
# Script to parse ChatGPT conversations.json file to analyze how many messages were sent by ChatGPT and users, and how many conversations were created per year and month.
# conversations.json is a JSON file containing a list of conversations. You can get this by starting a data export in ChatGPT settings.
import json
from collections import defaultdict
from datetime import datetime
# Load the JSON file
file_path = "conversations.json"
with open(file_path, "r") as file:
@mikkohei13
mikkohei13 / lajifi-monthly.py
Last active December 5, 2024 10:17
Generate a line graph from a FinBIF API json data file aggregated by year and month
# Script to read FinBIF API json data file aggregated by year and month, and to generate line graph.
# Get data e.g. at
# https://api.laji.fi/v0/warehouse/query/unit/aggregate?aggregateBy=gathering.conversions.month%2Cgathering.conversions.year&onlyCount=true&taxonCounts=false&gatheringCounts=false&pairCounts=false&atlasCounts=false&excludeNulls=true&pessimisticDateRangeHandling=false&pageSize=1000&page=1&cache=false&useIdentificationAnnotations=true&includeSubTaxa=true&includeNonValidTaxa=true&time=2000%2F2024&collectionId=HR.3551&individualCountMin=1&qualityIssues=NO_ISSUES&access_token=TOKEN
import json
import pandas as pd
import matplotlib.pyplot as plt
from collections import defaultdict
'''
@mikkohei13
mikkohei13 / compare.py
Created November 14, 2024 14:52
Compares id's from two files and outputs the id's that are in the first file but not in the second file
# Compares id's from two files and outputs the id's that are in the first file but not in the second file
import pandas as pd
def read_column_to_list(filename, column_name, separator=','):
"""
Reads a single column from a large CSV or TSV file into a list.
:param filename: The path to the CSV or TSV file.
:param column_name: The name of the column to read (default is "id").