Skip to content

Instantly share code, notes, and snippets.

View geocarvalho's full-sized avatar
🐼

George Carvalho geocarvalho

🐼
View GitHub Profile
import pandas as pd
file = "input.bed"
df = pd.read_csv(file, sep="\t", names=["chr", "start", "end", "interval", "score", "strand"])
df[["gene", "extra"]] = df["interval"].str.split("_", 1, expand=True)
df.drop(["interval", "score", "strand", "extra"], axis=1, inplace=True)
new_df = df.groupby("gene").agg({"chr":"unique", "start":min, "end":max})
new_df.reset_index(inplace=True)
new_df["chr"] = new_df["chr"].apply(lambda chr: chr[0])
new_df["start"] = new_df["start"].astype("str")
@geocarvalho
geocarvalho / awesome_bioinformatics.md
Last active March 24, 2023 23:27
Learn clinical bioinformatics: List of my recommendations to study genomics bioinformatics and clinical bioinformatics

Learn Clinical Bioinformatics 📚

  • My personal list of recommendations of resources to study genomics bioinformatics and clinical bioinformatics

Courses

  • This is just the best open source course I could find until now.
@geocarvalho
geocarvalho / CHANGELOG.md
Created June 9, 2023 22:23 — forked from juampynr/CHANGELOG.md
Sample CHANGELOG

Change Log

All notable changes to this project will be documented in this file.

The format is based on Keep a Changelog and this project adheres to Semantic Versioning.

[Unreleased] - yyyy-mm-dd

Here we write upgrading notes for brands. It's a team effort to make them as