George Carvalho geocarvalho

WDL tutorials

Introducing the Learn WDL Course

My personal list of recommendations of resources to study genomics bioinformatics and clinical bioinformatics

All notable changes to this project will be documented in this file.

The format is based on Keep a Changelog and this project adheres to Semantic Versioning.

Here we write upgrading notes for brands. It's a team effort to make them as

	import pandas as pd

	file = "input.bed"
	df = pd.read_csv(file, sep="\t", names=["chr", "start", "end", "interval", "score", "strand"])
	df[["gene", "extra"]] = df["interval"].str.split("_", 1, expand=True)
	df.drop(["interval", "score", "strand", "extra"], axis=1, inplace=True)
	new_df = df.groupby("gene").agg({"chr":"unique", "start":min, "end":max})
	new_df.reset_index(inplace=True)
	new_df["chr"] = new_df["chr"].apply(lambda chr: chr[0])
	new_df["start"] = new_df["start"].astype("str")