-
-
Save dceoy/99d976a2c01e7f0ba1c813778f9db744 to your computer and use it in GitHub Desktop.
#!/usr/bin/env python | |
import io | |
import os | |
import pandas as pd | |
def read_vcf(path): | |
with open(path, 'r') as f: | |
lines = [l for l in f if not l.startswith('##')] | |
return pd.read_csv( | |
io.StringIO(''.join(lines)), | |
dtype={'#CHROM': str, 'POS': int, 'ID': str, 'REF': str, 'ALT': str, | |
'QUAL': str, 'FILTER': str, 'INFO': str}, | |
sep='\t' | |
).rename(columns={'#CHROM': 'CHROM'}) |
Really convenient!
Oh thank you
Hi,
Thank you so much for this script! I am trying to run this script on a vcf file.
Do you run the script like this "python read_vcf.py vcf_filename" ?
Thanks!
I developed pdbio
package. Please use it. @pdorsaint
https://github.com/dceoy/pdbio
This package is a Pandas-based data handling tool and supports the use from a command-line.
Example of VCF data handling:
$ pdbio vcf2csv --tsv ./test/example.vcf
a way of doing it that will use all fields on any vcf using pyvcf https://pyvcf.readthedocs.io/en/v0.4.6/INTRO.html
import pandas as pd
import vcf
def read(f):
reader = vcf.Reader(open(f))
df = pd.DataFrame([vars(r) for r in reader])
out = df.merge(pd.DataFrame(df.INFO.tolist()),
left_index=True, right_index=True)
return out
run read(your_vcf)
If anyone's interested, I was looking for a way to do this too and ended up writing the pyvcf
submodule:
A quick example of pyvcf.VcfFrame
:
data = {
'CHROM': ['chr1', 'chr2'],
'POS': [100, 101],
'ID': ['.', '.'],
'REF': ['G', 'T'],
'ALT': ['A', 'C'],
'QUAL': ['.', '.'],
'FILTER': ['.', '.'],
'INFO': ['.', '.'],
'FORMAT': ['GT', 'GT'],
'Steven': ['0/1', '1/1']
}
vf = pyvcf.VcfFrame.from_dict([], data)
vf.df
CHROM POS ID REF ALT QUAL FILTER INFO FORMAT Steven
0 chr1 100 . G A . . . GT 0/1
1 chr2 101 . T C . . . GT 1/1
To read a VCF file into VcfFrame:
vf = pyvcf.VcfFrame.from_file('example.vcf')
This was so so useful. Thank you very much @dceoy
It works great. Thanks
Hi,
Did you find a solution for not finding the result after you use the python script ? I am facing the same issue
This was all I need for now. Thank you very much!! :)
That was indeed usefull! Thank you very much!!
Nice. Very useful.