project description
list of required packages or softwares, versions of systems
(Note. If you missed the "Python Primer" lab and you have no Python experience, please come to Jiho's office hours. We can do this together.)
Fibonacci numbers are defined recursively as below:
f(0) = 0, f(1) = 1
f(n) = f(n-1) + f(n-2)
Implement a fibonacci function fib(n)
, and use it to print out the Fibonacci numbers where n
ranges from 0 to 9 such as:
""" UTS (UMLS Terminology Services) API client """ | |
import json | |
from pathlib import Path | |
import requests | |
from lxml.html import fromstring | |
class UtsClient: | |
"""All the UTS REST API requests are handled through this client""" | |
def __init__(self, apikey=None): |
#!/usr/bin/env python3 | |
# pylint: disable=invalid-name | |
"""Reads UMNSRS datasets where CUIs are used, add corresponding MeSHes to the | |
CUIs""" | |
from pathlib import Path | |
import csv | |
from tqdm import tqdm | |
from BMET.uts_api_client import UtsClient |
#!/usr/bin/env python3 | |
""" | |
Parse PubTator raw data file, and creates ElasticSearch index, and populate the | |
corpus (index: 'pubtator') | |
You can download PubTator dataset from the following link: | |
ftp://ftp.ncbi.nlm.nih.gov/pub/lu/PubTator/bioconcepts2pubtator_offsets.gz | |
Place the file in 'data/' and run this code. Don't need to decompress the file. | |
We assume that ElasticSearch is properly installed and accessible via its API |
#!/usr/bin/env python3 | |
"""Preprocess PubTator corpus and ScopeNotes of MeSH descriptors for language | |
model training (LmBMET). | |
1) Given the original PubTator biocepts annotated documents, this interpolates | |
the concept codes into document texts. Before that, this will count word | |
frequencies and generate vocabulary which will include the entire set of | |
bioconcepts (MeSH in particular). In case that a pre-trained embeddings file | |
(.vec) is provided, we obtain a vocabulary from the embeddings. |
import csv | |
import sqlite3 | |
eval_file = "data/eval/MayoSRS_mesh.csv" | |
db_file = "data/pubtator/pubtator-20190725-6496be10.db" | |
words = [] | |
with open(eval_file) as f: | |
csv_reader = csv.DictReader(f) | |
for row in csv_reader: |
instruction note (lab8)
contact.h
from contactList.h
. Use double quotation marks#pragma once
tells the compiler to include the source code only oncestruct type_name {
int var1, var2;
Easiest so far!!!