Skip to content

Instantly share code, notes, and snippets.

@egonw
Created January 13, 2016 21:11
Show Gist options
  • Save egonw/588d237c27ecd90fa930 to your computer and use it in GitHub Desktop.
Save egonw/588d237c27ecd90fa930 to your computer and use it in GitHub Desktop.
WARNING: old DATASOURCENAME 'HMDB36CHEBI' doesn't match new DATASOURCENAME 'HMDB-CHEBI-WIKIDATA'
INFO: Lm is only in new database
INFO: Number of ids in Lm: 2565
INFO: Number of ids in Wd (Wikidata): 21293 (3208 added, 152 removed -> overall changed +16.8%)
INFO: Number of ids in Ck (KEGG Compound): 15941 (51 added, 6 removed -> overall changed +0.3%)
INFO: Number of ids in Wi (Wikipedia): 4731 (745 added, 5 removed -> overall changed +18.5%)
INFO: Number of ids in Kd (KEGG Drug): 2006 (270 added, 670 removed -> overall changed -16.6%)
WARNING: Number of ids in Kd has shrunk by more than 10%
INFO: Ik is only in new database
INFO: Number of ids in Ik: 41392
INFO: Number of ids in Ce (ChEBI): 67515 (3343 added, 50 removed -> overall changed +5.1%)
INFO: Number of ids in Cpc (PubChem-compound): 30374 (1489 added, 285 removed -> overall changed +4.1%)
INFO: Number of ids in Cs (Chemspider): 24891 (1049 added, 139 removed -> overall changed +3.8%)
INFO: Number of ids in Ch (HMDB): 41514 (0 added, 0 removed -> overall changed +0.0%)
INFO: Cks is only in new database
INFO: Number of ids in Cks: 4319
INFO: Number of ids in Ca (CAS): 41565 (3066 added, 122 removed -> overall changed +7.6%)
INFO: Attribute provided: Monoisotopic Weight
INFO: Attribute provided: InChIKey
INFO: Attribute provided: Symbol
INFO: Attribute provided: BrutoFormula
INFO: Attribute provided: InChI
INFO: Attribute provided: Synonym
INFO: Attribute provided: SMILES
INFO: new size is 188 Mb (changed +15.6%)
ERROR: 3991/3991 (100%) ids do not match expected pattern for Wikipedia
ERROR: expected pattern is ''
ERROR: aberrant ids are e.g. '1,2,4-Trimethylbenzene', '1,2,4-Trichlorobenzene', '1,1,1-Trichloroethane', '(Hydroxyethyl)methacrylate', '(S)-tetrahydroprotoberberine_N-methyltransferase', '1,1,2,2-Tetrachloroethane', '1,1-Dichloroethene', '1,2,4-Trihydroxyanthraquinone', '1,1,2-Trichloroethane', '1,2,3-Trimethylbenzene'
ERROR: 4816/64222 (7%) ids do not match expected pattern for ChEBI
ERROR: expected pattern is '^CHEBI:\d+$'
ERROR: aberrant ids are e.g. '100207', '1001644', '1002414', '1001275', '100241', '1000694', '10033', '100147', '10023', '100246'
WARNING: 27/38621 (0%) ids do not match expected pattern for CAS
WARNING: expected pattern is '^\d{1,7}\-\d{2}\-\d$'
WARNING: aberrant ids are e.g. '1952-41-6 15345-89-8', '15442-64-5 137090-60-9', '1405-86-3 103000-77-7', '20283-92-5 537-15-5', '10418-03-8 302-96-5', '11032-49-8 1182-68-9', '2180-92-9 38396-39-3', '14897-06-4 724691-52-5', '126-22-7:', '20369-67-9 708964-46-9'
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment