This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
IChemObjectBuilder bldr = SilentChemObjectBuilder.getInstance(); | |
IAtomContainer mol = new QueryAtomContainer(bldr); | |
if (!Smarts.parse(mol, "[C,N;H0,H1+]-*", Smarts.FLAVOR_LOOSE)) { | |
System.err.println("ERROR - " + Smarts.getLastErrorMesg()); | |
System.err.println(Smarts.getLastErrorLocation()); | |
return; | |
} | |
QueryAtom qatom1 = (QueryAtom) mol.getAtom(0); // instanceof for safety | |
Expr expr = qatom1.getExpression(); | |
System.err.println(expr); // AND(OR(ALIPHATIC_ELEMENT=6,ALIPHATIC_ELEMENT=7),OR(TOTAL_H_COUNT=0,AND(TOTAL_H_COUNT=1,FORMAL_CHARGE=1))) |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
curl ftp://ftp.ncbi.nlm.nih.gov/pubchem/Compound/Weekly/2020-01-05/Extras/CID-SMILES.gz | \ | |
gunzip -c | \ | |
head -n 10000000 | \ | |
awk '{print $2 " " $1}' > pubchem_first10m.smi |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
elements = {'H': 1, 'He': 2, etc} | |
def strtoelem(element): | |
return elemements.get(element, 0) |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
public static void alignMoleculeToSubstructure(IAtomContainer mol, | |
IAtomContainer sub, | |
boolean fixBonds) throws CDKException { | |
Pattern substructurePattern = Pattern.findSubstructure(sub); | |
Mappings mappings = substructurePattern.matchAll(mol); | |
Set<IAtom> fixedAtoms = new HashSet<IAtom>(); | |
Set<IBond> fixedBonds = new HashSet<IBond>(); | |
for (Map<IChemObject, IChemObject> map : mappings.toAtomBondMap()) { | |
GeometryUtil.scaleMolecule(sub, |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
public static void main(String[] args) throws CDKException { | |
IChemObjectBuilder bldr = SilentChemObjectBuilder.getInstance(); | |
SmilesParser smipar = new SmilesParser(bldr); | |
IAtomContainer root = smipar.parseSmiles("CC1=CC=NC2=C1C=CC(=C2)C1=CC([R1])=C([R2])C([R3])=C1"); | |
Map<IAtom, Map<Integer,IBond>> rootAttach = new HashMap<>(); | |
Map<Integer,RGroupList> rgrpMap = new HashMap<>(); | |
defineRgroup(root, rootAttach, rgrpMap, "R1", newRGroupList("[H].[CH2]CO.[CH2]Cl", 1)); | |
defineRgroup(root, rootAttach, rgrpMap, "R2", newRGroupList("[H].[CH2]CN.[CH2]F", 2)); | |
defineRgroup(root, rootAttach, rgrpMap, "R3", newRGroupList("[H].[CH2]CCl.[CH2]F", 3)); |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Building on Sand: Standard InChIs on Non-Standard Molfiles | |
John Mayfield | |
The molfile serves as a de facto standard for chemical information exchange. It is perhaps the most | |
widely supported format with its core syntax being easy to understand, parse, and generate. Beyond | |
the core syntax, more advanced features such as sgroups and enhanced stereochemistry are rarely | |
supported, often only being partially implemented and used. Additionally, several vendors, | |
toolkits, and service providers have added extended syntaxto their molfiles to solve particular | |
corner cases or representation problems.This talk will provide a brief summary of the less widely | |
supported features of the molfile including sgroups and enhanced stereochemistry. Additionally, |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
public static String toSmiles(CircularFingerprinter.FP fp, IAtomContainer mol) throws CDKException | |
{ | |
IAtomContainer part = mol.getBuilder().newAtomContainer(); | |
Set<IAtom> aset = new HashSet<>(); | |
int[] hcounts = new int[mol.getAtomCount()]; | |
for (int idx : fp.atoms) { | |
IAtom atom = mol.getAtom(idx); | |
aset.add(atom); | |
part.addAtom(atom); | |
hcounts[idx] = atom.getImplicitHydrogenCount(); |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Aromaticity arom = new Aromaticity(ElectronDontation.cdk(), | |
Cycles.cdkAromaticSet()); |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
SmilesParser smipar = new SmilesParser(SilentChemObjectBuilder.getInstance()); | |
String smi = "*CCO*"; | |
IAtomContainer mol = smipar.parseSmiles(smi); | |
Sgroup sgrp = new Sgroup(); | |
sgrp.addAtom(mol.getAtom(1)); | |
sgrp.addAtom(mol.getAtom(2)); | |
sgrp.addAtom(mol.getAtom(3)); | |
sgrp.addBond(mol.getBond(0)); // bond crossing bracket (xbond) |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
public static void main(String[] args) throws CDKException, IOException { | |
String rxnfile = "$RXN\n" | |
+ "\n" | |
+ "\n" | |
+ "\n" | |
+ " 2 1 0\n" | |
+ "$MOL\n" | |
+ "\n" | |
+ " Ketcher 01271718382D 1 1.00000 0.00000 0\n" | |
+ "\n" |