Skip to content

Instantly share code, notes, and snippets.

@dklassen
Created January 25, 2013 15:23
Show Gist options
  • Save dklassen/4635210 to your computer and use it in GitHub Desktop.
Save dklassen/4635210 to your computer and use it in GitHub Desktop.
Jruby example of using the Chemistry Development Kit (CDK) to parse smiles strings into a canonical format
# Jruby example using the chemistry development kit(CDK) to parse smiles strings into a
# canonical format
# author : Dana klassen
require 'rubygems'
include Java
# $CLASSPATH << File.join(File.expand_path(File.dirname(__FILE__)),"lib")
Dir.glob(File.join(File.expand_path(File.dirname(__FILE__)),"lib/*.jar")) { |file| require file }
import 'org.openscience.cdk.smiles.SmilesParser'
import 'org.openscience.cdk.DefaultChemObjectBuilder'
import 'org.openscience.cdk.smiles.SmilesGenerator'
parser = SmilesParser.new(DefaultChemObjectBuilder.getInstance())
generator = SmilesGenerator.new()
# examples smiles strings we are going to normalize
# caffeine :)
smiles1 = "CN2C(=O)N(C)C(=O)C1=C2N=CN1C"
smiles2 = "CN1C=NC2=C1C(=O)N(C)C(=O)N2C"
puts "INFO: inputting two smiles strings:"
puts " #{smiles1}"
puts " #{smiles2}"
smiles1 = generator.createSMILES(parser.parseSmiles(smiles1))
smiles2 = generator.createSMILES(parser.parseSmiles(smiles2))
puts "INFO: outputting canonical smiles from input:"
puts " #{smiles1}"
puts " #{smiles2}"
puts "INFO: asserting these smiles are identical"
puts "true" if (smiles1 == smiles2)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment