OpenBabelDocumentation,Release2.3.1
$obMol->DeleteHydrogens($c2Atom);
$obMol->DeleteHydrogens($c5Atom);
$c2Atom->SetAtomicNum(1);
$c5Atom->SetAtomicNum(1);
$obConversion->WriteFile($obMol"$filename.mol");
7.6 CSharpandOBDotNet
OBDotNetisacompiledassemblythatallowsOpenBabeltobeusedfromthevarious.NETlanguages(e.g. Visual
Basic,C#,IronPython,IronRuby,andJ#)onWindows,LinuxandMacOSX.ThecurrentversionisOBDotNet0.4.
7.6.1 Installation
Windows
TheOBDotNet.dllassemblyprovidedonWindowswascompiledusingthe.NETframeworkv3.5forthex86
platform.Touseit,youwillneedtocompileyourcodeusing.NETv3.5ornewerandyouwillalsoneedtotargetx86
(/platform:x86).
ThefollowinginstructionsdescribehowtocompileasimpleC#programthatusesOBDotNet:
1. FirstyouneedtodownloadandinstalltheOpenBabelGUIversion2.3.1
2. NextcreateanexampleCSharpprogramthatusestheOpenBabelAPI(seebelowforoneorusethislink).Let’s
callthisexample.cs.
3. CopyOBDotNet.dllfromtheOpenBabelinstallationintothesamefolderasexample.cs.
4. Openacommandpromptatthelocationofexample.csandcompileitasfollows:
C:\Work> C:\Windows\Microsoft.NET\Framework\v3.5\csc.exe
/reference:OBDotNet.dll /platform:x86 example.cs
5. Runthecreatedexecutable,example.exe,todiscoverthemoleculeweightofpropane:
C:\Work> example.exe
44.09562
IfyouprefertousetheMSVC#GUI,notethattheExpresseditiondoesnothavetheoptiontochoosex86asa
target.Thiswillbeaproblemifyouareusinga64-bitoperatingsystem.There’ssomeinformationatCoffeeDriven
Developmentonhowtogetaroundthis.
MacOSXandLinux
OnLinuxandMacOSXyouneedtouseMono,theopensourceimplementationofthe.NETframework,tocompile
thebindings.Thefollowinginstructionsdescribehowtocompileandusethesebindings:
1. OBDotNet.dllisincludedintheOpenBabelsourcedistributioninscripts/csharp. . Tocompilea
CSharpapplicationthatusesthis (e.g. . theexampleprogramshownbelow), , useacommandsimilartothe
following:
gmcs example.cs /reference:../openbabel-2.3.1/scripts/csharp/OBDotNet.dll
7.6. CSharpandOBDotNet
75
Pdf split file - Split, seperate PDF into multiple files in C#.net, ASP.NET, MVC, Ajax, WinForms, WPF
Explain How to Split PDF Document in Visual C#.NET Application
break a pdf into separate pages; break pdf into separate pages
Pdf split file - VB.NET PDF File Split Library: Split, seperate PDF into multiple files in vb.net, ASP.NET, MVC, Ajax, WinForms, WPF
VB.NET PDF Document Splitter Control to Disassemble PDF Document
pdf will no pages selected; a pdf page cut
OpenBabelDocumentation,Release2.3.1
2. TorunthisonMacOSXorLinuxyouneedtocompiletheCSharpbindingsasdescribedinthesectionCompile
languagebindings.Thiscreateslib/libopenbabel_csharp.sointhebuilddirectory.
3. Add d the location of f OBDotNet.dll to the environment t variable e MONO_PATH. . Add d the location of
libopenbabel_csharp.sototheenvironmentvariableLD_LIBRARY_PATH.Additionally,ifyouhave
notinstalledOpenBabelgloballyyoushouldsetBABEL_LIBDIRtothelocationoftheOpenBabellibraryand
BABEL_DATADIRtothedatadirectory.
4. Runexample.exe:
$ ./example.exe
44.09562
7.6.2 OBDotNetAPI
TheAPIisalmostidenticaltotheOpenBabelC++API.Differencesaredescribedhere.
Usingiterators
InOBDotNet,iteratorsareprovidedasmethodsoftherelevantclass.Thefulllistisasfollows:
• OBMolhas.Atoms(),.Bonds(),.Residues(),and.Fragments(). ThesecorrespondtoOBMo-
lAtomIter,OBMolBondIter,OBResidueIterandOBMolAtomDFSIterrespectively.
• OBAtom m has s .Bonds() ) and .Neighbours().
These correspond d to OBAtomBondIter and
OBAtomAtomIterrespectively.
Suchiteratorsareusedasfollows:
foreach (OBAtom atom in myobmol.Atoms())
System.Console.WriteLine(atom.GetAtomType());
OtheriteratorsintheC++APInotlistedabovecanstillbeusedthroughtheirIEnumeratormethods.
HandlingOBGenericData
TocastOBGenericDatatoaspecificsubclass,youshouldusethe.Downcast <T>method,whereTisasubclass
ofOBGenericData.
OpenBabelConstants
OpenBabelconstantsareavailableintheclassopenbabelcsharp.
7.6.3 Examples
ThefollowingsectionsshowhowthesameexampleapplicationwouldbeprogrammedinC#,VisualBasicandIron-
Python.Theprogramsprintoutthemolecularweightofpropane(representedbytheSMILESstring“CCC”).
C#
76
Chapter7. WritesoftwareusingtheOpenBabellibrary
Online Split PDF file. Best free online split PDF tool.
Split PDF file. Just upload your file by clicking on the blue button or drag-and-drop your PDF file into the drop area. Then set your PDF file split settings.
split pdf into individual pages; cannot print pdf no pages selected
VB.NET PDF File Compress Library: Compress reduce PDF size in vb.
Also able to uncompress PDF file in VB.NET programs. Offer flexible and royalty-free developing library license for VB.NET programmers to compress PDF file.
break pdf into pages; break password on pdf
OpenBabelDocumentation,Release2.3.1
using System;
using OpenBabel;
namespace MyConsoleApplication
{
class Program
{
static void Main(string[] args)
{
OBConversion obconv v = new OBConversion();
obconv.SetInFormat("smi");
OBMol mol l = new OBMol();
obconv.ReadString(mol, "CCC");
System.Console.WriteLine(mol.GetMolWt());
}
}
}
VisualBasic
Imports OpenBabel
Module Module1
Sub Main()
Dim OBConv As s New OBConversion()
Dim Mol As New OBMol()
OBConv.SetInFormat("smi")
OBConv.ReadString(Mol, "CCC")
System.Console.Write("The molecular r weight of f propane e is s " Mol.GetMolWt())
End Sub
End Module
IronPython
import clr
clr.AddReference("OBDotNet.dll")
import OpenBabel as ob
conv ob.OBConversion()
conv.SetInFormat("smi")
mol ob.OBMol()
conv.ReadString(mol, "CCC")
print mol.GetMolWt()
7.7 Ruby
Aswiththeotherlanguagebindings, justfollowtheinstructionsatCompilelanguagebindingstobuildtheRuby
bindings.
7.7. Ruby
77
C# PDF File & Page Process Library SDK for C#.net, ASP.NET, MVC
Well-designed APIs are provided. Splitting PDF File. If you want to split PDF file into two or small files, you may refer to this online guide.
pdf link to specific page; pdf split
VB.NET PDF File Merge Library: Merge, append PDF files in vb.net
Professional VB.NET PDF file merging SDK support Visual Studio .NET. Merge PDF without size limitation. Append one PDF file to the end of another one in VB.NET.
can't select text in pdf file; break apart pdf pages
OpenBabelDocumentation,Release2.3.1
LikeanyRubymodule,theOpenBabelbindingscanbeusedfromaRubyscriptorinteractivelyusingirbasfollows:
$ irb
irb(main):001:0> require ’openbabel’
=> true
irb(main):002:0> c=OpenBabel::OBConversion.new
=> #<OpenBabel::OBConversion:0x2acedbadd020>
irb(main):003:0> c.set_in_format ’smi’
=> true
irb(main):004:0> benzene=OpenBabel::OBMol.new
=> #<OpenBabel::OBMol:0x2acedbacfa10>
irb(main):005:0> c.read_string benzene, ’c1ccccc1’
=> true
irb(main):006:0> benzene.num_atoms
=> 6
78
Chapter7. WritesoftwareusingtheOpenBabellibrary
C# PDF File Compress Library: Compress reduce PDF size in C#.net
Reduce image resources: Since images are usually or large size, images size reducing can help to reduce PDF file size effectively.
pdf will no pages selected; pdf print error no pages selected
C# PDF File Merge Library: Merge, append PDF files in C#.net, ASP.
Professional C#.NET PDF SDK for merging PDF file merging in Visual Studio .NET. Append one PDF file to the end of another and save to a single PDF file.
cannot print pdf no pages selected; pdf split pages
Chapter
8
Cheminformatics101
Anintroductiontothecomputerscienceandchemistryofchemicalinformationsystems
Copyright©2009byCraigA.James,eMolecules,Inc.
TheoriginalversionofthisintroductiontocheminformaticscanbefoundontheeMoleculeswebsite. Itisincluded
herewiththepermissionoftheauthor.
8.1 CheminformaticsBasics
8.1.1 WhatisCheminformatics?
CheminformaticsisacrossbetweenComputerScienceandChemistry–theprocessofstoringandretrievinginforma-
tionaboutchemicalcompounds.
InformationSystemsareconcernedwithstoring,retrieving,andsearchinginformation,andwithstoringrelationships
betweenbitsofdata.Forexample:
79
C# Word - Split Word Document in C#.NET
C# DLLs: Split Word File. Add references: RasterEdge.Imaging.Basic.dll. using RasterEdge.XDoc.Word; Split Word file into two files in C#.
break pdf into pages; reader split pdf
C# PowerPoint - Split PowerPoint Document in C#.NET
File: Split PowerPoint Document. |. Home ›› XDoc.PowerPoint ›› C# PowerPoint: Split PowerPoint Document. Split PowerPoint file into two files in C#.
break pdf password online; c# print pdf to specific printer
OpenBabelDocumentation,Release2.3.1
Op-
era-
tion
Classical
Information
System
ChemicalInformationSystem
Store
Name=
‘Jimmy
Carter’
Storestext,
numbers,dates,
...
Storeschemicalcompounds
andinformationaboutthem
Re-
trieve
Findrecord
#13282
Retrieves
‘JimmyCarter’
Find
CC(=O)C4CC3C2CC(C)C1=C(C)...
C(=O)CC(O)C1C2CCC3(C)C4
Retrieves:
Search
Find
Presidents
named‘Bush’
GeorgeBush
andGeorgeW.
Bush
Findmoleculescontaining
Retrieves:
Rela-
tion-
ship
YearCarter
waselected
Answer:
Electedin1976
What’sthelogP(o/w)of
Answer:logP(o/W)=2.62
8.1.2 HowisCheminformaticsDifferent?
Therearefourkeyproblemsacheminformaticssystemsolves:
1. StoreaMolecule
Computerscientistsusuallyusethevalencemodelofchemistrytorepresentcompounds. Thenextsection
RepresentingMolecules,discussesthisatlength.
2. Findexactmolecule
Ifyouask,“IsAbrahamLincolninthedatabase?”it’snothardtofindtheanswer.But,givenaspecificmolecule,
isitinthedatabase?Whatdoweknowaboutit?Thismayseemseemsimpleatfirstglance,butit’snot,aswe’ll
seewhenwediscusstautomers,stereochemistry,metals,andother“flaws”inthevalencemodelofchemistry.
3. Substructuresearch
Ifyouask,“IsanyonenamedLincolninthedatabase?”youusuallyexpecttofindtheformerPresidentanda
numberofothers-thisiscalledasearchratherthanalookup. Forachemicalinformaticssystem,wehavea
substructuresearch:Findallmoleculescontainingapartialmolecule(the“substructure”)drawnbytheuser.The
substructureisusuallyafunctionalgroup,“scaffold”,orcorestructurerepresentingaclassofmolecules.This
80
Chapter8. Cheminformatics101
OpenBabelDocumentation,Release2.3.1
tooisahardproblem,muchharderthanmosttextsearches,forreasonsthatgototheveryrootofmathematics
andthetheoryofcomputability.
4. Similaritysearch
Somedatabases canfindsimilar-soundingormisspelledwords, suchas“FindLincon”or“findCincinati”,
whichrespectivelymightfindAbrahamLincolnandCincinnati. Manychemicalinformationsystemscanfind
moleculessimilartoagivenmolecule, rankedbysimilarity. . Thereareseveralwaystomeasuremolecular
similarity,discussedfurtherinthesectiononMolecularSimilarity.
8.2 RepresentingMolecules
8.2.1 WhatisaMolecule?
Oneofthegreatestachievements inchemistrywas thedevelopmentofthevalencemodelofchemistry, wherea
moleculeisrepresentedasatomsjoinedbysemi-rigidbondsthatcanbesingle,double,ortriple. Thissimplemental
modelhaslittleresemblancetotheunderlyingquantum-mechanicalrealityofelectrons,protonsandneutrons,yetit
hasprovedtobearemarkablyusefulapproximationofhowatomsbehaveincloseproximitytooneanother,andhas
beenthefoundationofchemicalinstructionforwelloveracentury.
Thevalencemodelis alsothefoundationofmodernchemicalinformationsystems. . WhenaComputerScientist
approachesaproblem,thefirsttaskistofigureoutadatamodelthatrepresentstheproblemtobesolvedasinformation.
TotheComputerScientist,thevalencemodelnaturallytransformsintoagraph,wherethenodesareatomsandthe
edgesarebonds. ComputerScientistsknowhowtomanipulategraphs-mathematicalgraphtheoryandcomputer
sciencehavebeencloselyalliedsincetheinventionofthedigitalcomputer.
Thereareatomsandspace.Everythingelseisopinion.
—Democritus
However,thevalencemodelofchemistryhasmanyshortcomings. Themostobviousisaromaticity,whichquickly
requiredaddingtheconceptofanon-integral“aromatic”distributedbond, tothesingle/double/triplebondsofthe
simplevalencemodel. Andthatwasjustthestart-tautomers,ferrocenes,chargedmoleculesandahostofother
commonmoleculessimplydon’tfitthevalencemodelwell.
Thiscomplicateslifeforthecomputerscientist. Asweshallsee,theyarethesourceofmostofthecomplexityof
moderncheminformaticssystems.
8.2.2 Oldersystems:ConnectionTables
Mostoftheearly(andsomemodern)representationsofmoleculeswereinaconnectiontable,literally,atableenu-
meratingtheatoms,andatableenumeratingthebondsandwhichatomseachbondconnected.Hereisanexampleof
connection-table(CTAB)portionofanMDL“SD”file(thedataportionisnotshownhere):
MOLCONV
3 2
0
0
1
0
1 V2000
5.9800
-0.0000
-0.0000 Br
0
0
0
0
0
0
4.4000
-0.6600
0.8300 C
0
0
0
0
0
0
3.5400
-1.3500
-0.1900 C
0
0
0
0
0
0
1
2
1
0
2
3
1
0
Thissimpleexampleillustratesmostofthekeyfeatures. Themoleculehasthreeatoms,twobonds,andisprovided
withthree-dimensional(x,y,z)coordinates.MDLprovidesextensivedocumentationfortheirvariousCTFileformats
ifyouareinterestedinthedetails.
8.2. RepresentingMolecules
81
OpenBabelDocumentation,Release2.3.1
Connectiontablescancapturethevalencemodelofchemistryfairlywell,buttheysufferfromthreeproblems:
1.Theyareveryinefficient,takingontheorderofadozenortwoofbytesofdataperatomandperbond.Newerline
notations(discussedbelow)representamoleculeswithanaverageof1.2to1.5bytesperatom,or6-8bytesperatom
ifcoordinatesareadded.
2.Manysufferedfromlackofspecificity.Forexample,sincehydrogensareoftennotspecified,therecanbeambiguity
astotheelectronicstateofsomemolecules,becausetheconnection-tableformatdoesnotexplicitlystatethevalence
assumptions.
3.Mostmixtheconceptofconnectivity(whattheatomsareandhowtheyareconnected)withotherdatasuchas2D
and3Dcoordinates. Forexample,ifyouhadtwodifferentconformersofamolecule,mostconnectiontableswould
requireyoutospecifytheentiremoleculetwice,eventhoughtheconnectiontableisidenticalinboth.
8.2.3 LineNotations:InChI,SMILES,WLNandothers
Alinenotationrepresentsamoleculeasasingle-linestringofcharacters.
WLN-WisswesserLineNotation WLN,inventedbyWilliamJ.Wisswesserintheearly1950’s,was
thefirstcomprehensivelinenotation, capableofrepresentingarbitrarilycomplexmoleculescor-
rectlyandcompactly.
1H = CH4 Methane
2H = CH3-CH3 3 Ethane
3H = CH3-CH2-CH3 Propane
QVR BG G CG G DG G EG G FG G = = C7HCl5O2 2 Pentachlorbenzoate
WLNwasthefirstlinenotationtofeatureacanonicalform,thatis,therulesforWLNmeantthere
wasonlyone“correct”WLNforanyparticularmolecule. ThoseversedinWLNwereableto
writemolecularstructureinalineformat,communicatemolecularstructuretooneanotherandto
computerprograms. Unfortunately,WLN’scomplexitypreventedwidespreadadoption. Therules
forcorrectspecificationofWLNfilledasmallbook,encodingthoserulesintoacomputerproved
difficult,andtherulesforthecanonicalizationwerecomputationallyintractable.
SMILES-SimplifiedMolecularInputLineEntrySystem The e best-known line notation today is
SMILES.ItwasbyArthurandDavidWeiningerinresponsetoaneedforasimpler,more“hu-
manaccessible”notationthanWLN.WhileSMILESisnottrivialtolearnandwrite,mostchemists
cancreatecorrectSMILESwithjustafewminutestraining,andtheentireSMILESlanguagecan
belearnedinanhourortwo.Youcanreadmoredetailshere.Herearesomeexamples:
C
methane
CC
ethane
C=C
ethene
Oc1ccccc1
phenol
SMILES,likeWLN,hasacanonicalform,butunlikeWLN,Weiningerreliedonthecomputer,
ratherthanthechemist,toconvertanon-canonicalSMILEStoacanonicalSMILES.Thisimportant
separationofdutieswaskeytomakingSMILESeasytoenter. (Readmoreaboutcanonicalization
below.)
InChI InChIisthelatestandmostmodernofthelinenotations.Itresolvesmanyofthechemicalambi-
guitiesnotaddressedbySMILES,particularlywithrespecttostereocenters,tautomersandotherof
the“valencemodelproblems”mentionedabove.
YoucanreadmoreaboutInChIattheOfficialWebSite,orontheUnofficialInChIFAQpage.
82
Chapter8. Cheminformatics101
OpenBabelDocumentation,Release2.3.1
8.2.4 Canonicalization
Acriticalfeatureoflinenotationsiscanonicalization-theabilitytochooseone“blessed”representationfromamong
themany.Consider:
OCC
ethanol
CCO
ethanol
BothoftheseSMILESrepresentthesamemolecule. Ifwecouldallagreethatoneofthesewasthe“correct”or
“canonical”SMILESforethanol,thenwewouldalwaysstoreitthesamewayinourdatabase. Moreimportantly,if
wewanttoask,“Isethanolinourdatabase”weknowthatitwillonlybethereonce,andthatwecangeneratethe
canonicalSMILESforethanolandlookitup.
(Notethatintheoryonecancreateacanonicalconnectiontable,too,butit’snotasusefulsinceinformaticssystems
usuallyhavetroubleindexingBLOBs-largeobjects.)
8.2.5 LineNotationversusConnectionTables:Apracticalmatter
Whyarelinenotationspreferredoverconnection-tableformats?Intheory,eithercouldexpressthesameinformation.
Buttherearepracticaldifference,mostlyrelatedtothecomplexityof“parsing”aconnectiontable.Ifyouknowthat
thewholemoleculeisononelineofafile,it’seasytoparse.
Linenotationsarealsoverynicefordatabaseapplications.Relationaldatabaseshavedatatypesthat,roughlyspeaking,
aredividedintonumbers,text,and“everythingelse”,alsoknownas“BLOBs”(BinaryLargeOBjects).Youcanstore
linenotationsinthe“text”fieldsmuchmoreeasilythanconnectiontables.
Linenotationsalsohavepragmaticadvantages. ModernUnix-likesystems(suchasUNIX,LinuxandCygwin)have
anumberofverypowerful“filter”text-processingprogramsthatcanbe“piped”together(connectedend-to-end)to
performimportanttasks.Forexample,tocountthenumberofmoleculescontainingaliphaticnitrogeninaSMILES
file,Icansimply:
grep N file.smi i | | wc
(greplooksforaparticularexpression,inthiscaseN,andprintsanylinethatcontainsit,andwc(“wordcount”)counts
thenumberofwordsandlines.)
Thisisjustasimpleexampleofthepoweravailablevia“script”programsusing“filters”onUnix-likesystems.Unix
filtersaremuchlessusefulforconnection-tableformats,becauseeachmoleculeisspreadovermanylines.
8.2.6 QueryLanguages:SMARTS
Inadditiontoatypographicalwaytorepresentmolecules,wealsoneedawaytoenterqueriesaboutmolecules,such
as,“Findallmoleculesthatcontainaphenol.”
Withtext,we’refamiliarwiththeconceptoftypingapartialword,suchas“ford”tofind“HenryFord”aswellas“John
Hartford”.Forchemistry,wecanalsospecifypartialstructures,andfindanythingthatcontainsthem.Forexample:
8.2. RepresentingMolecules
83
OpenBabelDocumentation,Release2.3.1
Query
Database
Matches?
YES(matchedportion
highlightedinblue)
NO(doublebondindicated
doesn’tmatch)
eMolecules,Inc.
eMoleculesisaone-stopshopforsuppliersandinformationforover8millionchemicalcompounds.Underthe
hoodisachemicalregistrationtechnologybasedonOpenBabel.
ThesimplestquerylanguageforchemistryisSMILESitself:Justspecifyastructure,suchasOc1ccccc1,andsearch.
ThisishoweMolecules’basicsearchingworks(seeSidebar).It’ssimpleand,becauseofthehigh-performanceindexes
ineMolecules,isalsoveryfast.
However,forgeneral-purposecheminformatics,oneneedsmorepower. Whatifthesubstructureyou’relookingfor
isn’tavalidmolecule?ForexampleClccBr(1,2-substitutiononanaromaticring)isn’tawholemolecule,sincethe
conceptofaromaticityisonlysensibleinthecontextofawholeringsystem.
Orwhatifthethingwe’relookingforisn’tasimpleatomsuchasBr,butratheraconceptlike“Halogen”? Or,“A
terminalmethyl”?
Toaddressthis,cheminformaticssystemshavespecialquerylanguages,suchasSMARTS(SMilesARbitraryTarget
Specification).SMARTSisaclosecousintoSMILES,butithasexpressionsinsteadofsimpleatomsandbonds.For
example,[C,N]willfindanatomthatiseithercarbonornitrogen.
8.2.7 IUPACNames,TradeNames,CommonNames
Chemistryalsohasthreeotherimportantnamesystems:
IUPACNames IUPAC(theInternationalUnionofPureandAppliedChemistry)establishedanamingconvention
thatiswidelyusedthroughoutchemistry.Anychemicalcanbenamed,andallIUPACnamesareunambiguous.
Thistextualrepresentationisaimedathumans,notcomputers:ChemistsversedinIUPACnomenclature(which
iswidelytaught)canreadanIUPACnameandvisualizeordrawthemolecule.
TradeNames NamessuchasTylenol™andValium™aregiventocompoundsandformulationsbymanufacturers
formarketingandsalespurposes,andforregulatorypurposes.
84
Chapter8. Cheminformatics101
Documents you may be interested
Documents you may be interested