Cheminformatics Tutorials - Herong's Tutorial Examples - v2.03, by Herong Yang
Cheminformatics Tutorials - Herong's Tutorial Examples
https://www.herongyang.com/Cheminformatics
Copyright © 2019-2024 Herong Yang. All rights reserved.
This book is a collection of notes and tutorial examples written by the author while he was learning cheminformatics and related tools. Topics include SMILES (Simplified Molecular-Input Line-Entry System) specifications; Open Babel chemical toolbox for file format conversion; Fingerprint index files used by Open Babel for fast search; RDKit for cheminformatics and machine learning; Substructure search and decomposition with RDKit; RDKit performance on large molecule datasets; molecular fingerprints generation methods; AlphaFold as an AI system to predict protein’s 3D structure. Updated in 2024 (Version v2.03) with minor updates.
Table of Contents
SMILES (Simplified Molecular-Input Line-Entry System)
Branch Represenations in SMILES
Disconnected Structures in SMILES
Charge Represenations in SMILES
Isotope Represenations in SMILES
Chirality Representations in SMILES
Hydrogen Representations in SMILES
Open Babel: The Open Source Chemistry Toolbox
Install Open Babel with Anaconda
Install Open Babel on Windows Computers
Run Open Babel GUI on Windows Computers
Change Display Command on Open Babel GUI
Open Babel Installation Options on Linux
Install Open Babel Binary Package on CentOS
"Open Babel Error in LoadAllPlugins" Error
Install Open Babel from Source Code
Install Open Babel 2.4.1 from Source Code
Open Babel Installation Options on macOS
Install Open Babel Binary Package on macOS
Using Open Babel Command: "obabel"
"obabel -i ..." - Input Data Format and Source
"obabel -o ... -O" - Output Data Format and Destination
"obabel -... --..." - Generic Conversion Options
"obabel" Command Option Argument Syntax
"obabel ... --gen2D" - Calculated 2D Coordinates
"obabel ... -f # -l #" - Split Large Molecule File
"obabel -h/-d" - Add/Remove Hydrogens in Molecule Data
"obabel --append ..." - Calculate Molecule Properties
"obabel -L formats" - List of File Formats Supported
"obabel -a..." - Extra Options for Input Reading
"obabel -x..." - Extra Options for Output Writing
"obabel" vs. "babel" Open Babel Commands
Generating SVG Pictures with Open Babel
"obabel -o svg" - Molecule Picture in SVG
"obabel -:... -o svg" - Generate SVG from SMILES
"obabel ... -o svg -xi" - Show Atom Indices in SVG
"obabel ... -o svg -xS" - Ball/Stick Depiction in SVG
"obabel ... -o svg -xX" - Hide Implicit H in SVG
"obabel ... -o svg -xC" - Hide Terminal C in SVG
"obabel ... -o svg -xP300" - Control Image Size
"obabel ... -o svg" - Two "svg" XML Tag Levels
"obabel ... -o svg -xd" - Hide Molecule Name
"babel ... -o svg -xd -xP300" - Open Babel 2.4 Bug
Scale SVG Images using "viewBox" Attribute
Substructure Search with Open Babel
"obabel -s ..." Command - Substructure Search
Substructure Search with Wildcard Atom "*"
Substructure Search with Wildcard Bond "~"
Substructure Search with SMARTS Expressions
Similarity Search with Open Babel
Fingerprint Index for Fastsearch with Open Babel
Stereochemistry with Open Babel
Read Stereoinformation from Input with Open Babel
Stereo Perception Performed by Open Babel
Write Stereoinformation to Output by Open Babel
Wedge-Hash Bond Changed by Open Babel
Hash Bond with Solid Line by Open Babel
Hash over Double Bond by Open Babel
Command Line Tools Provided by Open Babel
List of Open Babel Command Line Tools
"obchiral" - Print Chirality Information
"obconformer" - Generate Best Conformer
"obenergy" - Calculate Molecule Energy
"obfit" - Superimpose Two Molecules
"obgen" - Generate Molecule 3D Structures
"obgrep" - Search Molecules using SMARTS
"obminimize" - Optimize Geometry/Energy of Molecule
"obprobe" - Create Electrostatic Probe Grid
"obrotamer" - Generate Random Rotational Isomers
"obrotate" - Rotate Dihedral Angles with SMARTS
RDKit: Open-Source Cheminformatics Software
Install RDKit in an Anaconda Environment
Install RDKit Binary Package for CentOS
Build RDKit from Source Code on CentOS System
Compile, Link and Run RDKit C++ API Examples
Try Python API with RDKit Native Code
rdkit.Chem.rdchem - The Core Module
What Is rdkit.Chem.rdchem Module
rdkit.Chem.rdchem.Mol - The Molecule Class
rdkit.Chem.rdchem.Atom - The Atom Class
rdkit.Chem.rdchem.Bond - The Bond Class
rdkit.Chem.rdchem.RWMol - The RWMol Class
rdkit.Chem.rdmolfiles - Molecular File Module
What Is rdkit.Chem.rdmolfiles Module
MolFromSmiles/MolToSmiles for SMILES Format
MolFromMolBlock/MolToMolBlock for Mol Block
SmilesMolSupplier/SDWriter for SMILES Files
SDMolSupplier/SDWriter for SDF Files
rdkit.Chem.rdDepictor - Compute 2D Coordinates
What Is rdkit.Chem.rdDepictor Module
rdkit.Chem.Draw - Handle Molecule Images
What Is rdkit.Chem.Draw Module
MolToImage/MolToFile - Molecule PNG Image
rdkit.Chem.Draw.MolDrawing.DrawingOptions Class
rdkit.Chem.Draw.rdMolDraw2D.MolDraw2DCairo - 2D Molecule Drawing
rdkit.Chem.Draw.rdMolDraw2D.MolDraw2DCairo - Molecule PNG Image
rdkit.Chem.Draw.rdMolDraw2D.MolDraw2DSVG - Molecule SVG Image
rdkit.Chem.Draw.rdMolDraw2D.MolDrawOptions - Drawing Options
Drawing Diagrams with MolDraw2DCairo and MolDraw2DSVG
Molecule Substructure Search with RDKit
RDKit m.HasSubstructMatch(s) - Substructure Match
RDKit GenerateDepictionMatching2DStructure(m, s) - Substructure Orientation
RDKit rdMolDraw2D.PrepareAndDrawMolecule - Substructure Highlight
RDKit Substructure Search with SMARTS
rdkit.Chem.rdFMCS - Maximum Common Substructure
rdkit.Chem.rdSubstructLibrary - Substructure Library
Substructure Library in Binary and SMILES Formats
rdkit.Chem.rdmolops - Molecule Operations
What Is rdkit.Chem.rdmolops Module
Molecule Similarity Based on Fingerprints with RDKit
Molecule Core and Sidechains Decomposition with RDKit
R-Group Decomposition with RDKit
Daylight Fingerprint Generator in RDKit
What Is Daylight Fingerprint Generator in RDKit
RDKFingerprint() Method in RDKit
Impact of 'useBondOrder' on RDKFingerprint()
Impact of 'branchedPaths' on RDKFingerprint()
Impact of 'maxPath' on RDKFingerprint()
Impact of 'fpSize' on RDKFingerprint()
Impact of 'tgtDensity' on RDKFingerprint()
Impact of 'nBitsPerHash' on RDKFingerprint()
UnfoldedRDKFingerprintCountBased() Method in RDKit
GetRDKitFPGenerator() Method in RDKit
Morgan Fingerprint Generator in RDKit
What Is Morgan Fingerprint Generator in RDKit
GetMorganFingerprint() Method in RDKit
Impact of 'radius' on GetMorganFingerprint()
Impact of 'useCounts' on GetMorganFingerprint()
Impact of 'invariants' on GetMorganFingerprint()
Impact of 'useBondTypes' on GetMorganFingerprint()
Impact of 'fromAtoms' on GetMorganFingerprint()
GetMorganFingerprintAsBitVect() Method in RDKit
Impact of 'nBits' on GetMorganFingerprintAsBitVect()
GetHashedMorganFingerprint() Method in RDKit
Impact of 'nBits' on GetHashedMorganFingerprint()
GetMorganGenerator() Method in RDKit
Morgan Fingerprint Generator in RDKit for FCFP
RDKit Performance on Substructure Search
Introduction to Molecular Fingerprints
OCSR (Optical Chemical Structure Recognition)
StoneMIND Collector - Information Extraction System
Install StoneMIND Collector Client on Windows
Use StoneMIND Collector on Windows
Stop StoneMIND Collector on Windows
Use StoneMIND Collector Web Interface
AlphaFold - Protein Structure Prediction
Open Source Code for AlphaFold
Download AlphaFold Package and Databases
Keywords:Cheminformatics, Molecule, DNA, Gene, BioTech