Cheminformatics Tutorials - Herong's Tutorial Examples - v2.01, by Herong Yang
Impact of 'useBondTypes' on GetMorganFingerprint()
This section provides a tutorial example on impact of the 'useBondTypes' option on fingerprint generation with rdkit.Chem.rdMolDescriptors.GetMorganFingerprint() function.
The 'useBondTypes' option in the rdkit.Chem.rdMolDescriptors.GetMorganFingerprint() function call allows you to control whether or not bond types (bond orders) are considered, when updating identifiers with neighboring atom nodes. The default value is useBondTypes=True.
1. For example, molecule "C=C" has a double bond. The identifier 3695448525 in the fingerprint represents atom C with a neighboring C connected by a double bond with the useBondTypes=True option.
from rdkit.Chem import AllChem from rdkit.DataStructs.cDataStructs import UIntSparseIntVect radius = 1 bitInfo = {} mol = AllChem.MolFromSmiles('C=C') fp = AllChem.GetMorganFingerprint(mol, radius, useBondTypes=True, bitInfo=bitInfo) display(UIntSparseIntVect.GetNonzeroElements(fp)) display(UIntSparseIntVect.GetTotalVal(fp)) print(bitInfo) # output: {2246997334: 2, 3695448525: 1} 3 {2246997334: ((0, 0), (1, 0)), 3695448525: ((0, 1),)}
Note that I am expecting identifier 3695448525 to appear 2 times in the fingerprint, since the 2 C atoms have identical local structures.
2. If we use useBondTypes=False, The identifier representing atom C with a neighboring C connected by a double bond changes to 3695449228.
radius = 1 bitInfo = {} mol = AllChem.MolFromSmiles('C=C') fp = AllChem.GetMorganFingerprint(mol, radius, useBondTypes=False, bitInfo=bitInfo) display(UIntSparseIntVect.GetNonzeroElements(fp)) display(UIntSparseIntVect.GetTotalVal(fp)) print(bitInfo) # output: {2246997334: 2, 3695449228: 1} 3 {2246997334: ((0, 0), (1, 0)), 3695449228: ((0, 1),)}
Conclusion: The 'useBondTypes' option is only used during the updating identifier phase. It has no impact on the initial identifier generation phase. We should keep the default of useBondTypes=True to encode bond types into updated identifiers.
Table of Contents
SMILES (Simplified Molecular-Input Line-Entry System)
Open Babel: The Open Source Chemistry Toolbox
Using Open Babel Command: "obabel"
Generating SVG Pictures with Open Babel
Substructure Search with Open Babel
Similarity Search with Open Babel
Fingerprint Index for Fastsearch with Open Babel
Stereochemistry with Open Babel
Command Line Tools Provided by Open Babel
RDKit: Open-Source Cheminformatics Software
rdkit.Chem.rdchem - The Core Module
rdkit.Chem.rdmolfiles - Molecular File Module
rdkit.Chem.rdDepictor - Compute 2D Coordinates
rdkit.Chem.Draw - Handle Molecule Images
Molecule Substructure Search with RDKit
rdkit.Chem.rdmolops - Molecule Operations
Daylight Fingerprint Generator in RDKit
►Morgan Fingerprint Generator in RDKit
What Is Morgan Fingerprint Generator in RDKit
GetMorganFingerprint() Method in RDKit
Impact of 'radius' on GetMorganFingerprint()
Impact of 'useCounts' on GetMorganFingerprint()
Impact of 'invariants' on GetMorganFingerprint()
►Impact of 'useBondTypes' on GetMorganFingerprint()
Impact of 'fromAtoms' on GetMorganFingerprint()
GetMorganFingerprintAsBitVect() Method in RDKit
Impact of 'nBits' on GetMorganFingerprintAsBitVect()
GetHashedMorganFingerprint() Method in RDKit
Impact of 'nBits' on GetHashedMorganFingerprint()
GetMorganGenerator() Method in RDKit
Morgan Fingerprint Generator in RDKit for FCFP
RDKit Performance on Substructure Search
Introduction to Molecular Fingerprints
OCSR (Optical Chemical Structure Recognition)
AlphaFold - Protein Structure Prediction