Molecule Tutorials - Herong's Tutorial Examples - v1.25, by Dr. Herong Yang
Try RDKit Python API
Provides a tutorial example on how to use the RDKit Python API. Unfortunately, it is not working because of the missing boost_python library..
Now I am ready to try the RDKit Python API with the Python 2 engine. It should work with the build I did with "-DRDK_BUILD_PYTHON_WRAPPERS=OFF".
1. Import the "rdkit" package into Python 2. I see an "ImportError: No module named rdBase" error. I have no idea where "rdBase" module is located.
herong$ export PYTHONPATH=/home/herong/rdkit herong$ python2 Python 2.7.16 (default, Nov 17 2019, 00:07:27) [GCC 8.3.1 20190507 (Red Hat 8.3.1-4)] on linux2 Type "help", "copyright", "credits" or "license" for more information. >>> import rdkit import Chem Traceback (most recent call last): File "<stdin>", line 1, in <module> File "/home/herong/rdkit/rdkit/__init__.py", line 2, in <module> from .rdBase import rdkitVersion as v1.25 ImportError: No module named rdBase
2. Read RDKit Python documentation, I found this note: "Beginning with the 2019.03 release, the RDKit is no longer supporting Python 2. If you need to continue using Python 2, please stick with a release from the 2018.09 release cycle." So I have to rebuild RDKit with "-DRDK_BUILD_PYTHON_WRAPPERS=ON" to work with Python 3.
3. Unzip rdkit-master.zip into ~/rdkit and again build it again with no option, which takes the default setting of "-DRDK_BUILD_PYTHON_WRAPPERS=ON". I see errors on
herong$ unzip rdkit-master.zip herong$ mv rdkit-master rdkit herong$ cd rdkit herong$ mkdir build herong$ cd build herong$ cmake .. CMake Error: The following variables are used in this project, but they are set to NOTFOUND. Please set them or make sure they are set and tested correctly in the CMake files: PYTHON_LIBRARY (ADVANCED) linked by target "RDBoost" in directory /home/herong/rdkit/Code/RDBoost linked by target "rdBase" in directory /home/herong/rdkit/Code/RDBoost/Wrap ...
4. Install "platform-python-devel" and run "cmake" again. I see the "No Boost libraries were found" error.
herong$ sudo dnf install platform-python-devel ... Installed: platform-python-devel-3.6.8-15.1.el8.x86_64 python-rpm-macros-3-37.el8.noarch python3-rpm-generators-5-4.el8.noarch herong$ cmake .. CMake Error at /usr/share/cmake/Modules/FindBoost.cmake:2044 (message): Unable to find the requested Boost libraries. Boost version: 1.66.0 Boost include path: /usr/include Could not find the following Boost libraries: boost_python No Boost libraries were found. You may need to set BOOST_LIBRARYDIR to the directory containing Boost libraries or BOOST_ROOT to the location of Boost. ...
5. Search for boost_python library file. I see no boost_python library.
herong$ ls -l /usr/lib64/libboost_p* 35 May 13 2019 /usr/lib64/libboost_prg_exec_monitor.so -> libboost_prg_exec_monitor.so.1.66.0 89688 May 13 2019 /usr/lib64/libboost_prg_exec_monitor.so.1.66.0 34 May 13 2019 /usr/lib64/libboost_program_options.so -> libboost_program_options.so.1.66.0 701288 May 13 2019 /usr/lib64/libboost_program_options.so.1.66.0 ... herong$ dnf info boost Installed Packages Name : boost Version : 1.66.0 Release : 6.el8 Architecture : x86_64 Size : 1.3 k Source : boost-1.66.0-6.el8.src.rpm Repository : @System From repo : AppStream Summary : The free peer-reviewed portable C++ source libraries URL : http://www.boost.org
Too bad. the "boost 1.66" package I installed does not have the boost_python library. Not sure if I have to install it manually.
Table of Contents
Molecule Names and Identifications
Nucleobase, Nucleoside, Nucleotide, DNA and RNA
►RDKit: Open-Source Cheminformatics Software
Build RDKit from Source Code on CentOS System
Compile, Link and Run RDKit C++ API Examples
ChEMBL Database - European Molecular Biology Laboratory
PubChem Database - National Library of Medicine
INSDC (International Nucleotide Sequence Database Collaboration)
HGNC (HUGO Gene Nomenclature Committee)