Updating Rdkit

Revision as of 18:08, 10 April 2017 by Enkhjargal (talk | contribs)
Jump to navigation Jump to search

Create a python virtual env:

   /nfs/soft/python/install/scripts/create-virtualenv.sh -s /nfs/soft/python/versions/python-2.7.11 /nfs/soft/python/envs/rdkit/python-2.7.11-rdkit-2016_03_01

Install the new version of RDKIT in the python env:

  /nfs/soft/python/install/scripts/install-rdkit.sh /nfs/soft/python/install/extra/RDKit_2016_03_1.tgz /nfs/soft/python/envs/rdkit/python-2.7.11-rdkit-2016_03_01

= Notes = See RDKit

Each version is now installed in $PYTHON_ROOT/local/rdkit-VERSION. The current version is a symlink now named $PYTHON_ROOT/local/rdkit.

It's worth noting how this works. There is a directory in $PYTHON_ROOT named $PYTHON_ROOT/local. In addition to containing the build of RDKit as described above ($PYTHONR_ROOT/local/rdkit), it also contains $PYTHON_ROOT/lib, which is simply a directory of simlinks to important shared libraries.

When compiling Python, set LD_RUN_PATH to $ORIGIN/../lib:$ORIGIN/../local/lib so that it always looks in those two directories ($ORIGIN absolute path of the python binary). This way python can find important shared libraries, such as RDKit, modeller, etc. When we create a new virtualenv, we create a symlink to the original local directory in the virtualenv, so that we can always find the right shared libraries.

To make sure we always have the right python package installed there is a file created in: $PYTHON_ROOT/lib/python2.7/site-pacakges/rdkit.pth In Python package resolution, .pth files allow you to specify an additional location to look for a module. This file includes one line: ../../../local/rdkit Which looks for the rdkit package in the (symlink to) the current version of RDKit. Note that the RDkit python package has the path: $PYTHON_ROOT/local/rdkit/rdkit (two rdkits)

PTH files notes: There are actually two other pth files I have created: modeller.pth and hask.pth It would probably be best to simply create one pth file to handle these packages stored in the "local" directory: Create a directory called local/packages or something and put symlinks to rdkit, haks, modeller, etc. in there. then create a single pth file with the line: ../../../local/packages

Install procedure

  • 0) Have a Python installation you wish to install for (This should be the base installation for virtualenvs)
  • 1) Download RDKit Tarball
  • 2) Extract RDKit Tarball: $RDKIT_SRC
  • 3) Download INCHI support: cd $RDKIT_SRC/External/INCHI-API; ./download-inchi.sh
  • 4) Create build directory: cd $RDKIT_SRC; mkdir build; cd build
  • 5) Configure with cmake (This is a long command! and note the version specific directories/files)
cmake \
-DPYTHON_LIBRARY=$PYTHON_PREFIX/lib/libpython2.7.so \
-DPYTHON_NUMPY_INCLUDE_PATH=$PYTHON_PREFIX/lib/python2.7/site-packages/numpy/core/include \
  • 6) Build RDKit: make -j4 (build in parallel) OR make (build serial)
  • 7) Wait for a while
  • 8) Install RDKit: make install
  • 9) Run python install: cd $RDKIT_SRC; $PYTHON_ROOT/bin/python setup.py install
  • 10) OK so the install doesn't actually work: do it manually:
mkdir $PYTHON_PREFIX/local/rdkit-2013.09
cp -rv lib $PYTHON_PREFIX/local/rdkit-2013.09/lib
cp -rv rdkit $PYTHON_PREFIX/local/rdkit-2013.09/rdkit
  • 11) Update the symlinks: ln -svfn $PYTHON_PREFIX/local/rdkit-2013.09 $PYTHON_PREFIX/local/rdkit

This process works but of course doesn't cover all of the details of potential installations. For one, this doesn't even touch on the upgrading of the PostgreSQL cartridge, which is a nasty black art. I'll explain that once I have enough goats to sacrifice.