Pages

Sunday, 20 November 2022

glibc 2.36 vs. CentOS 7: a tale of failure

My favourite part of coding is planning and implementing some cool idea for doing something, especially if it involves some fun maths I read up on Wikipedia a minute beforehand. In reality polishing dirty data, refactoring someone-else's bad code, reverse engineering the use of a module and trying to get stuff to work is what take up most of my time.

Having got cocky I thought I could get the latest GNU library for C (glibc) working on CentOS 7. I failed miserably, here is my sorry tale down the rabbit hole.

In the cluster I am working in the OS is still CentOS 7. A dead distro that came out in 2014 and is no longer supported bar for security support, which even that has it's end of life (EOL) in 2024. It is still in use as Red Hat abandoned the project and CentOS 8 was never properly completely. CentOS Stream 9 is an attempt at moving it along, but was never recommended by Red Hat nor gained universal usage. Rocky Linux is the unofficial successor. Most clusters are moving towards Ubuntu as a result. For example, HTCondor was CentOS 7 only: CentOS Stream 9 is not an option supported by HTCondor, but Ubuntu and Rocky Linux are. This is a rather common situation unfortunately.

This is a problem for an increasing number of Python packages with C-bindings as the system glibc (GNU library for C) is version 2.17, which cannot be updated or circumvented to the best of my knowledge as discussed here.

Example of such packages in compbiochem are pytorch, pyrosetta, rdkit and pymol.

When a package is installed from a wheel or conda and glibc version is not satisfied one gets /lib64/libm.so.6: version GLIBC_2.27' not found (required by package_name). Installing from source may or may not work. For example, pyrosetta will complain about missing functions and if you force it with different tricks the compiled result does not work, in my experience at least.

Classic work-arounds

There are two ways to normally circumvent an old glibc with conda. The first is setting the CONDA_OVERRIDE_GLIBC variable before environment creation in conda or mamba:

CONDA_OVERRIDE_GLIBC=2.36 conda create -n my_new_py38_env python=3.8

The other is using the tool patchelf, which can replace the libraries used by a given package as documented here:

patchelf --add-rpath /path/newer_glibc broken_package

This requires a compiled 2.36 glibc library. However, in the package distributions, there is not a glibc version greater than 2.17 availble for CentOS 7. This explains why the former method does not work:

import platform, os

assert os.environ['CONDA_DEFAULT_ENV'] == 'my_new_py38_env'
print(f'glibc_version = {platform.libc_ver()[1]}')  # 2.17

There is in conda a package called glibc, but this 9 years old and 2.19, so utterly pointless if it even were to work.

Compiling glibc 2.36

To compiled glibc 2.36 modern kernel-headers are requires as CentOS 7 is runs the linux kernel 3, not 6. In a blog post there is a snippet, which makes it sound straightforward, but I failed to compile it myself. Here is what I tried:

  • providing different kernel-headers
  • with the flag to not raise warnings as errors
  • using modern C compilers thanks to conda (clang or gcc)

There is a conda module called kernel-headers_linux-64, which does not seem to take effect, but there are modules for clang (C-language), clangxx (C++ language), ninja, gcc (GNU C compiler), libgc which are handy (because I do not have root access and they system ones are ancient).

# throwing everything at it, including the kitchen sink:
mamba install -y -c anaconda -c conda-forge cmake make kernel-headers_linux-64 clang clangxx ninja gcc libgcc ld_impl_linux-64

mkdir $CONDA_PREFIX_1/custom_lib

wget https://ftp.gnu.org/gnu/glibc/glibc-2.25.tar.gz
tar -xvzf glibc-2.25.tar.gz
cd glibc-2.25/
mkdir build
cd build/
../configure --prefix=$CONDA_PREFIX_1/custom_lib/glibc-2.25/

In the above $CONDA_PREFIX_1 is the path to base conda, while $CONDA_PREFIX is the venv. The c-compiler can be specified with $BUILD_CC or $CC:

CC=`which gcc` ../configure --prefix=$CONDA_PREFIX_1/custom_lib/glibc-2.25/

The above says the compiler is too old with clang (10.0), but with gcc (12.2) it gives:

configure: error: GNU libc requires kernel header files from
Linux 3.2.0 or later to be installed before configuring.
The kernel header files are found usually in /usr/include/asm and
/usr/include/linux; make sure these directories use files from
Linux 3.2.0 or later.  This check uses <linux/version.h>, so
make sure that file was built correctly when installing the kernel header
files.  To use kernel headers not from /usr/include/linux, use the
configure option --with-headers.

Kernel headers

The kernel-headers_linux-64 conda package seems relevant, but it does not seem to add linux or asm to $CONDA_PREFIX/include, so I am not sure what it does.

Downloading the highest version 3 kernel-headers of CentOS 7 x86_64 and providing those will fail:

# https://centos.pkgs.org/7/centos-x86_64/kernel-headers-3.10.0-1160.el7.x86_64.rpm.html
cd
wget http://mirror.centos.org/centos/7/os/x86_64/Packages/kernel-headers-3.10.0-1160.el7.x86_64.rpm
rpm2cpio kernel-headers-3.10.0-1160.el7.x86_64.rpm | cpio -idmv
cd ~/glibc-2.25/build/

CC=`which gcc` LIBS=$HOME/usr/include ../configure --prefix=$CONDA_PREFIX_1/custom_lib/glibc-2.25/ --with-headers=$HOME/usr/include --disable-werror

As you can see the steps tried were a few and way more than most would try before giving up and using an older version of the glibc-grumpy packages or using a Docker or Singularity image if possible.

Ironically in my case I tried the Docker universe in HTCondor, but I am one version behind and that is a rabbit hole tale for another time!

1 comment:

  1. This is interesting. I face a similar problem. I have a few CentOS-7 machines, but I run the 4.1x elrepo kernels on them. They work nicely, but the glibc is version 2.17, and there is also no libGLX, and I can't even find one. (It's an Nvidia driver, compiled into the /bin/chat binary program I am trying to get running) The app is called GPT4all, which is a localized (works without internet connection) large-language AI model, like ChatGPT. But an ldd of the binary, shows it needs the current glibc libraries. I've already found and installed libraries for images and lossless data-compression, but am now failing on the "libGLX.so.0" not being found. This library is not even needed, since our machines do not have Nvidia cards, but it is likely an artifact of the way the installable GPT4all binary code is packaged for Linux. So, I think we are pooched.

    What I've learned, is that mostly, one just has to roll up one's sleaves, and build any open-source code one wants to run, from the source, since any sort of binary package will just about *never* work in the World of Linux. This is because the production-world (which values stability-of-operation above everything else), is always behind the latest-and-greatest technostuff the young developers are running. You either build it and maintain it yourself, or your business-critical software will break and die as the upgrades are pushed down on upon you.

    This harsh reality, explains the growing failure of various clever "open-source" projects, and the continued growth of commercial entities like Microsoft and Apple (and now IBM+RedHat?). All your open-source code, will just die, unless you can fully fabricate it yourself, in your own lab. And even your fabrication tools, will be under constant "upgrade" assault, so they too, will be destroyed, if any means exists for that to happen.

    This is not an accident. This is by design. The entire technology industry actively embraces an economic model where they attempt to "disrupt" your existing (and perhaps most effective and profitable!) technological solutions, into destruction, so you are forced to re-acquire and re-engineer constantly.

    This reality explains why most economic progress has stopped in many technologically-sophisticated fields, and is in fact running retrograde in many areas. We have not "progress", but only the illusion of progress. In reality, everyone is made to scramble around like ants, in an anthill that is constantly being kicked over, by the technology industry, so that no-one can build anything that can be made to endure.

    A few major players thus dominate and control all major technologically-sophisticated industries, and no new developers or product designers are able to mount any effective challenges to the existing order of things.
    Windows 11 updates *weekly* now, and most new open-source projects are rigged to be unusable on most existing technology. 64-bit machines replaced 32-bit machines, and run (in almost every case we have looked at) slower, due to the increased complexity of the absurd architecture of multi-core processors, and the bloated
    nature of most operating systems. It's just comical, how awful much of what we all now rely on, actually is, "under the hood". We have determined this is not an accident or the fault of "poor design decisions", but is in reality, a carefully crafted strategy.

    "Resistance is Futile", so it seems.

    Or.. as we have discovered: "The more we *upgrade*, the
    less we get..!" :)
    - M.

    ReplyDelete