Skip to content

Instantly share code, notes, and snippets.

@FooBarWidget
Last active August 17, 2024 18:47
Show Gist options
  • Save FooBarWidget/1ec28b08d1e9d935475cd73d6911b0b6 to your computer and use it in GitHub Desktop.
Save FooBarWidget/1ec28b08d1e9d935475cd73d6911b0b6 to your computer and use it in GitHub Desktop.

Understanding shared library depending scanning

We scan executables and libraries for the libraries they require, identifying the packages that provide these libraries, and ensuring that the correct versions are installed. When multiple packages can satisfy the same dependency, it also manages alternative dependencies by building a list of acceptable packages.

At its core, the approach relies on extracting symbol and version information from the shared libraries and packages, comparing them with the needs of executables, and resolving all possible package alternatives.

Scanning for Library Dependencies

The process begins with identifying the shared libraries an executable needs. Executables in Linux are often dynamically linked to shared libraries, meaning they don't include the entire code of the libraries they use; instead, they rely on these external files to be present when the program runs.

The first task is to scan the executable and extract the list of libraries it depends on. Executables in Linux use the ELF (Executable and Linkable Format) format. This format contains metadata, including a list of "NEEDED" libraries—these are the shared libraries the executable requires.

By analyzing the ELF headers of an executable, it's possible to compile a list of these required libraries. This allows the system to begin searching for the corresponding packages that provide these libraries.

Mapping Libraries to Debian Packages

Once the libraries required by the executable are identified, the next step is to determine which Debian packages provide those libraries. Debian packages often bundle shared libraries, and identifying the correct package ensures that the executable will have access to the library at runtime.

In Debian, each installed package can be queried to determine whether it provides a specific shared library. This is done by querying the package metadata. When a library like libfoo.so.1 is required, the system checks which installed packages provide this library by querying the local package database.

If a single package provides the library, the dependency is straightforward: the executable will depend on that package. However, it's common for multiple packages to provide the same library, such as when both a standard and a development version of the library are available. These are known as alternative dependencies.

Symbols and Version Constraints

Libraries evolve over time, often gaining or losing functionality as new versions are released. An executable may require specific functionality (symbols) from a library, and not every version of the library will necessarily provide the needed symbols.

Symbols are the individual functions, variables, or data fields that a shared library exposes to other programs. For example, a library like libfoo.so.1 might expose a function called foo_function. If an executable uses this function, it depends on the library providing it.

To manage these dependencies correctly, Debian uses symbols files. These files are generated during the packaging process and list all the symbols a library exports, along with the minimum version of the package that contains each symbol. Symbols files are stored in the /var/lib/dpkg/info/ directory and are named after the package, for example, libfoo1.symbols.

When a package is built, its symbols file is created by inspecting the shared libraries it provides. This file allows package managers to know which version of the library is necessary to provide the required symbols. If a symbol like foo_function was introduced in version 1.2 of libfoo1, and an executable depends on foo_function, the system will ensure that the executable declares a dependency on libfoo1 (>= 1.2).

This ensures that when the executable is installed, the correct version of the library will be present, avoiding runtime errors caused by missing functionality.

Alternative Dependencies

In some cases, multiple packages can provide the same shared library. For example, both libfoo1 and libfoo1-dev might provide the library libfoo.so.1. When this happens, the system needs to handle these alternative dependencies correctly.

Alternative dependencies are expressed as a list of packages, separated by pipes (|). This indicates that any of the packages in the list can satisfy the dependency. For example, if both libfoo1 and libfoo1-dev provide libfoo.so.1, the dependency might be expressed as:

libfoo1 | libfoo1-dev

The process begins by identifying all the packages that provide the library. This can be done by querying the local package database and collecting all possible matches. Once the packages are identified, the next step is to determine if any version constraints apply.

If an executable requires a specific version of the library, the version constraint is attached to the corresponding package in the alternative dependency list. For instance, if libfoo1 version 1.2 or higher is needed but libfoo1-dev can also satisfy the dependency, the result might be:

libfoo1 (>= 1.2) | libfoo1-dev

This indicates that the executable needs libfoo1 version 1.2 or greater but will also accept libfoo1-dev if available.

Final Dependency Resolution

Once all libraries have been processed, the system generates the final list of dependencies. This list includes the required packages, with any necessary version constraints and alternative dependencies correctly handled. The output is in a format that package managers like apt can interpret, ensuring that the executable has access to all the libraries it needs at runtime.

By scanning executables for their shared library dependencies, mapping those libraries to Debian packages, and handling alternative dependencies and version constraints, the system ensures that all necessary libraries are present when the executable runs. This process prevents runtime failures caused by missing libraries or incompatible versions and ensures smooth package installations in Debian-based systems.

A Closer Look at Symbols and Symbols Files

What are Symbols?

As we touched on earlier, symbols are the individual components of a library that other programs can use. These can be functions, global variables, or even data sections. For instance, if an executable needs to call the malloc() function, it relies on the shared library libc.so.6 to provide that symbol. Each symbol has a name, and potentially a version, that identifies it.

When you link an executable to a shared library, it’s the symbols inside the library that the executable actually uses. If the library changes (for example, if a function is removed or replaced), the executable might break unless it can be guaranteed that the required symbols are still available.

How are Symbols Files Generated?

Symbols files are created when a Debian package containing a shared library is built. These files record every symbol that the library exports and the minimum version of the package that contains that symbol. This allows other packages to declare precise dependencies on the package.

For instance, if a new version of libfoo.so.1 adds a function new_function, the symbols file will be updated to reflect that new_function was introduced in version 1.3 of the package. If an executable uses new_function, the package containing that executable must depend on libfoo1 (>= 1.3) to ensure that the function will be available.

Where are Symbols Files Stored?

Symbols files are stored in the /var/lib/dpkg/info/ directory. Each package has its own symbols file, which is named after the package. For example:

/var/lib/dpkg/info/libfoo1.symbols

These files are critical for handling precise versioning in Debian packages and are automatically generated during the packaging process.

Conclusion

To summarize, the Debendable script works by scanning executables to find their shared library dependencies, figuring out which Debian packages provide those libraries, and determining the necessary versions of the packages based on the required symbols. It handles alternative dependencies when multiple packages can provide the same library, ensuring that executables have all the libraries they need to run properly.

As a new maintainer, your focus will be on ensuring that this flow continues to work smoothly, even as libraries evolve and new dependencies arise. Now that you know how the script works and how Debian handles symbols and shared libraries, you’re well-prepared to maintain and extend the script as needed. Good luck!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment