From 066bcca741b688f1e430caf6d835c5b7f23ecbfd Mon Sep 17 00:00:00 2001 From: "David E. Wheeler" Date: Fri, 1 Nov 2024 17:27:30 -0400 Subject: [PATCH] Finish RFC draft --- .../rfc-extension-packaging-lookup.md | 318 +++++++++++------- 1 file changed, 201 insertions(+), 117 deletions(-) diff --git a/content/post/postgres/rfc-extension-packaging-lookup.md b/content/post/postgres/rfc-extension-packaging-lookup.md index 8caab705..b949141d 100644 --- a/content/post/postgres/rfc-extension-packaging-lookup.md +++ b/content/post/postgres/rfc-extension-packaging-lookup.md @@ -22,35 +22,36 @@ more unified, public proposal. ## The Problems A number of challenges face extension users in various configurations, thanks -to the status quo of extension file organization in the Postgres core. +to the status quo of extension file organization in the Postgres core. The +common thread among them is the need to add extensions without changing the +Postgres installation itself. ### Packager Testing On Debian systems, the user account that creates extension packages lacks -permission to install add into the `root` user-owned Postgres install. But -testing extensions requires installing the extension files where Postgres can -find them. - -Furthermore, the Postgres installation should be a clean install for each -extension package built, and installing an extension in order to run `make -installcheck` would pollute it. +permission to add files to the `root` user-owned Postgres install. But testing +extensions requires installing the extension files where Postgres can find +them. + +Furthermore, extensions should ideally be built against a clean Postgres +install; adding an extension in order to run `make installcheck` would pollute +it. [Christoph's patch][destdir] solves these problems by adding a second lookup -path for extensions and dynamic modules, so that Postgres can load them from -the package build directory. +path for extensions and dynamic modules, so that Postgres can load them +directly from the package build directory. -Alas, the patch isn't ideal, because it uses a prefix for the directory, -after which `pg_config` directories are appended. For example, if -`pg_config --sharedir` outputs `/opt/share` and `extension_destdir` GUC is set -to `/tmp/build/myext`, Postgres will search in `/tmp/build/myext/opt/share`. -This works well for the packaging use case, since prefixing is exactly the -pattern used for building packages, but would be a bit weird for other use -cases. +Alas, the patch isn't ideal, because it simply specifies a prefix and appends +the full `pg_config` directory paths to it. For example, if `--sharedir` +outputs `/opt/share` and `extension_destdir` GUC is set to `/tmp/build/myext`, +the patch will search in `/tmp/build/myext/opt/share`. This approach well for +the packaging use case, which explicitly uses a prefix pattern, but would be a +bit weird for other use cases. ### Docker Immutability -Docker images are immutable. To install persistent extensions in a Docker -container, one must create a persistent volume, map it to +Docker images are immutable. To install persistent extensions in a running +Docker container, one must create a persistent volume, map it to `SHAREDIR/extensions`, and copy over all the core extensions (or muck with [symlink magic]). Then do it again for shared object libraries (`PKGLIBDIR`), and perhaps also for other `pg_config` directories, like `--bindir`. @@ -62,11 +63,10 @@ deployment configuration complexity. ### Postgres.app Immutability -The [Postgres.app] project supports installing extensions. But because they -must go into the [SHAREDIR/extensions], installing one changes the contents of -the Postgres.app bundle, breaking Apple provisioned signature validation of -the bundle. The OS will no longer be able to validate that the app is legit, -so will refuse to start it. +The macOS [Postgres.app] supports extensions. But installing one into +[SHAREDIR/extensions] changes the contents of the Postgres.app bundle, +breaking the app's Apple-required signature validation. The OS will no longer +be able to validate that the app is legit, so will refuse to start it. ## Solution @@ -79,39 +79,22 @@ First, when an extension is installed, all of its files should live in a single directory. These include: * The Control file that describes extension -* Subdirectories for SQL, shared libraries, docs, binaries - -Subdirectories correspond to the `pg_config --*dir` options, except for -`include` and `sysconf` directories: - -``` console -❯ pg_config --help | grep 'dir\b' | grep -v 'include\sysconf' - --bindir show location of user executables - --docdir show location of documentation files - --htmldir show location of HTML documentation files - --libdir show location of object code libraries - --pkglibdir show location of dynamically loadable modules - --localedir show location of locale support files - --mandir show location of manual pages - --sharedir show location of architecture-independent support files - -``` +* Subdirectories for SQL, shared modules, docs, binaries -In other words: +Subdirectories roughly correspond to the `pg_config --*dir` options: -* `bin` -* `doc` -* `html` -* `lib` -* `pkglib` -* `locale` -* `man` -* `share` +* `sql`: SQL files +* `bin`: Executables +* `doc`: Documentation files +* `html`: HTML documentation files +* `lib`: Dynamically loadable modules +* `locale`: Locale support files +* `man`: Manual pages +* `share`: Other architecture-independent support files -This layout reduces the cognitive overhead of the current layout, in which the -the files for an extension are distributed to multiple locations. Want to know -what's included in the `widget` extension? Everything is in the `widget` -directory. +This layout reduces the cognitive overhead for understanding what files belong +to what extension. Want to know what's included in the `widget` extension? +Everything is in the `widget` directory. ### Configuration Parameters @@ -139,40 +122,39 @@ plpgsql xml2 ``` -Any OS vendor or packaging systems would install extensions into +Any OS vendor or packaging systems would install non-core extensions into `--extdir-vendor`, while end-user extensions would be installed into -`--extdir-site`. [PGXS] will be updated to default to installing into -`--extdir-site`, with an option to specify the vendor directory or any other -directory. External projects that install extensions without using PGXS (like -[pgrx]) should do the same. +`--extdir-site`. [PGXS] will be updated to install into `--extdir-site` by +default, with an option to specify the vendor directory (used by packagers) or +any other directory. External projects that install extensions without using +PGXS (like [pgrx]) should do the same. Like all other `pg_config` options, these values can be customized at compile -time. By default, the each of these values should point to different -directories, so that core, vendor, and end-user extensions are always kept -separate. Perhaps default to: +time. By default, the each should point to different directories, so that +core, vendor, and end-user extensions are always kept separate. Perhaps +default to: ``` -SHAREDIR/extensions/(core|site|vendor) +PG_INSTALL_ROOT/extensions/(core|site|vendor) ``` -## Extension Path +### Extension Path Add an extension lookup path akin to [`dynamic_library_path`]. For the purposes of this RFC, let's call it `extension_path`. It lists all the -directories that Postgres should search for extensions and their files, -including control, SQL, and shared library files. The default value for this -GUC would be: +directories that Postgres should search for extensions and their files. The +default value for this GUC would be: ``` ini extension_path = '$extdir_site,$extdir_vendor,$extdir_core' ``` -These special values, `$extdir_site`, `$extdir_vendor`, and `$extdir_core`, -correspond to `--extdir-site`, `--extdir-vendor`, and `--extdir-core`, -respectively, and function exactly as `$libdir` does for the -`dynamic_library_path` GUC. +The special values `$extdir_site`, `$extdir_vendor`, and `$extdir_core` +correspond to `pg_config` `--extdir-site`, `--extdir-vendor`, and +`--extdir-core` options, respectively, and function exactly as `$libdir` does +for the `dynamic_library_path` GUC. -## Lookup Execution +### Lookup Execution Update PostgreSQL's `CREATE EXTENSION` command to search the directories in `extension_path` for an extension. For each directory in the list, it should @@ -183,18 +165,48 @@ $dir/$extension/$extension.control ``` The first one it finds should thereafter be considered the canonical location -for the extension. For example, if the control file named `pair` was found at -`/opt/pg17/ext/pair/pair.control`, then Postgres must load files only from the -appropriate subdirectories, e.g.: +for the extension. For example, if the control file for the `pair` extension +was found at `/opt/pg17/ext/pair/pair.control`, then Postgres must load files +only from the appropriate subdirectories, e.g.: * SQL files from `/opt/pg17/ext/pair/sql` -* Library files from `/opt/pg17/ext/pair/pkglib` +* Library files from `/opt/pg17/ext/pair/lib` + +### PGXS + +Update extension installation behavior of [PGXS] to install extension files +into the new locations. A new variable, `EXTDIR`, will define the directory +into which an extension will be installed, and will default to +`--extdir-site`. It can be set to the literal values `$extdir_site`, +`$extdir_vendor`, or `$extdir_core`, or to any path. + +The installation behavior will be changed for the following variables: + +* `EXTENSION`: Creates `$EXTDIR/$EXTENSION`, installs + `$EXTDIR/$EXTENSION/$EXTENSION.control` +* `MODULES` and `MODULE_big`: Installed into `$EXTDIR/$EXTENSION/lib` +* `MODULEDIR` Removed +* `DATA`: Installed into `$EXTDIR/$EXTENSION/sql` if end in `.sql`, + otherwise into `$EXTDIR/$EXTENSION/share` +* `SQL`: New variable, like `DATA` but just for SQL files +* `DATA_built`: Installed into `$EXTDIR/$EXTENSION/share` +* `DATA_TSEARCH`: Installed into `$EXTDIR/$EXTENSION/share/tsearch_data` +* `DOCS`: Installed into `$EXTDIR/$EXTENSION/doc` +* `PROGRAM`, `SCRIPTS` and `SCRIPTS_built`: Installed into + `$EXTDIR/$EXTENSION/bin` + +Another new variable, `LINKBINS`, would default to true and symlink +`$EXTDIR/$EXTENSION/bin` files to `pg_config --bindir`. Installers can be set +to false to skip the symlinking, e.g., for immutable Postgres installs. -## +### Control File + +The `directory` and `module_pathname` control file variables would be removed +and ignored. ## Use Cases -Here’s how the proposed file layout and `extension_path` GUC would work for +Here’s how the proposed file layout and `extension_path` GUC would work for the [use cases that have driven it](#the-problems). ### Packager Testing @@ -207,45 +219,43 @@ PostgreSQL install would follow these steps: the packaging install. Something like `$RPM_BUILD_ROOT/$(pg_config --extdir-vendor)` * Install the extension into that directory: - `make install BASE_DIR=$RPM_BUILD_ROOT` + `make install EXTDIR=$RPM_BUILD_ROOT` * Run `make installcheck` This should allow PostgreSQL to find and load the extension during the tests. The Postgres installation will not have been modified, only the -`extension_path` will have been changed. +`extension_path` will have changed. ### Postgres.app -The contents of the macOS Postgres.app bundle must be immutable in order to -validate against the signature generated by an Apple-provided certificate. In -order to allow extensions to be installed without changing the app bundle, the -app would be compiled to have `pg_config --extdir-site` point to a well-known -directory outside the bundle. +To allow extension installation without invalidating the Postgresl.app bundle +signature, the app would be compiled to have `--extdir-site` point to a +well-known directory outside the app bundle. -Thus any extensions installed by the user would be placed in that directory, -avoiding any change to the Postgres.app bundle. +Any extensions installed by the user would be placed in that directory, +avoiding any change to the Postgres.app bundle. Postgres would know to find +extensions in that location thanks to the inclusion of `$extdir_site` in the +`extension_path` GUC. ### Docker/Kubernetes -Like Postgres.app, Docker images are immutable, but unlike Postgres.app, they -represent the entire system. The solution is identical to that for -Postgres.app, except that instead of using a directory outside the PostgreSQL -installation, one or more [volumes] could be used. A couple of options: +To allow extensions to be added to a container and to persist beyond the +container, one or more [volumes] could be used. A couple of options: * Mount the `--extdir-site` and/or `--extdir-vendor` directories as a - persistent volumes. Then any extensions installed into them will persist, - with no need for any [symlink magic]. If a new container spins up, as long - as it uses the same persistent volumes, it will have the same extensions. + persistent volumes. Then any extensions installed into them will persist. + If a new container spins up, as long as it uses the same persistent + volumes, it will have the same extensions. * Create separate images for each extension, and then "install" them by - simply mounting read-only volumes in the appropriate subdirectory of - `--extdir-site` or `--extdir-vendor`, as appropriate. Thereafter, any new - containers would simply have to mount the same volumes to have a - consistent number of extensions. + using the [Kubernetes image volume feature] to mount them as read-only + volumes in the appropriate subdirectory of `--extdir-site` or + `--extdir-vendor`. Thereafter, any new containers would simply have to + mount the same volumes to have a consistent number of extensions. ## Extension Directory Examples -A core extension, like [citext], would live in +A core extension, like [citext], would live in `$(pg_config --extdir-core)/citext`, and have a structure such as: ``` tree @@ -268,10 +278,10 @@ citext ``` Third-party extensions would live in one or more other directories on the file -system, generally in the `pg_config --extdir-site` directory for -user-installed extensions, and `pg_config --extdir-vendor` for OS-packaged -extensions. But they may be installed anywhere, as long as the `extension_path` GUC -points to them and they're accessible to/owned by the Postgres system account. +system, generally in the `--extdir-site` directory for user-installed +extensions, and `--extdir-vendor` for OS-packaged extensions. But they may be +installed anywhere, as long as the `extension_path` GUC points to them and +they're accessible to/owned by the Postgres system account. Say that `--extdir-site` is set to `/opt/pgxn`. Within that directory, we might have a directory for a pure SQL extension in a directory named “pair” @@ -329,11 +339,84 @@ semver └── semver--1.1.sql ``` -Another example: a binary-only extension loaded via LOAD (or -`*_preload_libraries`), and not `CREATE EXTENSION`, like `auto_explain`: +## Phase Two: Preloading + +The [solution](#solution) proposed above does not allow extension modules to +compatibly be loaded via [shared library preloading], because extension +modules will be installed in extension directories and no longer in the +[`dynamic_library_path`]. Users can use the full path; instead of + +``` ini +shared_preload_libraries = 'pg_partman_bgw' +``` + +One could use the full path: + +```ini +shared_preload_libraries = '/opt/postgres/extensions/pg_partman_bgw/lib/pg_partman_bgw' +``` + +However, this could become cumbersome, especially if an extension ships with +multiple shared modules. Perhaps some special syntax could be added, for +example: + +```ini +shared_preload_libraries = '$extension_path:pg_partman_bgw' +``` + +But this overloads the semantics of `shared_preload_libraries` and friends +rather heavily, not to mention the [`LOAD`] command. + +As a follow up to the [solution](#solution) proposed above, this RFC proposes +these additional changes to PostgreSQL. + +### Extension Preloading + +Add new GUCs the complement [shared library preloading], but for *extension* +module preloading: + +* `shared_preload_extensions` +* `session_preload_extensions` +* `local_preload_extensions` + +Each takes a list of extensions for which to preload shared modules. In +addition, another new GUC, `local_plugins`, would be an administrator-only GUC +that contains a list of extensions allowed in `local_preload_extensions`. This +would complement [`local_preload_libraries`]'s use of a `plugins` directory. + +Then modify the preloading code to also preload these files. For each +extension in a list, it would: + +* Search each `$extension_path` for the extension. +* When found, load all the shared libraries in `$extension/lib`. + +For example, to load all shared modules in the `pg_partman` extension, set: + +```ini +shared_preload_extensions = 'pg_partman' +``` + +To load a single shared module from an extension, give its name after the +extension name and a slash. This example will load only the `pg_partman_bgw` +shared module from the `pg_partman` extension: + +```ini +shared_preload_extensions = 'pg_partman/pg_partman_bgw' +``` + +This change will require a one-time change to existing preload configurations +on upgrade. + +## Future: Deprecate LOAD + +For a future change, consider adding support for shared module extensions +without SQL to `CREATE EXTENSION`. This would allow extensions such as +`auto_explain` to be handled like any other extension; it would live in +`--extdir-core` with a directory structure like this: ``` tree auto_explain +├── auto_explain.control └── lib ├── auto_explain.dylib ├── bitcode @@ -342,18 +425,13 @@ auto_explain └── auto_explain.index.bc ``` -A non-core extension would be the same, but might include other files like a -`README`, `LICENSE`, etc. - -## Implementation +Note the `auto_explain.control` file. We would need to add a variable to +indicate that the extension includes no SQL files, so `CREATE EXTENSION` and +related commands wouldn't try to find them. -* New GUCs -* `CREATE EXTENSION` path search -* Module path search -* PGXS install into single directory -* PGXS install prefix option -* PGXS `MODULE_PATHNAME` mapping -* PGXN symlink binaries +With these changes, extensions could become the primary, recommended interface +for extensing PostgreSQL. Perhaps the `LOAD` command could be deprecated, and +the `*_preload_libraries` GUCs along with it. ## Future Changes @@ -394,6 +472,8 @@ This RFC does not include or attempt to address the following issues: [destdir]: https://commitfest.postgresql.org/50/4913/ [symlink magic]: https://speakerdeck.com/ongres/postgres-extensions-in-kubernetes?slide=14 "Postgres Extensions in Kubernetes: StackGres" + [Kubernetes image volume feature]: https://kubernetes.io/docs/tasks/configure-pod-container/image-volumes/ + "Kubernetes Docs: Use an Image Volume With a Pod" [Postgres.app]: https://postgresapp.com "Postgres.app: The easiest way to get started with PostgreSQL on the Mac" [PGXS]: https://www.postgresql.org/docs/current/extend-pgxs.html @@ -405,4 +485,8 @@ This RFC does not include or attempt to address the following issues: [pg_top]: https://pgxn.org/dist/pg_top/ "PGXN: pg_top" [semver]: https://pgxn.org/dist/semver/ "PGXN: semver" [volumes]: https://docs.docker.com/engine/storage/volumes/ - "Docker Docs: Volumes" + "Docker Docs: Volumes" + [shared library preloading]: https://www.postgresql.org/docs/current/runtime-config-client.html#RUNTIME-CONFIG-CLIENT-PRELOAD + "PostgreSQL Docs: Shared Library Preloading" + [`local_preload_libraries`]: https://www.postgresql.org/docs/current/runtime-config-client.html#GUC-LOCAL-PRELOAD-LIBRARIES + "PostgreSQL Docs: local_preload_libraries"