From fb122e5c5b3db9030a1e783c5ce8b3a1a1d03f7d Mon Sep 17 00:00:00 2001 From: "David E. Wheeler" Date: Mon, 4 Nov 2024 16:39:17 -0500 Subject: [PATCH] Revise RFC; ready for feedback --- .../rfc-extension-packaging-lookup.md | 235 +++++++++--------- 1 file changed, 122 insertions(+), 113 deletions(-) diff --git a/content/post/postgres/rfc-extension-packaging-lookup.md b/content/post/postgres/rfc-extension-packaging-lookup.md index d15fa26c..a5b6db35 100644 --- a/content/post/postgres/rfc-extension-packaging-lookup.md +++ b/content/post/postgres/rfc-extension-packaging-lookup.md @@ -1,8 +1,8 @@ --- title: "RFC: Extension Packaging & Lookup" slug: rfc-extension-packaging-lookup -date: 2024-10-30T19:41:45Z -lastMod: 2024-10-30T19:41:45Z +date: 2024-11-04T19:07:44Z +lastMod: 2024-11-04T19:07:44Z description: | A proposal to modify the PostgreSQL core so that all files required for an extension live in a directory named for the extension, along with a search @@ -14,12 +14,11 @@ draft: true A few weeks ago, I started [a pgsql-hackers thread] proposing a new extension file organization and a search path [GUC] for finding extensions. The -[discussion] of [Christoph Berg]'s [`extension_destdir` patch][destdir], which -I submitted earlier this year, inspired this proposal. These threads cover -quite a lot of territory, so I thought it would be useful to pull together a -more unified, public proposal. +[discussion] of [Christoph Berg]'s [`extension_destdir` patch][destdir] +inspired this proposal. These threads cover quite a lot of territory, so I +thought it would be useful to pull together a more unified, public proposal. -## The Problems +## The Problem A number of challenges face extension users in various configurations, thanks to extension file organization in the Postgres core. The common thread among @@ -29,13 +28,10 @@ itself. ### Packager Testing On Debian systems, the user account that creates extension packages lacks -permission to add files to the `root` user-owned Postgres install. But testing -extensions requires installing the extension files where Postgres can find -them. - -Furthermore, extensions should ideally be built against a clean Postgres -install; adding an extension in order to run `make installcheck` would pollute -it. +permission to add files to Postgres install. But testing extensions requires +installing the extension files where Postgres can find them. Furthermore, +extensions should ideally be built against a clean Postgres install; adding an +extension in order to run `make installcheck` would pollute it. [Christoph's patch][destdir] solves these problems by adding a second lookup path for extensions and dynamic modules, so that Postgres can load them @@ -46,7 +42,7 @@ the full `pg_config` directory paths to it. For example, if `--sharedir` outputs `/opt/share` and `extension_destdir` GUC is set to `/tmp/build/myext`, the patch will search in `/tmp/build/myext/opt/share`. This approach works for the packaging use case, which explicitly uses full paths with a prefix, but -would be a bit weird for other use cases. +would be weird for other use cases. ### Docker Immutability @@ -54,19 +50,23 @@ Docker images are immutable. To install persistent extensions in a running Docker container, one must create a persistent volume, map it to `SHAREDIR/extensions`, and copy over all the core extensions (or muck with [symlink magic]). Then do it again for shared object libraries (`PKGLIBDIR`), -and perhaps also for other `pg_config` directories, like `--bindir`. +and perhaps also for other `pg_config` directories, like `--bindir`. Once it's +all set up, one can install a new extension and its files will be distributed +to the relevant persistent volumes. This pattern makes upgrades tricky, because the core extensions are mixed in -with third-party extensions. Plus, the number of directories that must be +with third-party extensions. Worse, the number of directories that must be mounted into volumes depends on the features of an extension, increasing -deployment configuration complexity. +deployment configuration complexity. It would be preferable to have all the +files for an extension in one place, rather than scattered across multiple +persistent volumes. ### Postgres.app Immutability The macOS [Postgres.app] supports extensions. But installing one into `SHAREDIR/extensions` changes the contents of the Postgres.app bundle, -breaking the app's Apple-required signature validation. The OS will no longer -be able to validate that the app is legit and refuse to start it. +breaking Apple-required signature validation. The OS will no longer be able to +validate that the app is legit and refuse to start it. ## Solution @@ -75,8 +75,8 @@ lookup patterns for PostgreSQL extensions. ### Extension Directories -First, when an extension is installed, all of its files should live in a -single directory. These include: +First, when an extension is installed, all of its files will live in a single +directory named for the extension. The contents include: * The Control file that describes extension * Subdirectories for SQL, shared modules, docs, binaries @@ -89,8 +89,7 @@ Subdirectories roughly correspond to the `pg_config --*dir` options: * `lib`: Dynamically loadable modules * `locale`: Locale support files * `man`: Manual pages -* `sql`: SQL files -* `share`: Other architecture-independent support files +* `share`: SQL and other architecture-independent support files This layout reduces the cognitive overhead for understanding what files belong to what extension. Want to know what's included in the `widget` extension? @@ -122,14 +121,13 @@ plpgsql xml2 ``` -Any OS vendor or packaging systems would install non-core extensions into -`--extdir-vendor`, while end-user extensions would be installed into +OS vendor and packaging systems would install non-core extensions into +`--extdir-vendor`, while user-installed extensions will be put into `--extdir-site`. Like all other `pg_config` options, these values can be customized at compile -time. By default, the each should point to different directories, so that -core, vendor, and end-user extensions are always kept separate. Perhaps -default to: +time. By default, they'll point to different directories, so that core, +vendor, and end-user extensions are always kept separate. Perhaps default to: ``` PG_INSTALL_ROOT/extensions/(core|site|vendor) @@ -138,43 +136,44 @@ PG_INSTALL_ROOT/extensions/(core|site|vendor) ### Extension Path Add an extension lookup path GUC akin to [`dynamic_library_path`], called -`extension_path`. It lists all the directories that Postgres should search for -extensions and their files. The default value for this GUC would be: +`extension_path`. It lists all the directories that Postgres will search for +extensions and their files. The default value for this GUC will be: ``` ini extension_path = '$extdir_site,$extdir_vendor,$extdir_core' ``` The special values `$extdir_site`, `$extdir_vendor`, and `$extdir_core` -correspond to `pg_config` `--extdir-site`, `--extdir-vendor`, and -`--extdir-core` options, respectively, and function exactly as `$libdir` does -for the `dynamic_library_path` GUC, substituting the appropriate values. +correspond to the `pg_config` options `--extdir-site`, `--extdir-vendor`, and +`--extdir-core`, respectively, and function exactly as `$libdir` does for the +`dynamic_library_path` GUC, substituting the appropriate values. ### Lookup Execution Update PostgreSQL's `CREATE EXTENSION` command to search the directories in -`extension_path` for an extension. For each directory in the list, it should -look for the extension control file in a directory named for the extension: +`extension_path` for an extension. For each directory in the list, it will +should look for the extension control file in a directory named for the +extension: ``` sh $dir/$extension/$extension.control ``` -The first one it finds should be considered the canonical location for the -extension. For example, if the control file for the `pair` extension was found -at `/opt/pg17/ext/pair/pair.control`, then Postgres must load files only from -the appropriate subdirectories, e.g.: +The first match wil be considered the canonical location for the extension. +For example, if Postgres finds the control file for the `pair` at +`/opt/pg17/ext/pair/pair.control`, then it will load files only from the +appropriate subdirectories, e.g.: -* SQL files from `/opt/pg17/ext/pair/sql` -* Library files from `/opt/pg17/ext/pair/lib` +* SQL files from `/opt/pg17/ext/pair/share` +* Shared module files from `/opt/pg17/ext/pair/lib` ### PGXS -Update extension installation behavior of [PGXS] to install extension files -into the new locations. A new variable, `EXTDIR`, will define the directory -into which an extension will be installed, and will default to -`--extdir-site`. It can be set to the literal values `$extdir_site`, -`$extdir_vendor`, or `$extdir_core`, or to any path. +Update the extension installation behavior of [PGXS] to install extension +files into the new locations. A new variable, `EXTDIR`, will define the +directory into which to install an extension, and will default to +`--extdir-site`. It can be set to the values `$extdir_site`, `$extdir_vendor`, +or `$extdir_core`, or to any literal path. The `$EXTENSION` variable will be changed to allow only one extension name. If it's set, the installation behavior will be changed for the following @@ -184,23 +183,20 @@ variables: `$EXTDIR/$EXTENSION/$EXTENSION.control` * `MODULES` and `MODULE_big`: Installed into `$EXTDIR/$EXTENSION/lib` * `MODULEDIR`: Removed -* `DATA`: Installed into `$EXTDIR/$EXTENSION/sql` if end in `.sql`, - otherwise into `$EXTDIR/$EXTENSION/share` -* `SQL`: New variable, like `DATA` but just for SQL files +* `DATA`: Installed into `$EXTDIR/$EXTENSION/share` * `DATA_built`: Installed into `$EXTDIR/$EXTENSION/share` * `DATA_TSEARCH`: Installed into `$EXTDIR/$EXTENSION/share/tsearch_data` * `DOCS`: Installed into `$EXTDIR/$EXTENSION/doc` * `PROGRAM`, `SCRIPTS` and `SCRIPTS_built`: Installed into `$EXTDIR/$EXTENSION/bin` -Another new variable, `LINKBINS`, would default to true and symlink -`$EXTDIR/$EXTENSION/bin` files in `pg_config --bindir`. Installers can be set +Another new variable, `LINKBINS`, will default to true and symlink +`$EXTDIR/$EXTENSION/bin` files in `pg_config --bindir`. Installers can set it to false to skip the symlinking, e.g., for immutable Postgres installs. -> [!NOTE] -> External projects that install extensions without using PGXS, like [pgrx], -> should be updated to either follow the same pattern or to delegate -> installation to [PGXS]. +> [!NOTE] External projects that install extensions without using PGXS, like +> [pgrx], must also be updated to either follow the same pattern or to +> delegate installation to [PGXS]. ### MODULE_PATHNAME @@ -209,13 +205,13 @@ install path for shared modules, `$EXTDIR/$EXTENSION/lib`. ### Control File -The `directory` and `module_pathname` control file variables would be +The `directory` and `module_pathname` control file variables will be deprecated and ignored. ## Use Cases -Here’s how the proposed file layout and `extension_path` GUC would work for -the [use cases that have driven it](#the-problems). +Here’s how the proposed file layout and `extension_path` GUC address the [use +cases](#the-problem) that inspired this RFC. ### Packager Testing @@ -229,36 +225,39 @@ follow these steps: `make install EXTDIR=$RPM_BUILD_ROOT` * Run `make installcheck` -This should allow PostgreSQL to find and load the extension during the tests. -The Postgres installation will not have been modified, only the +This will allow PostgreSQL to find and load the extension during the tests. +The Postgres installation will not have been modified; only the `extension_path` will have changed. -### Postgres.app - -To allow extension installation without invalidating the Postgres.app bundle's -signature, the app would be compiled to have `--extdir-site` point to a -well-known directory outside the app bundle. - -Any extensions installed by the user would be placed in that directory, -without changing the contents of the Postgres.app bundle. Postgres.app would -know to find extensions in that location thanks to the inclusion of -`$extdir_site` in the `extension_path` GUC. - ### Docker/Kubernetes -To allow extensions to be added to a container and to persist beyond the -container, one or more [volumes] could be used. A couple of options: +To allow extensions to be added to a Docker container and to persist beyond +its lifetime, one or more [volumes] could be used. A couple of options: * Mount the `--extdir-site` and/or `--extdir-vendor` directories as a - persistent volumes. Then any extensions installed into them will persist. - If a new container spins up, as long as it uses the same persistent - volumes, it will have the same extensions. + persistent volumes (or one volume and a subdirectory for each). Then any + extensions installed into them will persist. Files for any one extension + will live on a single volume. If a new container spins up, as long as it + uses the same persistent volume(s), can access the same extensions. * Create separate images for each extension, and then "install" them by using the [Kubernetes image volume feature] to mount them as read-only volumes in the appropriate subdirectory of `--extdir-site` or `--extdir-vendor`. Thereafter, any new containers would simply have to - mount the same volumes to have a consistent number of extensions. + mount all the same extension image volumes persistently provide the same + extensions to all containers. + +### Postgres.app + +To allow extension installation without invalidating the Postgres.app bundle's +signature, the app could be compiled to have `--extdir-site` and +`--extdir-vendor` point to subdirectories well-known directories outside the +app bundle, such as `/Library/Application Support/Postgres`. + +Any vendor or user extensions installed would be placed in those +subdirectories, without changing the contents of the Postgres.app bundle. +Postgres.app would know to find extensions in that location thanks to the +inclusion of `$extdir_site` in the `extension_path` GUC. ## Extension Directory Examples @@ -274,7 +273,7 @@ citext │ ├── citext │ │ └── citext.bc │ └── citext.index.bc -└── sql +└── share ├── citext--1.0--1.1.sql ├── citext--1.1--1.2.sql ├── citext--1.2--1.3.sql @@ -284,14 +283,8 @@ citext └── citext--1.5--1.6.sql ``` -Third-party extensions would live in other directories on the file system, -generally in the `--extdir-site` directory for user-installed extensions, and -`--extdir-vendor` for OS-packaged extensions. But they may be installed -anywhere, as long as the `extension_path` GUC points to them and they're -accessible to/owned by the Postgres system account. - -Within the install directory, we might have a subdirectory for a pure SQL -extension in a directory named “pair” that looks like this: +The subdirectory for a pure SQL extension named "pair" in a directory named +“pair” that looks something like this: ``` tree pair @@ -302,7 +295,7 @@ pair │ ├── html │ │ └── pair.html │ └── pair.md -└── sql +└── share ├── pair--1.0--1.1.sql └── pair--1.1.sql ``` @@ -340,30 +333,31 @@ semver │ ├── semver │ │ └── semver.bc │ └── semver.index.bc -└── sql +└── share ├── semver--1.0--1.1.sql └── semver--1.1.sql ``` ## Phase Two: Preloading -The [solution](#solution) proposed above does not allow extension modules to -compatibly be loaded via [shared library preloading], because extension -modules will be installed in extension directories and no longer in the -[`dynamic_library_path`]. Users can use the full path; for example, instead of +The above-proposed [solution](#solution) does not allow shared modules +distributed with extensions to compatibly be loaded via [shared library +preloading], because extension modules wil no longer live in the +[`dynamic_library_path`]. Users can specify full paths, however. For example, +instead of: ``` ini shared_preload_libraries = 'pg_partman_bgw' ``` -One would use: +One could use: ```ini shared_preload_libraries = '/opt/postgres/extensions/pg_partman_bgw/lib/pg_partman_bgw' ``` -However, this could become cumbersome, especially if an extension ships with -multiple shared modules. Perhaps some special syntax could be added, for +But users will likely find this pattern cumbersome, especially for extensions +with multiple shared modules. Perhaps some special syntax could be added, for example: ```ini @@ -373,12 +367,12 @@ shared_preload_libraries = '$extension_path:pg_partman_bgw' But this overloads the semantics of `shared_preload_libraries` and friends rather heavily, not to mention the [`LOAD`] command. -As a follow up to the [solution](#solution) proposed above, this RFC proposes -these additional changes to PostgreSQL. +Therefor, as a follow up to the [solution](#solution) proposed above, this RFC +proposes additional changes to PostgreSQL. ### Extension Preloading -Add new GUCs the complement [shared library preloading], but for *extension* +Add new GUCs that complement [shared library preloading], but for *extension* module preloading: * `shared_preload_extensions` @@ -386,9 +380,10 @@ module preloading: * `local_preload_extensions` Each takes a list of extensions for which to preload shared modules. In -addition, another new GUC, `local_plugins`, would contains a list of -administrator-approved extensions allowed in `local_preload_extensions`. This -would complement [`local_preload_libraries`]'s use of a `plugins` directory. +addition, another new GUC, `local_plugins`, will contain a list of +administrator-approved extensions users are allowed to include in +`local_preload_extensions`. This GUC complements [`local_preload_libraries`]'s +use of a `plugins` directory. Then modify the preloading code to also preload these files. For each extension in a list, it would: @@ -410,15 +405,15 @@ shared module from the `pg_partman` extension: shared_preload_extensions = 'pg_partman/pg_partman_bgw' ``` -This change will require a one-time change to existing preload configurations -on upgrade. +This change requires a one-time change to existing preload configurations on +upgrade. ## Future: Deprecate LOAD -For a future change, consider adding support for shared module extensions -without SQL to `CREATE EXTENSION`. This would allow extensions such as -`auto_explain` to be handled like any other extension; it would live in -`--extdir-core` with a directory structure like this: +For a future change, consider modifying `CREATE EXTENSION` to support shared +module-only extensions. This would allow extensions with no SQL component, +such as `auto_explain` to be handled like any other extension; it would live +in `--extdir-core` with a directory structure like this: ``` tree auto_explain @@ -443,11 +438,22 @@ the `*_preload_libraries` GUCs along with it. * The The `directory` and `module_pathname` control file variables and the `MODULEDIR` PGXS variable would be deprecated and ignored. -* `*_preload_libraries` would no longer be used to find extension modules. - Administrators would have to move extensions listed in those GUCs to the - new `*_preload_extensions` variables. +* `*_preload_libraries` would no longer be used to find extension modules + without full paths. Administrators would have to remove module names from + these GUCs and add the relevant extension names to the + `*_preload_extensions` variables. To ease upgrades, we might consider + adding a PGXS variable that, when true, would symlink shared modules into + `--pkglibdr`. * `LOAD` would no longer be able to find shared modules included with - extensions. + extensions, unless we add a PGXS variable that, when true, would symlink + shared modules into `--pkglibdr`. +* The `EXTENSION` PGXS variable will no longer support multiple extension + names. +* The change in extension installation locations must also be adopted by + projects that don't use PGXS for installation, like [pgrx]. Or perhaps + they could be modified to also use PGXS. Long term it might be useful to + replace the `Makefile`-based PGXS with another installation system, + perhaps a CLI. ## Out of Scope @@ -457,7 +463,7 @@ This RFC does not include or attempt to address the following issues: consistent in a Docker/Kubernetes environment or for non-system binary packaging patterns presents its own challenges, though they're not specific to PostgreSQL or the patterns described here. Research is ongoing - into potential solutions, but will be addressed elsewhere. + into potential solutions, and will be addressed elsewhere. [a pgsql-hackers thread]: https://postgr.es/m/2CAD6FA7-DC25-48FC-80F2-8F203DECAE6A%40justatheory.com [GUC]: https://pgpedia.info/g/guc.html "GUC - Grand Unified Configuration" @@ -486,3 +492,6 @@ This RFC does not include or attempt to address the following issues: "PostgreSQL Docs: local_preload_libraries" [`LOAD`]: https://www.postgresql.org/docs/17/sql-load.html "PostgreSQL Docs: LOAD" + [auto_explain]: https://www.postgresql.org/docs/current/auto-explain.html + "PostgreSQL Docs: auto_explain— log execution plans of slow queries" + \ No newline at end of file