Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FEATURE] Add support to autofill env-vars in config #764

Open
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

menof36go
Copy link

@menof36go menof36go commented Aug 1, 2024

  • Environment variables should be autofilled when the config is loaded

Fixes: #935

Plenty of tasks and middlewares offer environment variables for some parts of their configuration. I agree with the aforementioned issue that this leads to inconsistent env variable usage between tasks and forces developers into a repetitive pattern of manually adding env vars for configuration options they deem worthy.

The approach of directly referencing env variables in the configuration files seems like the most elegant solution to me, because it doesn't "hide" that the configuration occurs from the user and because it lets the user decide how they want to name their env variables (Which also allows them to reuse them between tasks freely).

Example where we replace the string "${placeholder}" in all js files with the username (e.g. Fabian on my machine)

builder:
  customTasks:
    - name: ui5-tooling-stringreplace-task
      afterTask: replaceVersion
      configuration:
        debug: true
        files:
          - "**/*.js"
        replace:
          - placeholder: ${placeholder}
            value: env:USERNAME

env:USERNAME is replaced with the content of environment variable "USERNAME" at runtime.
This works with any value in a ui5.yaml. The change should be non-breaking as long as the wrapper / prefix of the environment variables is reasonably different from a string a user might enter. I chose env: because that's the format @sap/ux-ui5-tooling currently uses, but a format that actually wraps the env variable is probably preferable. If the environment variable is not set, it currently does not get replaced.

I have not added any documentation yet. I look forward to your feedback, thanks in advance

- Environment variables should be autofilled when the config is loaded

Fixes: #935
Copy link

cla-assistant bot commented Aug 1, 2024

CLA assistant check
All committers have signed the CLA.

@flovogt
Copy link
Member

flovogt commented Aug 5, 2024

@menof36go Thanks a lot for your efforts. We will have a closer look in the team and will get in touch with you if we can share any news

@flovogt flovogt added the enhancement New feature or request label Aug 5, 2024
@flovogt flovogt self-assigned this Aug 5, 2024
@coveralls
Copy link

Coverage Status

coverage: 95.697% (-0.02%) from 95.717%
when pulling 174b695 on menof36go:add-env-auto-fillin
into 380e358 on SAP:main.

Copy link
Member

@RandomByte RandomByte left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@menof36go apologies for the delay, we are quite low on capacity this summer.

Again, thank you very much for your contribution! I like your approach and it generally looks pretty good. I left some comments that we can iterate on.

In addition I have some general points to discuss (with you, but also with the team):

  1. Should we restrict the use of environment variables to certain sections of the configuration?
    • E.g. only allow it in the configuration sections of custom tasks/middleware and the "customConfiguration" section? I'm worried about the possibility to circumvent our validation by changing relevant configuration values after a yaml has already been read and validated. Basically we would need validate the whole object again.
    • This PR and Support for environment variables ui5-tooling#935 only mention custom task/middleware configuration
  2. Should we restrict the names of the environment variables somehow? E.g. to require a UI5_TOOLING_ prefix?
    • This is probably unwanted? At least I'm not aware of other tool with a similar restriction.
    • However, this would give the developer more control over what environment variables may be accessed by UI5 Tooling, and especially by any UI5 dependencies a project might have.
    • Dependencies could read arbitrary environment variables. This could either expose sensitive information, or lead to errors (e.g. a variable may exist on macOS but not on Windows).
    • Custom tasks and middleware can of course already read any environment variables from within Node.js (accessing process.env). But maybe we'll be able to sandbox those in the future.
  3. I assume this should work for the configuration object of the current root project as well as for any dependencies?

Looking forward to your feedback!

* @static
* @readonly
*/
static _ENVIRONMENT_VARIABLE_REGEX = /env:\S+/g;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
static _ENVIRONMENT_VARIABLE_REGEX = /env:\S+/g;
static _ENVIRONMENT_VARIABLE_REGEX = /^env:\S+$/g;

Match the start and end of the string to prevent partial substitution like "env:should" in this env:should not be replaced.

At least I think that would be unwanted/unexpected by most users. Or did you expect this to work as well?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I considered this aswell, but figured this would actually be expected behavior since environment variables behave like this in batch and powershell script string interpolation.

Consider a custom task that takes a configuration value in a specific format, e.g. something like

configuration:
      example1: "key1=value1;key2=value2"
      example2: "value2"

With the current implementation, the user could inject their environment variables however they might like, e.g. something like

configuration:
      example1: "key1=env:myvalue;key2=value2"
      example2: "env:myvalue"

This approach allows a more fine-grained configuration IMO. Now one might argue that the dev should not have implemented their custom configuration like this in the first place (and I agree, at least for this example a yaml array would have sufficed), but then again the goal is to move the responsibility away from the dev.

I think it's very unlikely that a user might have a substring in a config value that coincidentally matches the environment variable regex given a sufficiently unique env var format, so the additional control and flexibility the user gets in return seems worth it.

I'll gladly change it if you want me to, though

const envVarName = substring.slice(4);
return process.env[envVarName] || substring;
});
object[key] = value;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder whether this requires additional checks to prevent prototype pollution, since the source of the configuration might not be fully trusted.

In case the configuration object has been loaded from a YAML, the yaml-js module already used defineProperty for any __proto__ keys when creating the object. I'm not sure whether we still need to do the same when updating the property's value.

case "string": {
value = value.replace(Module._ENVIRONMENT_VARIABLE_REGEX, (substring) => {
const envVarName = substring.slice(4);
return process.env[envVarName] || substring;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure whether we should do this fallback in case the environment variable is empty or not set.

I would expect that we always replace the env-pattern, since the pattern itself is probably never the correct value for whatever you are passing it to. An empty value seems more likely to trigger checks and result in a reasonable error for the user (i.e. "parameter xyz is empty" from a custom task).

I'll have to check again, but I think that would be in line with other tools like Docker as well.
From dotenv:

empty values become empty strings (EMPTY= becomes {EMPTY: ''})

I'm also wondering whether we should differentiate between empty environment variables and variables that are not set at all. I'm leaning towards treating them the same, but we should check how other tools are handling this.

Copy link
Author

@menof36go menof36go Sep 1, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, I agree that the current fallback is bad.
We have to consider that the task has no way of knowing that the user might have forgotten something since it doesn't and shouldn't know about the auto-fill. So there are 2 possible outcomes:

  1. empty string is a legal value for the configuration key's value
  2. empty string is an illegal value for the configuration key's value

In case 2, we are fine, the task will (hopefully) throw an error.

But what about case 1?
In case 1 the task may not fail at all and the results may be very different from what the user expects.
Imagine how hard it might be for a user to figure out why their results are all messed up in this case, when seemingly no error occurred.

For this reason, I think it would be best if we take responsibility and throw a reasonable error ourselves instead of delegating it to the task. At the very least we should warn the user imo.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree, we can't tell whether an empty string is a legal value for a given configuration option.

Are you proposing to raise an error in case an env-var is not set at all, or also in case it is set to an empty string? By now I think I'm more in favor of the first one, since empty strings may be valid configuration options.

I understand that we both agree the name of the environment variable should never be used as a value.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you proposing to raise an error in case an env-var is not set at all, ... By now I think I'm more in favor of the first one, since empty strings may be valid configuration options.

Yes, exactly, I think it is the best way to avoid unexpected behavior

I understand that we both agree the name of the environment variable should never be used as a value.

Yes, I agree

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After discussing this with some colleagues, I'm not sure whether raising an expection will work well with some of the expectations from SAP/ui5-tooling#935.

Especially when environment variables are used in (transitive-)dependencies (e.g. a reuse-library), you would have to make sure to set all those variables when building the root project, as otherwise you will face an error.

  1. For things like enabling the debug mode of a custom task, this doesn't seem ideal. Surely you don't want to always have to set DEBUG= to explicitly disable the debug mode. I'd rather expect the debug mode to be disabled by default, unless DEBUG=true is set.
  2. If a dependency adds an environment variable in a new release, your project's build would break after updating the dependency, until you also provide the new environment variable (which might be something irrelevant for your project)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After discussing this with some colleagues, I'm not sure whether raising an expection will work well with some of the expectations from SAP/ui5-tooling#935.

Especially when environment variables are used in (transitive-)dependencies (e.g. a reuse-library), you would have to make sure to set all those variables when building the root project, as otherwise you will face an error.

1. For things like enabling the debug mode of a custom task, this doesn't seem ideal. Surely you don't want to always have to set `DEBUG=` to _explicitly disable the debug mode_.  I'd rather expect the debug mode to be disabled by default, _unless_  `DEBUG=true` is set.

2. If a dependency adds an environment variable in a new release, your project's build would break after updating the dependency, until you also provide the new environment variable (which might be something irrelevant for your project)

I agree, environment variables inside your dependencies yamls sound like a hot mess, I will gladly give that up (see comment below). If we do however restrict the feature to the root project, would you then agree that an exception is the way to go? The only way for an exception to pop up in that case is if the user messed up their own config and updates to dependencies should no longer be a concern

name: "application.a-object",
},
customConfiguration: {
stringWithEnvVar: `env:${testEnvVars[0].name}`,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please also add a case where the pattern is inside a string (see my comment on the regex)

test/lib/graph/Module.js Outdated Show resolved Hide resolved
Co-authored-by: Merlin Beutlberger <[email protected]>
@menof36go
Copy link
Author

@menof36go apologies for the delay, we are quite low on capacity this summer.

Again, thank you very much for your contribution! I like your approach and it generally looks pretty good. I left some comments that we can iterate on.

In addition I have some general points to discuss (with you, but also with the team):

1. Should we restrict the use of environment variables to certain sections of the configuration?
   
   * E.g. only allow it in the configuration sections of custom tasks/middleware and the "customConfiguration" section? I'm worried about the possibility to circumvent our validation by changing relevant configuration values after a yaml has already been read and validated. Basically we would need validate the whole object again.
   * This PR and [Support for environment variables ui5-tooling#935](https://github.com/SAP/ui5-tooling/issues/935) only mention custom task/middleware configuration

2. Should we restrict the names of the environment variables somehow? E.g. to require a `UI5_TOOLING_` prefix?
   
   * This is probably unwanted? At least I'm not aware of other tool with a similar restriction.
   * However, this would give the developer more control over what environment variables may be accessed by UI5 Tooling, and **especially by any UI5 dependencies** a project might have.
   * Dependencies could read arbitrary environment variables. This could either expose sensitive information, or lead to errors (e.g. a variable may exist on macOS but not on Windows).
   * Custom tasks and middleware can of course already read _any environment variables_ from within Node.js (accessing `process.env`). But maybe we'll be able to sandbox those in the future.

3. I assume this should work for the configuration object of the current root project as well as for any dependencies?

Looking forward to your feedback!

  1. I think you are right. We don't want to validate the object twice and we don't want autofill to invalidate the config. I think the best course of action would be to move the autofill step into module#_readConfigFile. The autofill should only be relevant for yaml configurations anyways and this ensures that the autofill is always performed before the config validation. If we do it like this, there should not be a reason to restrict the usage to certain sections from a technical point of view

  2. This is a somewhat separate topic, because like you said, currently there is no sandboxing anyways, but I think it would be preferable to solve this without restricting the env vars to a prefix.
    The ~/.ui5rc seems like a decent place to let the user configure environment variable sandboxing on a project-independent level (e.g. whitelisting / blacklisting with regex support so users can come up with their own prefix if that's a solution they like).
    In my understanding, the only "trusted" ui5.yaml is the root project's ui5.yaml and similar environment variable sandboxing configuration could be implemented here (and overwrite the global config from ~/.ui5rc in that case, similar to how tools like npm handle their configuration hierarchy).

  3. Yes, i think so

@RandomByte
Copy link
Member

I added a comment in SAP/ui5-tooling#935 (comment) and want to share some details here:

After discussing this in the team we have found some concerns regarding general environment variable substitution in the ui5.yaml.

To begin, we missed an important feature that we already have and want to expand on in the future: Reusable build results. Right now you can build a project with the additional parameter --create-build-manifest. The dist folder will then contain a .ui5/build-manifest.json which allows other UI5 Tooling builds to use this build result instead of having to build the project again. Currently this feature is semi-public but heavily used in the SAPUI5 framework build. We plan to extend it to allow local caching and reuse of build results for UI5 app developers.

Adding the ability to influence how a dependency of your project is built by setting environment variables makes this a lot harder. The build-manifest would have to list all relevant environment variables and their values that were used when building the project to eventually decide whether the result can be re-used. The same applies for all transitive dependencies, since any of their build results could influence those above them.

Another problem I see is that dependencies might add environment variables to their ui5.yaml at any time and consuming projects might run into all sorts of trouble because of that. Maybe a dependency of your project suddenly adds a new variable that you need to set. Or one that you don't need to set but already do because you used the same name. There's no easy way for UI5 projects to communicate which variables they use and which ones are meant for their own internal testing/development and which ones for external consumers. Keep in mind that you might be dealing with multiple layers of transitive dependencies.

Therefore, and depending on responses to SAP/ui5-tooling#935 (comment), I argue that we should restrict this functionality to the custom middleware configuration for now. This configuration is only used if the project is the current root project. Therefore it can be ignored when building dependencies.

@menof36go
Copy link
Author

menof36go commented Sep 30, 2024

I added a comment in SAP/ui5-tooling#935 (comment) and want to share some details here:

After discussing this in the team we have found some concerns regarding general environment variable substitution in the ui5.yaml.

To begin, we missed an important feature that we already have and want to expand on in the future: Reusable build results. Right now you can build a project with the additional parameter --create-build-manifest. The dist folder will then contain a .ui5/build-manifest.json which allows other UI5 Tooling builds to use this build result instead of having to build the project again. Currently this feature is semi-public but heavily used in the SAPUI5 framework build. We plan to extend it to allow local caching and reuse of build results for UI5 app developers.

Adding the ability to influence how a dependency of your project is built by setting environment variables makes this a lot harder. The build-manifest would have to list all relevant environment variables and their values that were used when building the project to eventually decide whether the result can be re-used. The same applies for all transitive dependencies, since any of their build results could influence those above them.

Another problem I see is that dependencies might add environment variables to their ui5.yaml at any time and consuming projects might run into all sorts of trouble because of that. Maybe a dependency of your project suddenly adds a new variable that you need to set. Or one that you don't need to set but already do because you used the same name. There's no easy way for UI5 projects to communicate which variables they use and which ones are meant for their own internal testing/development and which ones for external consumers. Keep in mind that you might be dealing with multiple layers of transitive dependencies.

Therefore, and depending on responses to SAP/ui5-tooling#935 (comment), I argue that we should restrict this functionality to the custom middleware configuration for now. This configuration is only used if the project is the current root project. Therefore it can be ignored when building dependencies.

I actually agree with almost all of this. Dependencies should expose their configuration via yaml. I see no point in having an env var in a yaml outside of the root project, I will gladly restrict it. Restricting the feature to the root project should also alleviate the concerns regarding the build-manifest, if I understand this feature correctly.

However, I dont think we should exclude custom tasks. For example, we are currently using ui5-tooling-stringreplace environment variables to replace certain strings during build based on environment variables. We also use open-ux-tools/deploy-tooling with environment variables. I am sure there are more examples of ui5 custom tasks that rely on environment variables, but those are the two examples off the top of my head.

If we generally exclude custom tasks, we will continue to have different notations, varying grades of env var support and users and devs alike will have to deal with a rather unpleasant new asymmetry between tasks and middlewares (For example, a user could use env:Var notation in his stringreplace middleware notation, but would have to rely on the notation + separate .env file for the task, yuck!)

Are your concerns related to the build-manifest? If I understand the feature correctly, it allows reusing your build in another project. So if we restrict the feature to the root project, this should no longer be an issue / concern

  1. I assume this should work for the configuration object of the current root project as well as for any dependencies?

This was just a side-effect of how it was implemented, I will gladly give this up if it helps reduce concerns

@flovogt
Copy link
Member

flovogt commented Oct 2, 2024

We track this PR SAP internally in backlog item CPOUI5FOUNDATION-919.

@RandomByte
Copy link
Member

Quick update, sadly there hasn't been much progress on the topic of caching/reusing build results recently, but I think this PR has impact on that topic so we should better wait until that's fully figured out. Sorry for that, hope to get back onto it soon.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Support for environment variables
4 participants