Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

topdown: improve scientific notation parsing in extractNumAndUnit #7147

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

berdanA
Copy link

@berdanA berdanA commented Oct 31, 2024

Why the changes in this PR are needed?

In the units.parser (or specifically in the extractNumAndUnit function in topdown/parse_bytes.go), scientific notation is not currently supported for parsing numeric values with units. This limitation impacts functionality when using OPA Gatekeeper as the policy engine in Kubernetes clusters, where certain resource specifications permit the use of scientific notation. To maintain flexibility and improve user experience, I would like to propose a solution that adds support for scientific notation without altering existing functionality.

What are the changes in this PR?

  • Updated extractNumAndUnit to recognize and correctly parse scientific notation, including cases with 'e' or 'E' followed by an exponent (e.g., "1e10", "3.2E4").
  • Ensures that 'e' or 'E' is treated as part of the number if followed by digits, enhancing compatibility with scientific notation inputs.
  • Scientific notation with units now also parses correctly (e.g., "1e10GB" extracts "1e10" as the number and "GB" as the unit).
  • Maintains behavior for cases without units, allowing either the number or unit portion to be empty, while improving overall string parsing logic.
  • Added tests for scientific notation parsing in units.parse_bytes, including both SI and binary units (e.g., KB, MiB, GiB, KiB).
  • Included cases for uppercase, lowercase, and mixed case formats with scientific notation.
  • Added validation for numbers without units defaulting to bytes.

Fixes #7142
Signed-off-by: Berdan Akar [email protected]

- Updated extractNumAndUnit to recognize and correctly parse scientific notation, including cases with 'e' or 'E' followed by an exponent (e.g., "1e10", "3.2E4").
- Ensures that 'e' or 'E' is treated as part of the number if followed by digits, enhancing compatibility with scientific notation inputs.
- Scientific notation with units now also parses correctly (e.g., "1e10GB" extracts "1e10" as the number and "GB" as the unit).
- Maintains behavior for cases without units, allowing either the number or unit portion to be empty, while improving overall string parsing logic.
- Added tests for scientific notation parsing in `units.parse_bytes`, including both SI and binary units (e.g., KB, MiB, GiB, KiB).
- Included cases for uppercase, lowercase, and mixed case formats with scientific notation.
- Added validation for numbers without units defaulting to bytes.

Fixes open-policy-agent#7142
Signed-off-by: bakar <[email protected]>
if !isNum(r) {
for idx := 0; idx < len(s); idx++ {
r := rune(s[idx])
if !isNum(r) && r != 'e' && r != 'E' && r != '+' && r != '-' {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adding a small note for this check will be helpful.

Copy link
Member

@ashutosh-narkar ashutosh-narkar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@berdanA thanks for the contribution. The test coverage looks great. Just one small comment 👇 and we can get this in. It would be helpful if we update the docs for the built-ins.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add Scientific notation ("e" notation) support for unit parsing
2 participants