Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug] From field and timestamp circoms generated with negate regexes are invalid. #19

Closed
SoraSuegami opened this issue Sep 30, 2023 · 2 comments

Comments

@SoraSuegami
Copy link
Contributor

If from field and timestamp circoms are generated with negation regexes, their tests do not pass.
https://github.com/zkemail/zk-regex/blob/feat/invalid_dfa/packages/circom/circuits/common/from_addr.json
https://github.com/zkemail/zk-regex/blob/feat/invalid_dfa/packages/circom/circuits/common/timestamp.json

In particular, my current implementation generates a DFA by regarding the negation [^abc] as a group [\u{ff}abc]=(\u{ff}|a|b|c). When generating a circom based on the DFA, if the edge string contains \u{ff}, it makes constraints that the state moves only if the input character is not in the list of the edge string except for \u{ff}.
We can regard this modification of constraints as the modification of edges in the already-generated DFA.
Therefore, it turns DFA back into an NFA with multiple possible edges.

One simple idea is that we convert [^abc], to a non-negate regex (\u00 | \u01 | \u02 | ... | \uff), which means a group regex of all 1-byte characters except for ones in the group of the negate regex, before generating the first DFA.
However, it will not be able to reduce the circom size because the resulting DFA will be still complex.
I am not sure this complexity is due to theoretical limitations or not.

@SoraSuegami
Copy link
Contributor Author

New idea:

  1. Change the negation [^abc] to a group [\u{ff}abc]=(\u{ff}|a|b|c) and generate its minimized DFA.
  2. The DFA is converted to an object of NFA compatible with nfaToDfa in regex.js.
  3. Find nodes whose edge contains a character \u{ff}.
  4. For each node in 3, Find other edges that do not contain \u{ff}. Let C be a set of characters on those edges.
  5. For each node in 3, if the edge with \u{ff} goes to state x, add characters in C to that edge in the converted NFA.
  6. Apply transitions of NFA->DFA->minimized DFA to the NFA in 5 again.

@SoraSuegami
Copy link
Contributor Author

This is solved based on the first approach.
9273a2e

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant