Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Integrate combinable DFA capture resolution for our PCRE dialect regexes, part 1. #440

Open
wants to merge 54 commits into
base: sv/integration-target--combinable-DFA-capture-resolution
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
54 commits
Select commit Hold shift + click to select a range
db702ff
Add src/adt/idmap.c, a state -> ID set map.
silentbicycle Feb 16, 2023
41a92d7
Switch adt code to `uint64_t` for hashes, not `unsigned long`.
silentbicycle May 1, 2023
ec30958
stateset: Add `EXPENSIVE_CHECKS` guard around expensive asserts.
silentbicycle May 2, 2023
d1ce686
Namespace epsilon_closure and closure_free with "fsm_".
silentbicycle May 2, 2023
deb14a8
tests/capture/: Delete outdated capture tests.
silentbicycle May 3, 2023
ffea22e
fuzz/run_fuzzer: Run single seed file when given as argument.
silentbicycle May 31, 2023
2beeff4
Complemely rework capture resolution.
silentbicycle May 31, 2023
31ff234
Add pcre-anchor test for anchoring edge case.
silentbicycle Jun 2, 2023
77f9260
fuzz/target.c: Add multi-regex and single regex cmp against PCRE.
silentbicycle Jun 2, 2023
318c9ad
re: Add -FC flag to disable captures.
silentbicycle Jun 2, 2023
5e49713
tests/*/Makefile: Add `-FC` (no captures) for some calls to RE.
silentbicycle Feb 16, 2023
6e410b3
ast_analysis: Reject '\z' as unsupported.
silentbicycle Jun 2, 2023
5ddb4f0
Merge branch 'main' into sv/integrate-combinable-DFA-capture-resolution
silentbicycle Jun 15, 2023
86ec9e4
union.c: Remove EXPENSIVE_CHECKS based on removed interface.
silentbicycle Jun 15, 2023
d54ba0d
bugfix: resize endid buffer for carry_end_metadata properly.
silentbicycle Jun 15, 2023
e2ac4d7
Address a couple warnings from scan-build.
silentbicycle Jun 15, 2023
c11f372
fuzz/target.c: In MULTI mode, check endid behavior.
silentbicycle Jun 15, 2023
8b49a1a
minimisation: Distinct sets of end IDs should not split ECs.
silentbicycle Jul 10, 2023
a3a79ca
Remove theft test harness for deleted ADT (ipriq).
silentbicycle Jul 10, 2023
bfc9a80
capture_vm_exec: Address scan-build warnings.
silentbicycle Jul 11, 2023
f43585a
fuzz/target.c: Expand build_and_check_multi.
silentbicycle Jul 17, 2023
3a93f6d
re_capvm_compile.c: Update #include for EXPENSIVE_CHECKS.
silentbicycle Jul 17, 2023
da51d9e
Remove nullable ALT backpatching, it's now unreachable.
silentbicycle Jul 17, 2023
9f3ae29
capture vm: Cleanup & clarifying comments throughout.
silentbicycle Jul 17, 2023
26483cf
re_capvm_compile: Change "active_node" categorization.
silentbicycle Jul 18, 2023
e9353af
capture_vm: rename split's .cont and .new to .greedy and .nongreedy.
silentbicycle Jul 21, 2023
1e55220
capture_vm: Add error state for step limit reached.
silentbicycle Jul 21, 2023
36dfddf
Merge branch 'main' into sv/integrate-combinable-DFA-capture-resolution
silentbicycle Jul 24, 2023
937c9f0
ast_analysis: Remove variables that are no longer used.
silentbicycle Jul 24, 2023
5e54a8b
src/adt/stateset.c: Reference `adt/common.h`'s `EXPENSIVE_CHECKS`.
silentbicycle Jul 24, 2023
94d6e7f
fuzz/target.c: Use default mode for the empty string.
silentbicycle Jul 25, 2023
dced7c7
re_capvm_compile.c: Avoid potential sign-extension bug.
silentbicycle Jul 25, 2023
8f0a683
Do non-zero allocations to silence EFENCE.
silentbicycle Jul 25, 2023
722260a
re_capvm_compile: subtree_represents_character_class: handle empty b.
silentbicycle Jul 25, 2023
75bb88d
state_set: Improve `state_set_search` performance, correct result.
silentbicycle Aug 29, 2023
709b8cc
stateset: Avoid memmove of size 0.
silentbicycle Aug 29, 2023
cead0d9
stateset: Add note about potentially expensive assertion.
silentbicycle Aug 29, 2023
cbfeddd
stateset: Comment struct fields.
silentbicycle Aug 29, 2023
c3dab77
edgeset: Fix indentation for `#if`'d block.
silentbicycle Aug 29, 2023
1ada07c
edgeset: Switch from linear to binary searching in edge_set_add_bulk.
silentbicycle Aug 29, 2023
7122d2f
edgeset: Commit to using binary search.
silentbicycle Aug 29, 2023
937585a
determinise: Drastically reduce calls to qsort.
silentbicycle Aug 30, 2023
30e34ef
edgeset: Remove stale comment.
silentbicycle Aug 30, 2023
cf6051f
UBSan: Avoid implicit signed/unsigned conversion.
silentbicycle Aug 30, 2023
c1e1282
UBSan: Avoid implicit signed/unsigned conversion.
silentbicycle Aug 30, 2023
6eff0f9
bugfix: The range is min..max inclusive, so add 1.
silentbicycle Aug 30, 2023
98ee906
Address a couple warnings from scan-build.
silentbicycle Jun 15, 2023
51892e3
Add src/adt/idmap.c, a state -> ID set map.
silentbicycle Feb 16, 2023
7c6644f
Remove theft test harness for deleted ADT (ipriq).
silentbicycle Jul 10, 2023
c646868
Add pcre-anchor test for anchoring edge case.
silentbicycle Jun 2, 2023
0789d61
fuzz/run_fuzzer: Run single seed file when given as argument.
silentbicycle May 31, 2023
1ca3726
Don't purge the seed cache for PRs syncing clones.
katef Sep 12, 2023
5ece637
Merge branch 'main' into sv/integrate-combinable-DFA-capture-resolution
silentbicycle Sep 27, 2023
239927c
ci.yml: Update fuzzer modes for CI.
silentbicycle Oct 2, 2023
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
11 changes: 9 additions & 2 deletions .github/workflows/ci.yml
Original file line number Diff line number Diff line change
Expand Up @@ -374,7 +374,7 @@ jobs:
cc: [ clang ]
make: [ bmake ]
debug: [ DEBUG, RELEASE ] # RELEASE=1 is a no-op
mode: [ m, p, d ]
mode: [ r, s, m, i, M, p ]
exclude:
- os: macos
cc: gcc # it's clang anyway
Expand Down Expand Up @@ -409,8 +409,15 @@ jobs:
path: ${{ env.build }}
key: build-${{ matrix.make }}-${{ matrix.os }}-${{ matrix.cc }}-${{ matrix.debug }}-${{ matrix.san }}-${{ github.sha }}

# note we do the fuzzing unconditionally; each run adds to the corpus
# note we do the fuzzing unconditionally; each run adds to the corpus.
#
# We only run fuzzing for PRs in the base repo, this prevents attempting
# to purge the seed cache from a PR syncing a forked repo, which fails
# due to a permissions error (I'm unsure why, I think PRs from clones can't
# purge a cache in CI presumably for security/DoS reasons). PRs from clones
# still run fuzzing, just from empty, and do not save their seeds.
- name: Restore seeds (mode ${{ matrix.mode }})
if: github.repository == 'katef/libfsm'
uses: actions/cache/restore@v3
id: cache-seeds
with:
Expand Down
1 change: 1 addition & 0 deletions Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -108,6 +108,7 @@ SUBDIR += src
SUBDIR += tests/capture
SUBDIR += tests/complement
SUBDIR += tests/gen
SUBDIR += tests/idmap
SUBDIR += tests/intersect
#SUBDIR += tests/ir # XXX: fragile due to state numbering
SUBDIR += tests/eclosure
Expand Down
11 changes: 10 additions & 1 deletion fuzz/Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,15 @@ ${BUILD}/fuzz/: ${BUILD}

DIR += ${BUILD}/fuzz

# Uncomment to enable capture fuzzing using PCRE as a test oracle.
#PCRE_CMP=1

.if PCRE_CMP
PKG += libpcre2-8
LFLAGS.fuzzer += ${LIBS.libpcre2-8}
CFLAGS.${SRC:Mfuzz/target.c} += -DCMP_PCRE=1
.endif

.for src in ${SRC:Mfuzz/*.c}
CFLAGS.${src} += -std=c99
.endfor
Expand All @@ -15,7 +24,7 @@ CFLAGS.${src} += -std=c99
fuzz:: ${BUILD}/fuzz/fuzzer

${BUILD}/fuzz/fuzzer: mkdir
${CC} -o $@ ${LFLAGS} ${.ALLSRC:M*.o} ${.ALLSRC:M*.a}
${CC} -o $@ ${LFLAGS} ${LFLAGS.fuzzer} ${.ALLSRC:M*.o} ${.ALLSRC:M*.a}

.for lib in ${LIB:Mlibfsm} ${LIB:Mlibre}
${BUILD}/fuzz/fuzzer: ${BUILD}/lib/${lib:R}.a
Expand Down
10 changes: 8 additions & 2 deletions fuzz/run_fuzzer
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,8 @@ BUILD=../build
FUZZER=${BUILD}/fuzz/fuzzer
SEEDS=${BUILD}/fuzz/fuzzer_seeds

ARG=$1

SECONDS=${SECONDS:-60}
WORKERS=${WORKERS:-4}
SEEDS=${SEEDS:-seeds}
Expand All @@ -25,5 +27,9 @@ if [ ! -d "${SEEDS}" ]; then
mkdir -p "${SEEDS}"
fi

echo "\n==== ${FUZZER}"
${FUZZER} -jobs=${WORKERS} -workers=${WORKERS} -max_total_time=${SECONDS} ${SEEDS}
if [ -z "${ARG}" ]; then
echo "\n==== ${FUZZER}"
exec ${FUZZER} -jobs=${WORKERS} -workers=${WORKERS} -max_total_time=${SECONDS} ${SEEDS}
else
exec ${FUZZER} ${ARG}
fi
Loading