Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sync performance improvements and misc. bugfixes from upstream #23

Merged
merged 51 commits into from
Sep 12, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
51 commits
Select commit Hold shift + click to select a range
8a704d7
tests/endids/Makefile: Move SRC definition out of the loop.
silentbicycle Jun 12, 2023
b9b4cde
Move definitions into include/adt/common.h. Add BUILD_FOR_FUZZER.
silentbicycle Jun 13, 2023
8a9298e
Update generated parser code, due to BUILD_FOR_FUZZER change.
silentbicycle Jun 13, 2023
0cb89ef
ast_compile: Remove direct access to fsm->statecount.
silentbicycle Jun 13, 2023
fb5b74d
bugfix: Ensure combine_info is zeroed during early return.
silentbicycle Jun 13, 2023
5148314
bugfix: Eliminate possible write to null pointer.
silentbicycle Jun 13, 2023
656b32f
Missing API comments.
katef May 14, 2023
e6c11fb
Another pass at ABI fixes for rust, and introduce rust for CI.
katef May 14, 2023
64a4bc1
Missing assertion (gcc complains about this but not clang.
katef May 16, 2023
dc0c728
Assert on errno == 0 just to catch any pre-existing missing error han…
katef May 16, 2023
b066001
Missing argument list for __asan_default_options()
katef May 17, 2023
d759723
Update kmkf for -fsanitize=fuzzer-no-link
katef May 17, 2023
2dcca17
Run fuzzing in CI.
katef May 16, 2023
d5c6132
abort() on unreachable codepaths for -DNDEBUG
katef May 19, 2023
d0da06a
Missing header.
katef May 20, 2023
6166469
Add an explicit "default" mode for fuzzing.
katef May 19, 2023
b8212f9
Ignore timeouts when fuzzing under EXPENSIVE_CHECKS.
katef May 20, 2023
e2f5431
Cache seeds between CI runs.
katef May 19, 2023
33cc196
Consolidate `DEBUG` under `EXPENSIVE_CHECKS`, but only for CI.
katef May 19, 2023
af8f8fe
-fork for fuzzing, this ignores OOMs by default.
katef May 20, 2023
89b3e01
Apparently ASAN_OPTIONS= and UBSAN_OPTIONS= do actually both go into …
katef May 20, 2023
784d98c
Update kmkf for DWARF-4 workaround.
katef May 20, 2023
d939bfa
Two attempts to deal with timeouts when fuzzing.
katef May 20, 2023
cf4eae6
Retry on error, workaround for kmkf#14
katef May 20, 2023
0ea03dc
Combine ASAN and UBSAN just to save some time.
katef May 21, 2023
4ee299d
No need to rebuild the entire repo just to test makefiles.
katef May 21, 2023
c732e00
A cheesy attempt at parallelisation.
katef May 20, 2023
dd98636
No need to strdup here.
katef May 22, 2023
f216cb4
Just DEBUG for tests and fuzzing, not EXPENSIVE_CHECKS.
katef Jun 2, 2023
b5d710e
Explicit cache save/restore actions for seeds.
katef Jun 2, 2023
4e5bcce
None of these are valid rust byte escapes.
katef Jun 3, 2023
7a43207
Only `mut` when we modify a variable.
katef Jun 3, 2023
479d103
Mark unused input when there are no fetch instructions.
katef Jun 3, 2023
75bb88d
state_set: Improve `state_set_search` performance, correct result.
silentbicycle Aug 29, 2023
709b8cc
stateset: Avoid memmove of size 0.
silentbicycle Aug 29, 2023
cead0d9
stateset: Add note about potentially expensive assertion.
silentbicycle Aug 29, 2023
cbfeddd
stateset: Comment struct fields.
silentbicycle Aug 29, 2023
c3dab77
edgeset: Fix indentation for `#if`'d block.
silentbicycle Aug 29, 2023
1ada07c
edgeset: Switch from linear to binary searching in edge_set_add_bulk.
silentbicycle Aug 29, 2023
7122d2f
edgeset: Commit to using binary search.
silentbicycle Aug 29, 2023
937585a
determinise: Drastically reduce calls to qsort.
silentbicycle Aug 30, 2023
30e34ef
edgeset: Remove stale comment.
silentbicycle Aug 30, 2023
cf6051f
UBSan: Avoid implicit signed/unsigned conversion.
silentbicycle Aug 30, 2023
c1e1282
UBSan: Avoid implicit signed/unsigned conversion.
silentbicycle Aug 30, 2023
6eff0f9
bugfix: The range is min..max inclusive, so add 1.
silentbicycle Aug 30, 2023
98ee906
Address a couple warnings from scan-build.
silentbicycle Jun 15, 2023
51892e3
Add src/adt/idmap.c, a state -> ID set map.
silentbicycle Feb 16, 2023
7c6644f
Remove theft test harness for deleted ADT (ipriq).
silentbicycle Jul 10, 2023
c646868
Add pcre-anchor test for anchoring edge case.
silentbicycle Jun 2, 2023
0789d61
fuzz/run_fuzzer: Run single seed file when given as argument.
silentbicycle May 31, 2023
1ca3726
Don't purge the seed cache for PRs syncing clones.
katef Sep 12, 2023
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
181 changes: 167 additions & 14 deletions .github/workflows/ci.yml
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,7 @@ on: [ push, pull_request ]
env:
pcre2: pcre2-10.40
wc: wc
seeds: seeds
build: build
cvtpcre: build/test/retest
prefix: prefix
Expand Down Expand Up @@ -70,15 +71,15 @@ jobs:
id: cache-cvtpcre
with:
path: ${{ env.cvtpcre }}
key: cvtpcre-bmake-ubuntu-gcc-DEBUG-ASAN-${{ github.sha }}-${{ env.pcre2 }}
key: cvtpcre-bmake-ubuntu-gcc-DEBUG-AUSAN-${{ github.sha }}-${{ env.pcre2 }}

- name: Fetch build
if: steps.cache-cvtpcre.outputs.cache-hit != 'true'
uses: actions/cache@v3
id: cache-build
with:
path: ${{ env.build }}
key: build-bmake-ubuntu-gcc-DEBUG-ASAN-${{ github.sha }} # arbitary build, just for cvtpcre
key: build-bmake-ubuntu-gcc-DEBUG-AUSAN-${{ github.sha }} # arbitary build, just for cvtpcre

- name: Convert PCRE suite
if: steps.cache-cvtpcre.outputs.cache-hit != 'true'
Expand Down Expand Up @@ -135,11 +136,11 @@ jobs:
strategy:
fail-fast: true
matrix:
san: [ NO_SANITIZER, ASAN, UBSAN, MSAN, EFENCE ] # NO_SANITIZER=1 is a no-op
san: [ NO_SANITIZER, AUSAN, MSAN, EFENCE, FUZZER ] # NO_SANITIZER=1 is a no-op
os: [ ubuntu ]
cc: [ clang, gcc ]
make: [ bmake ] # we test makefiles separately
debug: [ DEBUG, EXPENSIVE_CHECKS, RELEASE ] # RELEASE=1 is a no-op
debug: [ DEBUG, RELEASE ] # RELEASE=1 is a no-op
exclude:
- os: macos
cc: gcc # it's clang anyway
Expand All @@ -149,6 +150,8 @@ jobs:
san: MSAN # not supported
- os: macos
make: pmake # not packaged
- san: FUZZER
cc: gcc # -fsanitize=fuzzer is clang-only

steps:
- name: Fetch checkout
Expand Down Expand Up @@ -186,28 +189,44 @@ jobs:
id: cpu-cores

- name: Make
if: steps.cache-build.outputs.cache-hit != 'true'
if: matrix.san != 'FUZZER' && steps.cache-build.outputs.cache-hit != 'true'
run: |
# note: lexer.h first, because parser.? depends on it
find . -name 'lexer.?' -exec touch '{}' \; # workaround for git checkout timestamps
find . -name 'parser.?' -exec touch '{}' \; # workaround for git checkout timestamps
${{ matrix.make }} -r -j $((${{ steps.cpu-cores.outputs.count }} + 1)) -C ${{ env.wc }} BUILD=../${{ env.build }} ${{ matrix.san }}=1 ${{ matrix.debug }}=1 PKGCONF=pkg-config CC=${{ matrix.cc }} NODOC=1

# We aren't building the CLI executables here, just the fuzzer
# matrix.san=FUZZER implies UBSAN and ASAN (but not MSAN, MSAN is incompatible with ASAN)
# XXX: needing to explicitly mkdir here is a makefile bug
- name: Make (Fuzzer)
if: matrix.san == 'FUZZER' && steps.cache-build.outputs.cache-hit != 'true'
run: |
# note: lexer.h first, because parser.? depends on it
find . -name 'lexer.?' -exec touch '{}' \; # workaround for git checkout timestamps
find . -name 'parser.?' -exec touch '{}' \; # workaround for git checkout timestamps
${{ matrix.make }} -r -j $((${{ steps.cpu-cores.outputs.count }} + 1)) -C ${{ env.wc }} BUILD=../${{ env.build }} ${{ matrix.san }}=1 ${{ matrix.debug }}=1 AUSAN=1 PKGCONF=pkg-config CC=${{ matrix.cc }} NODOC=1 mkdir
${{ matrix.make }} -r -j $((${{ steps.cpu-cores.outputs.count }} + 1)) -C ${{ env.wc }} BUILD=../${{ env.build }} ${{ matrix.san }}=1 ${{ matrix.debug }}=1 AUSAN=1 PKGCONF=pkg-config CC=${{ matrix.cc }} NODOC=1 fuzz

# testing different bmake dialects
# the goal here is to excercise the build system, not the code
# we don't care about e.g. different compilers here
#
# I'm including EXPENSIVE_CHECKS here just so we have some coverage
# of the build during CI, even if we don't run that during tests.
test_makefiles:
name: "Test (Makefiles) ${{ matrix.make }} ${{ matrix.os }} ${{ matrix.debug }}"
runs-on: ${{ matrix.os }}-latest
needs: [ checkout ]
needs: [ checkout, build ]

strategy:
fail-fast: false
matrix:
san: [ NO_SANITIZER ] # NO_SANITIZER=1 is a no-op
os: [ ubuntu ]
cc: [ clang ]
make: [ bmake, pmake ]
debug: [ DEBUG, RELEASE ] # RELEASE=1 is a no-op
debug: [ EXPENSIVE_CHECKS, DEBUG, RELEASE ] # RELEASE=1 is a no-op
exclude:
- os: macos
make: pmake # not packaged
Expand All @@ -220,6 +239,24 @@ jobs:
path: ${{ env.wc }}
key: checkout-${{ github.sha }}

# An arbitary build.
- name: Fetch build
uses: actions/cache@v3
id: cache-build
with:
path: ${{ env.build }}
key: build-${{ matrix.make }}-${{ matrix.os }}-${{ matrix.cc }}-${{ matrix.debug }}-${{ matrix.san }}-${{ github.sha }}

# We don't need to build the entire repo to know that the makefiles work,
# I'm just deleting a couple of .o files and rebuilding those instead.
- name: Delete something
if: steps.cache-build.outputs.cache-hit == 'true'
run: find ${{ env.build }} -type f -name '*.o' | sort -r | head -5 | xargs rm

- name: Outdate something
if: steps.cache-build.outputs.cache-hit == 'true'
run: find ${{ env.wc }} -type f -name '*.c' | sort -r | head -5 | xargs touch

- name: Dependencies (Ubuntu)
if: matrix.os == 'ubuntu'
run: |
Expand Down Expand Up @@ -251,8 +288,16 @@ jobs:
# Same for lx
run: ${{ matrix.make }} -r -j $((${{ steps.cpu-cores.outputs.count }} + 1)) -C ${{ env.wc }} BUILD=../${{ env.build }} ${{ matrix.debug }}=1 PKGCONF=pkg-config SID='true; echo sid' LX='true; echo lx' CC=${{ matrix.cc }} NODOC=1 test

# there's an unfixed intermittent makefile bug under -j for
# kmkf duplicate install targets, it's not interesting for libfsm's CI,
# so I'm retrying on error here. # github.com/katef/kmkf/issues/14
- name: Install
run: ${{ matrix.make }} -r -j $((${{ steps.cpu-cores.outputs.count }} + 1)) -C ${{ env.wc }} BUILD=../${{ env.build }} ${{ matrix.debug }}=1 PKGCONF=pkg-config PREFIX=../${{ env.prefix }} NODOC=1 install
uses: nick-fields/[email protected]
with:
timeout_seconds: 10 # required, but not a problem for the kmkf bug
max_attempts: 3
retry_on: error
command: ${{ matrix.make }} -r -j $((${{ steps.cpu-cores.outputs.count }} + 1)) -C ${{ env.wc }} BUILD=../${{ env.build }} ${{ matrix.debug }}=1 PKGCONF=pkg-config PREFIX=../${{ env.prefix }} NODOC=1 install

test_san:
name: "Test (Sanitizers) ${{ matrix.san }} ${{ matrix.cc }} ${{ matrix.os }} ${{ matrix.debug }}"
Expand All @@ -262,11 +307,11 @@ jobs:
strategy:
fail-fast: false
matrix:
san: [ ASAN, UBSAN, MSAN, EFENCE ]
san: [ AUSAN, MSAN, EFENCE ]
os: [ ubuntu ]
cc: [ clang, gcc ]
make: [ bmake ]
debug: [ DEBUG, EXPENSIVE_CHECKS, RELEASE ] # RELEASE=1 is a no-op
debug: [ DEBUG, RELEASE ] # RELEASE=1 is a no-op
exclude:
- os: macos
cc: gcc # it's clang anyway
Expand Down Expand Up @@ -315,6 +360,114 @@ jobs:
# I don't want to build SID just for sake of its -l test
run: ${{ matrix.make }} -r -j $((${{ steps.cpu-cores.outputs.count }} + 1)) -C ${{ env.wc }} BUILD=../${{ env.build }} ${{ matrix.san }}=1 ${{ matrix.debug }}=1 PKGCONF=pkg-config SID='true; echo sid' CC=${{ matrix.cc }} NODOC=1 LX=../${{ env.build }}/bin/lx test

test_fuzz:
name: "Fuzz (mode ${{ matrix.mode }}) ${{ matrix.cc }} ${{ matrix.os }} ${{ matrix.debug }}"
runs-on: ${{ matrix.os }}-latest
timeout-minutes: 5 # this should never be reached, it's a safeguard for bugs in the fuzzer itself
needs: [ build ]

strategy:
fail-fast: false
matrix:
san: [ FUZZER ]
os: [ ubuntu ]
cc: [ clang ]
make: [ bmake ]
debug: [ DEBUG, RELEASE ] # RELEASE=1 is a no-op
mode: [ m, p, d ]
exclude:
- os: macos
cc: gcc # it's clang anyway

steps:
- name: Fetch checkout
uses: actions/cache@v3
id: cache-checkout
with:
path: ${{ env.wc }}
key: checkout-${{ github.sha }}

- name: Dependencies (Ubuntu)
if: matrix.os == 'ubuntu'
run: |
uname -a
sudo apt-get install bmake
${{ matrix.cc }} --version

- name: Dependencies (MacOS)
if: matrix.os == 'macos'
run: |
uname -a
brew update
brew install bmake
${{ matrix.cc }} --version

- name: Fetch build
uses: actions/cache@v3
id: cache-build
with:
path: ${{ env.build }}
key: build-${{ matrix.make }}-${{ matrix.os }}-${{ matrix.cc }}-${{ matrix.debug }}-${{ matrix.san }}-${{ github.sha }}

# note we do the fuzzing unconditionally; each run adds to the corpus.
#
# We only run fuzzing for PRs in the base repo, this prevents attempting
# to purge the seed cache from a PR syncing a forked repo, which fails
# due to a permissions error (I'm unsure why, I think PRs from clones can't
# purge a cache in CI presumably for security/DoS reasons). PRs from clones
# still run fuzzing, just from empty, and do not save their seeds.
- name: Restore seeds (mode ${{ matrix.mode }})
if: github.repository == 'katef/libfsm'
uses: actions/cache/restore@v3
id: cache-seeds
with:
path: ${{ env.seeds }}-${{ matrix.mode }}
key: seeds-${{ matrix.mode }}-${{ matrix.debug }}

- name: mkdir seeds
if: steps.cache-seeds.outputs.cache-hit != 'true'
run: mkdir -p ${{ env.seeds }}-${{ matrix.mode }}

- name: Get number of CPU cores
uses: SimenB/github-actions-cpu-cores@v1
id: cpu-cores

- name: Fuzz
env:
MODE: ${{ env.mode }}
UBSAN_OPTIONS: ASAN_OPTIONS=detect_leaks=0:halt_on_error=1 UBSAN_OPTIONS=print_stacktrace=1:halt_on_error=1
run: ./${{ env.build }}/fuzz/fuzzer -fork=$((${{ steps.cpu-cores.outputs.count }} + 1)) -max_total_time=60 ${{ env.seeds }}-${{ matrix.mode }}

# saving the cache would fail because this key already exists,
# so I'm just explicitly purging by key here.
# the key contains "-${{ matrix.debug }}" so we don't lose seeds
# found from another instance in the matrix when purging.
- name: Purge cached seeds (mode ${{ matrix.mode }}-${{ matrix.debug }})
if: steps.cache-seeds.outputs.cache-hit == 'true'
run: |
set +e
gh extension install actions/gh-actions-cache
gh actions-cache delete ${{ steps.cache-seeds.outputs.cache-primary-key }} -R ${{ github.repository }} --confirm
env:
GH_TOKEN: ${{ secrets.GITHUB_TOKEN }}

# if: always() means we keep these even on error, so we don't need to re-find
# the same seeds for a given bug.
# The explicit cache/restore and cache/save actions are just for that.
- name: Save seeds (mode ${{ matrix.mode }}-${{ matrix.debug }})
uses: actions/cache/save@v3
if: always()
with:
path: ${{ env.seeds }}-${{ matrix.mode }}
key: ${{ steps.cache-seeds.outputs.cache-primary-key }}

# nothing to do with the caching, I'm uploading the seeds so a developer can grab them to fuzz locally
- name: Upload seeds (mode ${{ matrix.mode }}-${{ matrix.debug }})
uses: actions/upload-artifact@v3
with:
name: seeds-${{ matrix.mode }}-${{ matrix.debug }}
path: ${{ env.seeds }}-${{ matrix.mode }}

test_pcre:
name: "Test (PCRE suite) ${{ matrix.lang }} ${{ matrix.san }} ${{ matrix.cc }} ${{ matrix.os }} ${{ matrix.debug }}"
runs-on: ${{ matrix.os }}-latest
Expand All @@ -323,12 +476,12 @@ jobs:
strategy:
fail-fast: false
matrix:
san: [ ASAN, UBSAN, MSAN, EFENCE ]
san: [ AUSAN, MSAN, EFENCE ]
os: [ ubuntu ]
cc: [ clang, gcc ]
make: [ bmake ]
debug: [ DEBUG, EXPENSIVE_CHECKS, RELEASE ] # RELEASE=1 is a no-op
lang: [ "vm -x v1", "vm -x v2", asm, c, vmc, vmops, go, goasm ]
debug: [ DEBUG, RELEASE ] # RELEASE=1 is a no-op
lang: [ "vm -x v1", "vm -x v2", asm, c, rust, vmc, vmops, go, goasm ]
exclude:
- os: macos
cc: gcc # it's clang anyway
Expand Down Expand Up @@ -371,7 +524,7 @@ jobs:
id: cache-cvtpcre
with:
path: ${{ env.cvtpcre }}
key: cvtpcre-bmake-ubuntu-gcc-DEBUG-ASAN-${{ github.sha }}-${{ env.pcre2 }}
key: cvtpcre-bmake-ubuntu-gcc-DEBUG-AUSAN-${{ github.sha }}-${{ env.pcre2 }}

- name: Run PCRE suite (${{ matrix.lang }})
run: CC=${{ matrix.cc }} ./${{ env.build }}/bin/retest -O1 -l ${{ matrix.lang }} ${{ env.cvtpcre }}/*.tst
Expand Down
17 changes: 17 additions & 0 deletions Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -15,6 +15,9 @@ BUILD_IMPOSSIBLE="attempting to use .OBJDIR other than .CURDIR"

# targets
all:: mkdir .WAIT dep .WAIT lib prog
.if make(fuzz)
fuzz:: mkdir
.endif
doc:: mkdir
dep::
gen::
Expand All @@ -31,6 +34,19 @@ RE ?= re
BUILD ?= build
PREFIX ?= /usr/local

# libfsm has EXPENSIVE_CHECKS which are a superset of assertions;
# this is here just so CI can only set one flag at a time.
.if defined(EXPENSIVE_CHECKS)
CFLAGS += -DEXPENSIVE_CHECKS
DEBUG ?= 1
.endif

# combined just to save time in CI
.if defined(AUSAN)
ASAN ?= 1
UBSAN ?= 1
.endif

# ${unix} is an arbitrary variable set by sys.mk
.if defined(unix)
.BEGIN::
Expand Down Expand Up @@ -92,6 +108,7 @@ SUBDIR += src
SUBDIR += tests/capture
SUBDIR += tests/complement
SUBDIR += tests/gen
SUBDIR += tests/idmap
SUBDIR += tests/intersect
#SUBDIR += tests/ir # XXX: fragile due to state numbering
SUBDIR += tests/eclosure
Expand Down
10 changes: 8 additions & 2 deletions fuzz/run_fuzzer
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,8 @@ BUILD=../build
FUZZER=${BUILD}/fuzz/fuzzer
SEEDS=${BUILD}/fuzz/fuzzer_seeds

ARG=$1

SECONDS=${SECONDS:-60}
WORKERS=${WORKERS:-4}
SEEDS=${SEEDS:-seeds}
Expand All @@ -25,5 +27,9 @@ if [ ! -d "${SEEDS}" ]; then
mkdir -p "${SEEDS}"
fi

echo "\n==== ${FUZZER}"
${FUZZER} -jobs=${WORKERS} -workers=${WORKERS} -max_total_time=${SECONDS} ${SEEDS}
if [ -z "${ARG}" ]; then
echo "\n==== ${FUZZER}"
exec ${FUZZER} -jobs=${WORKERS} -workers=${WORKERS} -max_total_time=${SECONDS} ${SEEDS}
else
exec ${FUZZER} ${ARG}
fi
Loading