Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improved Memory Operations #174

Merged
merged 38 commits into from
Oct 12, 2024
Merged

Improved Memory Operations #174

merged 38 commits into from
Oct 12, 2024

Commits on Sep 27, 2024

  1. Configuration menu
    Copy the full SHA
    d04993d View commit details
    Browse the repository at this point in the history
  2. Docs: Weaknesses of LibC

    ashvardanian committed Sep 27, 2024
    Configuration menu
    Copy the full SHA
    172bf93 View commit details
    Browse the repository at this point in the history
  3. Fix: Missing, but documented partition(':')

    Closes #172
    
    Co-authored-by: Takuya Hashimoto <[email protected]>
    ashvardanian and toge committed Sep 27, 2024
    Configuration menu
    Copy the full SHA
    432fb3d View commit details
    Browse the repository at this point in the history
  4. Configuration menu
    Copy the full SHA
    a0e9be7 View commit details
    Browse the repository at this point in the history
  5. Docs: Avoid AppleClang

    ashvardanian committed Sep 27, 2024
    Configuration menu
    Copy the full SHA
    b5fcc62 View commit details
    Browse the repository at this point in the history
  6. Configuration menu
    Copy the full SHA
    ee6f754 View commit details
    Browse the repository at this point in the history
  7. Configuration menu
    Copy the full SHA
    97cf753 View commit details
    Browse the repository at this point in the history
  8. Configuration menu
    Copy the full SHA
    224a3a0 View commit details
    Browse the repository at this point in the history
  9. Configuration menu
    Copy the full SHA
    97535bc View commit details
    Browse the repository at this point in the history
  10. Configuration menu
    Copy the full SHA
    e4e138c View commit details
    Browse the repository at this point in the history

Commits on Sep 28, 2024

  1. Improve: Faster memcpy in AVX-512

    On the Leipzig1M dataset, LibC vs SZ:
    
    ~ 128b lines, aligned: 2.3 vs 2.6 GB/s
    ~ 128b lines, unaligned: 2.34 vs 2.53 GB/s
    ~ 5b tokens, aligned: 0.1 vs 0.1 GB/s
    ~ 5b tokens, unaligned: 0.1 vs 0.1 GB/s
    ~ 124 MB, aligned: 19.6 vs 20.3 GB/s
    ~ 124 MB, unaligned: 19.6 vs 20.3 GB/s
    ashvardanian committed Sep 28, 2024
    Configuration menu
    Copy the full SHA
    affebc0 View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    a265d3b View commit details
    Browse the repository at this point in the history

Commits on Sep 30, 2024

  1. Configuration menu
    Copy the full SHA
    36df73d View commit details
    Browse the repository at this point in the history
  2. Make: Lighter debugging in VS Code

    Previously SZ would build too many
    targets for each debugging session.
    ashvardanian committed Sep 30, 2024
    Configuration menu
    Copy the full SHA
    5d522cf View commit details
    Browse the repository at this point in the history
  3. Configuration menu
    Copy the full SHA
    5388ab4 View commit details
    Browse the repository at this point in the history

Commits on Oct 1, 2024

  1. Add: Faster memory ops in AVX2

    This commit accelerates the `sz_fill_avx2`
    and `sz_copy_avx2` by avoiding unaligned writes.
    
    It also adds an `sz_equal_avx2` to help validate
    large files with matching checksums faster.
    
    It also adds a placeholder for `sz_order_avx2`,
    discouraging further optimizations.
    
    C++ API with a matching argument order was added
    to mimic `std::memcpy`, `std::memset`, `std::memmove`.
    
    Matching `test_memory_utilities` tests were extended.
    ashvardanian committed Oct 1, 2024
    Configuration menu
    Copy the full SHA
    cef29c9 View commit details
    Browse the repository at this point in the history

Commits on Oct 3, 2024

  1. Configuration menu
    Copy the full SHA
    6d326d9 View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    a383e9e View commit details
    Browse the repository at this point in the history
  3. Configuration menu
    Copy the full SHA
    69060ac View commit details
    Browse the repository at this point in the history
  4. Improve: Using more registers for small moves

    In AVX-512, similar to GLibC we should use the
    register space to load more data simultaneously
    and avoid loops and data-dependency between
    iterations.
    ashvardanian committed Oct 3, 2024
    Configuration menu
    Copy the full SHA
    696797d View commit details
    Browse the repository at this point in the history
  5. Configuration menu
    Copy the full SHA
    e2f8cc7 View commit details
    Browse the repository at this point in the history
  6. Add: SVE kernels

    ashvardanian committed Oct 3, 2024
    Configuration menu
    Copy the full SHA
    02b9d68 View commit details
    Browse the repository at this point in the history
  7. Configuration menu
    Copy the full SHA
    bba72a6 View commit details
    Browse the repository at this point in the history

Commits on Oct 11, 2024

  1. Add: Look-Up Table transforms

    The new `sz_look_up_transform` API implements
    a 256-byte lookup table using serial code and AVX-512
    that can significantly accelerates text and image
    processing.
    
    The AVX-512 implementation reaches 18 GB/s on
    Intel Sapphire Rapids CPU, while serial code stays
    around 3 GB/s for large files.
    ashvardanian committed Oct 11, 2024
    Configuration menu
    Copy the full SHA
    850e4e8 View commit details
    Browse the repository at this point in the history

Commits on Oct 12, 2024

  1. Configuration menu
    Copy the full SHA
    014bcf2 View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    26a0fea View commit details
    Browse the repository at this point in the history
  3. Configuration menu
    Copy the full SHA
    423ad99 View commit details
    Browse the repository at this point in the history
  4. Fix: NEON cast

    ashvardanian committed Oct 12, 2024
    Configuration menu
    Copy the full SHA
    11272e5 View commit details
    Browse the repository at this point in the history
  5. Configuration menu
    Copy the full SHA
    4d8ac78 View commit details
    Browse the repository at this point in the history
  6. Configuration menu
    Copy the full SHA
    82146b0 View commit details
    Browse the repository at this point in the history
  7. Configuration menu
    Copy the full SHA
    165986f View commit details
    Browse the repository at this point in the history
  8. Fix: sz_move_neon order

    ashvardanian committed Oct 12, 2024
    Configuration menu
    Copy the full SHA
    be6c93b View commit details
    Browse the repository at this point in the history
  9. Configuration menu
    Copy the full SHA
    3898481 View commit details
    Browse the repository at this point in the history
  10. Configuration menu
    Copy the full SHA
    1baa3a9 View commit details
    Browse the repository at this point in the history
  11. Configuration menu
    Copy the full SHA
    1db702a View commit details
    Browse the repository at this point in the history
  12. Configuration menu
    Copy the full SHA
    5c1426f View commit details
    Browse the repository at this point in the history
  13. Fix: Reorder tests

    ashvardanian committed Oct 12, 2024
    Configuration menu
    Copy the full SHA
    78937f9 View commit details
    Browse the repository at this point in the history
  14. Configuration menu
    Copy the full SHA
    c0c1dcb View commit details
    Browse the repository at this point in the history