Move math types over to nalgebra with Generic dimensions > 2 and f32/f64 support #96

marstaik · 2023-04-22T07:11:09Z

BVH should now work (in-theory) with 2d,3d,4d+ and with f32,f64 using the base nalgebra types.

Consistent performance, everything does mostly get SIMD. On my older first gen threadripper, I see up to 10% performance improvements across the board. Those that match or are slightly above are usually within error.

Axis and some re-exports were simply deleted as they aren't needed when using the nalgebra types.

The only outliers are:

bench_intersects_aabb, which runs a bit slower than the fastest variant of before nalgebra
- I wrote this to use pure SIMD, but it doesn't optimize as well as it should. I should be able to make this much faster with some manual SIMD later.
bench_optimize_bvh_* The optimization tests run over 200% faster, and blow the old optimize code out the water. It is actually faster to optimize now than to rebuild.

test	glam	nalgebra
test bvh::bvh_impl::bench::bench_build_1200_triangles_bvh	821,735 ns/iter (+/- 67,897)	771,730 ns/iter (+/- 21,811)
test bvh::bvh_impl::bench::bench_build_120k_triangles_bvh	103,104,360 ns/iter (+/- 4,907,146)	101,412,600 ns/iter (+/- 4,414,230)
test bvh::bvh_impl::bench::bench_build_12k_triangles_bvh	8,944,690 ns/iter (+/- 267,942)	8,685,235 ns/iter (+/- 131,969)
test bvh::bvh_impl::bench::bench_build_sponza_bvh	79,784,630 ns/iter (+/- 2,508,824)	78,247,340 ns/iter (+/- 1,407,163)
test bvh::bvh_impl::bench::bench_intersect_1200_triangles_bvh	130 ns/iter (+/- 2)	149 ns/iter (+/- 6)
test bvh::bvh_impl::bench::bench_intersect_120k_triangles_bvh	765 ns/iter (+/- 14)	863 ns/iter (+/- 12)
test bvh::bvh_impl::bench::bench_intersect_12k_triangles_bvh	321 ns/iter (+/- 6)	371 ns/iter (+/- 5)
test bvh::bvh_impl::bench::bench_intersect_sponza_bvh	2,289 ns/iter (+/- 64)	1,905 ns/iter (+/- 36)
test bvh::iter::bench::bench_intersect_128rays_sponza_iter	159,109 ns/iter (+/- 4,056)	167,965 ns/iter (+/- 3,013)
test bvh::iter::bench::bench_intersect_128rays_sponza_vec	293,305 ns/iter (+/- 7,316)	244,404 ns/iter (+/- 4,788)
test bvh::optimization::bench::bench_intersect_120k_after_optimize_00p	764 ns/iter (+/- 12)	864 ns/iter (+/- 10)
test bvh::optimization::bench::bench_intersect_120k_after_optimize_01p	128,246 ns/iter (+/- 10,879)	145,156 ns/iter (+/- 11,400)
test bvh::optimization::bench::bench_intersect_120k_after_optimize_10p	1,634,400 ns/iter (+/- 387,476)	1,761,105 ns/iter (+/- 425,368)
test bvh::optimization::bench::bench_intersect_120k_after_optimize_50p	2,234,812 ns/iter (+/- 440,313)	2,395,035 ns/iter (+/- 498,051)
test bvh::optimization::bench::bench_intersect_120k_with_rebuild_00p	764 ns/iter (+/- 19)	862 ns/iter (+/- 7)
test bvh::optimization::bench::bench_intersect_120k_with_rebuild_01p	824 ns/iter (+/- 29)	928 ns/iter (+/- 21)
test bvh::optimization::bench::bench_intersect_120k_with_rebuild_10p	1,843 ns/iter (+/- 62)	2,000 ns/iter (+/- 39)
test bvh::optimization::bench::bench_intersect_120k_with_rebuild_50p	2,128 ns/iter (+/- 46)	2,282 ns/iter (+/- 77)
test bvh::optimization::bench::bench_intersect_sponza_after_optimize_00p	1,613 ns/iter (+/- 51)	1,392 ns/iter (+/- 25)
test bvh::optimization::bench::bench_intersect_sponza_after_optimize_01p	2,495 ns/iter (+/- 113)	3,051 ns/iter (+/- 58)
test bvh::optimization::bench::bench_intersect_sponza_after_optimize_10p	3,781 ns/iter (+/- 153)	4,421 ns/iter (+/- 180)
test bvh::optimization::bench::bench_intersect_sponza_after_optimize_50p	5,534 ns/iter (+/- 394)	6,352 ns/iter (+/- 220)
test bvh::optimization::bench::bench_intersect_sponza_with_rebuild_00p	1,609 ns/iter (+/- 36)	1,390 ns/iter (+/- 26)
test bvh::optimization::bench::bench_intersect_sponza_with_rebuild_01p	1,729 ns/iter (+/- 36)	1,527 ns/iter (+/- 26)
test bvh::optimization::bench::bench_intersect_sponza_with_rebuild_10p	2,074 ns/iter (+/- 144)	1,948 ns/iter (+/- 44)
test bvh::optimization::bench::bench_intersect_sponza_with_rebuild_50p	2,665 ns/iter (+/- 71)	2,578 ns/iter (+/- 51)
test bvh::optimization::bench::bench_optimize_bvh_120k_00p	1,090,720 ns/iter (+/- 23,355)	1,164,270 ns/iter (+/- 79,435)
test bvh::optimization::bench::bench_optimize_bvh_120k_01p	7,373,650 ns/iter (+/- 1,720,855)	2,237,075 ns/iter (+/- 218,183)
test bvh::optimization::bench::bench_optimize_bvh_120k_10p	40,201,330 ns/iter (+/- 14,485,239)	12,243,750 ns/iter (+/- 1,361,338)
test bvh::optimization::bench::bench_optimize_bvh_120k_50p	222,987,100 ns/iter (+/- 68,341,041)	55,583,600 ns/iter (+/- 13,194,002)
test bvh::optimization::bench::bench_randomize_120k_50p	6,262,345 ns/iter (+/- 360,555)	5,717,540 ns/iter (+/- 229,427)
test flat_bvh::bench::bench_build_1200_triangles_flat_bvh	676,387 ns/iter (+/- 16,147)	658,178 ns/iter (+/- 16,926)
test flat_bvh::bench::bench_build_120k_triangles_flat_bvh	97,173,840 ns/iter (+/- 6,132,352)	93,301,800 ns/iter (+/- 3,916,466)
test flat_bvh::bench::bench_build_12k_triangles_flat_bvh	9,569,305 ns/iter (+/- 447,199)	9,162,365 ns/iter (+/- 242,592)
test flat_bvh::bench::bench_flatten_120k_triangles_bvh	7,332,465 ns/iter (+/- 984,515)	7,517,620 ns/iter (+/- 537,332)
test flat_bvh::bench::bench_intersect_1200_triangles_flat_bvh	147 ns/iter (+/- 2)	179 ns/iter (+/- 2)
test flat_bvh::bench::bench_intersect_120k_triangles_flat_bvh	890 ns/iter (+/- 31)	1,121 ns/iter (+/- 21)
test flat_bvh::bench::bench_intersect_12k_triangles_flat_bvh	364 ns/iter (+/- 3)	473 ns/iter (+/- 6)
test ray::bench::bench_intersects_aabb	2,897 ns/iter (+/- 46)	4,475 ns/iter (+/- 23)
test ray::bench::bench_intersects_aabb_branchless	3,469 ns/iter (+/- 40)	-
test ray::bench::bench_intersects_aabb_naive	6,037 ns/iter (+/- 84)	-
test testbase::bench_intersect_120k_triangles_list	7 ns/iter (+/- 0)	7 ns/iter (+/- 0)
test testbase::bench_intersect_120k_triangles_list_aabb	7 ns/iter (+/- 0)	7 ns/iter (+/- 0)
test testbase::bench_intersect_sponza_list	7 ns/iter (+/- 0)	7 ns/iter (+/- 0)
test testbase::bench_intersect_sponza_list_aabb	7 ns/iter (+/- 0)	7 ns/iter (+/- 0)

marstaik · 2023-04-22T19:23:12Z

Added the f32x3 optimization, It can be done quite easily to also handle f32x2,3,4, and f64x2,3,4, which I will do later.

Some new bench numbers:

test	nalgebra
test bvh::bvh_impl::bench::bench_build_1200_triangles_bvh	777,386 ns/iter (+/- 22,640)
test bvh::bvh_impl::bench::bench_build_120k_triangles_bvh	99,323,460 ns/iter (+/- 1,955,579)
test bvh::bvh_impl::bench::bench_build_12k_triangles_bvh	8,601,280 ns/iter (+/- 160,429)
test bvh::bvh_impl::bench::bench_build_sponza_bvh	79,983,790 ns/iter (+/- 1,201,449)
test bvh::bvh_impl::bench::bench_intersect_1200_triangles_bvh	133 ns/iter (+/- 1)
test bvh::bvh_impl::bench::bench_intersect_120k_triangles_bvh	802 ns/iter (+/- 7)
test bvh::bvh_impl::bench::bench_intersect_12k_triangles_bvh	335 ns/iter (+/- 5)
test bvh::bvh_impl::bench::bench_intersect_sponza_bvh	1,403 ns/iter (+/- 24)
test bvh::iter::bench::bench_intersect_128rays_sponza_iter	156,681 ns/iter (+/- 3,499)
test bvh::iter::bench::bench_intersect_128rays_sponza_vec	179,903 ns/iter (+/- 2,719)
test bvh::optimization::bench::bench_intersect_120k_after_optimize_00p	802 ns/iter (+/- 8)
test bvh::optimization::bench::bench_intersect_120k_after_optimize_01p	134,032 ns/iter (+/- 10,309)
test bvh::optimization::bench::bench_intersect_120k_after_optimize_10p	1,606,707 ns/iter (+/- 389,896)
test bvh::optimization::bench::bench_intersect_120k_after_optimize_50p	2,200,720 ns/iter (+/- 453,999)
test bvh::optimization::bench::bench_intersect_120k_with_rebuild_00p	801 ns/iter (+/- 8)
test bvh::optimization::bench::bench_intersect_120k_with_rebuild_01p	865 ns/iter (+/- 8)
test bvh::optimization::bench::bench_intersect_120k_with_rebuild_10p	1,887 ns/iter (+/- 26)
test bvh::optimization::bench::bench_intersect_120k_with_rebuild_50p	2,157 ns/iter (+/- 29)
test bvh::optimization::bench::bench_intersect_sponza_after_optimize_00p	1,297 ns/iter (+/- 21)
test bvh::optimization::bench::bench_intersect_sponza_after_optimize_01p	2,655 ns/iter (+/- 56)
test bvh::optimization::bench::bench_intersect_sponza_after_optimize_10p	3,847 ns/iter (+/- 114)
test bvh::optimization::bench::bench_intersect_sponza_after_optimize_50p	5,992 ns/iter (+/- 426)
test bvh::optimization::bench::bench_intersect_sponza_with_rebuild_00p	1,300 ns/iter (+/- 20)
test bvh::optimization::bench::bench_intersect_sponza_with_rebuild_01p	1,437 ns/iter (+/- 33)
test bvh::optimization::bench::bench_intersect_sponza_with_rebuild_10p	1,814 ns/iter (+/- 29)
test bvh::optimization::bench::bench_intersect_sponza_with_rebuild_50p	2,370 ns/iter (+/- 48)
test bvh::optimization::bench::bench_optimize_bvh_120k_00p	1,158,720 ns/iter (+/- 7,658)
test bvh::optimization::bench::bench_optimize_bvh_120k_01p	2,232,760 ns/iter (+/- 89,304)
test bvh::optimization::bench::bench_optimize_bvh_120k_10p	12,362,530 ns/iter (+/- 1,015,947)
test bvh::optimization::bench::bench_optimize_bvh_120k_50p	58,228,630 ns/iter (+/- 11,172,616)
test bvh::optimization::bench::bench_randomize_120k_50p	5,814,270 ns/iter (+/- 232,303)
test flat_bvh::bench::bench_build_1200_triangles_flat_bvh	790,160 ns/iter (+/- 13,313)
test flat_bvh::bench::bench_build_120k_triangles_flat_bvh	104,565,130 ns/iter (+/- 4,193,432)
test flat_bvh::bench::bench_build_12k_triangles_flat_bvh	10,120,240 ns/iter (+/- 224,187)
test flat_bvh::bench::bench_flatten_120k_triangles_bvh	6,722,320 ns/iter (+/- 1,746,270)
test flat_bvh::bench::bench_intersect_1200_triangles_flat_bvh	131 ns/iter (+/- 1)
test flat_bvh::bench::bench_intersect_120k_triangles_flat_bvh	889 ns/iter (+/- 15)
test flat_bvh::bench::bench_intersect_12k_triangles_flat_bvh	354 ns/iter (+/- 4)
test ray::bench::bench_intersects_aabb	2,560 ns/iter (+/- 9)
test testbase::bench_intersect_120k_triangles_list	7 ns/iter (+/- 0)
test testbase::bench_intersect_120k_triangles_list_aabb	7 ns/iter (+/- 0)
test testbase::bench_intersect_sponza_list	7 ns/iter (+/- 0)
test testbase::bench_intersect_sponza_list_aabb	7 ns/iter (+/- 0)

svenstaro

Hey, great stuff. There appears to be some dead code that should be handled.

src/ray.rs

svenstaro · 2023-04-23T00:24:14Z

src/testbase.rs

 use crate::bounding_hierarchy::{BHShape, BoundingHierarchy};
-use crate::ray::Ray;
+
+// TODO These all need to be realtyped and bounded


Is this TODO current?

Yeah, I plan to expand the unit tests later to handle 2d and 4d cases

Would you like to do that in this PR?

@marstaik ping

src/ray.rs

svenstaro · 2023-04-23T00:30:42Z

src/ray.rs

+#[inline(always)]
+#[cfg(target_arch = "x86_64")]
+fn vec_to_mm(vec: &SVector<f32, 3>) -> __m128 {
+    unsafe { _mm_set_ps(vec.z, vec.z, vec.y, vec.x) }


Really digging all this custom SIMD but I'm wondering whether perhaps packed_simd might be the way forward?

packed_simd says on the readme that std::simd is the way forward

Hm true. I didn't know there's still no good stabilized SIMD stuff in Rust. What do you think instead about wide?

I'll look into it and maybe trying Simba (what nalgebra uses under the hood) again. But with Simba I had performance magnitudes worse that with the pure simd, and I'm unsure why. I was having difficulty looking into it with cargo asm due to windows linking issues

There's also simdeez which might be worth checking out.

src/lib.rs

svenstaro · 2023-04-23T00:32:33Z

src/aabb.rs

+    /// use bvh::aabb::AABB;
+    ///
+    /// # fn main() {
+    /// let aabb :AABB<f32,3> = AABB::infinite();


Some dead code here.

svenstaro · 2023-04-23T00:49:00Z

@marstaik Are you likely to make more bigger contributions? I could give you contributor access if you're interested to help with development and maintenance but I couldn't find your email or Matrix for further communication. If you're interested, it might be good to talk first.

marstaik · 2023-04-24T03:49:25Z

@svenstaro I can use SIMD with pure core::arch::x86_64, as nothing else really works (simba, wide, and most other libraries don't give me all of the intrinsics needed for good performance, such as shuffle).

The issue I am having is that I still need the specialization feature from nightly to be safely able to handle all cases and dimensions. I am currently trying to find a way to do this without, but it seems very convoluted.

I have a generic ray_intersects_bvh for any Vector size, but I need to manually specialize for the lower dimension cases, as well as for when x86_64 is available.

svenstaro · 2023-04-24T08:18:27Z

Well realistically people will mostly be using this on x86_64. However, even if we make good performance nightly only for the time being, that should be put behind a feature flag so people can still choose to use this on stable.

svenstaro · 2023-04-24T12:54:26Z

I'm not sure it was clear earlier: If we have to use nightly for the speedy stuff anyway, then might as well use the most convenient solution to achieve that so you don't necessarily have to switch to core.

…tly and "full_simd" feature

marstaik · 2023-04-25T00:34:05Z

I ended up sticking with core::arch as it was just way easier to use the native intrinsics than every libraries attempt at a custom wrapper for SIMD...

I added a feature flag called "full_simd" that only works with nightly enabled, allowing for specialization of the fast vectors.

I would like to add some benches and tests for 2d and 4d as well.

svenstaro · 2023-04-25T01:49:49Z

Please make sure to change the CI so it runs twice: Once with and once without the full_simd feature. Also, why not just call it simd?

svenstaro · 2023-04-25T01:48:08Z

Cargo.toml

@@ -30,7 +30,7 @@ doc-comment = "0.3"

 [features]
 bench = []
-serde = ["dep:serde", "glam/serde"]
+full_simd = []


Make sure to document this in the README.

…hNode)

marstaik · 2023-04-27T06:41:01Z

Hey, I fixed the CI builds up a bit and added the SIMD builds via nightly only. The maintainer of the script for the rust environment disappeared and CI complained node was out of date, so I just overhauled it a bit.

I fixed all of the clippy errors, and all that was left were errors about BVH and AABB naming conventions so I also renamed BVH, AABB, BVHNode to Bvh, Aabb, BvhNode, and made the "Testing" versions start with a "T" to see how it would look. I kind of like it, but if you don't want to do a breaking change name wise I can revert those changes, and we can ignore on clippy again.

I fixed a couple of other issues clippy complained about as well.

marstaik · 2023-04-27T06:43:18Z

Also, the README probably needs an update to reflect new data about rebuilding times versus building the tree from scratch again, because the rebuild numbers have drastically changed. That and the generics/testing of f32/f64 N dimensions would be nice to get done in a separate commit.

svenstaro · 2023-04-27T12:54:49Z

.github/workflows/ci.yml

      - name: cargo build
-        uses: actions-rs/cargo@v1
-        with:
-          command: build
-          args: --workspace
+        run: cargo build --workspace

      - name: cargo test
-        uses: actions-rs/cargo@v1
-        with:
-          command: test
-
-      - name: cargo fmt
-        uses: actions-rs/cargo@v1
-        with:
-          command: fmt
-          args: --all -- --check
-
-      - name: cargo clippy
-        uses: actions-rs/cargo@v1
-        with:
-          command: clippy
-          args: --workspace --all-targets --all-features -- -D warnings
-        if: matrix.rust == 'nightly'
-
-      # - name: Run cargo-tarpaulin
-      #   uses: actions-rs/[email protected]
-      #   if: matrix.os == 'ubuntu-latest' && matrix.rust == 'stable'
+        run: cargo test


Let's make this just

- run: cargo build - run: cargo test

Since we're not actually using a workspace, we don't need the extra flag.

svenstaro · 2023-04-27T12:55:21Z

.github/workflows/ci.yml

+      - name: cargo build
+        run: cargo build --workspace --features simd
+
+      - name: cargo test
+        run: cargo test --features simd


svenstaro · 2023-04-27T12:55:46Z

.github/workflows/ci.yml

+      - name: Run clippy
+        run: cargo clippy --workspace --all-targets --all-features -- -D warnings
+
+  build-and-test-no-features:


Suggested change

build-and-test-no-features:

build-and-test-no-simd:

svenstaro · 2023-04-27T12:56:01Z

.github/workflows/ci.yml

+        run: cargo clippy --workspace --all-targets --all-features -- -D warnings
+
+  build-and-test-no-features:
+    name: CI with ${{ matrix.rust }} on ${{ matrix.os }} [no-features]


Suggested change

name: CI with ${{ matrix.rust }} on ${{ matrix.os }} [no-features]

name: CI with ${{ matrix.rust }} on ${{ matrix.os }} [no SIMD]

svenstaro · 2023-04-27T12:56:10Z

.github/workflows/ci.yml


      - name: Upload coverage report to codecov.io
        uses: codecov/codecov-action@v1
        if: matrix.os == 'ubuntu-latest' && matrix.rust == 'stable'
+
+  build-and-test-simd:
+    name: CI with nightly on ${{ matrix.os }} [simd]


Suggested change

name: CI with nightly on ${{ matrix.os }} [simd]

name: CI with nightly on ${{ matrix.os }} [SIMD]

svenstaro · 2023-04-27T12:57:18Z

README.md

@@ -12,42 +12,42 @@ volume hierarchies.**
 ## About

 This crate can be used for applications which contain intersection computations of rays
-with primitives. For this purpose a binary tree BVH (Bounding Volume Hierarchy) is of great
-use if the scene which the ray traverses contains a huge number of primitives. With a BVH the
+with primitives. For this purpose a binary tree Bvh (Bounding Volume Hierarchy) is of great


Please revert the capitalization changes in docs and other prose where the docs do not reference a type. The code change is fine and in line with Rust best practices, though. I think this was probably renamed by mistake by find and replace.

Also check other files.

Ideally this rename would have been done in another PR since this one is gargantuan anyway but oh well.

README.md

src/utils.rs

svenstaro · 2023-04-27T13:03:15Z

src/aabb.rs

+use nalgebra::ClosedAdd;
+use nalgebra::ClosedMul;
+use nalgebra::ClosedSub;
+use nalgebra::Point;
+use nalgebra::SVector;
+use nalgebra::Scalar;
+use nalgebra::SimdPartialOrd;
+use num::Float;
+use num::FromPrimitive;
+use num::One;
+use num::Signed;
+use num::Zero;


Let's merge these.

src/ray/intersect_default.rs

svenstaro · 2023-04-27T13:16:21Z

I've also added a dummy CHANGELOG.md on master and I'd like to ask you to write a notice there and if you feel like it a short migration guide as well. We haven't had a changelog before and I feel this is as good an opportunity as any to start one at last.

marstaik · 2023-04-27T19:54:18Z

I think that should be everything :)

svenstaro · 2023-04-28T02:53:50Z

README.md

@@ -83,7 +95,7 @@ it is faster to update the tree, instead of rebuilding it from scratch.
 First of all, optimizing is not helpful if more than half of the scene is not static.
 This is due to how optimizing takes place:
 Given a set of indices of all shapes which have changed, the optimize procedure tries to rotate fixed constellations
-in search for a better surface area heuristic (SAH) value. This is done recursively from bottom to top while fixing the AABBs
+in search for a better surface area heuristic (SAH) value. This is done recursively from bottom to top while fixing the Aabbs


There are still a few miss-capitalized occurrences like this in prose and docs. Could you do another pass?

Woops, fixed! I didn't actually find any in the docs though.

marstaik · 2023-04-29T16:56:54Z

@svenstaro is this ready for merge?

svenstaro · 2023-04-29T22:26:34Z

Will take another look in a bit.

Cargo.toml

svenstaro · 2023-05-03T16:47:43Z

I'm doing one final check now. One thing I was wondering about: Should we kick out optimization? It never performed super well it did degrade the tree quickly. Should we perhaps just focus on super quick tree building using multithreading and SIMD?

svenstaro

Alright, very close now. I'm just missing some docs on the SIMD functions and some capitalization fixes.

svenstaro · 2023-05-03T16:56:15Z

src/bvh/optimization.rs

 //! By passing the indices of shapes that have changed, the function determines possible
-//! tree rotations and optimizes the BVH using a SAH.
+//! tree rotations and optimizes the Bvh using a SAH.


Suggested change

//! tree rotations and optimizes the Bvh using a SAH.

//! tree rotations and optimizes the BVH using a SAH.

svenstaro · 2023-05-03T16:56:29Z

src/bvh/optimization.rs

 use rand::{thread_rng, Rng};
 use std::collections::HashSet;

-// TODO Consider: Instead of getting the scene's shapes passed, let leaf nodes store an AABB
+// TODO Consider: Instead of getting the scene's shapes passed, let leaf nodes store an Aabb


Suggested change

// TODO Consider: Instead of getting the scene's shapes passed, let leaf nodes store an Aabb

// TODO Consider: Instead of getting the scene's shapes passed, let leaf nodes store an AABB

svenstaro · 2023-05-03T16:56:35Z

src/bvh/optimization.rs

 // that is updated from the outside, perhaps by passing not only the indices of the changed
-// shapes, but also their new AABBs into optimize().
-// TODO Consider: Stop updating AABBs upwards the tree once an AABB didn't get changed.
+// shapes, but also their new Aabbs into optimize().


Suggested change

// shapes, but also their new Aabbs into optimize().

// shapes, but also their new AABBs into optimize().

svenstaro · 2023-05-03T16:56:47Z

src/bvh/optimization.rs

-// shapes, but also their new AABBs into optimize().
-// TODO Consider: Stop updating AABBs upwards the tree once an AABB didn't get changed.
+// shapes, but also their new Aabbs into optimize().
+// TODO Consider: Stop updating Aabbs upwards the tree once an Aabb didn't get changed.


Suggested change

// TODO Consider: Stop updating Aabbs upwards the tree once an Aabb didn't get changed.

// TODO Consider: Stop updating AABBs upwards the tree once an AABB didn't get changed.

svenstaro · 2023-05-03T16:56:59Z

src/bvh/optimization.rs

    /// Based on
-    /// [`https://github.com/jeske/SimpleScene/blob/master/SimpleScene/Util/ssBVH/ssBVH.cs`]
+    /// [`https://github.com/jeske/SimpleScene/blob/master/SimpleScene/Util/ssBvh/ssBvh.cs`]


Suggested change

/// [`https://github.com/jeske/SimpleScene/blob/master/SimpleScene/Util/ssBvh/ssBvh.cs`]

/// [`https://github.com/jeske/SimpleScene/blob/master/SimpleScene/Util/ssBVH/ssBVH.cs`]

svenstaro · 2023-05-03T17:00:44Z

src/flat_bvh.rs

    };

    #[bench]
-    /// Benchmark the flattening of a BVH with 120,000 triangles.
+    /// Benchmark the flattening of a Bvh with 120,000 triangles.


Suggested change

/// Benchmark the flattening of a Bvh with 120,000 triangles.

/// Benchmark the flattening of a BVH with 120,000 triangles.

src/ray/intersect_x86_64.rs

svenstaro · 2023-05-03T17:03:39Z

src/testbase.rs

 use crate::bounding_hierarchy::{BHShape, BoundingHierarchy};
-use crate::ray::Ray;
+
+// TODO These all need to be realtyped and bounded


@marstaik ping

svenstaro · 2023-05-03T17:04:23Z

src/testbase.rs

@@ -515,17 +522,17 @@ fn bench_intersect_sponza_list(b: &mut ::test::Bencher) {
    intersect_list(&triangles, &bounds, b);
 }

-/// Benchmark intersecting the `triangles` list with `AABB` checks, but without acceleration
+/// Benchmark intersecting the `triangles` list with `Aabb` checks, but without acceleration


Suggested change

/// Benchmark intersecting the `triangles` list with `Aabb` checks, but without acceleration

/// Benchmark intersecting the `triangles` list with `AABB` checks, but without acceleration

svenstaro · 2023-05-03T17:04:28Z

src/testbase.rs

    let mut seed = 0;
    b.iter(|| {
        let ray = create_ray(&mut seed, bounds);

        // Iterate over the list of triangles.
        for triangle in triangles {
-            // First test whether the ray intersects the AABB of the triangle.
+            // First test whether the ray intersects the Aabb of the triangle.


Suggested change

// First test whether the ray intersects the Aabb of the triangle.

// First test whether the ray intersects the AABB of the triangle.

marstaik · 2023-05-03T18:59:53Z

@svenstaro I'll make the requested changes and look into things later today - but before I do that I was wondering if the Aabb to AABB capitalization changes in the docs make sense - they are generally referring to them via their class names rather than the AABB as a concept.

It doesn't bother me either way but I wanted to note it.

svenstaro · 2023-05-03T19:58:29Z

That's a good point. Let's make them clickable where they refer to the specific classes by doing

[`Aabb`]

dbenson24 · 2023-05-03T22:09:59Z

Once this PR lands I can take a look at adapting my work in #80. I think there's probably 3 PRs worth of stuff in that. The multithreaded builds, the replacement of optimize with the ability to add/remove individual nodes, and the additional shapes you can query with. The return of nalgebra to this crate makes the stuff I had been doing to have 64bit bvhs obsolete which makes me very happy.

marstaik · 2023-05-04T03:57:56Z

I'm doing one final check now. One thing I was wondering about: Should we kick out optimization? It never performed super well it did degrade the tree quickly. Should we perhaps just focus on super quick tree building using multithreading and SIMD?

I think we need to re-look at this. The new rebuild times are really really fast (400%) faster eg 200 sec to 40 sec for 50% modified. It might actually be worth it now. Maybe we can look into a better self optimizing bvh on removals and additions as well.

We should definitely look at both and compare, especially after multithreaded building is done. I wonder if optimizing can be improved via multithreading too thoough. Maybe @dbenson24 can look into it, as my next focus will probably be better unit tests and some more docs.

dbenson24 · 2023-05-04T05:35:06Z

If I recall correctly, the change to make optimize utilize the add/remove node functions was because that method produced trees with much faster traversal times compared to optimize

marstaik · 2023-05-04T05:54:08Z

I may have gone overboard on the document cleaning :)

svenstaro · 2023-05-04T09:28:11Z

Merging as-is. Amazing work!

marstaik added 2 commits April 21, 2023 23:22

Move math types over to nalgebra

29f41d2

Initial f32x3 optimization for SIMD

b535240

svenstaro requested changes Apr 23, 2023

View reviewed changes

src/ray.rs Outdated Show resolved Hide resolved

src/ray.rs Outdated Show resolved Hide resolved

svenstaro requested changes Apr 23, 2023

View reviewed changes

Try using simba

a5fa1da

Added intrinsics for f32/64 2-4 dimensions on x86_65, guarded by nigh…

2006787

…tly and "full_simd" feature

svenstaro requested changes Apr 25, 2023

View reviewed changes

marstaik added 6 commits April 24, 2023 21:30

Update CI configurations and rename full_simd to simd

28fdd22

Update README

6ddbaf9

Fix CI builds

ccfdc28

Fix Serde

2ddfac4

Fix Clippy errors, including renaming of all uppercase names (Bvh, Bv…

7ea3a1b

…hNode)

Formatting fix

64f8986

Remove upper-case acronyms flags

1c63a6d

svenstaro requested changes Apr 27, 2023

View reviewed changes

marstaik and others added 5 commits April 27, 2023 08:47

Bring back capitalization in docs & prose

90feed5

CR changes to ci file

1493fe4

Organize imports

b48e0a7

Documentation updates

e37e311

Merge branch 'master' into nalgebra

25a8070

marstaik added 2 commits April 27, 2023 10:04

Fix format on lib.rs

9804f7a

Add comments to fast_min/fast_max

cdbe274

svenstaro reviewed Apr 28, 2023

View reviewed changes

Fix Aabb in prose to AABB

608f897

svenstaro reviewed May 2, 2023

View reviewed changes

Cargo.toml Show resolved Hide resolved

svenstaro requested changes May 3, 2023

View reviewed changes

Document spring cleaning!

e59f1f0

marstaik added 2 commits May 3, 2023 22:38

Merge remote-tracking branch 'origin/nalgebra' into nalgebra_2

2eab008

Add comments to ray intersect x86_64

102a7ab

svenstaro merged commit 7482caf into svenstaro:master May 4, 2023

	name: CI with ${{ matrix.rust }} on ${{ matrix.os }} [no-features]
	name: CI with ${{ matrix.rust }} on ${{ matrix.os }} [no SIMD]

	name: CI with nightly on ${{ matrix.os }} [simd]
	name: CI with nightly on ${{ matrix.os }} [SIMD]

	//! tree rotations and optimizes the Bvh using a SAH.
	//! tree rotations and optimizes the BVH using a SAH.

	// TODO Consider: Instead of getting the scene's shapes passed, let leaf nodes store an Aabb
	// TODO Consider: Instead of getting the scene's shapes passed, let leaf nodes store an AABB

	// shapes, but also their new Aabbs into optimize().
	// shapes, but also their new AABBs into optimize().

	// TODO Consider: Stop updating Aabbs upwards the tree once an Aabb didn't get changed.
	// TODO Consider: Stop updating AABBs upwards the tree once an AABB didn't get changed.

	/// [`https://github.com/jeske/SimpleScene/blob/master/SimpleScene/Util/ssBvh/ssBvh.cs`]
	/// [`https://github.com/jeske/SimpleScene/blob/master/SimpleScene/Util/ssBVH/ssBVH.cs`]

	/// Benchmark the flattening of a Bvh with 120,000 triangles.
	/// Benchmark the flattening of a BVH with 120,000 triangles.

	/// Benchmark intersecting the `triangles` list with `Aabb` checks, but without acceleration
	/// Benchmark intersecting the `triangles` list with `AABB` checks, but without acceleration

	// First test whether the ray intersects the Aabb of the triangle.
	// First test whether the ray intersects the AABB of the triangle.

Move math types over to nalgebra with Generic dimensions > 2 and f32/f64 support #96

Move math types over to nalgebra with Generic dimensions > 2 and f32/f64 support #96

Conversation

marstaik commented Apr 22, 2023 • edited Loading

marstaik commented Apr 22, 2023

svenstaro left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

marstaik Apr 23, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

svenstaro commented Apr 23, 2023

marstaik commented Apr 24, 2023

svenstaro commented Apr 24, 2023

svenstaro commented Apr 24, 2023

marstaik commented Apr 25, 2023

svenstaro commented Apr 25, 2023

Choose a reason for hiding this comment

marstaik commented Apr 27, 2023 • edited Loading

marstaik commented Apr 27, 2023

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

svenstaro commented Apr 27, 2023

marstaik commented Apr 27, 2023

Choose a reason for hiding this comment

Choose a reason for hiding this comment

marstaik commented Apr 29, 2023

svenstaro commented Apr 29, 2023

svenstaro commented May 3, 2023

svenstaro left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

marstaik commented May 3, 2023

svenstaro commented May 3, 2023 • edited Loading

dbenson24 commented May 3, 2023 • edited Loading

marstaik commented May 4, 2023

dbenson24 commented May 4, 2023

marstaik commented May 4, 2023

svenstaro commented May 4, 2023

marstaik commented Apr 22, 2023 •

edited

Loading

marstaik Apr 23, 2023 •

edited

Loading

marstaik commented Apr 27, 2023 •

edited

Loading

svenstaro commented May 3, 2023 •

edited

Loading

dbenson24 commented May 3, 2023 •

edited

Loading