Skip to content

Haruto2017/Project1-CUDA-Flocking-1

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

42 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

University of Pennsylvania, CIS 565: GPU Programming and Architecture, Project 1 - Flocking

  • Penghai Wei
  • Tested on: Windows 11 Build 22000, i9-12900k @ 5.20GHz, RTX 3080ti 12GB

Coherent uniform grid 10000 boids with 0.2 delta time

Coherent uniform grid 10000 boids

Coherent uniform grid 50000 boids with 0.2 delta time

Coherent uniform grid 50000 boids

Coherent uniform grid 100000 boids with 0.2 delta time

Coherent uniform grid 100000 boids

Coherent uniform grid 50000 boids with 0.05 delta time

Coherent uniform grid 50000 boids

Coherent uniform grid 100000 boids with 0.05 delta time

Coherent uniform grid 100000 boids

Frame Rate Histogram (1-to-1)

Histogram

Frame Rate Histogram (Logarithmic)

LogHistogram

Questions to answer:

  1. For each implementation, how does changing the number of boids affect performance? Why do you think this is?

Brute force search's performance degrades quadratically. This is because it has a O(n^2) complexity for iterating all other boids for each boid. The other two degrades in a close-to-linear rate. They only search boids in neighboring grids, and that's probably a fixed constant ratio to the total number of boids influenced by block size. The uniform grid without coherence optimization suffers from cache miss so it's little worse.

  1. For the coherent uniform grid: did you experience any performance improvements with the more coherent uniform grid? Was this the outcome you expected? Why or why not?

Yes. Since it reorganize the memory is a way that the consecutive accesses are continuous in memory region, cache miss is much less frequent for each cuda warp. And boids in the same warp are also probably now in neighboring or same block in space.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Cuda 43.8%
  • C++ 37.3%
  • CMake 17.0%
  • Other 1.9%