Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Test stability of assignment algorithm #38

Open
2 tasks
BZ-BowenZhang opened this issue Aug 7, 2024 · 5 comments
Open
2 tasks

Test stability of assignment algorithm #38

BZ-BowenZhang opened this issue Aug 7, 2024 · 5 comments
Assignees
Labels
reproducibility Reproducibility

Comments

@BZ-BowenZhang
Copy link
Collaborator

According to the discussion on 7th AUG between @BZ-BowenZhang @sgreenbury and @Hussein-Mahfouz, the difference between the assigning results across each run is needed to test and prove this algorithm's stability. This may involve at least two parts:

  • Observe the difference between each run with the same input configuration to prove the stability.
  • Observe the difference in travel flow distributions with different input configurations to test the sensibility for specific configurations.
@BZ-BowenZhang BZ-BowenZhang added the reproducibility Reproducibility label Aug 7, 2024
@BZ-BowenZhang BZ-BowenZhang self-assigned this Aug 7, 2024
@sgreenbury
Copy link
Collaborator

I have started looking at this to see if the same RNG seed generates outputs deterministically.

It looks like the scripts up to 3.2.3 are currently deterministic but not for secondary locations with PAM.

@sgreenbury
Copy link
Collaborator

Exploring this would also be simplified through implementing #53 first

@sgreenbury
Copy link
Collaborator

From discussion with @BZ-BowenZhang and @Hussein-Mahfouz, we'd like to measure three things:

  1. Running with same seed, leads to identical outputs
  2. Running with different seeds, generates different stochastic samples from the pipeline's distribution. Invariance of properties will measure the stability.
  3. Running with different config and different seed generates outputs with different assumptions allowing a measure of the sensitivity.

@Hussein-Mahfouz
Copy link
Collaborator

@sgreenbury non determinism in 3.2.3_assign_secondary_zone.py may be due to the following:

  • we are using our internal update_population_plans here, which is a wrapper around pam DiscretionaryTrips class. We use the function update_plan from the class, which itselef uses DiscretionaryTripsOD and DiscretionaryTripsRound.
  • Both of these classes use the sample_weighted function to select a zone. That function uses random (see here)

@sgreenbury
Copy link
Collaborator

Thanks @Hussein-Mahfouz for identifying these calling random.choice(), I think that's it. I've updated the init_rng() method to include setting the seed for random too and this now produces deterministic outputs for a given seed. I'll open a PR!

sgreenbury added a commit that referenced this issue Oct 7, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
reproducibility Reproducibility
Projects
None yet
Development

No branches or pull requests

3 participants