Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Automate Increases in Fidelity #10

Open
jamesbevins opened this issue Feb 23, 2018 · 8 comments
Open

Automate Increases in Fidelity #10

jamesbevins opened this issue Feb 23, 2018 · 8 comments
Milestone

Comments

@jamesbevins
Copy link
Collaborator

Gnowee Utilities module.
Currently the increases happen at set fitness points, but not all problems converge to the same fitness criteria. Couple of options:
a) Have the run time increase at set fractions of the best initial fitness.
b) Have a routine that evaluates if increasing fitness is needed based on the statistics of the output tallies.

Note, both a and b could be used.

@jamesbevins
Copy link
Collaborator Author

On reflection, I think b is best. We should also track run time to increase/descrease the nodes assigned to a particular walker based on run time. Solutions are occassionally generated which run much longer than the expected avg run time.

@jamesbevins jamesbevins self-assigned this Mar 18, 2019
@jamesbevins jamesbevins removed their assignment Mar 18, 2019
@jamesbevins jamesbevins added this to the v2.0 milestone Mar 18, 2019
@jamesbevins
Copy link
Collaborator Author

@ dholland4

Let's develop method to have user-defined statistical uncertainty levels; the code will determine the # of particles on the fly to meet that threshold.

@dholland4
Copy link
Collaborator

My thought is to have two user defined parameters: initial # of particles (N) and number of levels (L).
Assume the initial fidelity is level 1 (L1)

  1. Using N, we generate tallies for a population
  2. Gnowee determines which designs should remain in the population
  3. From the tallies, we calculate the relative error (gets more complicated if multiple tallies, maybe sum relative errors?) term similar to MCNP, etc for those designs
  4. We increase the fidelity of these designs based on the design with the current best fidelity (relative error)
  • Assuming R=c_i/sqrt(N_i), calculate c for each i design
  • Split the R=c/sqrt(N) relative error convergence into L bins, where c and N are from the initial best relative error. The initial relative error is assumed to be at the midpoint of the L1 bin.
  • Calculate Rnew for the midpoint of the next level.
  • Calculate new N_i for design i, which will produce Rnew using R=c_i/sqrt(N)
  1. Evaluate the chosen designs using the new N_i

@dholland4
Copy link
Collaborator

Complication: There are now designs at various levels of fidelity. Do we include new populations members at lower or the highest fidelity level? What is passed to Gnowee?

Personal opinion:
I like the idea of continuing the optimization using the highest level population ("localish" search) until that population shrinks. Then (with increasing fidelity) , run new lower fidelity designs ("globalish" search) with the the lower fidelity versions of the highest fidelity population until you find designs that can supplement (with replacement) your high fidelity population to its nominal size. Continue to the next higher fidelity level if it exists (or stay at the highest level) until convergence is reached.

@jamesbevins
Copy link
Collaborator Author

We currently run designs at different levels of fidelity too. We scale the CPUs to (roughly) maintain load balancing.

I think the way is to forward is to specify the initial number of particles (N) and the statistical significance desired at each level. The code then automatically adjusts N from generation to generation to meet that level and scales N to achieve higher fidelity.

The trick is figuring our when to scale to a higher fidelity level. It think using a fraction of the initial best member fitness makes some sense, but what fraction? The easiest thing might be to keep at L1 for G generations, then have the top D designs at the top level, the next best designs at the next highest level, etc. There is some inefficiency here in the mid time calculations, but it is the most general I can come up with since the fitness is relative to the objective function specified and the problem being solved, making it difficult to know a priori what a good fitness is.

@dholland4
Copy link
Collaborator

Good news about the different fidelity levels!

Perhaps I am mixing up fitness (objective funct values), fidelity, and statistical uncertainty (accuracy). My understanding is that it's impossible to know if increasing N will improve or worsen the fitness value(s). Increasing N can only decrease the uncertainty and increase the accuracy associated with the fitness value(s). Improving the actual fitness value is the task of the optimization approach (Gnowee).

Thus, I understood the question to be: when and how should we increase a design's fidelity (our knowledge of the true fitness value's accuracy, which depends on N)? Please correct me if I am mistaken.

There are two parts to this question:

  1. How do we decide that a design's accuracy should be increased
    (suggestion choose the designs with the best potential: base it on Gnowee's evaluation, which i believe uses elitism. You could have a set number of generations, G, as mentioned)
  2. How should we increase N to allow a fair comparison between designs at the next level of fidelity?
    (suggestion: use the 1/sqrt(N) approximation based on user defined statistical significant levels or a uniform spacing - ex: decrease R by 1/2 for each level)

@dholland4
Copy link
Collaborator

On reflection, I think this is really an optimization issue. I suggest that Coeus (being a wrapper) should not make decisions regarding the optimization, but only handle input/output, scheduling, etc. Thus, perhaps the optimizer should choose N for the desired design simulations, and return to Coeus N or a design-specific relative resource requirement value (say between 0 and 10). Then, Coeus can take the number of desired design simulations, N or relative values, and number of resources (nodes/processors) to determine the best resource allocation. If running asynchronous, the more intensive runs could use more nodes/processors and be chosen first. If synchronous, then more intensive runs could be allocated more nodes/processors.

@jamesbevins
Copy link
Collaborator Author

Correct, increasing N gets us lower statistical uncertainty and a better understanding of the true fitness, and yes, the question is determining when (in an automated fashion) to increase N for a high fitness design to make sure the high fitness is not an artifact of poor statistics.

The two parts that you outlined seem correct to me; the biggest question is part 1. That is one approach as suggested, but there are others, such as setting the levels based on some fraction of the best (or average) fitness from the first population, etc.

This is something that will be handled in the interface that Coeus provides. Gnowee doesn't traditionally look at this because it is evaluating an objective function based on a perturbed set of parameters. It assumes perfect knowledge of those parameters, and values derived from those parameters. These are quick calculations, so this leveled approach to maximize efficiency isn't needed.

However, we could calculate a fitness uncertainty too, and use that in the algorithm; it isn't clear to me though that this would be beneficial.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants