Skip to content
Robert Carlsen edited this page May 19, 2016 · 4 revisions

The optimization process generates several artifacts/files:

  • <stdout>: The pswarmdriver command prints the best point found at each iteration. It is often useful to redirect this output to a file for later use. The format of the per iteration best points is:

      Iter <iter-number> (<cumulative-number-evals>):  f[<var1-val> <var2-val> <var3-val> ...] = <objective-val>
    
  • run.log: This file contains the stdout from running the actual simulations (i.e. Cyclus commands). It is empty if the simulation are being run remotely.

  • obj.log: This is a human readable text file containing a list of every evalutation (i.e. variable values and associated objective value) performed by the optimizer - one per line. The evaluations appear in iteration order, although evaluations within a single iteration appear in no particular order. The format is:

      f[<var1-val> <var2-val> <var3-val> ...] = <objective-val>
    
  • pswarm.sqlite: This is a SQLite database file with several tables recording a detailed progression of the entire optimization process. A discription of each of these tables and associated fields follows:

    • patterninfo: Provides the per-iteration overview of the algorithm's progression.

      • iter: optimizer iteration number - one entry for each iteration.

      • step: step size for the grid/mesh used for e.g. pattern search polling.

      • nsearch: number of objective evaluations performed on this iteration by the particle swarm portion of the algorithm.

      • npoll: number of objective evaluations performed on this iteration by the pattern-search portion of the algorithm.

      • val: best objective value found by the start of this iteration.

      • posid: position ID indexing into the posid of the points table where the best current objective value was found.

    • patternpolls: Provides a list of polling points for each iteration selected by the pattern-search portion of the algorithm.

      • iter: optimizer iteration number - several entries for each iteration.

      • val: objective value for this polling position.

      • posid: polling position ID indexing into the posid of the points table where this objective value was found.

    • swarmbest: Records the best point found by the particle swarm portion of the algorithm at each iteration - one entry for each iteration.

      • iter: optimizer iteration number.

      • val: best objective value found by the swarm up to the current iteration.

      • posid: position ID indexing into the posid of the points table where the best current objective value was found.

    • swarmparticles: Records the point sampled by each particle in the particle swarm portion of the algorithm at each iteration - one entry for each particle on each iteration.

      • particle: particle ID number.

      • iter: optimizer iteration number.

      • val: objective value for the current particle position.

      • posid: position ID indexing into the posid of the points for the current particle position.

      • velid: ID indexing into the posid of the points table for the "point" representing the current velocity in each dimension of the particle.

      • vel: The current L2 norm speed of the particle.

    • swarmparticlesbest: Records the best point found (among all iterations up to the current one) by each particle in the particle swarm portion of the algorithm at each iteration - one entry for each particle on each iteration.

      • particle: particle ID number.

      • iter: optimizer iteration number.

      • best: best objective value found by the particle up to this iteration.

      • posid: position ID indexing into the posid of the points for the particle's best found position.

    • swarmparticlesmesh: Records the mesh-projected points sampled by each particle in the particle swarm portion of the algorithm at each iteration. This provides the same info as the swarmparticles table because the in-code configuration for the algorithm has been set to not project the particle positions onto the pattern-search mesh. There is one entry for each particle on each iteration.

      • particle: particle ID number.

      • iter: optimizer iteration number.

      • val: objective value for the projected current particle position.

      • posid: position ID indexing into the posid of the points for the particle's projected position.

    • points: Records the optimization variable value for each dimension of the optimization for each evaluated point. This table provides the actual variable values associated with all the positions/points provided in the other tables. There is one row for each dimension for each position.

      • posid: ID for the position.

      • dim: dimension or variable number.

      • val: variable value

Using the Data

Some of the information in the generated files is redundant (e.g. the info printed to stdout and in the sqlite database) - users are free to use whichever format(s) they are most comfortable with, although the sqlite database is certainly the most complete.

Below are several common things you might want to do with the data and examples of how to do it.

  • Watch optimization progress live:

    # assuming you have the sqlite3 command installed
    $ watch --interval 100 'sqlite3 -header -column pswarm.sqlite "select iter,nsearch,npoll,step,val from patterninfo order by iter desc limit 15"'
    Every 100.0s: sqlite3 -header -column pswarm.sqlite "...  Thu May 19 11:02:09 2016
    
    iter        nsearch     npoll       step        val
    ----------  ----------  ----------  ----------  -----------------
    51          60          0           0.17        0.254641625417702
    50          60          47          0.289       0.254641625417702
    49          60          51          0.289       0.254648451111965
    48          60          48          0.4913      0.254648451111965
    47          60          49          0.4913      0.254671206069217
    46          60          50          0.289       0.254693965093522
    45          60          0           0.289       0.25665039713272
    44          60          0           0.289       0.257354933087717
    43          60          52          0.289       0.257550290082958
    42          60          54          0.289       0.257797235689992
    41          60          0           0.289       0.263173894539359
    40          60          51          0.289       0.263265359318117
    39          60          50          0.4913      0.263265359318117
    38          60          50          0.4913      0.263668560787351
    37          60          51          0.4913      0.264259549973836
    
    
    # or just using the info from stdout (assuming redirection to "optim.log")
    $ watch --interval 100 "tail -n15 optim.log | sed 's/\[.*\]//'"
    Every 100.0s: tail -n15 all.log | sed 's/\[.*\]//'        Thu May 19 11:05:24 2016
    
    Iter 41 (4129 evals):  f = 0.26317389453935885
    Iter 42 (4189 evals):  f = 0.257797235689992
    Iter 43 (4303 evals):  f = 0.2575502900829583
    Iter 44 (4415 evals):  f = 0.2573549330877174
    Iter 45 (4475 evals):  f = 0.25665039713271975
    Iter 46 (4535 evals):  f = 0.2546939650935218
    Iter 47 (4645 evals):  f = 0.25467120606921695
    Iter 48 (4754 evals):  f = 0.2546484511119649
    Iter 49 (4862 evals):  f = 0.2546484511119649
    Iter 50 (4973 evals):  f = 0.25464162541770163
    Iter 51 (5080 evals):  f = 0.25464162541770163
    
  • Extracting variable values from the files:

    # grab last/best variable combo from pswarmdriver's stdout. This method also
    # works for the "obj.log" file.
    $ grep ' = ' optim.log | tail -n1 | sed 's/.*\[\(.*\)\].*/\1/' | xargs cycobj -transform -scen scenario.json
    Prototype        BuildTime Lifetime Number
    slow_reactor     1         192      1
    slow_reactor     1         204      1
    slow_reactor     1         216      1
    ...
    
    
    # grab best variable combo found by a particular iteration (e.g. 42).  Note
    # the nested select is to get around duplicate entries in the "points" table
    # due to sloppy implementation.
    $ sqlite3 pswarm.sqlite "select val from (select distinct p.dim,p.val as val from patterninfo as pi join points as p on pi.posid=p.posid where pi.iter=42 order by p.dim)"
    Prototype        BuildTime Lifetime Number
    slow_reactor     1         192      1
    slow_reactor     1         204      1
    slow_reactor     1         216      1
    ...
    
  • You can plot optimization per-iteration convergence:

    # get data from pswarmdriver's stdout
    $ grep ' = ' optim.log | sed 's/Iter \([0-9]\+\).* = \(.*\)/\1 \2/' > converge.dat
    
    # or get data from sqlite database
    $ sqlite3 -column pswarm.sqlite "select iter,val from patterninfo order by iter asc" > converge.dat
    
    # and plot it using your favorite tool (e.g. gnuplot)
    $ gnuplot -p -e 'plot "converge.dat" u 1:2 w lp'
    

    to get something like this:

Investigating single evaluations with cycobj

The cycobj command is provided as part of the cloudlus package as tool for handling optimization scenario evaluations and single-evaluation/simulation-level introspection. cycobj expects a set of variable values to be passed as space-separated arguments or piped to stdin line-separated. A deployment schedule (as printed out by cycobj) can also be piped to stdin instead if the -sched flag is given to cycobj. Several examples of typical cycobj usage follow:

  • Transform a set of optimization variable values into a deployment schedule:

    # assuming space-separated variables in "vars.dat" (from e.g. from obj.log)
    $ cat vars.dat | xargs cycobj -transform -scen scenario.json
    Prototype        BuildTime Lifetime Number
    slow_reactor     1         192      1
    slow_reactor     1         204      1
    slow_reactor     1         216      1
    ...
    
    # assuming line-separated variables in "vars.dat" (from e.g. pswarm.sqlite)
    $ cat vars.dat | cycobj -transform -scen scenario.json
    Prototype        BuildTime Lifetime Number
    slow_reactor     1         192      1
    slow_reactor     1         204      1
    slow_reactor     1         216      1
    ...
    
  • You can run the simulation corresponding to a particular set of variables or deployment schedule. cycobj will show you the output of the cyclus command followed by the objective value for the simulation. This generates normal Cyclus databases that can be queried and analyzed with all the typical methods/tools. The databases are given uuid-prefixed names.

    # from space-separated variable values in "vars.dat"
    $ cat vars.dat | xargs cycobj -scen scenario.json
    
                :                                                               
            .CL:CC CC             _Q     _Q  _Q_Q    _Q    _Q              _Q   
          CC;CCCCCCCC:C;         /_\)   /_\)/_/\\)  /_\)  /_\)            /_\)  
          CCCCCCCCCCCCCl       __O|/O___O|/O_OO|/O__O|/O__O|/O____________O|/O__
       CCCCCCf     iCCCLCC     /////////////////////////////////////////////////
       iCCCt  ;;;;;.  CCCC                                                      
      CCCC  ;;;;;;;;;. CClL.                          c                         
     CCCC ,;;       ;;: CCCC  ;                   : CCCCi                       
      CCC ;;         ;;  CC   ;;:                CCC`   `C;                     
    lCCC ;;              CCCC  ;;;:             :CC .;;. C;   ;    :   ;  :;;   
    CCCC ;.              CCCC    ;;;,           CC ;    ; Ci  ;    :   ;  :  ;  
     iCC :;               CC       ;;;,        ;C ;       CC  ;    :   ; .      
    CCCi ;;               CCC        ;;;.      .C ;       tf  ;    :   ;  ;.    
    CCC  ;;               CCC          ;;;;;;; fC :       lC  ;    :   ;    ;:  
     iCf ;;               CC         :;;:      tC ;       CC  ;    :   ;     ;  
    ...
    ...
    ...
    ...
    Status: Cyclus run successful!
    Output location: ffa4316f-eb1b-4dd0-8b6e-f055c67df14d.sqlite
    Simulation ID: 51e7c73c-2d8b-47ee-92f2-4b3b054e02b6
    0.254375708458661
    
    
    # from a (possibly hand-modified) deployment schedule in "cycobj -transform" format
    $ cat sched.dat | cycobj -sched -scen scenario.json
    
                :                                                               
            .CL:CC CC             _Q     _Q  _Q_Q    _Q    _Q              _Q   
          CC;CCCCCCCC:C;         /_\)   /_\)/_/\\)  /_\)  /_\)            /_\)  
          CCCCCCCCCCCCCl       __O|/O___O|/O_OO|/O__O|/O__O|/O____________O|/O__
       CCCCCCf     iCCCLCC     /////////////////////////////////////////////////
       iCCCt  ;;;;;.  CCCC                                                      
      CCCC  ;;;;;;;;;. CClL.                          c                         
     CCCC ,;;       ;;: CCCC  ;                   : CCCCi                       
      CCC ;;         ;;  CC   ;;:                CCC`   `C;                     
    lCCC ;;              CCCC  ;;;:             :CC .;;. C;   ;    :   ;  :;;   
    CCCC ;.              CCCC    ;;;,           CC ;    ; Ci  ;    :   ;  :  ;  
     iCC :;               CC       ;;;,        ;C ;       CC  ;    :   ; .      
    CCCi ;;               CCC        ;;;.      .C ;       tf  ;    :   ;  ;.    
    CCC  ;;               CCC          ;;;;;;; fC :       lC  ;    :   ;    ;:  
     iCf ;;               CC         :;;:      tC ;       CC  ;    :   ;     ;  
    ...
    ...
    ...
    ...
    Status: Cyclus run successful!
    Output location: ffa4316f-eb1b-4dd0-8b6e-f055c67df14d.sqlite
    Simulation ID: 51e7c73c-2d8b-47ee-92f2-4b3b054e02b6
    0.254375708458661
    
  • You can transform a (possibly hand-modified) deployment schedule back into variable values - one value per line:

    $ cat sched.dat | cycobj -sched -scen scenario.json -transform
    1
    0
    0
    0
    1
    0
    ...