Skip to content
Evan Weinberg edited this page Aug 27, 2018 · 4 revisions

A unit test for the construction of the HISQ stencil is given in tests/hisq_stencil_test.cpp.

The discussion here is based on the initial paper on the HISQ discretization, [https://arxiv.org/abs/hep-lat/0610092]. The HISQ stencil is constructed out of an ASQTAD smearing, a unitarization, and a second step of ASQTAD smearing plus a three-link correction to improve the dispersion relation. The paper above notes that the sequence of ASQTAD+unitarization+ASQTAD can be algebraically rearranged to skip creating one type of fat link, which is why the two smearing steps described below aren't identical.

The sequence of calls is based on tracking through the calls within the function create_hisq_links_milc defined in the MILC source code in the file generic_ks/fermion_links_hisq_load_milc.c. The calls can be tracked all of the way to two or three calls to the function computeKSLinkQuda defined in interface_quda.cpp within the QUDA source tree.

There are, broadly, three steps to constructing the HISQ stencil:

  1. Creating the fat7 smeared links, V_\mu, and unitarizing them, W_\mu.
  2. Creating the non-relativistic epsilon correction if the coefficient epsilon is non-zero. If it's zero, this step can be skipped.
  3. Creating the fat7 plus Lepage "fat links" X_\mu, and Naik term, out of the W_\mu links. These are added to the epsilon correction if it exists.

As a note, the paths and coefficients for each of the three steps are spelled out in the MILC source code in the file generic_ks/imp_actions/hisq/hisq_action.h. Note that the non-relativistic correction is the third path, even though it is constructed second.

First step: constructing V_\mu and W_\mu links

The V_\mu and W_\mu links are constructed via the call

computeKSLinkQuda(vlink,nullptr,wlink,milc_sitelink,act_path_coeff_1, &qudaGaugeParam);

The second argument, which is supplied a nullptr, is where a long-link field would go if it were constructed. We note that V_\mu, saved in vlink, isn't strictly necessary for any subsequent step in constructing the HISQ stencil (though it is checked as part of the unit test). A point of optimization would be keeping it only on the GPU and deallocating it after unitarizing and copying it into the W field. The unitarized link, W_\mu, is stored in wlink. The input gauge field U_\mu is stored in milc_sitelink.

The path coefficients for this step are:

 double act_path_coeff_1[6] = {
     ( 1.0/8.0),                 /* one link */
       0.0,                      /* Naik */
     (-1.0/8.0)*0.5,             /* simple staple */
     ( 1.0/8.0)*0.25*0.5,        /* displace link in two directions */
     (-1.0/8.0)*0.125*(1.0/6.0), /* displace link in three directions */
       0.0                       /* Lepage term */
 };

We note that, in the CPU part of this unit test, V_\mu and W_\mu are computed separately.

Second step: constructing the non-relativistic correction

The epsilon correction, if it's non-zero, is constructed with the call:

computeKSLinkQuda(fatlink, longlink, nullptr, wlink, act_path_coeff_3, &qudaGaugeParam);

Where, following the MILC convention, the links are constructed into the final fatlink and longlink fields. After constructing these fields, the links are rescaled by the coefficient epsilon and copied into separate epsilon fields fatlink_eps and longlink_eps. Epsilon is the coefficient of the non-relativistic correction.

The path coefficients for this step are:

double act_path_coeff_3[6] = {
    ( 1.0/8.0),    /* one link b/c of Naik */
    (-1.0/24.0),   /* Naik */
      0.0,         /* simple staple */
      0.0,         /* displace link in two directions */
      0.0,         /* displace link in three directions */
      0.0          /* Lepage term */
  };

The epsilon parameter can be set on the command line in the unit tests with --epsilon-naik. The default value is 0.0. If it's left at the default value of zero, the HISQ stencil unit test skips constructing the non-relativistic correction.

Third step: constructing X_\mu and the long link

The last step is constructing the final fat (X_\mu) and long links out of the W_\mu field. It is constructed with the call:

computeKSLinkQuda(fatlink, longlink, nullptr, wlink, act_path_coeff_2, &qudaGaugeParam);

Where the fat and links are saved into fatlink and longlink, respectively. The third argument is set to nullptr to denote that there is no unitarization step after the second round of smearing. The calculation here overwrites any existing values in fatlink and longlink, which may be non-zero or uninitialized depending on if the non-relativistic correction is computed before.

If a non-relativistic correction was computed, fatlink and longlink are subsequently accumulated into fatlink_eps and longlink_eps. A physically relevant use case for keeping around all of these fields is a 2+1+1 QCD simulation: fatlink and longlink are used in the HISQ stencil for the light and strange quarks, while fatlink_eps and longlink_eps are used in the stencil for the charm quark.

The path coefficients for this step are:

double act_path_coeff_2[6] = {
    (( 1.0/8.0)+(2.0*6.0/16.0)+(1.0/8.0)),   /* one link */
        /* One link is 1/8 as in fat7 + 2*3/8 for Lepage + 1/8 for Naik */
    (-1.0/24.0),                             /* Naik */
    (-1.0/8.0)*0.5,                          /* simple staple */
    ( 1.0/8.0)*0.25*0.5,                     /* displace link in two directions */
    (-1.0/8.0)*0.125*(1.0/6.0),              /* displace link in three directions */
    (-2.0/16.0)                              /* Lepage term, correct O(a^2) 2x ASQTAD */
  };

We note that the second and third step cannot be (easily) fused into a single step because, in the case of the non-relativistic correction, the one-link term is simply a rescaled W_\mu gauge link, while in the fat link construction, the one-link term is a sum of the one-link, staple, two-link, three-link, and Lepage term.

Comment on using reconstruct-13 with the long links

Reconstruct-13 takes advantage of the fact that the long links live in U(3) = U(1) x SU(3), so it can be compressed as a phase plus recon-12. This still leaves the question of where the factor of -1.0/24.0 (or, in the case of the non-relativistic correction, -(1.0+eps_naik)/24.0) is applied.

These factors are addressed by setting the parameters tadpole_coeff and scale in the QudaGaugeParam structure. For HISQ fermions, set tadpole_coeff = 1.0 and scale = -(1.0+eps_naik)/24.0. When recon-13 is used, the long links are corrected by a factor of 1.0/scale when they are compressed (because the scale factor is baked into the links that come from the CPU) and then re-applied when applying the HISQ stencil.

As a remark of backwards compatibility, when using bona-fide ASQTAD fermions, tadpole_coeff needs to be set to the tadpole coefficient (generally [plaq]^1/4, but for historical reasons not always), and scale needs to be set to -1.0/(24.0*tadpole_coeff*tadpole_coeff). (Using two factors of tadpole_coeff as opposed to three is not a typo; it is a convention of pulling out one factor of the tadpole correction, similar to the convention of pulling out the factor of 1/2 in front of the finite difference.)

Comment on the physics of HISQ smearing

A single fat7 smearing suppresses all O(a^2) taste breaking errors (testament to the one-, staple, two-, and three-link structure), but introduces extra O(a^2) errors unrelated to taste breaking. The addition of the Lepage term cancels these additional O(a^2) errors without reintroducing taste breaking effects. The combination of fat7 smearing plus the Lepage term is the definition of ASQTAD smearing.

The first step in defining HISQ smearing is the underlying idea that if one iteration of smearing is good, two is better. However, while ASQTAD smearing suppresses taste breaking effects, it enhances contributions from dimension-5 operator contributions of the form . This enhancement is suppressed by the unitarization step, which explicitly bounds the size of these dimension-5 contributions.

The fat link contribution of HISQ smearing can be formally defined as an ASQTAD smearing, a unitarization, and another hit of ASQTAD smearing. As noted above, this can be rearranged algebraically into a fat7 smearing (giving the V_\mu field), unitarization (giving the W_\mu field), and an ASQTAD smearing with twice the Lepage term (giving the X_\mu field).

As a last step, a long link (Naik) term is added, constructed from the W_\mu fields. This improves the dispersion relation to O(p^4).

Further, a non-relativistic correction can be added, originally motivated by applications of HISQ fermions to charm physics. This is the addition of the appropriate combination of a one-link and Naik term.

Clone this wiki locally