Skip to content

Commit

Permalink
Doc clarifications
Browse files Browse the repository at this point in the history
  • Loading branch information
WardBrian committed Aug 5, 2024
1 parent c89010b commit 1dcaf78
Show file tree
Hide file tree
Showing 3 changed files with 39 additions and 20 deletions.
28 changes: 17 additions & 11 deletions src/reference-manual/statements.qmd
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,7 @@ are used --- if they do not, the behavior is undefined.
The basis of Stan's execution is the evaluation of a log probability
function (specifically, a log probability density function) for a given
set of (real-valued) parameters. Log probability functions can be
constructed by using distribution statements and log probability increment
constructed by using distribution statements and log probability increment
statements. Statements may be grouped
into sequences and into for-each loops. In addition, Stan allows
local variables to be declared in blocks and also allows an empty
Expand Down Expand Up @@ -248,7 +248,8 @@ of the posterior up to an additive constant. Data and transformed
data are fixed before the log density is evaluated. The total log
probability is initialized to zero. Next, any log Jacobian
adjustments accrued by the variable constraints are added to the log
density (the Jacobian adjustment may be skipped for optimization).
density (the Jacobian adjustment may be skipped for maximum likelihood estimation
via optimization).
Distribution statements and log probability increment statements may add to the log
density in the model block. A log probability increment statement
directly increments the log density with the value of an expression as
Expand Down Expand Up @@ -370,9 +371,9 @@ or in functions ending with `_jacobian` to mimic the log Jacobian
adjustments accrued by built-in variable transforms.

Similarly to those implemented for the built-in transforms, these Jacobian adjustment
may be skipped for optimization.
may be skipped for maximum likelihood estimation via optimization.

For example, here is a program which re-creates the existing
For example, here is a program which recreates the existing
[`<upper=x>` transform](transforms.qmd#upper-bounded-scalar) on real numbers:

```stan
Expand All @@ -391,16 +392,21 @@ parameters {
transformed parameters {
real b = upper_bound_jacobian(b_raw, ub);
}
model {
// use b as if it was declared `real<upper=ub> b;` in parameters
// e.g.
// b ~ lognormal(0, 1);
}
```

### Accessing the log density {-}

To access accumulated log density up to the current execution point,
To access the accumulated log density up to the current execution point,
the function `target()` may be used.

## Sampling statements {#sampling-statements.section}

The term "sampling statement" has been replaced with
The term "sampling statement" has been replaced with
[distribution statement](#distribution-statements.section).

## Distribution statements {#distribution-statements.section}
Expand Down Expand Up @@ -464,7 +470,7 @@ terms in the model block. Equivalently, each $\sim$ statement
corresponds to a multiplicative factor in the unnormalized posterior
density.

Distribution statements (`~`) accept only built-in or user-defined
Distribution statements (`~`) accept only built-in or user-defined
distributions on the
right side. The left side of a distribution statement may be data,
parameter, or a complex expression, but the evaluated type needs to
Expand All @@ -484,8 +490,8 @@ target += normal_lpdf(sigma | 0, 1);
```

Stan models can mix distribution statements and log probability
increment statements. Although statistical models
are usually defined with distributions in the literature,
increment statements. Although statistical models
are usually defined with distributions in the literature,
there are several scenarios in which we may want to code the log
likelihood or parts of it directly, for example, due to computational
efficiency (e.g. censored data model) or coding language limitations
Expand Down Expand Up @@ -517,7 +523,7 @@ target += dist_lpmf(y | theta1, ..., thetaN);

This will be well formed if and only if `dist_lpdf(y | theta1,
..., thetaN)` or `dist_lpmf(y | theta1, ..., thetaN)` is a
well-formed expression of type `real`. User defined distributions
well-formed expression of type `real`. User defined distributions
can be defined in functions block by using function names ending with `_lpdf`.


Expand Down Expand Up @@ -913,7 +919,7 @@ The equivalent code for a vectorized truncation depends on which of the
variables are non-scalars (arrays, vectors, etc.):

1. If the variate `y` is the only non-scalar, the result is the same as
described in the above sections, but the `lcdf`/`lccdf` calculation is
described in the above sections, but the `lcdf`/`lccdf` calculation is
multiplied by `size(y)`.

2. If the other arguments to the distribution are non-scalars, then the
Expand Down
12 changes: 8 additions & 4 deletions src/reference-manual/user-functions.qmd
Original file line number Diff line number Diff line change
Expand Up @@ -95,7 +95,7 @@ arguments to produce an expression, which has a value when executed.
### Functions as statements {-}

Functions with void return types may be applied to arguments and used
as [statements.qmd](statements).
as [statements.qmd](statements).
These act like distribution statements or print
statements. Such uses are only appropriate for functions that act
through side effects, such as incrementing the log probability
Expand Down Expand Up @@ -161,7 +161,11 @@ used in place of parameterized distributions on the right side of
Functions of certain types are restricted on scope of usage.
Functions whose names end in `_lp` assume access to the log
probability accumulator and are only available in the transformed
parameter and model blocks.
parameters and model blocks.

Functions whose name end in `_jacobian` assume access to the log
probability accumulator may only be used within the transformed parameters
block.

Functions whose names end in `_rng`
assume access to the random number generator and may only be used
Expand Down Expand Up @@ -293,8 +297,8 @@ a function elsewhere results in a compile-time error.

### Log probability access in functions {-}

Functions that include
[statements.qmd#distribution-statements.section](distribution statements) or
Functions that include
[statements.qmd#distribution-statements.section](distribution statements) or
[statements.qmd#increment-log-prob.section](log probability increment statements)
must have a name that ends in `_lp`.
Attempts to use distribution statements or increment log probability
Expand Down
19 changes: 14 additions & 5 deletions src/stan-users-guide/user-functions.qmd
Original file line number Diff line number Diff line change
Expand Up @@ -293,15 +293,20 @@ Functions whose names end in `_jacobian` can use the
`jacobian +=` statement. This can be used to implement a custom
change of variables for arbitrary parameters.

For example, here is a program which re-creates the built-in
For example, this function recreates the built-in
`<upper=x>` transform on real numbers:
```stan
real upper_bound_jacobian(real x, real ub) {
jacobian += x;
return ub - exp(x);
}
```

It can be used as a replacement for `real<lower=ub>` as follows:

```stan
functions {
real upper_bound_jacobian(real x, real ub) {
jacobian += x;
return ub - exp(x);
}
// upper_bound_jacobian as above
}
data {
real ub;
Expand All @@ -312,6 +317,10 @@ parameters {
transformed parameters {
real b = upper_bound_jacobian(b_raw, ub);
}
model {
b ~ lognormal(0, 1);
// ...
}
```

## Functions acting as random number generators
Expand Down

0 comments on commit 1dcaf78

Please sign in to comment.