-
Notifications
You must be signed in to change notification settings - Fork 1
/
Copy pathheron.diderot
172 lines (143 loc) · 6.56 KB
/
heron.diderot
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
#version 1.0
/* ==========================================
## heron.diderot: Computing square roots via Heron's method
This program finds square roots of numval reals between minval and maxval
using Heron's method (aka the Babylonion method)
https://en.wikipedia.org/wiki/Methods_of_computing_square_roots#Babylonian_method
After compiling, you can run this program with:
<!--(ignore lines starting with "\tab#"; for testing with ../runtests.py)-->
#!
#=diderotc
./heron
#> vrie.nrrd 0
The output stores four numbers for each value processed (by index along fast axis):
<ol start=0>
<li> the value whose square root was found
<li> the computed square root
<li> the number of iterations taken to compute it
<li> the error, relative to Diderot's sqrt() function
</ol>
To see the output (four numbers per line):
#R
unu save -f text -i vrie.nrrd
You can see that with the defaults, it took at most 7 iterations to converge:
#R
unu slice -i vrie.nrrd -a 0 -p 2 | unu minmax -
The command-line executables produced by Diderot have hest-generated
usage infomation. Try:
#R
./heron --help
to see how to set the input values and output filename. Note
that the input variables self-document their purpose with ("...")
annotations, which are in turn included in the generated `--help`
usage information. So the declarations:
#R
input real minval ("min value to find root of") = 1;
input real maxval ("max value to find root of") = 100;
input int numval ("how many values to compute") = 100;
input real eps ("relative error convergence test") = 0.000001;
become in the usage information:
#R
-minval <x> = min value to find root of (double); default: "1"
-maxval <x> = max value to find root of (double); default: "100"
-numval <int> = how many values to compute (long int); default: "100"
-eps <x> = relative error convergence test (double); default: "1e-06"
There are also command-line options that are unrelated to input variables:
#R
-l <int> , --limit <int> = specify limit on number of super-steps (0
means unlimited) (unsigned long int); default: "0"
-print <str> = specify where to direct printed output (string); default: "-"
-v , --verbose = enable runtime-system messages
-t , --timing = enable execution timing
heron.diderot shows the utility of `-l` option to limit the number of iterations
(called "super-steps" in Diderot) for which the computation can run. If you try to increase the precision
of the result by lowering `eps`, as with:
#R
./heron -eps 1e-8
The program may not ever finish, because the limited precision of 32-bit
floats prevents the computation from reaching sufficient accuracy. So, noting
from above that at most 7 iterations were needed with the default `eps`, we
can try limiting the computation at something higher than 7:
#^ ./heron -eps 1e-8 -l 20
#_ ./heron -eps 1e-8 -l 20 -o vrie-lim.nrrd
#> vrie-lim.nrrd 0
This will finish promptly, after 20 iterations. The initialization of the program output variable
with:
#R
output vec4 vrie = [val,-1,-1,-1];
makes clear which strands never stabilized because they were halted by the `-l
20` limit. When you view the output from the command above with `unu save -f
text -i vrie.nrrd`, you'll see lines like:
#R
89 -1 -1 -1
90 -1 -1 -1
91 -1 -1 -1
for the values where the algorithm didn't converge.
The consequences of active strands being halted by an iteration limit is
different for strand **collections** vs strand **arrays**. If the program runs
as a collection of strands, i.e. the last line of the program is instead
#R
initially { sqroot(lerp(minval, maxval, 1, ii, numval)) | ii in 1 .. numval };
(note `initially {}` instead of `initially []`), then running `./heron -eps
1e-8 -l 20` will still finish after 20 iterations, but no output values will
be saved for those strands that didn't stabilize in time. That is, `unu save -f text
-i vrie.nrrd` will produce fewer than 100 lines of text, including only the
values for which the algorithm converged with 20 iterations or fewer.
#T
grep -v "initially \[" heron.diderot > coll.diderot
echo "initially { sqroot(lerp(minval, maxval, 1, ii, numval)) | ii in 1 .. numval };" >> coll.diderot
diderotc --exec coll.diderot
#tmp coll.diderot
./coll -eps 1e-8 -l 20 -o vrie-coll.nrrd
#> vrie-coll.nrrd 0
On the other hand, we can also increase the precision of a Diderot
`real` itself by using a C `double` to store a `real`, instead of a (single-precision) `float`,
the default. We do this by compiling with:
#R
diderotc --double --exec heron.diderot
at which point
#R
./heron -eps 1e-8
should finish, with every strand producing a more accurate answer.
========================================== */
input real minval ("min value to find root of") = 1;
input real maxval ("max value to find root of") = 100;
input int numval ("how many values to compute") = 100;
input real eps ("relative error convergence test") = 0.000001;
/* One strand per value. The name of the strand need
not be related to the name of the source file. */
strand sqroot (real val) {
real root = val; // first guess at sqrt(N) is N
int iter = 0; // count iterations required
/* the output: value, root, iters, error. In Diderot, all
strand variables must be initialized at declaration. */
output vec4 vrie = [val,-1,-1,-1];
// Each update does one iteration of Heron's method.
update {
if (val <= 0) {
// No work to do
stabilize;
}
root = (root + val/root) / 2.0;
iter += 1; // "iter++;" is not in the syntax
/* The convergence test expression below is a manual re-write of
(|root^2 - val|/val < eps) to increase numerical precision. However
that expression would compile too because ^ is exponentiation. */
if (|(root/val)*root - 1| < eps) {
stabilize;
}
}
/* The stabilize method is called upon strand stabilization; you can use
this to do some post-processing on whatever computation has finished,
prior to saving the output. Having the stabilize method is especially
handy if update can call stabilize from multiple places */
stabilize {
// We use the sqrt() built-in to report actual relative error
vrie = [val, root, iter, (root-sqrt(val))/sqrt(val)];
}
}
/* Initialize strands with value to take the square root of; with this
5-argument lerp, the output ranges from minval to maxval as ii ranges
from 1 to numval. The "initially [];" declaration creates an array
of strands (as opposed to a collection, created by "initially {};"). */
initially [ sqroot(lerp(minval, maxval, 1, ii, numval)) | ii in 1 .. numval ];