-
Notifications
You must be signed in to change notification settings - Fork 0
/
README.txt
31 lines (24 loc) · 1.44 KB
/
README.txt
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
Parse Ushahidi reports download (csv) and look for discrepancies.
Right now it just takes a download of the full list of reports,
andlooks for outliers among reports that share exactly the same
location name.
Here are some quick hints on the calculations and the currently
cryptic output format.
I group each set of locations that share the same name into a
"LocationCluster". For each one, I calculate a "median" location,
which is just the median of latitudes and median of the longitudes.
The lat and lon may come from different reports, and is averaged when
there is an even number of points, so it is quite possibly not an
actual point from a report, but should be in the middle in some
relatively robust sense.
For each LocationCluster I find the bounding box of the set of points,
and calculate the length of the diagonal of the bounding box (the "extent").
If the extent is more than 0.2 degrees, I print something out.
7.46928 7 ll (32.0640 12.7365) ur (32.7850 20.1709) Zawiya, Libya
median (32.7630 12.7365)
(32.0640 20.1709) http://cal.libyacrisismap.net/admin/reports/edit/397
(32.7850 12.7441) http://cal.libyacrisismap.net/admin/reports/edit/21
The first line has the extent, the number of points, the lower left ("ll" and
upper right ("ur") points of the bounding box, and the name.
Then comes the median, and then one outlier per line, with a link to
the report.