-
Notifications
You must be signed in to change notification settings - Fork 0
/
DIALS_for_ED_v2.tex
1379 lines (1257 loc) · 76.3 KB
/
DIALS_for_ED_v2.tex
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
%------------------------------------------------------------------------------
% Electron diffraction data processing with DIALS
%------------------------------------------------------------------------------
%
%
\documentclass[preprint]{iucr}
%\documentclass[preprint, pdf]{iucr}
\pdfoptionpdfminorversion=5
%----------------------------------------------------------------------------
% Extra packages
%----------------------------------------------------------------------------
\usepackage{graphicx} % For graphics
\usepackage{mathtools} % Math stuff
\usepackage{bm} % Bold in maths
\usepackage{listings} % Code snippets
\usepackage{bold-extra} % Bold mono space for code snippets
\usepackage{url} % For URLs
\usepackage{xspace} % Spacing in macros
\usepackage{color} % Colours
\usepackage{textcomp} % Required by listings when upquote=true
\usepackage{gensymb}
\usepackage{booktabs}
\usepackage{siunitx} % Proper formatting for units
%----------------------------------------------------------------------------
% Information about the paper
%----------------------------------------------------------------------------
\paperprodcode{a000000}
\paperref{xx9999}
\papertype{FA}
\paperlang{english}
%----------------------------------------------------------------------------
% Information about journal
%----------------------------------------------------------------------------
\journalcode{D}
\journalyr{2018}
%\journaliss{1}
%\journalvol{56}
%\journalfirstpage{000}
%\journallastpage{000}
\journalreceived{\relax}
\journalaccepted{\relax}
\journalonline{\relax}
%----------------------------------------------------------------------------
% Bits of formatting used throughout document
%----------------------------------------------------------------------------
\newcommand{\cctbx}{\emph{cctbx}\xspace}
\newcommand{\dxtbx}{\emph{dxtbx}\xspace}
\newcommand{\lstbx}{\emph{lstbx}\xspace}
\newcommand{\dials}{\emph{DIALS}\xspace}
\newcommand{\dialsreport}{\emph{dials.report}\xspace}
\newcommand{\dialsestimategain}{\emph{dials.estimate\_gain}\xspace}
\newcommand{\dialsfindspots}{\emph{dials.find\_spots}\xspace}
\newcommand{\dialsimport}{\emph{dials.import}\xspace}
\newcommand{\dialsindex}{\emph{dials.index}\xspace}
\newcommand{\dialsrefinebravaislattice}{\emph{dials.refine\_bravais\_lattice}\xspace}
\newcommand{\dialsrefine}{\emph{dials.refine}\xspace}
\newcommand{\dialsintegrate}{\emph{dials.integrate}\xspace}
\newcommand{\dialsimageviewer}{\emph{dials.image\_viewer}\xspace}
\newcommand{\dialsreciprocallatticeviewer}{\emph{dials.reciprocal\_lattice\_viewer}\xspace}
\newcommand{\dialsexport}{\emph{dials.export}\xspace}
\newcommand{\ccpfour}{\emph{CCP4}\xspace}
\newcommand{\labelit}{\emph{LABELIT}\xspace}
\newcommand{\cctbxxfel}{\emph{cctbx.xfel}\xspace}
\newcommand{\code}{\texttt}
\newcommand{\xds}{\emph{XDS}\xspace}
\newcommand{\mosflm}{\emph{MOSFLM}\xspace}
\newcommand{\pointless}{\emph{POINTLESS}\xspace}
\newcommand{\aimless}{\emph{AIMLESS}\xspace}
\newcommand{\blend}{\emph{BLEND}\xspace}
\newcommand{\refmac}{\emph{REFMAC5}\xspace}
\newcommand{\phaser}{\emph{PHASER}\xspace}
% use bold face for vectors
\renewcommand{\vec}[1]{\mathbf{#1}}
\newcommand{\mat}[1]{\mathbf{#1}}
% derivatives
\newcommand{\pder}[2][]{\frac{\partial#1}{\partial#2}}
\newcommand{\tder}[2][]{\frac{\mathrm{d}#1}{\mathrm{d}#2}}
%----------------------------------------------------------------------------
% Configure the listing environment
%----------------------------------------------------------------------------
% Define the Python style
\definecolor{pykeyword}{rgb}{0,0,0.5}
\definecolor{pystring}{rgb}{0,0.5,0}
\definecolor{pycomment}{rgb}{0.6,0.6,0.6}
\newcommand\pythonstyle{
\lstset{
language=Python,
basicstyle=\scriptsize\ttfamily,
upquote=true,
showstringspaces=false,
otherkeywords={self},
keywordstyle=\color{pykeyword},
stringstyle=\color{pystring},
commentstyle=\color{pycomment},
}
}
% Python environment
\lstnewenvironment{python}[1][] {
\pythonstyle
\lstset{#1}
} {}
% use to fix order in bibtex entries
\newcommand{\mockalph}[1]{}
% Comments from DW:
\newcounter{DWCounter}
\newcommand{\DW}[1]{%
\stepcounter{DWCounter}%
{\color{red}{\textbf{DW \#\arabic{DWCounter}: }#1}}%
}
% Comments from TG:
\newcounter{TGCounter}
\newcommand{\TG}[1]{%
\stepcounter{TGCounter}%
{\color{green}{\textbf{TG \#\arabic{TGCounter}: }#1}}%
}
% Comments from MC:
\newcounter{MCCounter}
\newcommand{\MC}[1]{%
\stepcounter{MCCounter}%
{\color{blue}{\textbf{MC \#\arabic{MCCounter}: }#1}}%
}
\begin{document}
%----------------------------------------------------------------------------
% Title of the paper + short title for header
%----------------------------------------------------------------------------
\title{Electron diffraction data processing with \dials}
\shorttitle{\dials for ED}
%----------------------------------------------------------------------------
\author[a]{Max T.B.}{Clabbers}
\author[b]{Tim}{Gruene}
\author[c]{James M.}{Parkhurst}
\author[a,b]{Jan Pieter}{Abrahams}
\cauthor[d,e]{David G.}{Waterman}{[email protected]}{}
%----------------------------------------------------------------------------
% Affiliations
%----------------------------------------------------------------------------
\aff[a]{Center for Cellular Imaging and NanoAnalytics (C-CINA),
Biozentrum, University of Basel,
Mattenstrasse 26,
4058 Basel, Switzerland}
\aff[b]{Paul Scherrer Institute,
5232 Villigen PSI,
Switzerland}
\aff[c]{Diamond Light Source Ltd,
Harwell Science and Innovation Campus,
Didcot,
OX11 0DE,
UK}
\aff[d]{STFC Rutherford Appleton Laboratory,
Didcot,
OX11 0FA,
UK}
\aff[e]{CCP4,
Research Complex at Harwell,
Rutherford Appleton Laboratory,
Didcot,
OX11 0FA,
UK}
\shortauthor{Clabbers, Gruene, Parkhurst, Abrahams \& Waterman}
%----------------------------------------------------------------------------
% Create the title
%----------------------------------------------------------------------------
\maketitle
%------------------------------------------------------------------------------
% Synopsis
%------------------------------------------------------------------------------
\begin{synopsis}
Adaptations to the \dials package are described that make it a suitable choice
for processing challenging continuous rotation electron diffraction data.
Results of using the extended package are presented for a case consisting of
seven example datasets.
\end{synopsis}
%------------------------------------------------------------------------------
% Abstract
%------------------------------------------------------------------------------
\begin{abstract}
Electron diffraction is a relatively novel alternative to X--ray crystallography
for the structure determination of macromolecules from three-dimensional
nanometre-sized crystals. The rotation method of data collection has
been adapted for the electron microscope. However, there are important
differences in geometry that must be considered for successful data
integration. The wavelength of electrons in a TEM is typically around forty
times shorter than X--rays, implying a nearly flat Ewald sphere, consequently low diffraction
angles and a high effective sample to detector distance. Nevertheless, the
\dials software package can, with specific adaptations, successfully process
continuous rotation electron diffraction data. Pathologies encountered
specifically in electron diffraction make the data integration more
challenging. Errors can arise from instrumentation, such as beam drift or
distorted diffraction patterns from lens imperfections. The diffraction
geometry brings additional challenges such as strong correlation between
lattice parameters and detector distance. These issues are compounded if
calibration is incomplete leading to uncertainty in experimental geometry, such
as the effective detector distance and the rotation rate or direction. Dynamic
scattering, absorption, radiation damage and incomplete wedges of data are
additional factors that complicate data processing. Here, recent features of
\dials as adapted to electron diffraction processing are shown, including
diagnostics for problematic diffraction geometry refinement, refinement of a
smoothly-varying beam model and corrections for distorted diffraction images.
These novel features, combined with the existing tools in \dials, make
data integration and refinement feasible for electron crystallography, even in
difficult cases.
\end{abstract}
\newpage
\section{Introduction}
Electron diffraction allows structural analysis of nanometre-sized samples of
crystalline material. Since the maximal radiation dose is proportional to
sample volume, electron diffraction of organic and macromolecular compounds was
long limited to 2D samples\footnote{S. Hovm{\"o}ller, Workshop ``3D Electron
Crystallography of Macromolecular Compounds'', 2017.}\cite{unwin-henderson:1975}.
In contrast to X--ray crystallography, the three domains, inorganic, organic,
and macromolecular electron crystallography were developed rather independent
of each other
\cite{vainshtein:1964,dorset:1995,adt:2007,glaeser_downing_derosier:2007,zou:2011}.
Physical and instrumental limitations, like miniature sample size or dynamic
scattering effects and lens distortions, affect data precision. However,
several studies show that model accuracy compares with X--ray structures
\cite{weirich:1996,zeo_adt:2014,dorset:1992,palatinus:2017}. Only about one and
a half decades ago, electron diffraction of 3D crystals was pioneered with
automated diffraction tomography (ADT) and further refined with rotation electron
diffraction (RED) \cite{adt:2007,rotmethod_e:2010,gemmi_adt:2015}. Recently,
single crystal 3D electron diffraction has also been applied to protein
crystals, by using the standard rotation method
\cite{Arndt1977,Nederlof2013,Hattne2015,Yonekura2015,Clabbers2017}. The only
very recent use of integration software with profile fitting and scaling is
indicative of the
independent development of electron diffraction. These methods have been in use
for decades in X--ray crystallography, improving the quality of diffraction
intensities and their standard uncertainties, whilst enabling heuristic
correction for systematic errors \cite{pflugrath:1999,leslie1999integration}.
\dials is a relatively new package for diffraction integration
\cite{Winter2018}, designed as an extensible toolkit for the implementation of
algorithms relevant to diffraction data analysis. The core set of algorithms
are presented as a suite of command-line programs that can be used following
simple protocols to integrate datasets collected using the rotation method
\cite{Arndt1977}. Many of these algorithms are implementations of tried and
tested methods described in numerous publications over the past three decades
\cite{leslie1999integration,LURE1986phase1and2,LURE1986phase3,kabsch2010xds}.
However, the toolkit design of \dials facilitates the construction of new
algorithms \cite{Gildea2014,Parkhurst2016,Parkhurst2017}. \dials is an
open-source project, allowing scientists from outside the core collaboration to
contribute software, or to use \dials within their own projects.
% See cctbx.xfel, iota, prime for example. These are rather off-topic for
% citation here
To date, \dials development has focused on macromolecular and chemical
crystallography datasets, optimised for continuous rotation data collected in
fine slices using photon counting detectors at synchrotron light sources.
Despite this emphasis, with suitable modification of parameters at certain
steps, high quality results have also been obtained for wide-sliced X--ray
datasets recorded on CCD detectors \cite{dials_adsc:2016a,dials_adsc:2016b}. The common
fundamental assumption is that reciprocal lattice points pass through the Ewald
sphere by constant-velocity rotation around a single axis.
%Reciprocal lattice
%points have a finite extent and the total scattered intensity for one
%reflection is the integration over the reflecting range of slices of
%the reciprocal lattice point that instantaneously satisfy Bragg's law.
No artificial restrictions on the diffraction geometry are imposed, allowing the
modelling of diffraction experiments using a generic vectorial description
\cite{Waterman2016}. By default, two measurements are made for each reflection:
summation integration and three dimensional profile fitting, along with
estimated errors \cite{Winter2018}. The simplicity of this approach, avoiding
assumptions inherent in the details of any particular technique, mean that
\dials is readily adapted for analysis beyond the original scope of its design.
A common feature shared between \dials programs is the global modelling of an
experiment, in which data are assumed to be complete before analysis begins.
This has some advantages over the traditional approach of processing data by
means of a moving window that passes over the complete data set in blocks of a
local range of images. One is that the expensive step of integration can be
performed with a high level of parallelism, as the experimental model is
determined completely ahead of time. A second is that the programs can consider
multiple experiments simultaneously without losing track of the connections
between them. This feature has particular relevance to the global refinement of
diffraction geometry, for which experiments may share some models
\cite{Waterman2016}, certain parameters may be constrained to shift together,
or restraints may be applied between multiple crystal models. These features
can be important for the analysis of electron diffraction datasets, for which
determining accurate diffraction geometry may be challenging
\cite{review_adt_red:2015}, and current technology usually imposes the
collection of incomplete wedges of data for each crystal. Here we discuss the
use of \dials for the analysis of electron diffraction data that has been
collected using the rotation method. As a motivational example we describe the
stages of data processing with reference to 7 datasets collected on
orthorhombic crystals of a dimeric form of hen egg-white lysozyme, previously
reported in \citeasnoun{Clabbers2017}.
\section{Methods and results}
\subsection{Image formats}
The first stage in processing rotation data with \dials is to import the images
constituting the data set to form a \code{DataBlock}, using the \dxtbx library
\cite{Parkhurst2014}. This library contains format reading classes for the
majority of common file formats used in X--ray crystallography. The classes are
arranged in a hierarchy, from generic classes that contain code to read image
data and construct an experimental model solely from metadata contained in the
image headers, to specific classes that may recognise a particular instrument
and can override for incorrect or missing metadata. This feature is important
for reading file formats used in electron microscopy because current
instruments usually do not transfer all the information that is required to
reconstruct the experimental geometry. There are three main approaches that can
be taken to import electron diffraction data into \dials.
\begin{enumerate}
\item Externally convert the native format into a format more common for MX.
This is the usual approach adopted for data processing with
other programs such as \mosflm~\cite{leslie2007} and
\xds~\cite{kabsch2010xds}. For example, data sets have been converted to SMV
\cite{Hattne2015}, PCK \cite{Clabbers2017}, or CBF images \cite{Gruene2018}.
%%CBF, the
%Crystallographic Binary Format, \cite{Bernstein2005}, is probably the most
%widely used nowadays. It is a self-describing format combining text strings
%that conform to the imgCIF dictionary to provide comprehensive metadata with
%binary blocks of image data.
Where external conversion programs exist, this has the advantage that no
coding or understanding of the original file format is required by the user.
Often, missing metadata can be supplied during the conversion so the
resulting images contain a proper description of the experiment and no
additional overrides are required when importing the dataset into \dials. The
same set of images can then also be used with other data processing packages.
However, the reliance on an external conversion tool has some drawbacks.
There is the scope for errors when metadata are introduced manually during
the conversion. The proliferation of conversion tools adds complication for
the user and the fidelity of the conversion process must be checked. For
example, image export functions within microscope vendor-supplied software to
common formats such as TIFF might not preserve the real pixel intensities,
and this fact may not be clear to the user. Even when data are properly
converted, the generic readers for standard MX formats may contain
assumptions that are not appropriate for electron diffraction, such as the
creation of a polarised beam model. Generic readers might also not allow the
desired interpretation for sophisticated cases, such as splitting a data
array for a multiple panel detector model or defining masks for certain
regions of images.
\item Extend the \dxtbx library to recognise native data formats. This
approach entails writing a format class (typically a single, small Python
module) to contribute to \dxtbx, following the published description
\cite{Parkhurst2014}, and existing examples. This requires knowledge of the
native data format and conventions used by \dxtbx, as well as coordination
with the \dials developers. The advantage of investing this effort is that
once included in the library, the native data format will be supported for
all users with no additional conversion steps. In practice, however, where
native formats lack the metadata describing the diffraction experiment, this
will have to be supplied each time during data import, either by providing
parameters at the command line or in a file in the PHIL format, a simple data
interchange format used within the \cctbx~\cite{Grosse-Kunstleve2002}.
Appendix~\ref{app:PHIL_example} contains an example of such a file. Format
classes for native file types that have now been added to the dxtbx include
image stacks in the TIA Series Data (ESD) format used by software provided
with FEI microscopes and image stacks in Gatan DM4 format.
\item For local installations, testing, or one-off developments for a
particular data processing problem it may be more appropriate to create a
format class as a plugin rather than contributing to the \dxtbx library.
There is no difference in the procedure required to implement the class; the
resulting Python module should simply be placed in a \code{.dxtbx} directory
in the user's home area and this will automatically be picked up at runtime
when required. Various plugins for electron diffraction are collected at
\url{https://github.com/dials/dxtbx_ED_formats} and can be downloaded and
modified freely.
\end{enumerate}
The seven lysozyme datasets discussed here consist of diffraction images from a
$1024\times1024$ pixel detector composed of a $2\times2$ array of Timepix quad detectors
\cite{Clabbers2017}. Large gaps between the Timepix quads are imposed by the
form factor of each quad. For the original \xds processing of these data, the
images were converted into PCK format, in which pixel values were interpolated
onto an orthogonal grid, with the gaps forming `dead' areas of the image array.
For processing with \dials we chose a multiple panel description
\cite{Parkhurst2014}. The images were converted to
CBF\footnote{\url{https://strucbio.biologie.uni-konstanz.de/xdswiki/index.php/Timepix2cbf}}
without interpretation of the gaps. We created a \dxtbx format class specific
for these images, which represents each quad as a separate panel of a composite
detector. In this way, no interpolation is required because each panel has an
independent position and orientation, thus sub-pixel shifts and rotations can be
represented precisely. The \dialsimageviewer takes account of the relative
position and orientation of independent panels and displays a composite image
projected onto a viewing plane, as shown by
Figure~\ref{fig:spotfinding}.
A $512\times512$ pixel Timepix quad is an assembly of four abutting Timepix ASICs,
each with $256\times256$ square, $\SI{55}{\micro\metre}$ pixels. However, the distance
between two abutting Timepix ASICs is $\SI{350}{\micro\metre}$, corresponding
to a pitch for the abutting pixels that is about three times that of the other
pixels. Since these pixels have a larger surface, they also have a higher probability
of collecting more electrons. To correct for this non-uniformity, the conversion to
CBF splits pixels with an \emph{x}- (and/or \emph{y}-coordinate) that equals 256 or 257, into three
pixels that are $\SI{55}{\micro\metre}$ wide (or high). This results in $516\times516$ pixel
frames with a discernible, 6-pixel wide cross, in which the pixels have a gain that
is about three times higher than that of the other pixels outside the cross. This was
corrected by multiplying the counts of the unaffected pixels by a factor of three. The
converted images therefore model a detector with a gain of 3. This was recorded in the
\dxtbx format class so that the correct gain value would be used automatically, e.g.
in the calculation of error estimates for integrated intensities.
\subsection{Spot finding \label{sec:spot_finding}}
The spot finding algorithm used in \dials is rather sensitive to the detector
gain. No automatic evaluation of the gain is performed prior to spot finding,
although a value can be determined using the program \dialsestimategain. This
uses the mean and variance of pixels within a region of interest
\cite{leslie2006integration} and may significantly underestimate the true gain
for detectors that have non-negligible point spread, or corrections applied
that reapportion signal between neighbouring pixels \cite{Waterman2010}. If the
correct gain is known it is usual for this to be set by the format class used
to import images. Otherwise, a suitable value should be passed to
\dialsfindspots for use by the spot finding algorithm. In difficult cases it
may be necessary to optimise the gain and other spot-finding parameters, the
effects of which can be explored interactively using \dialsimageviewer. For the
seven example datasets discussed here we typically found that it was necessary
to increase the sensitivity of spot-finding, and then reduce additional noise
by using a global threshold. Appropriate spot-finding settings were determined
manually for each dataset separately. The effect of these settings for
\emph{dataset 1} is shown on Figure~\ref{fig:spotfinding}.
\subsection{Experiment geometry}
The most substantial difference between the processing of rotation data from
electron diffraction compared to X--ray diffraction lies in the modelling of the
diffraction geometry. The short wavelength of an electron beam
($\SI{0.02508}{\angstrom}$ for 200 keV electrons compared to
$\SI{1.0332}{\angstrom}$ for 12 keV X--rays) implies a correspondingly large
Ewald sphere, with a small
$2\theta$ scattering angle even for the highest resolution reflections.
The low
diffraction angles imply a large effective sample to detector distance needed to magnify
the diffraction pattern and achieve sufficient spatial
separation between peaks. Large detectors are advantageous
for crystallography because they allow the sample to detector distance to
be increased, which both reduces diffuse background and improves spatial
separation of peaks \cite{Stanton1993}. However,
the detector distance is limited in a transmission electron microscope (TEM)
by the largest possible magnification,
and the relatively small size of the detectors. Whilst the true camera position underneath the
TEM column is always at a fixed distance, the effective detector distance
is set by the objective lens and does not correspond directly to a quantity that
can be measured mechanically. Similar to an X--ray beamline, the sample to detector distance in
a TEM is easily calibrated with reliable test crystals. However,
inaccuracy in the recorded effective distance may be difficult to correct by the
usual process of diffraction geometry refinement due to the high correlation
between unit cell parameters and the detector distance when
$2\theta_{\text{max}}$ is small, when the Ewald sphere construction becomes
invariant of linear scaling (see Section~\ref{sec:refinement})
\cite{VanGenderen2016}. In addition, imperfections in the lens system may
introduce distortions in the recorded diffraction images. With disregard of
such defects, discussed further in Section~\ref{sec:distortion}, the processing
software can ignore the lens system and model the experiment with an effective
detector distance.
The relatively
extreme geometry of the electron diffraction is unfamiliar to many X--ray
crystallographers. It is instructive to compare graphical schematics, such as
Figure 6 in \citeasnoun{Clabbers2018} for the real space geometry of the instruments
and Figure~\ref{fig:ewald} for a comparison of the Ewald construction in
reciprocal space for the two cases.
Another potential source of inaccuracy in the initial model for the diffraction
geometry arises because of the relatively poor characteristics of the sample
positioning stage of electron microscopes, as compared to X--ray goniometers for the
purpose of rotation method experiments. Improved set ups are possible,
but are not widely available \cite{Yonekura2015,Shi2016}.
The rotation range per image is generally assumed constant and accurate.
Instruments used
for electron diffraction should therefore be well calibrated
\cite{gemmi_adt:2015}. Small, smooth deviations from the expected rotation angle
can then be modelled as part of the scan-varying refinement of the crystal.
Generally, there may be
uncertainty regarding the orientation of the rotation axis, the direction of
rotation, and the rotation range per image. A reasonable estimate of the
rotation axis orientation in the plane of the images can be made by finding a
line through the beam centre along which reflections have the widest reflecting
range, and few reflections are found. The direction of rotation around the axis
is more difficult to determine. For an X--ray experiment, the curvature of the
Ewald sphere makes the incorrect choice obvious, for example using a visual
tool such as \dialsreciprocallatticeviewer \cite{Winter2018}. By contrast, the
flatness of the Ewald sphere in electron diffraction ensures that either choice
of handedness of rotation will produce regular reciprocal lattice positions,
as shown by Figure~\ref{fig:invert_axis}. If indexing is successful, it is
likely to work either way. For any case where there is ambiguity, the inverse
direction should also be tested and results compared. The correct solution will
have a lower RMSD for the angular residual between the predicted and observed
positions of the reflections.
\subsection{Image distortion due to lens effects \label{sec:distortion}}
Image distortion is not unique to electron crystallography. In X--ray
crystallography, geometrical distortions may be present due to components
of the detector system. A familiar example of these are spatial distortions
introduced by the fibre-optic taper in a phosphor-taper-CCD area detector
\cite{Stanton1992}. In that case, the distortion is a fixed property of the
detector and it is usual for images to be corrected by manufacturer-supplied
routines prior to analysis. Nevertheless, data processing packages such as \xds
have facilities for applying a distortion correction in the form of look-up
tables. Even with the advent of Hybrid Pixel Array Detectors, which have a
direct coupling between the detector surface and the counting electronics,
geometrical distortion may be used to correct for sub-pixel shifts and
misorientations between the modules of the detector array. With electron
crystallography, geometrical distortions of the detector are no less relevant,
while there is the additional factor of the possibility of distortion of
the diffraction pattern itself due to effects of the electron optical system.
Possible distortions include
anisotropic magnification where the diffraction pattern is elongated
in one direction, transforming a circular powder pattern to an ellipse
\cite{lenscorr_2dx:2006,Clabbers2017}. Care must be taken to investigate the
presence of these effects in electron diffraction datasets and as they are
not mechanical properties of the instrument it is necessary to recalibrate
when instrument settings are changed.
Despite the fact that the distortion occurs in the direction of the scattered
rays rather than as a property of the detector,
it is reasonable to correct images by the same means as for other sources of
distortion. The difference in scattering angle implied by distorted or corrected
images is likely to have a negligible effect
on parallax correction. Anyway, the detector model for the example datasets
discussed here has no parallax correction, thus making the assumption that
observed intensity is essentially deposited on the surface of the ``virtual
detector'' \cite{Parkhurst2014}. Within \dials, we implemented a similar mode
for distortion correction as used in \xds. A pair of distortion maps encode the
pixel offset across the detector for both the fast and slow directions. These
maps are equal in size to the pixel array of the detector (for a multiple panel
detector the correction files encode a list of separate maps for each panel).
No interpolation is performed during the application of the distortion maps. In
principle, sharp changes to correct for shear defects would be possible,
however for the case of lens abberation, the offset varies slowly over the face
of the detector so that neighbouring values in the look-up table are similar.
The distortion maps are applied during the conversion between detector pixel
coordinates and virtual detector millimetre coordinates. During the
transformation from millimetre coordinates to pixel coordinates, the
uncorrected pixel coordinate is first calculated and the correction is applied
to obtain the distortion corrected pixel coordinate. Likewise, during the
transformation from pixel coordinates to millimetre coordinates, the reverse
correction is first applied and the millimetre coordinate is calculated from
the reverse corrected pixel coordinate.
The \emph{datasets 2--7} of our examples all showed a significant elliptical
distortion. The parameters of this distortion were determined as described
previously \cite{Clabbers2017}. We extended the program
\emph{dials.generate\_distortion\_maps}
to produce $X$ and $Y$ distortion maps for the 4 panel detector model based on
the known parameters. These maps were registered for each relevant dataset
during the \dialsimport step, after which they were loaded and applied
whenever required by \dials programs.
\subsection{Indexing}
Provided a sufficient number of strong spots have been collected (\emph{cf.}
Sec.~\ref{sec:spot_finding}), indexing of electron diffraction works with
similar reliability as with X--ray diffraction data. Difficulties mostly arise
from systematic errors like stability of the rotation axis and, mostly, the
often large variation in oscillation width $\Delta \Phi$. The \dialsindex
program offers three different methods for determining the unit cell basis
vectors. The default is based on the three-dimensional FFT, but alternatively a
method based on one-dimensional FFT similar to the programs DPS
\cite{Steller1997} and \mosflm \cite{leslie2007} can be used. When the cell
parameters are known, a simplification of the Fourier transform-based methods
can be used that is particularly successful for very narrow wedges of data
\cite{Gildea2014}. The program \dialsindex performs refinement of the initial
solution, therefore the guidance listed in Section~\ref{sec:refinement} for
refinement of ED geometry is also relevant, and it is possible to pass
options for the \dialsrefine program into \dialsindex where required.
Unless a model space group was chosen by the user, the indexing results are
presented with triclinic symmetry. The compatibility of other choices of
Bravais lattice with the triclinic solution can be tested using the program
\dialsrefinebravaislattice \cite{Winter2018,Sauter2006}. There is no difference
in usage compared with X--ray data, however for electron diffraction the
results might be more difficult to interpret. In particular, the metric fit
reported for each trial solution \cite{LePage1982} may be large (e.g. greater
than $1^\circ$) even for a correct solution, whereas much smaller values are
expected for good quality X--ray data. The correlation coefficients between
intensities related by symmetry operations of the lattice are affected by data
incompleteness and by factors that cause deviation from expected intensities
such as dynamic diffraction. As a result, these are not as useful to decide
on the correct lattice as they are in X--ray experiments. The key criterion then
is the RMSD between predictions and observations. A pool of solutions with
RMSDs similar to the original triclinic solution are good candidates. Any
solution resulting in a significant increase in RMSD is suggestive of an
over-constrained lattice and should be discarded.
For 6 of the 7 datasets of our example, indexing with an approximately correct
orthorhombic cell was successful with default options apart from fixing some
detector parameters, as described in Section~\ref{sec:refinement}. For
\emph{dataset 6} we additionally fixed the beam orientation parameters and
provided the expected unit cell and a restraint to this target cell during
refinement. This dataset shows relatively poor diffraction. Rather few spots
were successfully indexed and RMSDs between the predicted and observed rotation
angles remained high after refinement (see Table~\ref{tab:geometry}).
The action of both constraints and restraints help to stabilise and
guide refinement in such difficult cases.
\subsection{Global refinement of the unit cell and instrument parameters
\label{sec:refinement}}
Following indexing, the model for the diffraction experiment is refined as
described previously \cite{Waterman2016}. In common with X--ray data processing
with \dials, it is usual to first refine a ``static'' model for the whole data
set, in which parameters such as the crystal unit cell and orientation angles
are not allowed to vary across the scan. The global refinement of a data set
improves the stability of the refinement procedure. However, the geometry of an
electron diffraction experiment raises particular issues that should be taken
into account, especially if data quality is limited by low resolution
diffraction for some or all of the scan, poor quality spot centroids or the
scan is an especially narrow wedge. In this section we offer some practical
advice for \dials refinement tasks with challenging electron diffraction data.
It is more difficult to refine unit cell parameters using electron
diffraction data, rather than X--ray data. This is mainly caused by the weaker
signal and the much smaller diffraction angles $2\theta_\text{max}$ in electron diffraction
\footnote{As the diffraction angle $\theta \to 0$, $\arctan(\theta) \approx \arcsin(\theta)
\approx \theta$. Substituting this into Bragg's law, and the geometry of the diffraction (where
$D$ is the detector distance, and $r$ is the distance between the central beam position and a Bragg
spot with resolution $d$) results in $D/r = d/\lambda$; with $r$ and $\lambda$ known, $d$ and $D$ are
linearly correlated.}.
A weak diffraction signal implies fewer diffraction spots and lower accuracies in
determining their centroids, compromising the accuracy of the refinement. The small diffraction
angle implies a low Ewald sphere curvature and a very high correlation between detector
distance and a uniform unit cell scale factor. In the limiting case, the relative
accuracy of the unit cell’s scales linearly with the relative accuracy of the detector
distance calibration. In cases where unit cell imprecision does not prevent structure
solution, the parameters can be refined, as recently implemented in \refmac (see
Section~\ref{sec:phase-refine}).
%The most noticeable problem seen with electron diffraction geometry refinement
%is that the accuracy of the refined unit cell may be poor. With $2\theta$
%ranges typical for X--ray data collection, the Ewald sphere construction is
%gauged by the radius of the sphere \DW{What does that sentence mean?}.
%Deviations from the correct unit cell parameters or detector distance results
%in a non--linear deviation of predicted spot positions across the detector
%surface. The smaller $2\theta_\text{max}$, the more linear the deviation
%becomes and thus more difficult to detect: unit cell volume and detector
%distance can be scaled together while maintaining precise spot prediction. In
%cases where unit cell imprecision does not prevent structure solution, the
%parameters can be refined, as recently implemented in REFMAC5 (see
%Section~\ref{sec:phase-refine}). While a change in the detector parameters can
%only result in a projective transformation of the positions of the recorded
%spots on any image, changes in the cell parameters will act to move the
%reciprocal lattice points towards or away from the surface of the Ewald sphere,
%altering not only the predicted positions of the spots on the detector, but
%also the rotation angle and therefore the predicted image number on which these
%spots are expected to appear. When the Ewald sphere is essentially flat, this
%distinguishing factor is much reduced and it is more difficult to separate the
%effects of the detector and unit cell parameters.
The high level of correlation between parameters in diffraction geometry
refinement problems has long been recognised. The method of eigenvalue
filtering was proposed to allow refinement to proceed in such cases
\cite{Reeke1984,LURE1986phase3}, by automatically selecting only those
parameters, or linear combinations of parameters, that have the greatest effect
at each step of refinement. This was deemed necessary at the time to refine
crystal parameters using data from a single oscillation film. Within \dials,
all available data is used for a global refinement. This reduces correlations
and provides a better determination for parameters when the scan range is wide,
thus the default behaviour is to refine simultaneously the beam, crystal and
detector parameters, which works well for X--ray data. We have seen that when
limited to a narrow wedge of data recorded with the geometry of the electron
diffraction experiment, high correlations are again problematic. \dials
refinement does not use the eigenvalue filtering method, but by default uses a
Levenberg Marquardt algorithm, which provides an alternative approach for
dealing with near-singular least-squares problems. In practice, we find that
this algorithm is robust even in the presence of very high parameter
correlations. However, experience shows that the most challenging problems with
electron diffraction geometry may need many steps before convergence is
achieved, where this is defined as a negligible further reduction in RMSDs. For
this reason, from \dials version 1.8 the maximum number of iterations before
refinement terminates has been raised to 100 from 20 for the Levenberg
Marquardt algorithm (the limit can always be adjusted by the user via the
\code{max\_iterations} parameter).
If a good estimate for the unit cell is available as prior knowledge, this can
be incorporated into refinement by the use of restraints, tying the unit cell
model to an external target. Unit cell restraints are currently available for
static refinement of unit cell models but not scan-varying refinement, as they
were originally developed for XFEL serial crystallography where scan-varying
refinement is irrelevant. The unit cell parameterisation in \dials is expressed
with reciprocal metrical matrix elements as parameters \cite{Waterman2016}.
However, for ease of use, restraints are specified in terms of the real space
cell, as shown by the example given in Appendix~\ref{app:PHIL_example}. Each
crystal included in refinement can add up to six restraint terms (for the
triclinic case). Irrelevant restraints for cell parameters that are already
constrained by lattice symmetry are automatically excluded. Every restraint
term adds a pseudo-observation to refinement. Taking the cell parameter $a$ as
an example, the pseudo-observation term $R_a$ consists of the squared residual
between that parameter and its target value $a_t$, with a weighting factor. In
common with the real observations, the first derivatives of the
pseudo-observations with respect to the refinable parameters (here, arbitrarily
denoted $p$) are also required for refinement by non-linear least-squares
methods.
\begin{equation}
\label{eq:restraint_to_target}
R_a = \frac{\left( a - a_t \right)^2}{\sigma_a^2}
\end{equation}
\begin{equation}
\pder[R_a]{p} = 2 \pder[a]{p} \frac{\left( a - a_t \right)}{\sigma_a^2}
\end{equation}
In principle, statistical weighting could be achieved by setting the weights
equal to the inverse variance of the target cell parameter values. However,
numerical uncertainties from refinement are
known to be underestimated \cite{Dauter2015}. For X--ray diffraction refinement
we usually try values between $\sigma \sim 0.001$ for qualitatively ``strong''
restraints and $\sigma \sim 0.1$ for ``weak'' restraints, monitoring the effect
on the refined RMSDs. In the electron diffraction case setting even very weak
restraints to a target cell can avoid issues with the unit cell and detector
distance drifting when these are refined simultaneously. Nevertheless, the high
correlation between these parameters means that the problem of distinguishing
between cell volume and detector distance remains salient, and indeed the unit
cell can be driven towards a target cell of incorrect volume with minimal
increase in refined RMSDs if the detector distance is also refined. It is
generally advisable to accurately calibrate the effective detector distance
prior to ED data collection and then to fix this during data processing. Other
parameters that it may be prudent to fix include the detector $\tau_2$ and
$\tau_3$ values, which describe rotations around axes in the plane of the
detector, similar to \mosflm's TILT and TWIST. Joint refinement of these
parameters along with the beam direction and detector translations
within the detector plane can be unstable.
For 6 of the example datasets, fixing the detector distance, $\tau_2$ and $\tau_3$
gave acceptable results for joint refinement of the beam, crystal and detector
in-plane translation and rotation parameters. For the more difficult case,
\emph{dataset 6}, no additional parameters were fixed, but a restraint to
the target cell as given in Appendix~\ref{app:PHIL_example} was used. Only 139
reflections were available for refinement in this case after outlier rejection.
The use of the restraint ensured that the refined cell remained reasonable. In
particular, without the restraint the long axis dimension drifted to above
$\SI{108}{\angstrom}$. Including the restraint increased the RMSDs in X and Y
by less than 0.07 and 0.14 pixels respectively, and had a negligible effect on
the RMSD in the rotation angle, demonstrating a case in which this feature can
be used to guide refinement, without resulting in a model that stands in dispute
with the centroid data.
\subsection{Scan-varying refinement of crystal and beam parameters
\label{sec:sv-refinement}}
In typical use of \dials, the global static model for a dataset is used as a
starting point for scan-varying refinement. As originally implemented
\cite{Waterman2016}, this was intended to capture changes to the crystal unit
cell and orientation parameters during data collection. These parameters were
allowed to vary in a smooth manner by evenly distributing sample points across
the scan and interpolating values at any one position using a Gaussian
smoother. The beam and detector parameters could be jointly refined to global,
static values alongside the scan-varying crystal.
The analysis of electron
diffraction images raises a new issue in that instrument stability during the
course of data collection cannot be simply assumed, as it is for MX data. In
some cases, there is significant drift of the beam centre during data
collection caused by instability of the alignment or charging effects.
Previous methods to handle this involve procedures to identify the shift
for each image and write out corrected images in which the beam centre remains
constant, effectively describing the drift in terms of shifts of the detector
\cite{Wan2013,Nederlof2013,Hattne2015}. The procedures differ in the
way that the beam centre is determined for each image. In the simplest case,
the high scattering cross section for electrons allows, for some
instrumentation, the direct beam to be
recorded simultaneously with diffraction spots, avoiding the need for a beam
stop. When images are not corrected,
software such as \mosflm or \xds can be set to independently refine the beam
centre for each image, or within small blocks of images.
% Andrew Leslie notes that default behaviour of MOSFLM is for all detector
% parameters (beam position, YSCALE detector distance, detector twist and tilt)
% to be refined independently on every image. The XDS default is
% REFINE(INTEGRATE)=POSITION BEAM ORIENTATION, which includes beam position
% refinement within small blocks (DELPHI degrees)
The focus on global refinement in \dials means that an alternative approach was
sought. Beam drift in electron diffraction experiments, at least those
collected by a continuous rotation protocol, appears to occur gradually.
Therefore it seems reasonable to assume that a smoothly-varying model for the
beam direction vector would suffice to represent this effect. For small
magnitudes of the total drift, the difference between correction by implicit
detector shifts and modelling of a drifting beam will be negligible. For the
purposes of ED data processing, we extended the scan-varying refinement
methodology from crystal parameters to optionally apply also to the beam
parameters, available from \dials version 1.9 onwards.
The difficulties with refinement inherent to electron diffraction geometry are
exacerbated during scan-varying refinement. Like static
refinement, scan-varying refinement in \dials is also global, in that data from
the full rotation scan is used in a single optimisation procedure. However, at
any point in the scan the local values for the crystal unit cell, angular
misset, and potentially the beam direction parameters, are dominated by the data
close to that point. Spot centroids
at rotation angles further from that point have a diminishing effect on the
local model, controlled by a Gaussian smoother. While this allows the
model to express genuine smooth changes,
it reduces the stability of the refinement procedure. This has been seen in
cases where a static crystal model allows global refinement of both the
detector and crystal parameters to reasonable values, but scan-varying
refinement of the crystal results in a drift of the average unit cell volume
and detector distance. Despite these observations, scan-varying refinement is
still preferable to static refinement of the beam, crystal and detector models
within local narrow wedges, which suffers even more from high parameter
correlations. To stabilise a problematic scan-varying refinement task we must
either restrain or constrain (fix) some parameters of the model. There is no
automatic determination of a suitable parameterisation for refinement in
\dialsrefine. Diagnostics (see Section~\ref{sec:diag}) may help to understand
the details of a particular case and guide choices, however ultimately the user
must inspect the resulting models for reasonable geometry as well as the final
RMSD values.
%Currently,
%there is no facility available in \dials to restrain scan-varying models to
%the best (in a least squares sense) overall values for the dataset,
%as determined during global, static refinement. One way to improve stability
%is to modify options for the smoother used in scan-varying
%refinement. Each parameterisation of a model that is to be refined in a
%scan-varying manner has an associated interval width
%parameter that controls the number of refinable subparameters that will be used
%to describe changes during the scan, with a default of $36^\circ$.
%Increasing this value, or directly setting a small absolute number of
%intervals, ensures a greater degree of smoothing by reducing the number of
%parameters in refinement. The most robust method for stabilising refinement
%is to fix parameters of the static models that have high degrees of correlation
%with parameters of the scan-varying models. For example, if detector distance
%is optimised during the static refinement phase (or even better, known in
%advance by accurate calibration), then this parameter can be fixed for
%scan-varying refinement of the unit cell. If a smoothly-varying beam direction
%is refined then fixing the detector in-plane translations and the orientation
%angles $\tau_2$ and $\tau_3$ helps to stabilise refinement. There is no
%automatic determination of a suitable parameterisation for refinement in
%\dialsrefine. Diagnostics (see Section~\ref{sec:diag}) may help to understand
%the details of a particular case and guide choices, however ultimately the user
%must inspect the resulting models for reasonable geometry as well as the final
%RMSD values.
We performed scan-varying refinement prior to integration for the 7 example
datasets. A variety of protocols was tested, and the best chosen for each
dataset according to merging statistics after scaling of that dataset in
isolation by \aimless \cite{Evans2013}. In each case, we fixed all detector
parameters so that the detector maintained the geometry from the static
refinement step. For \emph{dataset 1} a significant drift of the beam centre
was observed. We enabled scan-varying refinement of both beam direction angles,
$\mu_1$ and $\mu_2$ in the nomenclature of \citeasnoun{Waterman2016}.
Remarkably, the simplest model consisting of two refineable sub-parameters for
each angle resulted in the best merged dataset, rather than models with more
sub-parameters that are smoothed less in order to track higher frequency
changes to the beam drift. Scan-varying refinement of the beam was tested for
each of the other datasets. For two cases, \emph{dataset 4} and
\emph{dataset 5}, merging statistics favoured static refinement of the beam
direction. In the other cases, the simple two sub-parameter model for each
beam angle was used. For each dataset, the three crystal orientation ``misset''
angles were refined in a scan-varying manner, using default smoother parameters.
A scan-varying unit cell was refined for each case, except for datasets
\emph{3}, \emph{4} and \emph{6}, for which refining a global, static cell
stabilised refinement and produced better merging statistics.
Further details about the diffraction geometry modelling for each dataset
are given in Table~\ref{tab:geometry}.
\subsection{Diagnostics for problematic diffraction geometry refinement
\label{sec:diag}}
Sections~\ref{sec:refinement} and \ref{sec:sv-refinement} describes parameters that need to be adjusted in
difficult cases. To date, even electron diffraction data sets from standard
proteins can be difficult \cite{Clabbers2017,Hattne2015}. At this early
development stage, diagnostic tools are important for fine--tuning parameters.
The program \dialsrefine provides some facilities for investigating the main
issue we have identified, namely the high level of correlation between the
effects of different parameters on the model. This information is contained
within the Jacobian matrix built up as part of each step taken by a non-linear
least squares optimisation algorithm. In this section we present two
diagnostics based on analysis of the Jacobian matrix and pick out the salient
differences that occur simply as a feature of the refinement of geometry at the
very short wavelength typical for electron diffraction.
Each step of a non-linear least squares problem is expressed as a linearised
sub-problem of the form
\begin{equation}
\label{eq:linearised_step}
\mat{J} \vec{\Delta p} = \vec{\Delta r}.
\end{equation}
By convention, the three-dimensional observations are split so that
$\vec{\Delta r}$, the vector of residuals, contains first the $(X - X_o)$
components, followed by the $(Y - Y_o)$ components, and finally the $(\phi -
\phi_o)$ values. $\mat{J}$, the Jacobian matrix of first partial derivatives of
the residuals with respect to each parameter of the problem, is thus similarly
formed in blocks, with the upper third of the matrix corresponding to
$\pder[X]{p}$ values, the second to $\pder[Y]{p}$ and the lower third to
$\pder[\phi]{p}$. The vector $\vec{\Delta p}$ is the parameter shift vector to
be determined for the step.
The first diagnostic consists of graphical ``corrgrams'', which are a way of
rapidly assessing correlations between the parameters of refinement in a visual
manner. The data represented by a corrgram consists of the matrix of pairwise
correlation values calculated between columns of the Jacobian. Since their
introduction, described in \citeasnoun{Waterman2016}, this diagnostic has been
improved. Rather than calculating a single corrgram using correlation between
each full column of the Jacobian, the three-dimensional nature of the centroid
data is respected and three corrgrams are produced: one for each of the blocks
of the Jacobian, corresponding to the dimensions $X$, $Y$ and $\phi$. These
separate figures are more appropriate for assessing the levels of correlations
between parameters implied by the data, whereas a single corrgram can obscure
these features. That is because derivatives of calculated centroid positions
with respect to some parameter $\pder[X]{p}$, $\pder[Y]{p}$ and
$\pder[\phi]{p}$ come from different distributions and thus should not be
combined in a meaningful calculation of correlation.
While the corrgram diagnostic qualitatively identifies which parameters are the
least distinguishable from each other, it might still not give a clear
indication of which refinement cases will actually cause problems. Certain
correlations are high anyway even in unproblematic cases. For this reason we
also investigated an alternative, quantitative, diagnostic with a simpler
interpretation, namely the \emph{condition number} of the Jacobian matrix,
$\mat{J}$. This provides a measure of how well-posed is the sub-problem given
by Equation~\ref{eq:linearised_step}, but does not pick out which parameters
are culpable. A condition number $\kappa \left( \mat{J} \right)$ of infinity
means that $\mat{J}$ is singular, while a finite value of $\kappa \left(
\mat{J} \right)$ gives a bound on the accuracy of the solution to
Equation~\ref{eq:linearised_step}.
The Jacobian used to calculate both the corrgram and the condition number
diagnostics does not include any additional blocks related to
pseudo-observations that may be used as restraints in refinement. For that
reason, it should be noted that the diagnostics give information about
underlying degeneracy of parameters determined only by the geometry of the
problem, not including the effects of modifications to the problem that may
have been introduced to improve the robustness of the procedure. Similarly, the
diagnostics inform us directly about properties of the normal equations of the
Gauss-Newton problem implied by Equation~\ref{eq:linearised_step} rather than
the modified normal equations of the default Levenberg-Marquardt algorithm that
is typically in fact used to find the solution. This ensures that these
diagnostics can be used to warn us of problems with the set up of the
diffraction geometry refinement itself, without conflation with factors
relating to implementation details of the algorithm used to perform the
optimisation.
To investigate the difficulties faced with refinement problems that are solely
a result of the electron diffraction geometry, we elected to perform refinement
against simulated data. That way, we could compare two refinement procedures,
using an identical crystal model, beam direction and rotation axis, while
altering the wavelength and detector distance to match typical values for
electron diffraction in one case, and X--ray diffraction in the other.
Details about how the simulated data was constructed are presented in
the Supplementary Material. Refinement
was performed for the same sets of reflections with both versions
of the geometry, using default settings in
\dialsrefine. In each case, 13 parameters were refined in total: six to
describe the detector position and orientation, one beam orientation angle,
three crystal orientation angles and three reciprocal metrical matrix elements
for the unit cell. For the final step of refinement prior to termination at
RMSD convergence, corrgrams were produced and the condition number calculated
for comparisons.
The two sets of three corrgrams are shown complete in Supplementary Figure~1.
The pattern of high correlations between parameters
that affect the predicted reflection positions $(X, Y)$ on the detector plane
are similar in the cases of electron and X--ray diffraction geometries. However,
in general, the absolute values of correlations are higher for the electron
diffraction geometry. The most striking difference between the two cases is
shown on the corrgram for the parameters that affect the predicted rotation
angle, $\phi_c$. None of the detector parameters affect $\phi_c$, so only the
beam and crystal parameters are of interest. The relevant subset of the
corrgram is reproduced in Figure~\ref{fig:corrgram}. This figure shows that
absolute correlations between certain parameters are high in either case, but
that the electron diffraction geometry shows increased absolute correlations
between $\phi_3$, the crystal orientation around the $Z$ axis, and other
parameters. In general, absolute correlations are smallest between the
parameter $g^*_{11}$, here corresponding to the short axis of the cell, and
other parameters for either version of the geometry. For this dataset, the
short cell axis was aligned closest to the rotation axis. As a result, this
dimension is relatively well-determined by centroid data from images throughout
the dataset. However, even for this parameter the electron diffraction geometry