MAXIMUM ENTROPY DATA ANALYSIS

Colloque C5, supplbment au no 8, Tome 47, aoOt 1986 MAXIMUM ENTROPY DATA ANALYSIS R.K. BRYAN European Molecular Biology Laboratory, Meyerhofstrasse 1,...

0 downloads 44 Views 3MB Size
MAXIMUM ENTROPY DATA ANALYSIS R. Bryan

To cite this version: R. Bryan. MAXIMUM ENTROPY DATA ANALYSIS. Journal de Physique Colloques, 1986, 47 (C5), pp.C5-43-C5-53. �10.1051/jphyscol:1986506�. �jpa-00225823�

HAL Id: jpa-00225823 https://hal.archives-ouvertes.fr/jpa-00225823 Submitted on 1 Jan 1986

HAL is a multi-disciplinary open access archive for the deposit and dissemination of scientific research documents, whether they are published or not. The documents may come from teaching and research institutions in France or abroad, or from public or private research centers.

L’archive ouverte pluridisciplinaire HAL, est destinée au dépôt et à la diffusion de documents scientifiques de niveau recherche, publiés ou non, émanant des établissements d’enseignement et de recherche français ou étrangers, des laboratoires publics ou privés.

JOURNAL D E PHYSIQUE

Colloque C 5 , supplbment au n o 8, Tome 47, aoOt 1986

MAXIMUM ENTROPY DATA ANALYSIS R.K.

BRYAN

European Molecular Biology Laboratory, Meyerhofstrasse 1, 0-6900 Heidelberg, F.R.G.

A b s t r a c t . The maximum entropy method has been shown to be the only regularisation method which does not introduce correlations into the solution of inverse problems for which there was no evidence in the data. If the data and image are linearly related, there is a unique maximum entropy solution. In such cases the solution can be found reliably by fast numerical algorithms. The method is discussed here particularly in relation t o the deconvolution problem, considering also the effects of missing data points, and the possibility of refining parameters defining the point spread function, illustrated by many computational examples.

I. Introduction. The general problem of reconstructing an image from incomplete and noisy data is considered. Many problems are of the same general type, in that the data are related to the original object by a well-defined transform, often an integral transform as in a convolution or projection. An inverse to this forward transform may well exist, but may well also take the form of a continuous infinite integral, and is thus only exact if the data are complete and noise free. Often the inversion formula is numerically unstable, such as the Radon formula for reconstruction from projections, which includes a differentiation. In practice, the exact inversion formula is often applied directly to the data, interpolating unknown data or setting them to zero. The result is often dreadfully noisy! The data-analyst then resorts to fix-ups to reduce noise and artifacts:- filtering, windowing, etc. The result then perhaps becomes visually acceptable, but which when re-transformed fails to fit the data, and may also violate simple physical constraints, such as positivity. There is an inevitable trade-off between noise-suppression and accuracy in any reconstruction method, and the maximum entropy method discussed here is claimed to be optimal in this respect.

11. M a x i m u m Entropy. We wish to reconstruct an image represented by a set of positive numbers f', for which there are some experimental data d , subject to noise c of known statistical distribution, related to f by a t r i t n s f m I ', SO that d = r(f) + € .

I'(f) are the data which would be observed in the absence of noise were f to represent the object actually observed. Any image f which obeys the physical constraints on the solution (e.g. positivity, size...) and which when transformed predicts the observed data, in the sense than any differences between T(f) and d can be accounted for solely by noise, is a possible solution to the problem. Such images are termed "feasiblen. A statistical test is applied to the differences to determine whether it is a likely noise distribution. It has become conventional /1,2/ to use the X 2 test in the form M

Article published online by EDP Sciences and available at http://dx.doi.org/10.1051/jphyscol:1986506

C5-44

JOURNAL DE PHYSIQUE

where wk, k = 1,...,M , are the inverse variances of the data dk. If, for a particular trial image f , X2 is significantly larger than the number of observations M , it is unlikely than the distribution of residuals dk - r k ( f ) arose by chance from noise, so the image is inconsistent with the data. Hence those images for which x2 5 M are the feasible images. For practical sizes of images, this is a very large set, and is impossible to display in any comprehensible form. One of the feasible images should be selected and displayed as "the result". This may be done by selecting that feasible image which is an extremum of some functional R(f), a procedure known as regularisation. The problem then devolves to the selection of a suitable criterion R. Two seemingly different arguments have been made t o show that using the Shannon entropy as the regularising functional is the optimum choice. One is based on counting the number of ways that the distribution of image density can be built up from 'quanta' 121, and the other, a formal argument by Shore & Johnson 131, based on simple axioms requiring consistency in the way information is combined, showing that it is the only optimisation criterion which does not introduce correlations between elements in the image which are not required by the data. The connection between these two approaches has recently been discussed by Jaynes 141. Here, the comparison between various functionals is illustrated by a simple example due to Gull 151. Suppose we are reconstructing a 2 x 2 image, and we have the following data: 1/3 of the total intensity comes from the top half of the image, and 113 from the left-hand half, Depending on one's imagination, this could represent a reconstruction from an incomplete set of projections, or data on the (0,l) and (I, 0) Fourier components. We are interested only in the shape of the image, and not the absolute scaling, so we work in terms of proportions p, = fi/ fi. The complete feasible set has one degree of freedom, and can be parameterised as

Ci

Any value of 6 within these limits gives a possible solution t o the problem. The particular value 0 = 119 gives no correlation between rows and columns. Any lower value of 0 gives a negative correlation, with a lower proportion of the intensity of the top half a t the left than for the top half of the whole image, and any higher value of 0 a positive correlation. Without any evidence that there is such a correlation, we should, if forced t o make a choice of a single preferred image, select the one which displays no correlation. Now suppose that we calculate the values of 6' for four different regularising functionals, all of which have been suggested for practical reconstruction. The results are Function 0 logp

0.1301 0.1218 - C p l o g p 0.1111 (119) - Cp2 0.0833 (1112). Clearly, all but the Shannon entropy - C plog p have introduced correlations between rows and columns: data relating only tb the x-direction is affecting the reconstruction in the y-direction, and vice versa.

C p1I2

Why is this the case? The data provide two linear equality constraints, which we write as

a.p = 0 and b.p = 0. Maximising S subject to the first of these constraints gives 0 = ViS

= -1 - log pi + xaj,

hence

Pj a ex~(Xaj), and applying both constraints gives

where A, p, and Y are Lagrange multipliers, and the constants of proportionality are chosen such that Cp = 1. Extra data combines multiplicatively with the old in the solution, due t o the logarithmic form of the derivative of the entropy. The maximum entropy method requires that the member of the feasible set with the greatest configurational entropy is chosen as the preferred reconstruction. The total intensity is spread out as uniformly as possible over the image, and contains only such detail as required to fit the data constraints. It is, however, no more likely to be the 'true' solution than any other feasible image.

LII. Numerical algorithms. If the global entropy maximum lies within the feasible set, then this will be the maximum entropy solution. This should only happen with extremely noisy data. Otherwise, since the Moreover, if the constraint entropy function is convex, it will lie on the boundary X 2 = function is also convex, which is always the case for a linear problem with the x2 statistic, this maximum will be unique. The problem is therefore to maximise the entropy S over X 2 = Xg. The usual size of image precludes the use of Newton-like algorithms, which require storage of the N x N Hessian matrix, although this method has been used for small problems /6/. Conventional optimisation methods for large problems without the use of second derivatives, such as steepest descents and conjugate gradients, turned out to be insufficiently powerful for this problem. Gull & Daniel1 /2/ were the first to design an algorithm which could be used on realisticallysized images. Their approach was to set the derivative of Q = S - Ax2 with respect to f to zero,

Xi.

and to rearrange in the form f j = A ~ X ~ ( - Xx2). V~ This equation was used iteratively, inserting the current estimate for f on the right-hand-side, and using the result as the new value. The exponentiation means that the result is automatically positive, and large values of f can be feached rapidly. Unfortunately, this iteration can be unstable, and smoothing between successive iterates is required /2,7/. There is also the problem of calculating the correct value of the Lagrange multiplier X to give the required value of x2. A number of ideas were used to give a reliable and robust algorithm /8,9,10,11/. As is usual with nonlinear optimisation algorithms, we attempt t o find an increment Sf to the nth iterate f(n), such that f("+') = f (n) + 6f is a better approximation to the solution than f(n), with the following considerations in mind: 1) each increment should resemble the Newton-Raphson increment

2) the step length should be restricted a t each iteration, so that the quadratic approximation to S remains-valid,

3) a specific value of X 2 can be achieved, led to the following algorithm. The inverse Hessian matrix in the Newton-Raphson step can be expanded in powew of X to get

sf = -(vvs)-'.VQ + ~(vvs)-~.(vv~~).(vvs)-~.v~ + ..., and substituting -(VVS)-I = diag{f ), we obtain sf = diag{f ).VQ X diag{f ).(VVX2).diag{f

+

).VQ

+ ... .

Note that the first term is a first order approximation to Gull & Daniell's integral equation, since

JOURNAL DE PHYSIQUE

Restricting the step length with a Euclidian metric is hopeless. It is essential to allow components o f f with high values to change rapidly, but to restrict more those approaching zero. This can 12, which is equivalent to imposing a be achieved by using the distance limit C ( Sf;)2/ f i metric diag{l/f) onto f-space. This is just -VVS, and the multiplications by diag{j) above can be seen to be the result of changing from covariant to contravariant components. We can also interpret the first-order approximation to the integral equation as steepest ascents with a n entropy metric. The increment Sf is seen to be a linear combination of diag{f).VS, diag{f).VX2, and the matrix diag{f).(VVX2) acting (perhaps repeatedly) on these vectors. Let e,, p = 1,2,. ., represent these search directions, so Sf = C, xpep. Although VVX2 is formally an N x N matrix, it is only ever used in the form (vvXi).e. Such expressions can be evaluated by vector operations and transforms between image space and data space only. Thus e is transformed to data space, where VVX2 is diagonal and acts by point-by-point multiplications, and the result transformed back t o image space by the transpose transform. Quadratic models of S and X2 are constructed in the subspace spanned by the search directions, so the problem now becomes one of constrained quadratic optimisation in a lowdimensional subspace, distances again being evaluated using the second derivative of S as a metric the 'entropy metric' /8,11/, now also interpreted as a second order approximation to the relative entropy between successive iterates /12,13/. Within the 'trust region' defined by the distance limit, the x p are selected t o give an increment towards the local maximum of S over x2, whilst x2 is reduced towards its target value x:. This step, a constrained quadratic optimisation problem in a low dimension space, is easily and quickly performed /11/. For problems with convex constraints, a total of three or four search directions, constructed a t each iteration from the contravariant gradients of S and x2, and diag{f).(VVX2) acting on these, is usually sufficient. Additional directions can sometimes be necessary, particularly if VVX2 is not positive definite 181. The convergence of the algorithm is checked by calculating the angle between V S and VX2 a t the solution. These vectors should be parallel if S is truly a maximum. The computational requirements of this algorithm are of course much greater than for a s i m ~ l einverse calculation. For each iteration, the bulk of the computation is in the calculation of the search directions, each one requiring one forward transform from image to data space, and one t r a n s ~ o s etransform. The rest of the calculation is mostly - -pointwise vector operations within either image or data space, or matrix operations within the low-dimensional search subspace. The overall speed thus depends mostly on the efficiency of the image-data transform. The number of iterations required depends principally on the signal-to-noise ratio, and is almost independent of the number of pixels. Very noisy data, containing little information, can be processed extremely rapidly. For example, the deconvolution problem, fig 2, with 10% noise, took only 14 iterations using 4 search directions, or just over 200 fast Fourier transforms in all.

<

.

-

IV.Results. A selection of example computations is presented here, mainly as applied to deconvolution from space-invariant point spread functions, so rk(f) = Cj bk-j fj. This is not unnecessarily specialised, as many other problems also have the form of a linear transform between image and data spaces, but has the advantage that convolutions are relatively fast to compute, using Fourier transforms. The data can also be displayed in a comprehensible manner, although one should remember that the image and data are in completely different spaces! The first example is intended t o be a straightforward demonstration of the method. The original, fig. la, is convolved with a 6-pixel radius point spread function, and noise added to give fig. lb. For comparison purposes, fig. lc-e show three images, produced by the Wiener filter, with F k = ~ ; ~ k / ( l ~ k +7), / ~

Figure 1. Top row, a,b,c, bottom row, d,e,f. a) Original picture, 192 x 192 pixels, 256 grey levels. Entropy S = 14.801, compared with 14.987 for uniform image. b) Convolution of (a) with 6-pixel radius point spread function, uniform Gaussian noise, standard deviation 0.1% of maximum image value added. c-e) Examples of filtered deconvolutions of (b), with increasing damping. f ) Maximum entropy deconvolution of (b). S = 14.819.

where capital letters denote the Fourier transform of the corresponding variable. Fig. lc, with 7 = 0,is the inverse filtered image, and is completely dominated by ringing a t spatial frequencies where there is a zero in the transform of the point spread function. Increasing 7 to (fig. Id) Suppresses the ringing and gives a considerable improvement in resolution? but is contaminated by noise, which is reduced by a further increase of 7 to lop2 (fig. le), but with a loss of resolution. Maximum entropy gives the result of fig. If, which shows both noise suppression and an increase in resolution. Increasing the noise on the data (fig. 2a & c), also shows noise suppression in the maximum entropy reconstruction, but with little increase in resolution. Information in the data has been lost in the noise. With a further increase in noise (fig. 2b & d), the maximum entropy image loses more resolution, but still with no increase in noise level in the restoration. Maximum entropy always gives the optimum tradeoff between noise suppression and resolution in the restoration, and only displays detail for which there is evidence in the data. As the noise increases, the detail in the restoration fades away smoothly. We now simulate the effect of 'missing' data by throwing away 75% of the data, giving the data of fig. 2e. In the X 2 test, the weights of the missing points are set to zero. Maximum entropy gives a recognisable reconstruction (fig. 2f), again with good noise suppression. There is no way that conventional linear filter methods can use such data. They depend on Fourier transforming the data, and such transforms do not exist for sparse data, as in this example. If the amount of data used is further reduced, the maximum entropy restoration loses more and more detail, until a t the limit of about 1 sampIe point per point spread function area, the separate areas decouple,

C5-48

JOURNAL DE PHYSIQUE

Figure 2. Top row, a,b,e, bottom row c,d,f. a,b) As fig. Ib, but 1%and 5% noise, respectively. c,d) Maximum entropy deconvolutions from (a) and (b). S = 14.846 and S = 14.915. e) As fig. lb, but 75% of the points discarded. f) Maximuni ehtropy deconvolutiori of (e). S = 14.841.

to give independent uniform patches of about the size of the point spread function/lO/. The next example, taken from Bryan & Skilling /14/, is also a t first sight a straightforward deconvolution. Fig. 3a shows ari optical picture of the galaxy M87, taken with the Mount Palomar 200-inch telescope, with an inset of the image of a nearby star, which defines the point spread function. The spread is mainly due to atmospheric effects. There is clearly noise on the picture, which is related to the image intensity in a way known from the characteristics of the photographic plate, increasing with image intensity. As before, we deconvolve using maximum entropy and the x2 test, giving the result of fig. 3b. The noise has clearly been reduced and resolution increased to show more structure, yet the peak intensities are unexpectedly less than in the original. Examining the normalised residuals nk = fi(rk(f)- d k ) , we find that they are large and positive a t the peaks, and moreover, comparing the distribution of the residuals with the expected Gaussian (fig. 3c), the bulk of the histogram is too narrow and off-centre. This means that the background, which contains most of the points in the image, is shifted systematically from the data, and follows variations in it too closely. The large value data points could be fitted more closely by constraining to a smaller value of X2, but this would then lead to spurious resolution elsewhere, .with noise on the data being interpreted as true signal /2/.

A statistical test is needed that restricts the outliers in the distribution. This can be achieved n ( ~ )and , fitting the ordered residuals to by ordering the residuals, so that n ( l ) 5 n(2) 5 . the values that they would have if they really came from a normal distribution' That is, n(i) should take the value

. .<

Figure 3.

a) Contour map of optical photograph of the galaxy M87, contour levels 10, Inset, point spread function, equal contour intervals. b) Maximum 20,. .,100, 120, 140.. entropy deconvolution of (a), using X 2 statistic. Same contour levels as (a). c) Histogram of X 2 residuals compared with unit Gaussian. d) Maximum entropy deconvolution of (a), using E2. e) Histogram of E2 residuals compared with unit Gaussian.

.

..

ui = @

(i - 1 )

where

@(z) = (2r)-?

is the cumulative normal probability. The distance between n and v can be measured by

JOURNAL DE PHYSIQUE

whose expectation value is asymptotically log log M /14/. The only extra step in the algorithm is to normalise and order the residuals for use in the statistical test. Although the solution is now non-unique, in that the algorithm stops when an image is found which has the maximum entropy on a surface-with the correct E2, and no other ordering of the residuals gives a lower E2, and not necessarily that with the global entropy maximum over all the M! E2 surfaces corresponding to all possible ordering of the residuals, it can be shown that the expected error is less than one standard deviation, spread somehow over the whole image, and therefore negligible 1141. The result of applying this to the M87 picture is shown in fig. 3d, and is a clear improvement, with an increase in resolution and yet still with a smooth background. The existence of additional structure has since been confirmed from observations under better 'seeing' conditions and more sophisticated (CCD) detectors (Fielden, private communication). As expected, the histogram of residuals now conforms to a Gaussian (fig. 3e). The penalty payed for this more sophisticated test is about a factor of two in the number of iterations required, and it is also essential to know the noise statistics accurately. We have so far assumed that the point spread function is known exactly. If not, but its functional form, defined in terms of a few parameters, is known, then we can attempt to optimise the fit to the data with respect to these parameters as well. We make an initial estimate of their values, and attempt to solve the problem by maximum entropy. We may succeed in fitting the data if the estimate is sufficiently good, or perhaps not, due to a large number of points being forced down to zero. In either case, the values of the parameters are then adjusted so as to minimise the value of x2, keeping the current image fixed. A new maximum entropy solution can then be calculated using the new parameters, and the process iterated. We illustrate this with an example drawn from phase-contrast electron microscopy, where the Fourier transform of the point spread function (the contrast transfer function) can be approximated as sin(alk4 - a2k2)exp(--ask4), where k is the reciprocal space vector, and a1, a 2 and 0 3 are parameters related to the spherical aberration, defocus and chromatic aberration respectively. a1 and a3 are essentially fixed for a given microscope, but the defocus is a t the control of the experimenter. To record even moderately low resolution data in phase contrast, a very large value of defocus must be used, which createsavery rapidly oscillating CTF, and carrection for the resulting effectsis essential. A first estimate of the defocus can be made from the positions of the zeros in the Fourier transform of the image. A simulation of this (fig. 4) shows an initial object (not to be taken too seriously), and the 'data' derived from this using a defocus of 1.5 pm with added noise of about 1%of the maximum signal, admittedly much less than expected from genuine EM data. Fig. 4c shows the maximum entropy restoration of this using a defocus of 1.3 pm, with the larger blobs evident, but confused by a very noisy background. The value of X2 attained was about 3Xz. The procedure described above was then followed, until X: was reached, simultaneously with X: being a minimum of x2 with respect to the defocus value. The calculated value of defocus was 1.49 pm. The result (fig. 4d) shows much better resolution of the smaller peaks, and a smoother background. The large-scale variation in bazkground is not surprising, as the low resolution information is removed by the point spread function. Note that the entropy of the final reconstruction is greater than for that calculated with the incorrect defocus value, despite the closer fit to the data. This technique of parameter refinement has been used in other maximum entropy applications, such -as calibration phases in radio astronomy /15/ and NMR spectroscopy /16/, and refinement of heavy atom positions /17/ in isomorphous replacement calculations using fibre diffraction data. The final example is taken from x-ray fibre diffraction. The Fourier transform of a perfectly aligned array of helical particles is a set of parallel layer lines. Imperfect alignment and limited coherence length cause layer line broadening in the direction of the polar angle and the fibre axis respectively /18,19/. The point spread function is spatially variant, and the broadening cannot be calculated by a convolution, so the values relating each point on the layer lines to each

Figure 4. a) Contour map of simulated object. S = 7.638, compared with 9.720 for a flat image. b) Simulated data from (a), using a phase contrast transfer function with 1.5 pm defocus and 1%noise added. c) Deconvolution of (b) using 1.3 pm defocus. S = 9.480. d) Result of simultaneously refining the defocus value and maximising the image entropy, leading to a calculated defocus value of 1.49 pm. S = 9.600. point in film space must be calculated separately /20/. Fig. 5a shows an example of a diffraction pattern from a we11 oriented fibre of the filamentous virus P f l /21/. Maximum entropy has been applied applied to this deconvolution problem (R. K. Bryan and C. Nave, in preparation). A few points with exceptionally strong intensity have overexposed the film, and this can be allowed for by setting to zero the weights on these data points. The layer line intensity is then derived from the surrounding points of the broadened layer line. The closely overlapping layer lines from the strong 10A near-equatorial region are shown in fig. 5b. Although reasonable well resolved, there is clearly some spillover, as witnessed by the poor fit of the intensity computed from a structure determined from these data /17/ t o the tails of layer lines

C5 -5 2

JOURNAL DE PHYSIQUE

Figure 5. a) Fibre diffraction pattern of filamentous virus Pfl (from ref. 1211). b) Layer line amplitudes in strong l0Aregion derived from (a) by maximum entropy deconvolution (thin line), with transform of resulting structure superimposed (thick line). The selection rules (1,n) are (5,16), (6,5), (7, -6), (8, -17), where n is the order of the Bessel function. Horizontal scale 0.0025A-I per division. (From R. K. Bryan, M. Bansal, W. Folkhard, C. Nave and D. A. Marvin, unpublished). with only high order Bessel function terms. An approach to correcting this problem is discussed further in the next section. Ideally, we would like to refine the disorientation parameters, as for the defocus in the previous problem, but the functional form is sufficiently complicated that is has so far been impracticable t o compute the necessary derivatives. The parameters were estimated by trial and error, looking a t the fit of the layer line profiles in various parts of the diffraction pattern.

V. Discussion. The examples in the previous section have demonstrated the power of the maximum entropy method in the deconvolution problem. In principle, it can be applied to any inverse problem in which the data which would arise from a particular trial image can be predicted. Any instrumental effect that the experimenter can describe can be incorporated into the forward transform. If the image-data transform allows us to define a convex set of feasible solutions, then the maximum entropy solution is unique. Many other problems having intrinsically ambiguous solutions, in particular the phase problem, which has been the subject of much recent work (e. g. 113,221)' and may yet succumb to the method. However, there is one important aspect ignored so far in this discussion. The entropy expression used is not the most general form satisfying the consistency requirements. The relative

measures the entropy of the distribution p relative t o an initial model or 'prior map' m. So far m has been implicitly taken as uniform. In the absence of any data, the maximum entropy solution is p = m , so that the method is now looking for evidence in the data that p is different from m. This is potentially a powerful way of incorporating prior knowledge of the solution into the analysis. For example, in the layer line deconvolution problem, information on which orders of Bessel functions occur on which layer lines was not incorporated into the analysis described above. Not

surprisingly, intensity sometimes appears incorrectly on meridional regions of Iayer lines with high order Bessel function terms which are adjacent to those with strong intensity and low order terms. When using the data for subsequent calculations of electron density /17/ the obviously erroneous data was discarded by hand. The deconvolution calculation was correct from the maximum entropy point of view, since the intensity is spread out as much as possible within the constraints provided by the data. An appropriate prior map, set to zero near the meridian, could partly supply this information. Even more information could be incorporated once an initial structure has been deduced from the layer line data. The calculated intensities wiil probably not fit the layer line data exactly, where, for instance, strong layer lines overlap and have been incorrectly deconvolved. Using the calculated intensities as a prior should permit more accurate deconvolution. Additionally, if the layer line data is to be derived from more than one specimen, using calculated intensities as priors enables information from well-aligned specimens to be incorporated in the layer line deconvolution of others. The relative entropy expression should be thought of as a way of combining prior information with new data. Its potential in this respect has only just begun to be explored.

References. /1/ Abels, J. G., Astr. Astrophys. Suppi., 15, (1974), 383. /2/ Gull, S. F. and Daniell, G. J., Nature, 272, (1978), 686. /3/ Shore J. E. and Johnson R. W., IEEE Trans., IT-20, (1980), 26; and IEEE Bans., IT-29, (1983)~942. /4/ Jaynes, E. T., "Monkeys, Kangaroos, and Nn, Paper presented a t the Workshop on Bayesian and Maximum Entropy Methods in Inverse Problems, University of Calgary, (1984). To be published by Cambridge University Press. /5/ Gull, S. F. and Skilling, J., "The maximum entropy method", in Indirect Imaging, ed. J. A. Roberts, Cambridge University Press, (1984), 267. /6/ Frieden, B. R., 3. Opt. Soc. Am., 02, (1972), 511. /7/ Willingde, R., Mon. Not. R. astr. Soc., 194, (1981), 359. /a/ Bryan, R. K., Maximum Entropy Image Processing, PhD Thesis, University of Cambridge, (1980). /9/ Skilling, J., "Algorithms and applications", Paper presented at First Workshop on Maximum Entropy Estimation and Data Analysis, University of Wyomimg, (1981). /lo/ Burch, S. F., Gull, S. F. and Skilling, J., Computer Vision, Graphics, and Image Processing, 23, (1983), 113. Ill/ Skilling, J. and Bryan, R. K., Mon. Not. R. astr. Soc., 211, (1984), 1.11. 1121 Skilling, J. and Livesey, A. K., Paper presented a t the EMBO Workshop on Maximum Entropy Methods in the X-ray Phase Problem, Orsay, fiance.(1984). /13/ Bricogne, G., Acta Cryst., A40, (1984), 410. /14/ Bryan, R. K. and Skilling, J., Mon. Not. R. astr. Soc., 191, (1980), 69. /15/ Scott, P. F., Mon. Not. R astr. Soc., 194, (1981), 25P. /16/ Sibisi, S., Nature, 301, (19831, 134. /17/ Bryan, R. K., Bansal, M., Folkhard, W., Nave, C. and Marvin, D. A., Proc. Natl. Acad. Sci. USA, 80, (1983), 4728. /18/ Holmes, K. C. and Barrington Leigh, J., Acta. Cryst., A30, (1974), 635. /19/ Fraser, R. D. B., Macrae, T. P., Miller, A. and Rowlands, R.J., J. Appl. Cryst., 9, (1976), 81. /20/ Provencher, S. W. and Gloeckner, J., J. Appl. Cryst., 15, (1982), 132. /21/ Nave, C., Brown, R. S., Fowler, A. G., Ladner, J. E., Marvin, D. A., Provencher, S. W., Tsugita, A., Armstrong, J. and Perham, R. N., J. MoJ, Biol,, 149, (1981), 675. 1221 Wilkins, S. W., Varghese, J. N. and Lehmann, M. S., Acta Cryst., A39, (1983), 47. /23/ Jaynes, E. T.,BEE IItans., SCC-4, (1968), 227.