TOP: Introduction

5. Multi-Conjugate Adaptive Optics

MCAO general idea

5.1. Why MCAO?

Multi-Conjugate Adaptive Optics (MCAO) is a further development of the original AO concept. It consists in correcting the turbulence in three dimensions with more than one deformable mirror (DM). Each DM (see the Figure) is optically conjugated to a certain distance from the telescope. We call this conjugation altitude, although the term range would be more correct. The benefit of MCAO is reduced anisoplanatism, hence an increase of the compensated field-of-view (FoV) size.

In order to avoid vignetting, the projected diameter of the DMs must be equal to $D + 2 \Theta H$, where $\Theta$ is the radius of the FoV and H is the conjugation altitude. Hence, upper DMs must be larger than the telescope pupil, they are called meta-pupils. The beams from scientific objects and guide stars (GSs) do not sample the whole meta-pupil, but have smaller footprints.

Question: What is the meta-pupil diameter for a 8 m telescope with DM2 conjugated to 8 km and FoV diameter 2 arcminutes? What are the diameters of the beam footprints for a sodium LGS?

The signals driving these DMs are obtained from several WFSs, each observing its own guide star (GS). The information from the WFSs is processed by a reconstructor in order to retrieve the 3-dimensional instantaneous wave-front perturbations, as in the medical tomography where the 3D structure of an object is derived by viewing it from different angles. In our context the technique is called turbulence tomography, reconstruction is done by matrix multiplication.

AO, tomography, and MCAO

Tomography is useful even with only one DM, because it permits to infer the compensation signal for a target which is at a large angular distance from GSs. In this way a better compensation quality is achieved comparing to the use of just one GS, and the sky coverage of an AO system with natural GSs is improved. With LGSs, tomography helps to reduce the cone effect: solving for several turbulent layers from the signals of several GSs permits to combine these layers in a correct way (without stretching or missing portions) to achieve the best compensation for a selected target. Thus, tomography can be used without MCAO, but MCAO would not work without tomography.

The currently mounting interest in tomography and MCAO is directly related to the perspective of turbulence correction at large telescopes (LTs) and at future Extremely Large Telescopes (ELTs) in the whole optical and IR range. The logic leading to this conclusion is shown on this scheme and is as follows:

The enthusiasm about MCAO and few wrong papers contributed to an impression that this is a magic solution for a complete removal of turbulence effects. In the following we provide some realistic estimates of the large (although still finite) gain brought by MCAO and of its associated problems.

5.2. Correction-limited field-of-view

Suppose that by some magic the instantaneous perturbations in all atmospheric layers are known, and that we have at our disposal the finite number M of ideal deformable mirrors. How large a FoV can be corrected? By answering this question, we obtain the size of correction-limited FoV.

It was shown (JOSA, V. A17, P. 1819, 2000) that in the limit of very large telescope the residual phase variance due to anisoplanatism, $\langle \epsilon_{\rm aniso}^2 \rangle$, is given by a familiar expression

\langle \epsilon_{\rm aniso}^2 \rangle =
\left( \frac{\theta}{\theta_M} \right)^{5/3},
\end{displaymath} (1)

where $\theta$ is the angular distance of the target from the FoV center (which is perfectly corrected). The new parameter $\theta_M$ replaces the classical isoplanatic patch $\theta_0$, the gain in the FoV size is equal to the ratio of these angles. The dependence on imaging wavelength remains the same, $\theta_M \propto \lambda^{6/5}$.

Weighting functions

In order to compute $\theta_M$, the turbulence altitude profile $C_n^2(h)$ must be known. The profile must be multiplied by some weighting function F(h) and integrated over altitude to obtain the residual phase error, hence to obtain $\theta_M$. The weighting function is zero at the conjugate altitudes of DMs (corresponding layers are perfectly corrected by these ideal DMs), and positive at intermediate altitudes. These functions are plotted in the Figure for the classical AO ( F0 - one DM conjugated to zero altitude), for the case when one DM is conjugated to 5 km (F1), like in the Gemini-N Altair AO system, and for the case of a MCAO with 2 DMs (F2) conjugated to 2 km and 10 km.

The 2-DM curve was obtained by assuming that each intermediate layer is corrected by both DMs, and the correction is shared between the DMs is in an optimum proportion. This strategy gives better results than a simple "allocation" of turbulent slabs to be corrected by the adjacent DMs, although an order-of-magnitude FoV size can be estimated primitively by summing up the anisoplanatic effects of all slabs. The same strategy of optimally shared corrections applies to more than 2 DMs.

For any particular $C_n^2(h)$ the conjugation altitudes of DMs that result in the largest FoV can be found. When there are strong turbulent layers in the atmosphere, it is advantageous to conjugate DMs to these layers. However, in all cases a significant fraction of turbulence is distributed continuously at all altitudes (see the profile), hence the performance gain obtained for "layered" profiles is small compared to continuous profiles with the same $\theta_0$. The optimum conjugation altitudes for 1, 2, and 3 DMs are shown HERE for a particular profile. The compensation quality is only a weak function of exact DM conjugation altitudes.

Question: What would be the compensated FoV size in an ideal MCAO system considered here with 2 DMs and all turbulence in 2 thin layers?

The actual gain in the FoV size brought by the increasing number of DMs was computed for the 12 profiles at Cerro Paranal: 4-5 times with 2 DMs, 7-10 times with 3 DMs. Increasing the number of DMs further brings smaller benefits, and for a large M, eventually, $\theta_m \propto M$. This result is intuitively clear for a continuously distributed turbulence: the thickness of slabs affected to each DM is inversely proportional to M (recall that correcting the first Zernike modes brings the largest gain, here the situation is somewhat similar). So, to correct a large FoV of diameter 2$\Theta$ with sub-aperture size d, we need, very roughly, $M =2
\Theta H/d$ DMs.

Question: Supposing that by using 2 DMs instead of one we widen the FoV size by 5 times, estimate the achievable FoV diameter at 0.5 and 2.2 microns if $\theta_0$ at 0.5 microns is 2.5 arcseconds.

5.3. Turbulence tomography

The first works in turbulence tomography aimed at modeling the turbulent atmosphere as few thin layers and at trying to infer the phase perturbations in these layers from the signals obtained on several GSs (by solving a system of linear equations). In order to do this, the number of unknowns must be less than or equal to the number of measurements, which means at least 1 GS per layer. The layers were identified with DMs, of course.

In reality there is an infinite number of turbulent layers in the atmosphere and the WFS data are noisy. This calls for statistical techniques like optimum filtering. What is actually needed is not the reconstruction of the whole turbulent volume, but the best possible estimate of the compensating signals using the information actually available from the GSs. This approach is also called tomography, it was experimentally demonstrated (Nature, V. 403, P. 54, 2000).

Tomographic solution for 2 layers and 2 GSs

Suppose again that the telescope is very big, and that only NGSs are used (no wave-front stretching). Then the problem can be treated by Fourier techniques. Each spectral component of the wave-front distortion (a sinusoidal perturbation) is measured and corrected separately. As seen in the Figure, the relative spatial shift of the signals between the two sources separated by an angle $\Theta$ is $\Theta H$, where H is the distance between the layers. For a Fourier component with spatial frequency f the phase shift is $2\pi f \Theta H$. WFSs measure the combined effect (sum) of both layers, which is different for the two GSs owing to the phase shift. From these two signals it is possible to reconstruct the two layers by solving the algebraic system of two equations. However, when the phase shift is exactly $2 \pi$, the two signals become identical and the system can not be solved. It happens at the critical frequency $f_c =
1/\Theta H$.

If there is turbulence between the layers, the situation remains qualitatively similar. For phase shifts more than 1 radian, the Fourier components from different GSs become de-correlated and the achievable degree of turbulence compensation diminishes. It means that in order to correct small-scale perturbations (large f), the distance between guide stars $\Theta$ (i.e. FoV size) must become smaller. When the whole atmosphere becomes thinner (smaller H), the corrected FoV opens up.

Question: Estimate the maximum FoV that can be corrected with sub-apertures of 1 m for a turbulence thickness of H=5 km.

Question: For a uniform turbulence distribution of H=5 km thickness estimate the number of DMs and GSs needed to correct a FoV of 5 arcminute diameter in the visible (sub-aperture size d=0.3 m).

These considerations lead to a formula (JOSA V. A18, P. 873, 2001) expressing the residual error of a wave-front of some scientific target that can be reconstructed using a constellation of K bright NGSs at some radius $\Theta$ around the target:

\langle \epsilon_{\rm tom}^2 \rangle =
\left( \frac{\Theta}{\gamma_K} \right)^{5/3}.
\end{displaymath} (2)

Here $\gamma_K$ is the tomographic patch size, which, again, is computed from the $C_n^2(h)$ profile. The dependence on wavelength is the same as for $\theta_0$. In fact, $\gamma_K$ can be written as $\gamma_K = r_0/\delta_K$, where the equivalent thickness of the atmosphere $\delta_K$ is similar to the equivalent height $\bar{h}$ in the classic formula for $\theta_0= r_0/\bar{h}$.

Using more GSs, we reduce the apparent "thickness" of atmosphere and open up the tomographic FoV. The gain in the FoV size is equal to $\gamma_K/\theta_0 = \bar{h}/\delta_K$. It amounts to 10-20 for typical profiles and for 3 and 5 GSs, respectively.

This theory is very general and does not take into account telescope diameter, for example. In reality the overlap between the footprints of GS beams on the high-altitude layers will be incomplete, some portions of the layers may be not sampled at all, hence remain unmeasured. These effects may dominate the total tomographic error under certain conditions (4 m telescope in the IR range) or may be insignificant in some other cases (8 m telescope in the visible range or ELTs). The tomographic patch size $\gamma_K$ provides a system-independent lower limit to the error of phase reconstruction.

Using several GSs, we collect more photons. Does it mean that the tolerance on the GS brightness can be relaxed and fainter GSs can be used for tomography, as compared to classical AO? The answer depends on the size of reconstructed FoV. If FoV is much smaller than $\gamma_K$, the GS signals are correlated and, indeed, individual GSs can be fainter than a single GS. On the other hand, if we want to take advantage of the full tomographic FoV, the GSs must be at least as bright (or even brighter) than single GS in the classical AO, because the solution of the tomographic problem leads to noise amplification (like in other inverse problems).

There may be not enough NGSs for tomography, especially for shorter imaging wavelengths. Moreover, the WFS must be re-configurable for each telescope pointing, and the command matrix of MCAO must be updated accordingly. Clearly, LGSs should be a better solution for an MCAO system (see Gemini MCAO). However, in view of LGS problems, some researchers think of using several NGSs and tomography to correct the scientific object, even with a single DM. F. Rigaut proposes to correct only lower atmospheric layers. The resulting performance will be a way below diffraction limit, but an improved seeing over a wide FoV can be profitable for observations in the visible. E. Gendron proposes to build a multi-object spectrometer where each target will be corrected by a miniature AO system using the signals from several surrounding NGSs and tomographic reconstruction.

5.4. Tilt problem in MCAO

Null modes in MCAO

Tips and tilts of several LGSs remain undetermined for the same reason as in single-LGS AO systems. As a consequence, the information brought by the LGSs becomes insufficient for a full solution of tomographic problem. In addition to the overall tip and tilt, there appear at least 3 additional undetermined modes (or null modes). They correspond to the differential astigmatism and defocus between the two DMs (see the Figure). These modes do not influence on-axis image quality, but rather produce a differential tilt between the different parts of the FoV, or tilt anisoplanatism (this is why they can not be measured with LGSs). Simulations show that if tilt anisoplanatism is left uncorrected, the stars in the FoV will move with respect to each other, as though the whole FoV were randomly distorted.

GS layout Question: Draw the relative displacements of 5 GSs located in the FoV as shown that are provoked by the Zernike modes 4, 5, 6 applied to the upper DM.

The three additional modes can be sensed with two additional NGSs, making their total number 3. The differential tilts between the NGSs constrain these modes. Alternatively, a single NGS can be used to sense Zernike modes 2 to 6 (radial orders 1 and 2). This requires a brighter NGS, of course. The first solution seems to produce better performance and better sky coverage and hence is preferable.

What happens if the tip-tilt sensors of the 3 NGSs are positioned with small errors? The MCAO system will compensate these errors in the closed loop, hence the FoV will be distorted! For example, the plate scale (arcseconds per pixel) will change if the upper DM has a static defocus. Special procedures must be applied to ensure that these errors do not compromise the astrometric performance of an MCAO system (like flattening of the upper DM before closing the loop).

The insight into the tilt anisoplanatism provided by MCAO leads to a suggestion to use 3 NGSs even for the "standard" tip-tilt correction. If, in addition to the tip and tilt, the modes 4-6 are corrected by a DM conjugated to some altitude, a large part of the tilt anisoplanatism will vanish. It means that an improvement of the image quality can be achieved not only in the vicinity of the tip-tilt NGS, but in a wider FoV. A second advantage of using 3 NGSs for tip-tilt correction is that even without addition of a low-order DM the tilt anisoplanatism is measured. Hence, a better correction of the scientific target can be achieved, e.g. in LGS-based AO systems. The calculations of sky coverage show that the need to have 3 NGSs instead of one is over-compensated by the increased FoV where these NGSs can be found. Hence, "tilt tomography" promises an improved sky coverage even for LGS-based single-conjugate AO systems.

5.5. Modal MCAO systems

The general scheme of an MCAO system is shown in Figure in Sect. 5.1 . The WFSs provide information on a certain number of wave-front parameters, e.g. measure several Zernike modes. This data vector is multiplied by a command matrix (see Reconstructors) to obtain the correction signals applied to DMs. These signals can also be specified as Zernike modes, this is why we call it a modal MCAO.

The problem of command matrix optimization was treated in a number of works. When noise and turbulence statistics are taken into account, something like a Wiener filter is obtained. Typically, the optimization criterium is the minimum weighted residual phase variance over the FoV (or in some specified FoV locations). A simpler and more traditional approach consists in constructing an interaction matrix and inverting it to obtain the command matrix. However, optimization gives substantially better results (see below).

The performance of MCAO systems can be studied using complete Monte-Carlo simulation in a computer. This technique requires large amount of calculations and is suitable for a detailed performance analysis of an MCAO system at design stage. Alternatively, the optimized command matrix and the resulting performance can be derived from the second-order statistical quantities, like covariances of Zernike coefficients (in modal MCAO) or covariances of S-H signals and DM actuator signals (in zonal MCAO). The modal covariance codes are the fastest tools to date.

Modal tomography results

Monte-Carlo simulations and covariance codes address the issues which were neglected in the Fourier theory, namely beam overlap, cone effect (in case of LGSs), and finite order of correction. In the Figure, the residual phase variance of the first 66 Zernike modes in a 8 m telescope is plotted for an object at the FoV center (this corresponds to tomography, because the object can be corrected with only one DM). The solid lines show the results with 3 and 5 NGSs at increasing distance from the object. The dotted line shows the limiting tomographic error for 3 NGSs and an infinite telescopes; as can be seen, the actual errors are much larger, because here beam overlap is the major source of tomographic error.

The dashed line show the performance achievable with 3 sodium LGSs, under the condition that the tip and tilt are perfectly compensated for the object. At close LGS separation the performance is worse than with NGSs because of the cone effect. When the LGS radius reaches 9 arcseconds, their distance from the telescope axis is just 4 m; in this case the cone effect is partially removed by tomography, the residual error is below 1 square radian at 0.5 microns (caution: higher modes must be considered before stating that cone effect is beaten and LGS correction in the visible is possible).

It might seem strange that at large separations 3 LGSs give better results than 3 NGSs; the reason for this paradox is the tip-tilt compensation, supposed to be perfect for LGSs. The dash-dot line shows the case when LGSs are replaced by NGSs with perfect tip-tilt compensation, to demonstrate that the apparent gain is indeed due to this assumption.

MCAO with 3 GSs and 2 DMs

Both covariance codes and Monte-Carlo simulations demonstrate that the quality of compensated images delivered by MCAO is much more uniform over the FoV than in the classical AO. For example, variation of the Strehl ratio (at 2.2 microns) over the 2 arcminute FoV is plotted for a 2-DM MCAO system using 3 NGSs (full line). Each of the 2 DMs corrects 66 Zernike modes. For comparison, the performance of MCAO with inverse command matrix (dashed line) and the performance of a classical AO (dotted lines) are over-plotted. The positions of GSs and test points are shown on the insert.

PSF variation in AO PSF variation in MCAO

The variations of the PSF shape across the FoV are simulated by R. Conan for a classical AO compensating 66 Zernike modes at the 8 m telescope (left) and for the MCAO with 3 DMs and 3 NGSs (right). The FoV size is 4x4 arcminutes, guide stars (marked as red dots) are the 14-th to 15-th magnitude natural stars around the planetary nebula NGS 2346. Imaging wavelength 2.2 micron.

Another unexpected result of the simulations is that for a 2-DM MCAO system the conjugation altitude of the second DM can be changed in a wide range without affecting the performance. This is very useful: the distance between telescope and turbulent layers changes in time and, additionally, depends on the telescope zenith distance. These changes can be accommodated by re-optimization of the command matrix, it is not needed to change the optical conjugation.

5.6. Layer-oriented MCAO

The concept of layer-oriented MCAO is developed by R. Ragazzoni and his colleagues. It is close to the original MCAO idea of J. Beckers and to the early versions of medical tomography, when the layers of a 3-D object were isolated by "focusing" on them while illuminating the object from different angles.

Layer-oriented MCAO

Suppose that we measure the wave-fronts using many natural guide stars. If the WFSs are optically conjugated to some altitude H, the signals of all NGSs corresponding to this layer would be identical. However, other layer at altitude h will be seen with various relative shifts. If all signals are averaged, the measured layer will not be affected, but other layers will be smoothed with a typical length of 2$\Theta$(H-h), where $\Theta$ is the radius of the FoV. In short, the contribution of our selected layer will be enhanced as compared to other layers.

In a layer-oriented system (LOS), the averaging of the signals from many stars is done not in a computer, but by adding their light on a single detector (this can be achieved with a multi-pyramid WFS). The combined signal is fed to a DM conjugated to the same altitude. Part of stellar light is used by another WFS conjugated to another altitude which drives a second DM. Of course, the layers are not completely independent: a WFS at some layer H "sees" smoothed wave-fronts at all other layers and the smoothed corrections applied to all other DMs. But the system works in closed loop, trying to adjust itself and to drive to zero the signals in all layers. There is a hope that the contributions of individual layers will be eventually disentangled. The simulations and theory show that this indeed happens under some conditions.

Question: For a layer-oriented MCAO system with two DM-WFS pairs separated by 5 km and a FoV diameter of 5 arcminutes, estimate the size of perturbations that are left un-compensated in an intermediate turbulent layer. For the same conditions, estimate at which spatial scales there will be a strong cross-talk between the layers.

Milti-pyramid WFS

Layer-oriented MCAO can be regarded as an attempt to solve the tomographic problem by hardware. The system can not be optimized with respect to the brightness of individual stars, turbulence profile, etc. It is expected that under given conditions it will perform slightly worse than an optimized modal MCAO. On the other hand, LOS concept has several advantages: simplified computing (AO loop is closed in each layer independently), a possibility to use many very faint NGSs (to overcome detector readout noise) and a possibility to track the wind-driven turbulence in individual layers with longer exposure times. Not all problems in actual implementation of this concept are solved as yet.

Question: How the photon noise will change after the number of layers in a LOS is doubled?

5.7. MCAO: the near future

Gemini team has embarked on actually building an MCAO system for the Gemini-S telescope. The project is now (2001) past the conceptual design stage. The goal is to achieve a uniform turbulence compensation in the near IR J,H,K bands over the 1 arcminute FoV.

Although the system parameters may still change, its main features are summarized in the Table.

DM conjugate ranges 0, 4.5 and 9 km
DM Orders 16, 16 and 8 actuators across the pupil
Number of Guide stars 5 sodium LGSs and 3 NGSs
LGS geometry center and 4 corners of 42.5 arcsec square
WFS Orders S-H, 16 by 16 (LGS); Tip-tilt (NGS)
LGS Laser Power 10 W per beam
Launch Telescope Behind telescope secondary, 45cm
NGS magnitudes 3 times 19 (for 50% Strehl reduction in H)
Control bandwidths 33Hz (LGS); 0-90Hz (NGS)

The compensated FoV will be at least 1 arcminute square (up to 2 arcminutes for partial compensation in K band), the variations of Strehl ratio across the FoV are constrained to be few percent. Detailed performance characteristics are available at the Gemini WEB site.

In a parallel effort, European Southern Observatory (in collaboration with several European institutions ) is planning to build an MCAO demonstrator for 8-m VLT telescope that will use NGSs. The goal of this project is to show the feasibility of MCAO, which is perceived as a major milestone towards ELT projects (ELTs are considered as useless without MCAO).

Work on MCAO is also being done at the Lund observatory, at Durham (UK) and at Palomar Observatory.

Summary. Multi-conjugate AO will attempt to correct the 3-dimensional turbulence, improving the accessible FoV and other AO parameters, especially when LGSs are used. It relies on turbulence tomography - a technique to extract the multi-layer correction signals in the optimum way be measuring several GSs. With 2-3 DMs and 3-5 GSs, the FoV is opened up by a factor of 5-10, depending on vertical turbulence profile. Tomography will improve AO performance even with single DM in different ways (correct cone effect; better tip-tilt correction; better sky coverage).

TOP: Introduction