SpectroPolDeconvolution

2014-05-19 09:40 | Source

python
work

We face the problem of correcting two-dimensional spectropolarimetric data from the perturbation introduced by the PSF of the Hinode solar optical telescope. The two-dimensional data has been obtained by scanning a slit on the surface of the Sun and recording the information of the four Stokes profiles $(I,Q,U,V)$ on each point along the slit for a set of discrete wavelength points around the 630 nm Fe \textsc{i} doublet. As a consequence, the data can be considered to be four three-dimensional cubes of images. We use the notation $\mathbf{I}(\lambda)$, $\mathbf{Q}(\lambda)$, $\mathbf{U}(\lambda)$ and $\mathbf{V}(\lambda)$ to refer to observed images at a certain wavelength $\lambda$. In practice, given the scanning process, these are not strictly speaking images, because each column of the image is taken at a different time. In general, in the standard image formation paradigm, the observed image $\mathbf{I}$ (for simplicity we focus on Stokes $I$ but the same expressions apply to any Stokes parameter) that one obtains in the detector after degradation by the atmosphere and the optical devices of the telescope at a given wavelength can be written as: $$ \mathbf{I} = \mathbf{O} * \mathbf{P} + \mathbf{N}, \label{eq:image_formation} $$ where $\mathbf{P}$ is the PSF of the atmosphere+telescope in the image of interest while $\mathbf{O}$ is the original unperturbed image that one would obtain with a perfect instrument without diffraction and without any atmospheric perturbation. The operator $*$ is the standard convolution operator. The quantity $\mathbf{N}$ is the noise contribution in the image formation produced at the camera. We assume that we are not in the low illumination regime and assume that $\mathbf{N}$ follows a Gaussian distribution with zero mean and diagonal covariance matrix with equal variance $\sigma_n^2$. The previous expression can be applied to individual monochromatic images, with potentially different PSFs for each wavelength. For simplicity, we make the assumption that the PSF is wavelength-independent in this work, which turns out to be a very good approximation given the wavelength ranges that we are dealing with (less than 2.5 \AA\ in the Hinode case). The specific PSF that we consider has been described in \cite{vannoort12} and obtained from the pupil specified by \cite{suematsu08}, which takes into account the entrance pupil of the telescope and the presence of a spider. Note that, although the PSF shown in the figure spans 9''$\times$9'', only the central few arcsec are really important. Under the presence of uncertainties induced by the noise, any deconvolution must be treated under a statistical framework. Consequently, we have only access to the distribution of reconstructed images. Using the Bayes theorem, the posterior distribution $p(\mathbf{O}|\mathbf{I},\mathbf{P})$ that describes the probability of the restored image given the observed image and the information about the image-forming system is given by: $$ p(\mathbf{O}|\mathbf{I},\mathbf{P}) = \frac{p(\mathbf{I}|\mathbf{O},\mathbf{P}) p(\mathbf{O})}{p(\mathbf{I})}, $$ where $p(\mathbf{I}|\mathbf{O},\mathbf{P})$ is the likelihood or, in other words, the probability that an observed image $\mathbf{I}$ has been obtained given an original image $\mathbf{O}$ and the PSF. $p(\mathbf{O})$ encodes all the a-priori information we have for the images (i.e., degree of smoothness, presence of large gradients, etc.). Finally, $p(\mathbf{I})$ is a normalization constant (termed evidence) that is independent of the unperturbed image. Under the assumption of uncorrelated Gaussian noise in every pixel of the image, the likelihood can be written as: $$ p(\mathbf{I}|\mathbf{O},\mathbf{P}) = \prod_{k=1}^N \exp \left[ - \frac{ \left(I_k - (\mathbf{O} * \mathbf{P})_k\right)^2 }{2\sigma_n^2} \right], $$ where the product is done over all $N$ pixels of the image, $I_k$ represents the $k$-th pixel of the observed image (in lexicographic order) and $(\mathbf{O} * \mathbf{P})_k$ is the $k$-th pixel (in lexicographic order) of the original image convolved with the PSF. If we additionally assume that the prior over restored images is flat (all images are equally probable), the most probable image is the one that maximizes the likelihood. Taking the derivative of the previous Gaussian likelihood with respect to the original image $\mathbf{O}$ results in the following iterative algorithm known as the Gaussian version of the Richardson-Lucy (RL) algorithm: $$ \mathbf{O}_\mathrm{new} = \mathbf{O}_\mathrm{old} + \left[\mathbf{I} - \left(\mathbf{O}_\mathrm{old} * \mathbf{P} \right) \right] \otimes \mathbf{P}, \label{eq:rl_deconvolution} $$ where the symbol $\otimes$ represents image correlation (which can be written in terms of the convolution operator). The previous iterative scheme does not guarantee positivity of the images even in the case that the observed images are positive and thus has to be forced somehow for Stokes $I$. However, note that for the Stokes parameters, the pixel values can be positive and negative. Additionally, since the RL deconvolution is a maximum-likelihood algorithm, it is sensitive to overreconstruction produced by the presence of noise. The most notable effect is the appearance of high-frequency structures in the reconstructed image. In order to avoid this problem, it is customary to stop the iterative scheme before such artifacts appear. The most straightforward way to deconvolve two-dimensional spectropolarimetric data is to deconvolve the monochromatic images of the Stokes profiles, in a way similar to what is done with imaging data \citep[e.g.,][]{vannoort05}. This approach presents two clear drawbacks. First, the number of deconvolutions one has to carry out is large. For instance, the spectropolarimetric data of Hinode SOT/SP contains 112 wavelength points. Second, many of these wavelengths contain practically no information in Stokes $Q$, $U$ and $V$. This is the case of the continuum wavelengths where, unless strong velocity fields are present, the polarimetric signal is expected to be zero. Therefore, one ends up in the difficult situation of having to deconvolve very noisy images. The nature of the RL algorithm then induces an exponential increase of the spatially high-frequency noise, making the final images useless. In general, and as a consequence of the smoothing introduced by the PSF, some information is irremediably lost. This unavoidably transforms the deconvolution process into an ill-posed problem. In particular, a set of solutions with potentially diverging power in the high spatial frequencies are perfectly compatible with the observations. Standard spatial deconvolution techniques solve this dilemma by ad-hoc spatial filtering methods that avoid the divergence of high frequencies during the deconvolution process. Typical methods include setting a hard or soft threshold on the resulting modulation transfer function that avoids the appearance of high-frequencies in the resulting image.

Orthogonal basis¶

Inspired by the work of \cite{vannoort12}, we pursue here a regularized deconvolution. Contrary to the typical procedure in image deconvolution, the regularization that we propose acts on the spectral dimension and not on the spatial dimensions. We assume that the unperturbed Stokes profiles at each pixel can be written as a linear combination of the elements of a complete orthonormal basis formed by the eigenfunctions $\{\phi_i(\lambda)\}$. Consequently, any of the unperturbed Stokes profiles can be written as: $$ \mathbf{O}(\lambda) = \sum_{i=1}^{N_\lambda} \omega_i \phi_i(\lambda), \label{eq:pca_decomposition} $$ where $N_\lambda$ is the number of wavelength points that we have in the Stokes profiles. If only a few elements of the eigenbasis are enough to reproduce the unperturbed Stokes profiles, it is advisable to truncate the previous sum and only take into account the first $N \ll N_\lambda$ eigenfunctions. Therefore, the unperturbed data is now described by a set of images $\omega_i$ (that we term projected images) which are built by projecting the Stokes profiles of each pixel on the orthonormal wavelength basis functions. Given that we have assumed that the monochromatic PSF is wavelength-independent, the observed perturbed Stokes profiles is obtained after applying Eq. (\ref{eq:image_formation}): $$ \mathbf{I}(\lambda) = \sum_{i=1}^{N} \left( \omega_i * \mathbf{P} \right) \phi_i(\lambda) + \mathbf{N}, $$ where we have used the fact that the convolution operator only acts on the spatial dimensions. Because of the presence of noise, we can find the original unperturbed Stokes profiles by computing the projection of the previous equation onto the orthonormal basis functions: $$ \langle \mathbf{I}(\lambda), \phi_k(\lambda) \rangle = \sum_{i=1}^{N} \left( \omega_i * \mathbf{P} \right) \langle \phi_i(\lambda), \phi_k(\lambda) \rangle + \mathbf{N}, \label{eq:projectionPCA} $$ where $\langle \cdot,\cdot\rangle$ indicates the dot product of the two functions. The noise term still maintains the same statistical properties because the basis functions are normalized to unit norm. Since the basis set is orthonormal, the previous expression simplifies to: $$ \langle \mathbf{I}(\lambda), \phi_k(\lambda) \rangle = \omega_k * \mathbf{P} + \mathbf{N}. $$ Consequently, the regularization process we have used implies that we have to deconvolve the projected images (associated to the basis functions $\phi_k(\lambda)$) from the PSF and reconstruct back the unperturbed image. This deconvolution is done using the RL iteration. The previous approach is valid for any set of orthonormal functions that can be used to explain the Stokes profiles \citep[e.g.,][]{hermite_deltoro03}. However, the basis obtained after PCA is ideal in our case. The reason is that the PCA decomposition transformation is defined in such a way that the first principal component accounts for as much of the variability in the data as possible, and each additional principal component in turn explains the largest variability in the data under the orthogonality constraint. Therefore, working with PCA-projected images we find that the real signal present in each pixel appears only associated to the first few elements of the basis set, while the remaining elements are used to explain the noise. Consequently, the influence of noise is largely minimized if one only focuses on the maps of low-order coefficients. This is a huge advantage with respect to the wavelength-by-wavelength deconvolution. The procedure starts by building the $M \times N_\lambda$ matrix of measurements, where the Stokes profiles (with the mean Stokes profile substracted) are placed as the rows of a matrix for each one of the $M$ observed pixels. This matrix (or equivalently its covariance matrix) is diagonalized using the singular value decomposition \citep[e.g.,][]{numerical_recipes86}. The eigenvectors obtained after the diagonalization form a basis that is efficient in reproducing the observed Stokes profiles and only a few of them are needed. In principle, and according to Eq. (\ref{eq:pca_decomposition}), we should use the PCA eigenvectors obtained with the original Stokes profiles. Since we do not have access to those profiles, we have used the observed Stokes profiles to build such database. Unless the original profiles are radically different to the observed ones, we expect the eigenbasis to be also efficient in reproducing the original Stokes profiles.

Non-orthogonal basis¶

If the Stokes parameters are expanded in a non-orthogonal basis (a so-called dictionary), we cannot directly reconstruct the profiles by projecting over the elements of the basis. Equation (\ref{eq:projectionPCA}) is then transformed into: $$ \langle \mathbf{I}(\lambda), \phi_k(\lambda) \rangle = \sum_{i=1}^{N} \left( \omega_i * \mathbf{P} \right) K_{ik} + \mathbf{N}, $$ where $K_{ik}$ are the elements of the Gram matrix of the dictionary. The previous equations constitute a linear system for the convolved images. A possible approach here is to solve the noisy linear system by minimizing the $\ell_2$ norm of the residual (for instance, using singular vector decomposition) for each pixel and then deconvolve the resulting images $\omega_i * \mathbf{P}$ using the same approach as before. The final Stokes profiles are obtained as a linear combination of the deconvolved images and the elements of the dictionary.