Campiotti
Back to Psych 221 Projects 2013
Introduction
The primary motivation for this project is the proliferation of forged images in modern society (e.g. in advertising, viral images, and even political races) and the increasing need to detect these forgeries in a forensic setting. This project seeks to reproduce the results of [1], which proposes the use of the underlying statistics of an image resulting from color filter array (CFA) interpolation (i.e. demosaicking) to detect localized forgeries in an image.
CFA Interpolation
Most CMOS sensors used in cameras today have sensitivities spanning the entire visible spectrum or more. In order to obtain information about the different color bands when a photograph is taken, a CFA is placed in front of the sensor array. With this in place, each pixel in the array can only detect one band of colors, determined by the color filter in front of it. Numerous CFA patterns are utilized, each featuring three or more colors. The most common is the Bayer pattern, shown in Figure 1.
The remaining color channels for a given pixel must be interpolated from neighboring pixels. In order to do this, a CFA demosaicking algorithm must be implemented, of which there are literally hundreds (for a description of some of these algorithms, see [1]). The common theme amongst all CFA algorithms is that the interpolated values are some combination of neighboring measured values. The method proposed in [1] and emulated here assumes a linear model (i.e. that interpolated values are a weighted sum of neighboring measured values).
Methods
As mentioned previously, the methods used here are those used in [1], which describes an algorithm the authors call expectation/maximization (EM), a version of iteratively reweighted least squares.
Expectation/Maximization Algorithm
Even assuming a linear model, there are still several issues that need to be addressed. The primary issue is that it is not known which pixels are interpolated and which pixels were actually measured by the sensor (estimated in the expectation step) or by how much each neighbor for a given pixel should be weighted (estimated in the maximization step).
Expectation Step
The EM algorithm begins by estimating the probability that each pixel is interpolated and fits the linear model
(1)
where f(x,y) denotes the pixel value for a given color channel, αu,v denotes the weighting of each neighbor, N is the chosen neighborhood size, and n(x,y) is an error (i.e. noise) term.
Maximization Step
In the maximization step, weighted least squares is used to determine the αu,v terms that minimizes the total error over the entire image between the actual pixel values and the estimates for the pixel values determined by the linear model.
(4)
Each term in the sum is weighted by the probability that the pixel is interpolated, given by equation (2). If these probabilities were exactly correct (i.e. 1 for interpolated pixels and 0 for measured pixels), only pixels that were truly interpolated would contribute to the total error. As shown in [2], equation (4) is the characteristic error equation for iteratively reweighted least squares, which have solutions for the αu,v terms of the form
(5)
where t here represents the current iteration.
Similarity Measure
The EM algorithm essentially provides two outputs (one from each of its steps): 1) a probability map of the likelihoods that each pixel is interpolated and 2) the αu,v terms that best fit the linear model. The authors of [1] propose using the Fourier transform of the probability map as a metric to determine whether an image is a forgery by comparing it to the Fourier transform of a synthesized, ideal probability map. For each color channel, this ideal probability map is generated by assuming the CFA has a Bayer pattern (as shown in Figure 1) and distributing the probabilities perfectly (i.e. the interpolation probability is 0 for measured pixels and 1 for interpolated pixels). This is illustrated for each color channel in equation (6).
(6)
As shown in equation (7), [1] proposes a measure of similarity between the two Fourier transforms as the inner product of their magnitudes. Magnitudes are used in order for the similarity measure to be phase invariant (i.e. so that the synthesized probability map is not required to align perfectly with the EM-generated probability map).
(7)
The authors of [1] propose setting an empirically determined similarity measure threshold, above which a color channel is determined to be CFA interpolated and thus not forged. As will be discussed in the Results section, this similarity measure has too large a range for the image set in question in order to reliably generate a threshold that could be used to determine forgeries. The cosine similarity (equation 8) is proposed as an alternative measure of similarity, which takes into account the magnitudes of the vectors in question and generates a similarity measure in the range [0,1].
(8)
Results
Retinotopic models in native space
Some text. Some analysis. Some figures.
Retinotopic models in individual subjects transformed into MNI space
Some text. Some analysis. Some figures.
Retinotopic models in group-averaged data on the MNI template brain
Some text. Some analysis. Some figures. Maybe some equations.
Equations
If you want to use equations, you can use the same formats that are use on wikipedia.
See wikimedia help on formulas for help.
This example of equation use is copied and pasted from wikipedia's article on the DFT.
The sequence of N complex numbers x0, ..., xN−1 is transformed into the sequence of N complex numbers X0, ..., XN−1 by the DFT according to the formula:
- Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle X_k = \sum_{n=0}^{N-1} x_n e^{-\frac{2 \pi i}{N} k n} \quad \quad k = 0, \dots, N-1}
where i is the imaginary unit and Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle e^{\frac{2 \pi i}{N}}} is a primitive N'th root of unity. (This expression can also be written in terms of a DFT matrix; when scaled appropriately it becomes a unitary matrix and the Xk can thus be viewed as coefficients of x in an orthonormal basis.)
The transform is sometimes denoted by the symbol Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle \mathcal{F}} , as in Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle \mathbf{X} = \mathcal{F} \left \{ \mathbf{x} \right \} } or Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle \mathcal{F} \left ( \mathbf{x} \right )} or Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle \mathcal{F} \mathbf{x}} .
The inverse discrete Fourier transform (IDFT) is given by
- Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle x_n = \frac{1}{N} \sum_{k=0}^{N-1} X_k e^{\frac{2\pi i}{N} k n} \quad \quad n = 0,\dots,N-1.}
Retinotopic models in group-averaged data projected back into native space
Some text. Some analysis. Some figures.
Conclusions
Here is where you say what your results mean.
References
Software
Appendix I - Code and Data
Code
Data
Appendix II - Work partition (if a group project)
Brian and Bob gave the lectures. Jon mucked around on the wiki.