Comparison between Large Displacement Optical Flow Algorithms: Difference between revisions
imported>Projects221 |
imported>Projects221 |
||
| Line 38: | Line 38: | ||
[[File:aggregationdetail.png]] | [[File:aggregationdetail.png]] | ||
=== Deep Matching (4). | === Deep Matching (4). quasi-dense correspondes === | ||
In the pyramid of multi-scale response maps, local maximum is retrieved from every matched patch, even in weakly textured areas. This is the aspect that makes Deep Matching and Deep Flow stronger than Brox and Malik's method which uses sparse descriptor matching and tends to miss the points in the textureless areas. | |||
[[File:multiscalepyramid.png]] | [[File:multiscalepyramid.png]] | ||
Revision as of 16:34, 22 March 2014
Background
Large Displacement
Optical flow has been considered an almost solved problem with the algorithm suggested by Lucas and Kanade[1]. However, due to the intrinsic property of the algorithm based on constant brightness assumption in neighbor pixels often the conventional algorithm[1,2] fails when there is a large displacement due to fast movement of the object, camera motion or occlusions.
Large Displacement Optical Flow Algorithm
Brox and Malik proposed an optical flow algorithms that adapts variational approach combining it with descriptor matching[3]. The idea of their work is to find the corresponding region using sparse SIFT[4] descriptor. Their method can be described like following as a formula.
Energy
The algorithm suggested by Brox and Malik[3] mainly consists of three parts; Data term(color and gradient constancy), smoothness term and descriptor matching term.

Methods - Deep Flow and Deep Matching
Deep Flow
In [5], Deep Flow has been suggested that uses Deep Matching in descriptor matching. In spirit, Deep Flow is similar to Brox and Malik's work[3] in several aspects that Deep Flow also uses variational flow in their energy minimization and incorporated descriptor matching to find the corresponding regions, but it is different in that Brox and Malik[3] does sparse descriptor matching and Deep Flow is doing dense correspondences matching. In following sections we explain more about Deep Flow properties, structure and formula.
Deep Matching (1). independently movable subpatches
As we can see from the figure below, in SIFT descriptor matching, for example in the figure below using the conventional HoG template matching, the second configuration is the best matching result. However, rather than using rigid configuration of each subpatches, Deep Matching allows each subpatch to to find the best fit to the target image from the reference image.
Deep Matching (2). convolution
The reference(first) image is divided into non-overlapping pathes of 4 by 4 pixels and convoluted with the target image.
Deep Matching (3). aggregating into larger patches
In the pyramid of multi-scale response maps, local maximum is retrieved from every matched patch, even in weakly textured areas. This is the aspect that makes Deep Matching and Deep Flow stronger than Brox and Malik's method which uses sparse descriptor matching and tends to miss the points in the textureless areas. After response maps are generated from convolution, do max-pooling and sub-sample the response map to reduce the size into half and aggregate the patches so create 8 by 8 patches from 4 by 4 response maps. Repeat this process until 16 by 16 patches and 32 by 32 patches are obtained. This multi-layer structure is similar to deep convolutional nets[6] and that is the reason the name Deep Matching and Deep Flow comes from.
Deep Matching (4). quasi-dense correspondes
In the pyramid of multi-scale response maps, local maximum is retrieved from every matched patch, even in weakly textured areas. This is the aspect that makes Deep Matching and Deep Flow stronger than Brox and Malik's method which uses sparse descriptor matching and tends to miss the points in the textureless areas.
Deep Flow Energy
In [5], Deep Flow has been suggested that uses Deep Matching in descriptor matching. In spirit, Deep Flow is similar to Brox and Malik's work[3] in several aspects that Deep Flow also uses variational flow in their energy minimization and incorporated descriptor matching to find the corresponding regions, but it is different in that Brox and Malik[3] does sparse descriptor matching and Deep Flow is doing dense correspondences matching. In following sections we explain more about Deep Flow properties, structure and formula.
Detecting Forgeries
Regardless of the type of interpolation used in a given image, the estimated pixels should exhibit a strong correlation, or dependence, on their surrounding pixels. If one can categorize each pixel as either correlated or independent with respect to other pixels, one should see a periodic pattern that mimics the CFA used to construct the image. Even if an altered image does not have any visual cues that point to its forgery, inspecting the correlation of pixels to one another can potentially expose which parts of an image were altered. It is important to note that, even if an image has been altered, the mentioned correlations may still be kept intact. In this way, detecting forgeries through CFA interpolation is one tool out of many used to detect forgeries.
Results
Ultimately the underwater color rig was successful in providing convincing estimates of underwater illumination. A summary of intermediate and final results, along with overviews of the data processing techniques used are introduced below.
Dataset
As we can see from the figure below, in SIFT descriptor matching, for example in the figure below using the conventional HoG template matching, the second configuration is the best matching result. However, rather than using rigid configuration of each subpatches, Deep Matching allows each subpatch to to find the best fit to the target image from the reference image.
Macbeth/Xrite Color Checker Reflectances
The Macbeth Color Checker patch reflectances are a part of the ISET suite and were not re-measured for the purposes of this experiment. A chart of the reflectances is shown below:

The reflectances easily cover the spectral extent of the human visual system and beyond into the infared. They make a useful set of spectrally independent color estimation targets. The Xrite color target used in the PVC color rigs is produced by a different manufacturer than the Macbeth target characterized by the ISET suite. While the color patches used by Xrite are assumed to be identical to Macbeth's, a verification of this assumption would guarantee the results presented here.
Canon SX260HS Responsivities
The laboratory configuration described above was used to collect a series of 81 images of linearly independent spectra with the SX260HS with fixed color balance. An example of one of the images is shown below, with a red illuminant. The camera's exposure, ISO, and aperture were set manually to avoid saturation in any of the camera's RGB color channels.

A section of the illumination target within the images was extracted and compared to the dark portions of the image to render an estimate of the average RGB response of the camera to the illuminant. A graph of the RGB responses measured by the SX260HS is shown below. The x-axis is the wavelength setting of the Cornerstone 130 monochromatic light source.
Conclustions and Future work
Ultimately the underwater color rig was successful in providing convincing estimates of underwater illumination. A summary of intermediate and final results, along with overviews of the data processing techniques used are introduced below.Ultimately the underwater color rig was successful in providing convincing estimates of underwater illumination. A summary of intermediate and final results, along with overviews of the data processing techniques used are introduced bUltimately the underwater color rig was successful in providing convincing estimates of underwater illumination. A summary of intermediate and final results, along with overviews of the data processing techniques used are introduced b
References
1. B. D. Lucas and T. Kanade. An iterative image registration technique with an application to stereo vision. In proceedings of the 7th International Joint Conference on Artificial Intelligence, 1981.
2. B. Horn and B. Schunck. Determining optical flow. Artificail Intelligence, 17:185-203, 1981.
3. T. Brox, C. Bregler, and J. Malik. Large displacement optical flow. In IEEE Conference on Computer Vision and Pattern Recognition(CVPR), 2009.
4. D. Lowe. Distintive image features from scale-invariant keypoints. International Journal of Computer Vision, 60(2):91-110, 2004.
5. DeepFlow: Large displacement optical flow with deep matching
Philippe Weinzaepfel, Jerome Revaud, Zaid Harchaoui and Cordelia Schmid,
Proc. ICCV‘13, December, 2013.
6. Y. LeCun, L. Bottou, Y. Bengio, and P. Haffner. Gradient-based learning applied to document recognition. Proceedings of the IEEE, november 1998.




