Comparison between Large Displacement Optical Flow Algorithms

From Psych 221 Image Systems Engineering
Revision as of 17:35, 22 March 2014 by imported>Projects221 (References)
Jump to navigation Jump to search

Background

Large Displacement


Optical flow has been considered an almost solved problem with the algorithm suggested by Lucas and Kanade[1]. However, due to the intrinsic property of the algorithm based on constant brightness assumption in neighbor pixels often the conventional algorithm[1,2] fails when there is a large displacement due to fast movement of the object, camera motion or occlusions.

Large Displacement Optical Flow Algorithm

Brox and Malik proposed an optical flow algorithms that adapts variational approach combining it with descriptor matching[3]. The idea of their work is to find the corresponding region using sparse SIFT[4] descriptor. Their method can be described like following as a formula.

Energy

The algorithm suggested by Brox and Malik[3] mainly consists of three parts; Data term(color and gradient constancy), smoothness term and descriptor matching term.

Methods - Deep Flow and Deep Matching

Deep Flow


In [5], Deep Flow has been suggested that uses Deep Matching in descriptor matching. In spirit, Deep Flow is similar to Brox and Malik's work[3] in several aspects that Deep Flow also uses variational flow in their energy minimization and incorporated descriptor matching to find the corresponding regions, but it is different in that Brox and Malik[3] does sparse descriptor matching and Deep Flow is doing dense correspondences matching, what they call Deep Matching. In following sections we explain more about Deep Flow properties, structure and formula.

Deep Matching (1). independently movable subpatches

As we can see from the figure below, in SIFT descriptor matching, for example in the figure below using the conventional HoG template matching, the second configuration is the best matching result. However, rather than using rigid configuration of each subpatches, Deep Matching allows each subpatch to to find the best fit to the target image from the reference image.

Deep Matching (2). convolution

The reference(first) image is divided into non-overlapping pathes of 4 by 4 pixels and convoluted with the target image.

Deep Matching (3). aggregating into larger patches

After response maps are generated from convolution, do max-pooling and sub-sample the response map to reduce the size into half and aggregate the patches so create 8 by 8 patches from 4 by 4 response maps. Repeat this process until 16 by 16 patches and 32 by 32 patches are obtained. This multi-layer structure is similar to deep convolutional nets[6] and that is the reason the name Deep Matching and Deep Flow comes from.


Deep Matching (4). quasi-dense correspondes

In the pyramid of multi-scale response maps, local maximum is retrieved from every matched patch, even in weakly textured areas. This is the aspect that makes Deep Matching and Deep Flow stronger than Brox and Malik's method which uses sparse descriptor matching and tends to miss the points in the textureless areas.

Results

Figure below shows one example from MIP-Sintel dataset[7]. The result of optical flow computation is shown on the last column and the movement in the arm and head is visualized.

Ultimately the underwater color rig was successful in providing convincing estimates of underwater illumination. A summary of intermediate and final results, along with overviews of the data processing techniques used are introduced below.

Conclustions and Future work

Ultimately the underwater color rig was successful in providing convincing estimates of underwater illumination. A summary of intermediate and final results, along with overviews of the data processing techniques used are introduced below.Ultimately the underwater color rig was successful in providing convincing estimates of underwater illumination. A summary of intermediate and final results, along with overviews of the data processing techniques used are introduced bUltimately the underwater color rig was successful in providing convincing estimates of underwater illumination. A summary of intermediate and final results, along with overviews of the data processing techniques used are introduced b

References

1. B. D. Lucas and T. Kanade. An iterative image registration technique with an application to stereo vision. In proceedings of the 7th International Joint Conference on Artificial Intelligence, 1981.
2. B. Horn and B. Schunck. Determining optical flow. Artificail Intelligence, 17:185-203, 1981.
3. T. Brox, C. Bregler, and J. Malik. Large displacement optical flow. In IEEE Conference on Computer Vision and Pattern Recognition(CVPR), 2009.
4. D. Lowe. Distintive image features from scale-invariant keypoints. International Journal of Computer Vision, 60(2):91-110, 2004.
5. DeepFlow: Large displacement optical flow with deep matching Philippe Weinzaepfel, Jerome Revaud, Zaid Harchaoui and Cordelia Schmid, Proc. ICCV‘13, December, 2013.
6. Y. LeCun, L. Bottou, Y. Bengio, and P. Haffner. Gradient-based learning applied to document recognition. Proceedings of the IEEE, november 1998.
7. D. J. Butler, J. Wulff, G. B. Stanley, and M. J. Black. A naturalistic open source movie for optical flow evaluation. In ECCV, 2012.