LuppescuShah: Difference between revisions
imported>Student2016 |
imported>Student2016 |
||
| Line 34: | Line 34: | ||
<!-- Same example as in Wikipedia:WikiProject_Computer_science/Manual_of_style#Algorithms --> | |||
'''algorithm''' ford-fulkerson '''is''' | |||
'''input:''' Graph ''G'' with flow capacity ''c'', | |||
source node ''s'', | |||
sink node ''t'' | |||
'''output:''' Flow ''f'' such that ''f'' is maximal from ''s'' to ''t'' | |||
''(Note that f<sub>(u,v)</sub> is the flow from node u to node v, and c<sub>(u,v)</sub> is the flow capacity from node u to node v)'' | |||
'''for each''' s6edge (''u'', ''v'') '''in''' ''G''<sub>''E''</sub> '''do''' | |||
''f''<sub>(''u'', ''v'')</sub> ← 0 | |||
''f''<sub>(''v'', ''u'')</sub> ← 0 | |||
'''while''' there exists a path ''p'' from ''s'' to ''t'' '''in''' the residual network ''G''<sub>''f''</sub> '''do''' | |||
let ''c''<sub>''f''</sub> be the flow capacity of the residual network ''G''<sub>''f''</sub> | |||
''c''<sub>''f''</sub>(''p'') ← min{''c''<sub>''f''</sub>(''u'', ''v'') | (''u'', ''v'') '''in''' ''p''} | |||
'''for each''' edge (''u'', ''v'') '''in''' ''p'' '''do''' | |||
''f''<sub>(''u'', ''v'')</sub> ← ''f''<sub>(''u'', ''v'')</sub> + ''c''<sub>''f''</sub>(''p'') | |||
''f''<sub>(''v'', ''u'')</sub> ← −''f''<sub>(''u'', ''v'')</sub> | |||
'''return''' ''f'' | |||
The main point of mean shift is that instead of specifying a fixed number of clusters (like in K-means), one must simply specify a search radius. Intuitively, the smaller the search radius, the more clusters you will have, as there will be fewer overlapped clusters. | The main point of mean shift is that instead of specifying a fixed number of clusters (like in K-means), one must simply specify a search radius. Intuitively, the smaller the search radius, the more clusters you will have, as there will be fewer overlapped clusters. | ||
Revision as of 04:03, 14 December 2016
Introduction
Background
Methods
Hole Filling
The pixels in the depth image with zero values were considered to be holes.
Mean filter-based Hole Filling
We implemented a basic hole-filling algorithm using a mean filter. A mean filter of size NxN updates the central pixels with the mean of the NxN neighborhood around the central pixel. However, directly applying a mean filter to the entire image updates not only the value of the holes but also the value of the pixels with valid depth values. Hence, we first find the pixels that correspond to holes i.e. have value = 0 and apply the mean filter to update values of only the pixels that correspond to the holes. Also, the neighborhood of a hole may have holes as well. For faster convergence, it is useful not to include the holes in the neighborhood while calculating the mean. Thus, for each pixel corresponding to a hole, this can be written mathematically as -
The hole-filling pipeline is given below -
Median filter-based Hole Filling
Hole-filling can also be implemented using the algorithm mentioned by using a median filter instead of a mean filter. A median filter updates the value of a pixel to the median of the values in a neighborhood around the pixel. Again, to speed-up the convergence, one should only consider the non-zero values to calculate the median. Thus, for each pixel corresponding to a hole, this can be written mathematically as -
The hole-filling pipeline using the median filter is given below -
Segmentation
K-means Segmentation
K-means is one of the simplest image segmentation algorithms. The main purpose of this algorithm is to cluster data. In our case, we want to cluster pixels by depth such that pixels at similar depths will be in the same cluster. A theoretical treatment of this problem ultimately leads to a two-step algorithm: Step1 Step 2. In step 1, each data point is assigned to the cluster with the closest centroid. Step 2 then recalculates the cluster centroids given the new assignments from step 1. These two steps are repeated until convergence, i.e. the change in cluster centers is below a certain threshold.
Mean shift Segmentation
As an alternative to K-means, we also implemented the mean shift segmentation algorithm. The mean shift algorithm seeks local maxima of densities in a given distribution, i.e. it seeks local modes in distributions. The equation to determine the centroid is:
Consider the following figure in a feature space of dimension 2:
In the first figure, the blue point is the centroid of the previous cluster and the orange point is the new centroid of the current cluster. At each iteration, the center of the search area is shifted to the new centroid, and this occurs until convergence, where the difference between the new centroid and the old centroid is below some threshold. In the case of depth map images, the feature space is simply the 1d histogram of the grayscale values. In general, the pseudo code for the mean shift algorithm is as follows:
algorithm ford-fulkerson is
input: Graph G with flow capacity c,
source node s,
sink node t
output: Flow f such that f is maximal from s to t
(Note that f(u,v) is the flow from node u to node v, and c(u,v) is the flow capacity from node u to node v)
for each s6edge (u, v) in GE do
f(u, v) ← 0
f(v, u) ← 0
while there exists a path p from s to t in the residual network Gf do
let cf be the flow capacity of the residual network Gf
cf(p) ← min{cf(u, v) | (u, v) in p}
for each edge (u, v) in p do
f(u, v) ← f(u, v) + cf(p)
f(v, u) ← −f(u, v)
return f
The main point of mean shift is that instead of specifying a fixed number of clusters (like in K-means), one must simply specify a search radius. Intuitively, the smaller the search radius, the more clusters you will have, as there will be fewer overlapped clusters.
Simulating Depth of Field
Depth of field is region around the focal plane that appears acceptably sharp in an image. Sometimes it is desirable to have certain parts of image in focus and others out of focus. For example, focusing on the foreground, while blurring the background. For this purpose, we need a shallow depth of field. For images, where almost everything in the scene should be in focus, we need a larger depth of field. Typically, depth of field depends on the type of lens, aperture and focal length.
Since depth maps give us information about how far objects in a scene are, we figured we could use that information to simulate depth of field. Given a reference point from the depth map, the amount of blur applied to each pixel is determined by the distance of that pixel from the reference pixel. The farther away the pixel from the reference pixel, the more blur applied. Essentially, we made the variance of a Gaussian blurring kernel proportional to how far a pixel was from a reference pixel.
Simulating Tilt-shift
Tilt-shift is achieved by tilting and shifting the camera such that one can selectively focus a region of an image. This effect can be simulated by making a wedge-shaped plane to be in focus and by blurring out the other parts of an image relative to the plane. For the blurring aspect of tilt-shift, we incorporate depth information for more realistic blurring.
Results
Conclusion & Future Work
In this project, we explored different hole filling procedures for depth maps, implemented different segmentation algorithms, and simulated depth of field and tilt-shift photography as post processing effects based on depth maps. Hole filling worked very well for the close range images. For next year, an extension of hole filling could be to implement more complex hole filling algorithms like colorization and hierarchical hole filling and to compare the results to the simpler methods presented in this project. We are happy we explored both K-means and mean shift segmentation, as both seemed to work well in different scenarios. For the short range camera, we achieved foreground detection by using K-means with number of clusters = 2. For the longer range depth image, we used mean shift with a radius of 0.04. K-means failed to segment out the body of the person standing in the image from the rest of the clutter in the long range image, but mean shift did reasonably well. The downside to mean shift is that it is much more computationally expensive. For next year, students could explore other types of segmentation algorithms for depth maps like the n-cuts segmentation algorithm.
The tilt-shift algorithm works reasonably well, as there is a miniaturization effect when done correctly on an image. However, we were not able to incorporate a large amount of depth information because, in order for tilt-shift to work, the object needs to be sufficiently far away -- which is mostly outside of the range of the long-range depth camera.
Lastly, depth maps allowed us to automatically simulate depth of field given a point of interest, which would be quite tedious given only RGB information. We noticed that there were edge effects when computing depth of field, creating a “glowing” effect around an object in focus. For next year, students could try to improve this depth of field algorithm by accounting for those edge artifacts to reduce the “glowing.” Also, we chose an arbitrary amount to blur based on distance, so another extension of this project could be to incorporate camera information like focal length and f-number to simulate how images would look with different types of cameras.
References
Appendix
Appendix 1
Appendix 2
Work breakdown:
Raj - Got cameras working so we could get depth images from both the long range and short range cameras.
Greg - Implemented GUI.
Both - All the algorithms were implemented together.