LuppescuShah: Difference between revisions
imported>Student2016 |
imported>Student2016 |
||
| Line 67: | Line 67: | ||
===Hole Filling Results=== | ===Hole Filling Results=== | ||
Given below are the results we obtained for the mean filter-based hole filling algorithm. The image was taken using the Intel RealSense SR300 camera. 4 iterations of hole filling were applied on the depth image. The size of the neighborhood used was 3x3. It can be seen that after each iteration more holes are filled, but at the same time, there is a loss of detail and sharp edges. Hence, there is a trade-off between the amount of hole filling and sharpness in the image. It is important to note that warmer colors like red represent higher depth values and cooler colors like blue represent smaller depth value. Thus, the darkest shade of blue (color with covers most area in the image) correspond to zero values and hence, hole. The background is mostly made up of zero values as that distance in the scene is out of the range of the short-range camera. | Given below are the results we obtained for the mean filter-based hole filling algorithm. The image was taken using the Intel RealSense SR300 camera. 4 iterations of hole filling were applied on the depth image. The size of the neighborhood used was 3x3. It can be seen that after each iteration more holes are filled, but at the same time, there is a loss of detail and sharp edges. Hence, there is a trade-off between the amount of hole filling and sharpness in the image. It is important to note that warmer colors like red represent higher depth values and cooler colors like blue represent smaller depth value. Thus, the darkest shade of blue (color with covers most area in the image) correspond to zero values and hence, hole. The background is mostly made up of zero values as that distance in the scene is out of the range of the short-range camera. | ||
[[File:Greg_Raj_Mean_Hole_Filling.png|center]] | [[File:Greg_Raj_Mean_Hole_Filling.png|center|width=10%]] | ||
==Conclusion & Future Work== | ==Conclusion & Future Work== | ||
Revision as of 02:31, 15 December 2016
Introduction
Background
Methods
Hole Filling
The pixels in the depth image with zero values were considered to be holes.
Mean filter-based Hole Filling
We implemented a basic hole-filling algorithm using a mean filter. A mean filter of size NxN updates the central pixels with the mean of the neighborhood around the central pixel. However, directly applying a mean filter to the entire image updates not only the value of the holes but also the value of the pixels with valid depth values. Hence, we first find the pixels that correspond to holes i.e. have value = 0 and apply the mean filter to update values of only the pixels that correspond to the holes. Also, the neighborhood of a hole may have holes as well. For faster convergence, it is useful not to include the holes in the neighborhood while calculating the mean. Thus, for each pixel corresponding to a hole, this can be written mathematically as -
where -
is the value of the pixel at after mean filtering
is the value of the pixel at after mean filtering
is the neighborhood of pixels around the central pixel
The hole-filling pipeline is given below -
Median filter-based Hole Filling
Hole-filling can also be implemented using the algorithm mentioned by using a median filter instead of a mean filter. A median filter updates the value of a pixel to the median of the values in a neighborhood around the pixel. Again, to speed-up the convergence, one should only consider the non-zero values to calculate the median. Thus, for each pixel corresponding to a hole, this can be written mathematically as -
The hole-filling pipeline using the median filter is given below -
Segmentation
K-means Segmentation
K-means is one of the simplest image segmentation algorithms. The main purpose of this algorithm is to cluster data. In our case, we want to cluster pixels by depth such that pixels at similar depths will be in the same cluster. A theoretical treatment of this problem ultimately leads to a two-step algorithm: Step1 Step 2. In step 1, each data point is assigned to the cluster with the closest centroid. Step 2 then recalculates the cluster centroids given the new assignments from step 1. These two steps are repeated until convergence, i.e. the change in cluster centers is below a certain threshold.
Mean shift Segmentation
As an alternative to K-means, we also implemented the mean shift segmentation algorithm. The mean shift algorithm seeks local maxima of densities in a given distribution, i.e. it seeks local modes in distributions. The equation to determine the centroid is:
Consider the following figure in a feature space of dimension 2:
In the first figure, the blue point is the centroid of the previous cluster and the orange point is the new centroid of the current cluster. At each iteration, the center of the search area is shifted to the new centroid, and this occurs until convergence, where the difference between the new centroid and the old centroid is below some threshold. In the case of depth map images, the feature space is simply the 1d histogram of the grayscale values. In general, the pseudo code for the mean shift algorithm is as follows:
Until all pixels have been “seen” For a given search radius Choose a starting point that hasn’t “seen” (in our case, the grayscale value of one of the pixels in our depth image) Compute the mean of the data in the search window Mark all pixels in search radius as “seen” Center the search window at the new mean location Repeat until convergence. Once converged: If the current search radius does not overlap with any existing cluster Declare a new cluster Otherwise Do not decare a new cluster Assign all pixels to closest cluster centers
The main point of mean shift is that instead of specifying a fixed number of clusters (like in K-means), one must simply specify a search radius. Intuitively, the smaller the search radius, the more clusters you will have, as there will be fewer overlapped clusters.
Simulating Depth of Field
Depth of field is region around the focal plane that appears acceptably sharp in an image. Sometimes it is desirable to have certain parts of image in focus and others out of focus. For example, focusing on the foreground, while blurring the background. For this purpose, we need a shallow depth of field. For images, where almost everything in the scene should be in focus, we need a larger depth of field. Typically, depth of field depends on the type of lens, aperture and focal length.
Since depth maps give us information about how far objects in a scene are, we figured we could use that information to simulate depth of field. Given a reference point from the depth map, the amount of blur applied to each pixel is determined by the distance of that pixel from the reference pixel. The farther away the pixel from the reference pixel, the more blur applied. Essentially, we made the variance of a Gaussian blurring kernel proportional to how far a pixel was from a reference pixel.
Simulating Tilt-shift
Tilt-shift is achieved by tilting and shifting the camera such that one can selectively focus a region of an image. This effect can be simulated by making a wedge-shaped plane to be in focus and by blurring out the other parts of an image relative to the plane. For the blurring aspect of tilt-shift, we incorporate depth information for more realistic blurring.
Results
Hole Filling Results
Given below are the results we obtained for the mean filter-based hole filling algorithm. The image was taken using the Intel RealSense SR300 camera. 4 iterations of hole filling were applied on the depth image. The size of the neighborhood used was 3x3. It can be seen that after each iteration more holes are filled, but at the same time, there is a loss of detail and sharp edges. Hence, there is a trade-off between the amount of hole filling and sharpness in the image. It is important to note that warmer colors like red represent higher depth values and cooler colors like blue represent smaller depth value. Thus, the darkest shade of blue (color with covers most area in the image) correspond to zero values and hence, hole. The background is mostly made up of zero values as that distance in the scene is out of the range of the short-range camera.
Conclusion & Future Work
In this project, we explored different hole filling procedures for depth maps, implemented different segmentation algorithms, and simulated depth of field and tilt-shift photography as post processing effects based on depth maps. Hole filling worked very well for the close range images. For next year, an extension of hole filling could be to implement more complex hole filling algorithms like colorization and hierarchical hole filling and to compare the results to the simpler methods presented in this project. We are happy we explored both K-means and mean shift segmentation, as both seemed to work well in different scenarios. For the short range camera, we achieved foreground detection by using K-means with number of clusters = 2. For the longer range depth image, we used mean shift with a radius of 0.04. K-means failed to segment out the body of the person standing in the image from the rest of the clutter in the long range image, but mean shift did reasonably well. The downside to mean shift is that it is much more computationally expensive. For next year, students could explore other types of segmentation algorithms for depth maps like the n-cuts segmentation algorithm.
The tilt-shift algorithm works reasonably well, as there is a miniaturization effect when done correctly on an image. However, we were not able to incorporate a large amount of depth information because, in order for tilt-shift to work, the object needs to be sufficiently far away -- which is mostly outside of the range of the long-range depth camera.
Lastly, depth maps allowed us to automatically simulate depth of field given a point of interest, which would be quite tedious given only RGB information. We noticed that there were edge effects when computing depth of field, creating a “glowing” effect around an object in focus. For next year, students could try to improve this depth of field algorithm by accounting for those edge artifacts to reduce the “glowing.” Also, we chose an arbitrary amount to blur based on distance, so another extension of this project could be to incorporate camera information like focal length and f-number to simulate how images would look with different types of cameras.
References
Appendix
Appendix 1
Appendix 2
Work breakdown:
Raj - Got cameras working so we could get depth images from both the long range and short range cameras.
Greg - Implemented GUI.
Both - All the algorithms were implemented together.