Smartphone Camera Quality: Difference between revisions

From Psych 221 Image Systems Engineering
Jump to navigation Jump to search
imported>Student2017
imported>Student2017
Line 102: Line 102:


=Results=
=Results=
==Summary of Data Collected==
In total, we had 51 participants take photos for us, with a total of 472 images uploaded.  Of these images, 157 captured the color checker and grey card, 154 captured the grid of lines, and 161 captured the grid of dots.  After image pre-processing (cropping, rotating, etc.), we had 76 color checker/grey card images, 135 grid of dots images, and 109 grid of lines images that we were able to analyze and compare.
There were seven different smartphone manufacturers in our sample, with the distribution of smartphone manufacturers in our sample shown below:
There were seven different smartphone models in our sample, with the distribution of smartphone models in our sample shown below:
In the form, we captured information about how long each participant has had his or her smartphone.  Given that we are located in the heart of Silicon Valley, we anticipated that most of the smartphones in our sample would be relatively new, but we found that over half of the smartphones in our sample were over one year old.  The length of ownership of smartphones in our sample is distributed as shown below:
We were concerned about having too small of sample size for some phone models (in some cases we had only one participant with a given smartphone model), so we decided to group the images together for smartphones made by the same manufacturer with the same or very similar camera specifications.  The groupings were defined as follows, along with the number of samples and images for each group:
==Image Targets==
For each of our image targets, we chose to represent our results using a box-and-whiskers plot for the computed value of interest for each smartphone model group.  As in a standard box-and-whiskers plot, the hash marks at the top and bottom represent the maximum and minimum values, respectively.  The top edge of the shaded box represents the 3rd quartile value, the bottom edge represents the 1st quartile value, and the line in the middle of the box represents the median value.  The “X” mark represents the mean value.  Along the horizontal axis, the groups are ordered from largest number of images in the sample to smallest, starting from the left.  A box-and-whiskers plot where the values are all close together would indicate consistency across different units of the same model of smartphone, while a box-and-whiskers plot where the values are farther apart would indicate less consistency.  All charts shown below were created in this way.
===Color Checker and Grey Card===
For the color checker/grey card image target, we computed the CIELAB deltaE value of each color patch imaged by the smartphone camera relative to the true patches in the color checker.  We are using values measured in the lab with a spectrophotometer under the same illuminant conditions (the daylight illuminant in the light booth) as the true values for each patch in the color checker and the grey card as well as the white point.  The deltaE value is on the vertical axis in the plot.  A value of zero means that it would be difficult to distinguish the color produced by the smartphone camera from the true color in the color checker, while a positive value means that it would be easier to distinguish the difference. 
Prior to data collection, we expected each smartphone camera to have values that were non-zero, and that the relative magnitude of the deltaE values would be somewhat consistent across smartphones made by the same manufacturer (i.e. all iPhones should produce images with similar colors).  As can be seen in the plots, the 6th-generation and most recent generation iPhones appear to be fairly consistent unit-to-unit, while the 7th-generation iPhones appear to show more unit-to-unit variation.  For the dark skin patch, the iPhone models appear to have a similar magnitude relative to zero, but for other color patches, we do not see this trend.
Our results for the grey card are given below:
For brevity, we have included results for a selection of patches from the color checker in this report.  Because the appearance of skin tones is very important for the subjective perception of image realism, we have included the results for the light and dark skin patches on the color checker.  Because of their use in color ink for printers, we have included the results for the cyan, magenta, and yellow patches on the color checker.  Full results for all color patches can be found in the spreadsheet in Appendix II. 
Our results for the light and dark skin patches are given below:
Our results for the cyan, magenta, and yellow patches are given below:
===Grid of Dots===
For the grid of dots image target, we computed the proportional diameter difference between the dots at the center of the image and the dots at the edge of the image, which is on the vertical axis in this plot. We are using this value as a proxy for chromatic aberration and geometric distortion.  A value of zero means that there is no additional spreading for the dots at the edges of the image relative to those at the center, while a positive value means that the dots at the edges of the image exhibit more spreading that the dots at the center of the image.
Prior to data collection, we expected each smartphone camera to have values that were non-zero, and that the relative magnitude of the values would be smaller for newer smartphone models (i.e. newer phones have better lenses).  As can be seen in the plots, the phones on the left-hand side of the chart seem to show distributions of a similar magnitude and no one phone model seems to be closer in magnitude to zero than any of the others. 
Our results for the grid of dots images are as follows:
===Grid of Lines===
For the grid of lines image target, we computed the relative bend per box (in mm) of the boxes at the edges of the image relative to the boxes at the center of the image, which is on the vertical axis in this plot.  We are using this value as a proxy for geometric distortion.  A value of zero means that there is minimal geometric distortion, while a positive value means that there is more distortion at the edges relative to the center of the lens.
Prior to data collection, we expected each smartphone camera to have values that were non-zero, and that the relative magnitude of the values would be smaller for newer smartphone models (i.e. newer phones have better lenses).  As can be seen in the plots, the phones on the left-hand side of the chart seem to show distributions of a similar magnitude and no one phone model seems to be closer in magnitude to zero than any of the others. 
Our results for the grid of lines images are as follows:


=Conclusions=
=Conclusions=

Revision as of 07:49, 15 December 2017

Introduction

With the increasing quality of smartphone cameras and software, everyone--from professional photographers to amateurs taking selfies--is using their smartphone as the primary device to capture images. In recent years, consumer desire to take and share high-quality photos with a device as portable as a smartphone has exploded, as evidenced by the popularity of image capture and sharing applications like Snapchat and Instagram and the ever-increasing need for photo storage space through services like Apple iCloud and Google Photos. Following this trend, there is increasing interest in quantifying and comparing the quality control of images captured by all smartphones that leave the manufacturing plant. The DxOMark rankings are treated as a reliable metric for comparison, but the sample size and complexity of the tests required to generate these rankings is limited to only a few units of each phone.

For our project, we would like to assess the unit-to-unit variation in image quality among smartphones in the real world. In particular, we are curious if there are variations based on smartphone manufacturer, model, price, firmware/OS version, length of ownership/use, and/or physical condition. We believe that things related to physical properties of camera may vary from unit to unit, so we think that chromatic aberration, distortion, and/or color metrics might show some variation in our preliminary results. Smartphones are now approaching $1000 with almost all recent improvements devoted to the camera, so we believe that this is an important and technologically-relevant area to explore for our final project.

Background

Smartphone camera quality is an emerging area of interest for researchers and consumers. There are two primary standards for smartphone camera quality evaluation: the DxOMark ratings1, and the IEEE Standard2.

DxOMark

DxOMark is a company that performs camera and lens image quality assessments and then provide ratings for consumers. In addition to digital camera sensors and lenses, DxOMark reviews mobile phone cameras and ranks them based on a variety of measurements. On a smartphone, they analyze the performance of the imaging pipeline in its entirety, including lens, sensor, camera control, and image processing. Their protocol includes a combination of lab testing and perceptual evaluation of images taken in the lab and in a variety of indoor and outdoor scenes. DxOMark reports sub-scores in several different categories in additional to an overall score, which is used to rank the smartphone cameras. DxOMark reviews both photos and videos captured on the smartphones. For photos, their evaluation metrics include:

  • Exposure and contrast, including dynamic range, exposure repeatability, and contrast
  • Color, including saturation and hue, white balance, white balance repeatability, and color shading
  • Texture and noise
  • Autofocus, including AF speed and repeatability
  • Artifacts, including softness in the frame, distortion, vignetting, chromatic aberrations, ringing, flare, ghosting, aliasing, moiré patterns, and more
  • Flash
  • Zoom at several subject distances
  • Bokeh

For videos, their evaluation metrics include:

  • Exposure
  • Color
  • Texture and noise
  • Autofocus
  • Artifacts
  • Stabilization

DxOMark is seen as the industry standard and their rankings are referenced in popular press and industry publications, including Forbes, The Verge, Wired, and TechRadar.

IEEE Standard

IEEE Std 1858-2016: Camera Phone Image Quality provides a detailed specification of test conditions and apparatus for evaluating smartphone image quality. The standard includes protocols for lab-based assessments as well as subjective perceptual evaluations. The evaluation metrics considered include spatial frequency response, lateral chromatic displacement, chroma level, color uniformity, local geometric distortion, visual noise, and texture blur.


Methods

Data Collection

For our project, we recruited participants to capture images with their smartphones on Stanford’s campus. Our objective was to have a broad sample of smartphone cameras, so we collected data at several different locations on campus over a period of two weeks. Our data collection locations included the Stanford Bookstore, the Graduate School of Business, the SCIEN Image Systems Engineering Seminar, the Psychology 221 class, and the SCIEN Affiliates Meeting.

File:MfakBookstore.JPG
Collecting data at the Stanford Bookstore

Our objective for our data collection procedure was for it to be controlled and repeatable. We created a series of three image targets and had participants take multiple photos of each of them. In order to provide a consistent illuminant, we borrowed a light booth and brought it to each data collection location and used the daylight illuminant for every image captured. We covered the front of the light booth with a piece of cardboard and cut a small opening to point the phone camera through in order to minimize stray light entering the light booth. We used a piece of Styrofoam mounted on the cardboard front cover to serve as a stand for the phones while participants were taking photos. While different phones have the camera at different locations, the stand ensured that all phones were held in the same horizontal alignment during image capture.

We had participants capture three consecutive photos of each of three different image targets. All photos were taken with the phone in landscape orientation, using the following settings:

  • Photo mode
  • Flash, live, and filters off
  • Zoom out
  • Highest resolution
  • No image compression
  • All other settings (HDR, etc.) to auto

The image targets were mounted on pieces of cardboard that we cut to match the dimensions of the rear panel of the light booth. For each round of photos with a given image target, we removed the light booth’s lid and placed the piece of cardboard (with the image target on it) flush against the rear panel of the light booth and used binder clips to fasten it to the upper edge of the rear panel. We then placed the lid back on the light booth and had the participant take three photos. After these three photos were taken, we removed the light booth lid and replaced the cardboard panel with another panel with a different image targets.

Once participants had captured all nine images, we directed them to complete a Google form. On each face of the light booth, we posted a sign that included the URL for the form and a QR code that would take users directly to the form. The form included the following questions:

  1. How long have you had your smartphone?
  2. What brand of smartphone do you have?
  3. What model of smartphone do you have?
  4. What operating system version is your phone running?
  5. Image upload
  6. If you would like to be entered into a raffle for a $50 Amazon gift card, please enter your email

We encouraged participants to complete the form while they were with us at the light booth, as there is the possibility that they would forget to submit after the fact. We used the form responses to measure our progress and number of participants and it does not appear that many, if any, took the photos but failed to complete the form. We spent most of our data collection time at events where attendees are interested in image systems engineering and our project topic, as we believed they would be more willing to participate and more likely to complete the process once they started. Over two weeks, we had 51 different participants take photos, with a total of 472 images uploaded (some people submitted additional photos beyond the required nine).

After we had received all of these form responses, we exported the metadata (responses to the questions) into a spreadsheet and the image files into a folder on our local machines for further processing. For the metadata, we created a unique ID for each participant and used that to associate their images with the correct smartphone model for analysis. A summary of the metadata is included in the Results section below.

For each image target, we computed values to use as our evaluation metrics and compared these values across images captured by the same model of smartphone.

Image Targets

Each cardboard panel has the image target mounted on it, with a rectangular border surrounding it made of bright yellow duct tape. We chose brightly colored tape so that it would be clearly visible to participants and ensure that they were capturing the entire area of the image target if they could see all sides of the rectangle formed by the yellow tape. The three image targets were as follows:

  • Color checker + grey card: One X-Rite ColorChecker Classic and one 5.3 in x 7.28 in Anwenk 18% grey card affixed to the cardboard panel. Both were ordered online from Amazon.com.
  • Grid of dots: A grid of 14 x 22 dots printed on 11 in x 17 in white paper at 1200 dpi. Each dot has diameter 0.12 in and 0.39 in spacing between dots. The dots were located inside of a 10 in x 16 in box with 0.5 in margins on all sides, with a blue outline of width 1 pt. The image target was created in PowerPoint and is attached in Appendix II.
  • Grid of lines: A 10 in x 16 in grid of squares, each with side length 0.5 in. The lines are black and of width 1 pt. The grid has 0.5 in margins on all sides and is printed on 11 in x 17 in white paper at 1200 dpi. The image target was created in PowerPoint and is attached in Appendix II.


Image Processing

Image Standardization

Once all photos were collected and uploaded, it was time to start processing the images. Even though we did our best to control the exact image that a camera should see, varying resolutions, fields of view, and even the slight change in tilt based on how heavily the photographer pressed the camera against our light booth, provided a surprising amount of contrast between the pictures. Since it would have been a poor use of time to manually crop 500 images, the first thing we had to do was standardize each of the samples using the following process:

  • Rotating the photo 2 degrees counterclockwise – Since Matlab is only capable of cropping photos into perfect rectangles, it was important to find out where the corners of the edges existed. Built-in Matlab functions (discussed next) are capable of detecting where edges exist in an image, but these edges are stored as arrays of 2-dimensional points. Since Matlab would not know the what shape these points were forming, we extracted the maximum and minimum X and Y coordinates in order to identify which points identified the corners of the rectangle. Since the images were imperfect, there would be cases where the extreme values would be in unexpected locations. Therefore, the image was rotated to ensure that the top left corner had was located at the minimum X coordinate. While rotation may have introduced error into the equation, the expectation is that this experiment is comparing the difference between similar phones, so the error would be present but offset in all photos.
  • Detecting the edges in the photo – Matlab has quite a few built in functions for detecting the edges of the photo. We ended up having the most success converting the photo to grayscale (rgb2gray), binarizing the photo using the Canny method with a threshold varying between 0.1 and 0.5 (edge), and grouping these resultant values together based on similar locations (bwboundaries).
  • Determining the boundary of the critical region – The above result left us with a list of dozens of boundaries and the challenge of finding out which one highlights our intended features. We found a consistent edge location that was generally free from noise on the lower half of the left side of the image. At this point, the contrast between the tape and the cardboard (or the black box and white background for the lines and dots) was large enough for us to start on the left side of this line and move right pixel by pixel until a boundary was found. It was then determined that this was the boundary that should box each of our display cards, and ideally, this would identify a complete box around our targets.
  • Rotation and Cropping – Using the corners that were found earlier, the image was rotated by an angle that was calculated to line up the extremes in vertical and horizontal rows. After determining the length of the edges of the critical region, the image was then cropped down to a standard field.

Grey Card

Since the exact RGB values of the 18% grey card are known [118 118 118], our goal was to evaluate how two things varied across different versions of the same phone:

  1. Accuracy: Was the average grey color similar to that of the known value?
  2. Consistency: Was there a large spread between the grey pixels across the field of view?

Since the images were now consistent, we were able to hardcode in the locations (as percentages of the width and height since different cameras had different resolutions) that corresponded to the dimensions of the grey card. Since each pixel had an RGB value, we were able to compute the ∇E value for each pixel relative to the known grey with a white point that was measured in the light booth. However, when this proved to be too computationally demanding, we ended up instead determining the average red, green, and blue values and then only computing the ∇E with those values. Due to the nonlinear properties of the CIELAB 1976 functions, we knew this value would not be the true average, but we determined that since our main objective was the compare the difference in results (instead of absolute results) that this would be a consistent offset for all samples. Therefore, we would still be able to compare ∇E values relative to each other. This average ∇E value was returned. In order to determine the spread of the variations, we paired together both the minimum red, green, and blue values as well as the maximum ones. Since the final step in computing the ∇E is the Euclidean distance of the sample and known L^*,a^*,b^* values, the value will always be positive. Therefore, the maximum ∆E created by either the min or max RGB values was returned to show the magnitude of the spread of color.

Color Checker

Just like the second objective of the grey card, the color checker was used to compare the ∆E values between the average RGB values of each color swatch and the known RGB values of each of the colors under consistent lighting conditions.

Grid of Dots

The intention of this target was to evaluate the chromatic aberration using a set of uniformly spaced and sized dots. There would be a large amount of aberration if there is a difference in the diameter of the dots between the dots located in the center and at the edges of the target. After the images were standardized, the boundaries were once again computed. These boundaries perfectly captured the pixels occupied by each of the dots. If dots were too close to the edge of the image (2% of the width/height of the image), there was a chance that the full dot was not seen, so these dots would be ignored from computation. Each dot’s boundary was made up of roughly 200 (x,y) coordinates. Using the “polyarea” function, the area for each function was then computed for each of the dots. The difference between the largest and smallest diameters of these dots was returned.

Grid of Lines

The intention of this target was to evaluate the curvature of the screen at the edges relative to the center by evaluating how the slope of the line changes. Much like the dots, once the image was cropped, the boundaries were again found—this time resulting in each little box being seen as an individual boundary. Once again taking the 2% buffer, the “fit” function was used to determine the best linear fit for each box. This resulted in an equation from which the slope was easily extracted. The slopes from both edges were computed, and the “worst case scenario” value was computed. Taking into consideration the resolution of the photo, the slope was returned as the maximum number of millimeters the line would increase or decrease over the span of one cube (1/2 inch).

Results

Summary of Data Collected

In total, we had 51 participants take photos for us, with a total of 472 images uploaded. Of these images, 157 captured the color checker and grey card, 154 captured the grid of lines, and 161 captured the grid of dots. After image pre-processing (cropping, rotating, etc.), we had 76 color checker/grey card images, 135 grid of dots images, and 109 grid of lines images that we were able to analyze and compare.

There were seven different smartphone manufacturers in our sample, with the distribution of smartphone manufacturers in our sample shown below:

There were seven different smartphone models in our sample, with the distribution of smartphone models in our sample shown below:

In the form, we captured information about how long each participant has had his or her smartphone. Given that we are located in the heart of Silicon Valley, we anticipated that most of the smartphones in our sample would be relatively new, but we found that over half of the smartphones in our sample were over one year old. The length of ownership of smartphones in our sample is distributed as shown below:

We were concerned about having too small of sample size for some phone models (in some cases we had only one participant with a given smartphone model), so we decided to group the images together for smartphones made by the same manufacturer with the same or very similar camera specifications. The groupings were defined as follows, along with the number of samples and images for each group:

Image Targets

For each of our image targets, we chose to represent our results using a box-and-whiskers plot for the computed value of interest for each smartphone model group. As in a standard box-and-whiskers plot, the hash marks at the top and bottom represent the maximum and minimum values, respectively. The top edge of the shaded box represents the 3rd quartile value, the bottom edge represents the 1st quartile value, and the line in the middle of the box represents the median value. The “X” mark represents the mean value. Along the horizontal axis, the groups are ordered from largest number of images in the sample to smallest, starting from the left. A box-and-whiskers plot where the values are all close together would indicate consistency across different units of the same model of smartphone, while a box-and-whiskers plot where the values are farther apart would indicate less consistency. All charts shown below were created in this way.

Color Checker and Grey Card

For the color checker/grey card image target, we computed the CIELAB deltaE value of each color patch imaged by the smartphone camera relative to the true patches in the color checker. We are using values measured in the lab with a spectrophotometer under the same illuminant conditions (the daylight illuminant in the light booth) as the true values for each patch in the color checker and the grey card as well as the white point. The deltaE value is on the vertical axis in the plot. A value of zero means that it would be difficult to distinguish the color produced by the smartphone camera from the true color in the color checker, while a positive value means that it would be easier to distinguish the difference.

Prior to data collection, we expected each smartphone camera to have values that were non-zero, and that the relative magnitude of the deltaE values would be somewhat consistent across smartphones made by the same manufacturer (i.e. all iPhones should produce images with similar colors). As can be seen in the plots, the 6th-generation and most recent generation iPhones appear to be fairly consistent unit-to-unit, while the 7th-generation iPhones appear to show more unit-to-unit variation. For the dark skin patch, the iPhone models appear to have a similar magnitude relative to zero, but for other color patches, we do not see this trend.

Our results for the grey card are given below:

For brevity, we have included results for a selection of patches from the color checker in this report. Because the appearance of skin tones is very important for the subjective perception of image realism, we have included the results for the light and dark skin patches on the color checker. Because of their use in color ink for printers, we have included the results for the cyan, magenta, and yellow patches on the color checker. Full results for all color patches can be found in the spreadsheet in Appendix II.

Our results for the light and dark skin patches are given below:

Our results for the cyan, magenta, and yellow patches are given below:

Grid of Dots

For the grid of dots image target, we computed the proportional diameter difference between the dots at the center of the image and the dots at the edge of the image, which is on the vertical axis in this plot. We are using this value as a proxy for chromatic aberration and geometric distortion. A value of zero means that there is no additional spreading for the dots at the edges of the image relative to those at the center, while a positive value means that the dots at the edges of the image exhibit more spreading that the dots at the center of the image.

Prior to data collection, we expected each smartphone camera to have values that were non-zero, and that the relative magnitude of the values would be smaller for newer smartphone models (i.e. newer phones have better lenses). As can be seen in the plots, the phones on the left-hand side of the chart seem to show distributions of a similar magnitude and no one phone model seems to be closer in magnitude to zero than any of the others.


Our results for the grid of dots images are as follows:

Grid of Lines

For the grid of lines image target, we computed the relative bend per box (in mm) of the boxes at the edges of the image relative to the boxes at the center of the image, which is on the vertical axis in this plot. We are using this value as a proxy for geometric distortion. A value of zero means that there is minimal geometric distortion, while a positive value means that there is more distortion at the edges relative to the center of the lens.

Prior to data collection, we expected each smartphone camera to have values that were non-zero, and that the relative magnitude of the values would be smaller for newer smartphone models (i.e. newer phones have better lenses). As can be seen in the plots, the phones on the left-hand side of the chart seem to show distributions of a similar magnitude and no one phone model seems to be closer in magnitude to zero than any of the others.

Our results for the grid of lines images are as follows:

Conclusions

References

Appendix I

Appendix II