Revision as of 06:54, 22 March 2012

Implementation and analysis of a perceptual metric for photo retouching

Introduction

Retouched images are everywhere today. Magazine covers feature impossibly fit and blemish free models, and advertisements frequently show people too thin to be real. While some of these alterations could be considered comical, an increasing number of studies show that these pictures lead to low self-image and other mental health problems for many of those that view them. To help address this problem, lawmakers in several countries, including France and the UK, have proposed legislation that would require publishers to label any severely retouched images, and over the last few days, Isreal has passed the first law to require labels for retouched images (in this case, for thinning the model).

Legislation requiring the labeling of modified images raises a number of issues. Namely, how do we define “severely retouched”? Nearly all published images are modified in some way, whether through basic cropping or color adjustments or more significant alterations. Which, if any, of these changes are acceptable? The second problem is that there are a huge number of photographs published every day. How can they all be analyzed for retouching in a timely, cost-effective manner?

In their 2011 paper “A perceptual metric for photo retouching,” Kee and Farid proposed a perceptual photo rating scheme to solve these problems. With their method, an algorithm would analyze the original and retouched versions of an image to determine the extent of the geometric (e.g., stretching, warping) and photometric (e.g., blurring, sharpening) changes made to the original. The results of this analysis would be compared to a database of human-rated altered images to automatically assign a perceptual modification score between 1 (“very similar”) and 5 (“very different”). This scheme, intended to deliver an objective measure of perceptual modification with minimal human involvement, would allow authorities or publishers to define a threshold for a “severely retouched” image and label them accordingly.

This project is largely intended as an effort to reproduce the results from the Kee and Farid paper. Accordingly, the algorithm and methods described by the paper have been implemented and tested on a set of images. The rest of this report describes the algorithm implementation process. The report discusses the results of applying this algorithm to a set of retouched images, as well as potential improvements to improve the effectiveness and practicality of the algorithm.

Methods

Measuring retinotopic maps

Retinotopic maps were obtained in 5 subjects using Population Receptive Field mapping methods Dumoulin and Wandell (2008). These data were collected for another research project in the Wandell lab. We re-analyzed the data for this project, as described below.

Subjects

Subjects were 5 healthy volunteers.

MR acquisition

Data were obtained on a GE scanner. Et cetera.

MR Analysis

The MR data was analyzed using mrVista software tools.

Pre-processing

All data were slice-time corrected, motion corrected, and repeated scans were averaged together to create a single average scan for each subject. Et cetera.

PRF model fits

PRF models were fit with a 2-gaussian model.

MNI space

After a pRF model was solved for each subject, the model was trasnformed into MNI template space. This was done by first aligning the high resolution t1-weighted anatomical scan from each subject to an MNI template. Since the pRF model was coregistered to the t1-anatomical scan, the same alignment matrix could then be applied to the pRF model.
Once each pRF model was aligned to MNI space, 4 model parameters - x, y, sigma, and r^2 - were averaged across each of the 6 subjects in each voxel.

Et cetera.

Results - What you found

Caption: we compared the means of our ratings for each before after/image to the ratings obtained for those same images in Kee & Farid’s (2011) study.

Caption: A nonlinear SVR was used to correlate the summary statistics with predicted user ratings. The SVR model was trained and tested on the same image set, with parameters determined using 5-fold cross validation.

Caption: A nonlinear SVR was used to correlate the summary statistics with predicted user ratings. The SVR model was trained and tested on separate but equally sized image subsets, with parameters determined using 5-fold cross validation on the training subset.

Caption: A nonlinear SVR was used to correlate the four photometric statistics with predicted user ratings. The SVR model was trained and tested on separate but equally sized image subsets, with parameters determined using 5-fold cross validation on the training subset. Several out of range values were discarded.

Caption: A nonlinear SVR was used to correlate the four geometric statistics with predicted user ratings. The SVR model was trained and tested on separate but equally sized image subsets, with parameters determined using 5-fold cross validation on the training subset. Several out of range values were discarded.

Conclusions

Here is where you say what your results mean.

References - Resources and related work

References

Software

Appendix I - Code and Data

Data

User ratings recorded by group members for subsets of the images

File:Farid ratings.zip: User ratings data graciously provided by Prof Farid. This set differs from that provided with the publication in that the numbering matches the author's photo sets.

Amazon Mechanical Turk observer ratings on retouching severity

Photoset: http://www.cs.dartmouth.edu/farid/downloads/publications/pnas11/beforeafter.tar

Corresponding Masks: http://www.cs.dartmouth.edu/farid/downloads/publications/pnas11/masks.tar

Fifty observer ratings for each of the 468 before/after images used in Kee & Farid’s 2011 paper were acquired from the supplementary resources for the research paper. The authors gathered the user data using the process and observers described below:

Task: Each observer session lasted approximately 30 minutes and was structured as followed:

1. Each participant was initially shown a representative set of 20 before/after images to help them gauge the range of distortions that they could expect to see 2. Each participant was then asked to rate 70 pairs of before/after images on a scale of 1 (“very similar”) to 5 (“very different”). The presentation of images was self-timed; participants could manually toggle between before and after images as many times as they wished. Each observer rated a random set of 5 images 3 times each to measure the consistency of their responses.

Observers: These ratings were provided by 390 observers that were recruited through Amazon’s Mechanical Turk. Each observer was paid $3 for their participation in the session. 9.5% of observers were excluded because they responded with high variance on repeated trials and frequently toggled only once between before and after images, suggesting a lack of consistency or seriousness in the observer’s rating process.

Code

zip file with custom code

This .zip contains several files, including: photo_batch.pl: Perl script used to batch out statistics gathering jobs to multiple machines in the Farmshare cluster. Modify/run this script to gather stats.

image_farm.m: Master matlab code modified by photo_batch.pl to do cluster statistics gathering

photometric.m, stats.m, vfield.m: function files called by image_farm.m or run_in_serial...m.

jsub - executable necessary farmshare cluster

run_in_serial_331_340.m: serial adaptation of image_farm.m

prepAndRunSVM_revised....m - different variants of code to run SVR on the gathered stats. See files for differences.

Other code needed (ssim_index.m, CVX, image registration code, libSVM) cited in the report.

Appendix II - Work partition

Much of the work for this project was performed cooperatively, with all three group members meeting together frequently to discuss and explore the algorithm and its implementation. However, each member focused on different aspects of the project. Andrew Danowitz led a lot of the early code exploration and did most of the implementation for the photometric (filter and SSIM) components. He also pieced together the different implementation components and brought about the capacity to submit much of the computational work to the Farmshare cluster. Andrew contributed a great deal to the report and presentation slides as well.

In addition to taking part in the collaborative aspects of the project, Andrea Zvinakis wrote much of the report and presentation slides. She also performed statistical analysis and, when the Farmshare cluster was unable to support our workload, was responsible for running the summary statistics generating code on hundreds of images on other computers. Andrea also set up the capacity for the group to rate certain images from the photo set.

Taking part in the collaborative aspects of the project as well, Bradley Collins also made small contributions to the report and presentation slides. Bradley also explored or implemented several parts of the geometric component of the algorithm, helped adapt the summary statistics generating code to run jobs serially on campus computers, and used that code to gather many of the image statistics. In addition, Bradley was responsible for most of the SVR implementation and running.

CollinsZvinakisDanowitz: Difference between revisions

Revision as of 06:54, 22 March 2012

Contents

Implementation and analysis of a perceptual metric for photo retouching

Introduction

Methods