TimmarajuHng - Predicting Human Performance Using ISETBIO

From Psych 221 Image Systems Engineering
Revision as of 22:07, 16 March 2014 by imported>Projects221
Jump to navigation Jump to search

Predicting Human Performance Using ISETBIO - Aditya Timmaraju, Gerald Hng


Introduction

This project attempts to model a computational observer using ISETBIO and a machine learning classifier. ISETBIO allows us to model the optical pipeline – from scene creation to modeling the human eye to calculating the photons absorbed by the cones. The machine-learning algorithm then plays the part of the “brains” of the computational observer to classify the cone absorption data.

The metric of evaluation in this project was vernier acuity and we investigated four main things using the computational observer:

  1. What is the minimum display resolution (ppi) at a viewing distance of 1 meters for vernier acuity
  2. What is the minimum scene size at a display resolution of 400 ppi for vernier acuity
  3. What is the difference between using support vector machine and logistic regression as the machine language algorithm
  4. Finally, how the results from this project compares with real life human performance


Background

From Literature

Vernier acuity (also known as hyperacuity) is defined as the minimum misalignment that can be detected in the co-linearity of two lines. There had been many past work done to determine the vernier acuity in humans. Westheimer and McKee (1977) found that the human threshold for discrimination of relative position of two lines is a few seconds of an arc [1]. Klein and Levi (1984) discovered that under ideal condition, this threshold may even reach 1 arc second [2]. William, Enoch and Essock (1984) examined and found that at certain feature separations, the vernier acuity threshold is quite resistant to optical degradation [3].

In addition to the tests on human subjects, there were some works done on using ideal observers analysis to estimate the limits of hyperacuity and visual discrimination in humans. An ideal observer is defined as a theoretical person or device that performs a given task in an optimal fashion, given the available information. The performance levels of ideal observers can thus be seen as near optimal, providing a useful yardstick by which to compare human performance results and enable us to better understand the human optical system e.g. if the ideal observer performs much better than human, we can conclude that the human optical system is not fully using some information from the retina. Geisler (1984) compared the performance of human and ideal observers and found that humans to be 3-6 times less sensitive [4]. Geisler, in another paper (1989), repeated the experiment by Westheimer (1976) with an ideal observer and discovered that its performance did not deteriorate at larger base separations like the human subjects [5].

Vernier Acuity

The metric of evaluation in this project is vernier acuity so let’s take a moment to explain more about it. Vernier acuity is defined as the minimum misalignment that can be detected in the co-linearity of two lines. The underlying principle of this type of acuity can be explained using figure X. The photoreceptors on the retina are shown as rectangles at the base of the figure. Assume that the observer sees two light spots and that the left spot gives rise to a bell-shaped intensity represented by the solid line. The second spot, which is slightly to the right of the first (at a distance of a fraction of the cone separation), causes an intensity distribution that is correspondingly shifted, represented by the dashed line.

The vertical bars are the measured excitations of the respective cones and it can be seen that the small displacement of the spots can cause significant difference in the light intensity perceived. The human photoreceptors can discriminate these intensity differences and with additional neural processing, it is possible for humans to detect spot position differences that are just a fraction of cone separation. In fact, Klein and Levi [2] found that under ideal conditions, resolution for vernier acuity goes as high as the 1 arc second range. This is much smaller than what can be resolved from the spacing between the retina cones, indicating what Westheimer termed as hyperacuity, which surpasses traditional acuity (~1 arc min) by at least an order of magnitude. An alternative version of this test replaces the two spots with two misaligned lines as depicted by figure Z.

Left: Vernier acuity demonstrated with the 2 spots test (left) and an alternative version with lines (right). The bars show the intensity on the photoreceptors caused by the stimuli.

Method

Workflow

The following diagram shows the workflow used in this project. We created the scenes, optics and sensors using the ISETBIO toolbox. The output from the sensor is a set of cone absorption data which we then processed using machine learning classifiers to obtain the classification accuracy.

The details of each step in the workflow can be found in the subsequent sections.

Diagram summarizing the workflow used in this project


Scenes

According to literature, vernier acuity can be tested with the two lines experiment or the two spots experiment. We chose to use the two lines version in this project. Two scenes were used and both were designed to allow us to test for vernier acuity. First, we created a scene with a straight vertical unbroken line as shown in figure X. Next, a scene was created with a line that is misaligned by one pixel width in the middle, depicted in figure Y. The scenes were created on the “LCD-Apple” display in ISETBIO with a visual angles ranging from 6 to 36 minutes of arc depending on the test conducted.

Scenes used in the tests. Scene 1 consists of a straight unbroken line and scene 2 consists of a line misaligned by 1 pixel in the middle

Optics and Sensor

The optics used in the project was the standard human lens in ISETBIO. As the optics is imperfect, it introduces effects such as chromatic aberration and blurring. The extent of blur is dependent on the point spread function of the lens.

The sensor used emulates a human retina and the proportion of the cones in the cone mosaic was set to a ratio of 0.6 : 0.3 : 0.1 for the long, medium and short wavelength receptors respectively. We modeled eye movements as 2d Gaussian and there was photon noise introduced in the sensor. The sensor calculates the cone absorption samples based on the various test parameters (scene, viewing distance, optics etc) and this set of samples was then used as training and verification data in the machine learning classifier.

The combination of blurring, eye movements and photon noise adds randomness into the captured image and makes the classification more difficult for the classifier. The retinal image can be seen in figure X and it can easily be seen that it is now not as easy to distinguish the line as before.

The retinal images of the respective scenes. The images are very different from the scenes due to blurring and chromatic aberration