TimmarajuHng - Predicting Human Performance Using ISETBIO

From Psych 221 Image Systems Engineering
Revision as of 07:51, 17 March 2014 by imported>Projects221
Jump to navigation Jump to search

Predicting Human Performance Using ISETBIO - Aditya Timmaraju, Gerald Hng


Introduction

This project attempts to model a computational observer using ISETBIO and a machine learning classifier. ISETBIO allows us to model the optical pipeline – from scene creation to modeling the human eye to calculating the photons absorbed by the cones. The machine-learning algorithm then plays the part of the “brains” of the computational observer to classify the cone absorption data.

The metric of evaluation in this project was vernier acuity and we investigated four main things using the computational observer:

  1. What is the minimum display resolution (ppi) at a viewing distance of 1 meters for vernier acuity
  2. What is the minimum scene size at a display resolution of 400 ppi for vernier acuity
  3. What is the difference between using support vector machine and logistic regression as the machine language algorithm
  4. Finally, how the results from this project compares with real life human performance


Background

From Literature

Vernier acuity (also known as hyperacuity) is defined as the minimum misalignment that can be detected in the co-linearity of two lines. There had been many past work done to determine the vernier acuity in humans. Westheimer and McKee (1977) found that the human threshold for discrimination of relative position of two lines is a few seconds of an arc [1]. Klein and Levi (1984) discovered that under ideal condition, this threshold may even reach 1 second of an arc [2]. William, Enoch and Essock (1984) examined and found that at certain feature separations, the vernier acuity threshold is quite resistant to optical degradation [3].

In addition to the tests on human subjects, there were some works done on using ideal observers analysis to estimate the limits of hyperacuity and visual discrimination in humans. An ideal observer is defined as a theoretical person or device that performs a given task in an optimal fashion, given the available information. The performance levels of ideal observers can thus be seen as near optimal, providing a useful yardstick by which to compare human performance results and enable us to better understand the human optical system e.g. if the ideal observer performs much better than human, we can conclude that the human optical system is not fully using some information from the retina. Geisler (1984) compared the performance of human and ideal observers and found that humans to be 3-6 times less sensitive [4]. Geisler, in another paper (1989), repeated the experiment by Westheimer (1976) with an ideal observer and discovered that its performance did not deteriorate at larger base separations like the human subjects [5].

Vernier Acuity

The metric of evaluation in this project is vernier acuity so let’s take a moment to discuss more about it. Vernier acuity is defined as the minimum misalignment that can be detected in the co-linearity of two spots or lines. The underlying principle of this type of acuity can be explained using the left figure below. The photoreceptors on the retina are shown as rectangles at the base of the figure. Assume that the observer sees two light spots and that the left spot gives rise to a bell-shaped intensity represented by the solid line. The second spot, which is slightly to the right of the first (at a distance of a fraction of the cone separation), causes an intensity distribution that is correspondingly shifted, represented by the dashed line.

The vertical bars are the measured excitations of the respective cones and it can be seen that the small displacement of the spots can cause significant difference in the light intensity perceived. The human photoreceptors can discriminate these intensity differences and with additional neural processing, it is possible for humans to detect spot position differences that are just a fraction of cone separation. In fact, Klein and Levi [2] found that under ideal conditions, resolution for vernier acuity goes as high as the 1 arc second range. This is much smaller than what can be resolved from the spacing between the retina cones, indicating what Westheimer termed as hyperacuity, which surpasses traditional acuity (~1 arc min) by at least an order of magnitude. An alternative version of this test replaces the two spots with two misaligned lines as depicted by right figure.

Left: Vernier acuity demonstrated with the 2 spots test (left) and an alternative version with lines (right). The bars show the intensity on the photoreceptors caused by the stimuli.

Method

Workflow

The following diagram shows the workflow used in this project. We use the Image System Engineering Toolbox for Biology (ISETBIO) toolbox in this project to model the optical pipeline. ISETBIO allows us to model the front end of the visual system – creating the scenes generated by a display, modeling the human eye with an accurate optics model and calculating the optical irradiance image that impinges on the retina and the number of photons absorbed by the photoreceptors.

We created the scenes, optics and sensors using the ISETBIO toolbox. The output from the sensor is a set of cone absorption data which we then processed using machine learning classifiers to obtain the classification accuracy.

The details of each step in the workflow can be found in the subsequent sections.

Diagram summarizing the workflow used in this project


Scenes

According to literature, vernier acuity can be tested with the two lines experiment or the two spots experiment. We chose to use the two lines version in this project. Two scenes were used and both were designed to allow us to test for vernier acuity. First, we created a scene with a straight vertical unbroken line as shown in the left figure (scene 1). Next, a scene was created with a line that was misaligned by one pixel width in the middle, depicted in the right figure (scene 2). The scenes were created on the “LCD-Apple” display in ISETBIO with visual angles ranging from 6 to 36 minutes of arc depending on the test conducted.

Scenes used in the tests. Scene 1 consists of a straight unbroken line and scene 2 consists of a line misaligned by 1 pixel in the middle

Optics and Sensor

The optics used in the project was the standard human lens in ISETBIO. As the optics are imperfect, it introduced effects such as chromatic aberration and blurring. The extent of blur is dependent on the point spread function of the lens.

The sensor emulates a human retina and the proportion of the cones in the cone mosaic was set to a ratio of 0.6 : 0.3 : 0.1 for the long, medium and short wavelength receptors respectively. We modeled eye movements as 2d Gaussian and photon noise was also taken into account. The sensor calculates the cone absorption samples based on the various test parameters (scene, viewing distance, optics etc) and this set of samples was then used as training and verification data in the machine learning classifier.

The combination of blurring, eye movements and photon noise adds randomness into the captured image and makes the classification more difficult for the classifier. The retinal images can be seen in the below figure and it can easily be seen that it is now not as easy to distinguish the line as before (e.g the misalignment is not as obvious as before). A total of 3000 samples were gathered for each scene, of which a portion was used to train the classifier.

The retinal images of the respective scenes. The images are very different from the original scenes due to blurring and chromatic aberration

Machine Learning Classifier

Support Vector Machine

To be added


Logistic Regression

To be added


Tests Conducted

We conducted four main tests in this project and we describe them in detail here. For every test, a fraction of the cone absorption samples collected were used as training data for the classifier and the rest were used as verification data.

Test 1: Minimum ppi for vernier acuity at 1 meter

In this test, we collected the cone absorption data for both scenes from 100 ppi to 1000 ppi at a fixed distance of 1 meter and all other variables being constant. The data was then processed by the machine learning classifier to find the minimum ppi at which there is vernier acuity. We judge that there is vernier acuity when the classifier is able to achieve a classification accuracy of more than 75%.

Test 2: Minimum scene size for vernier acuity at 400 ppi

Next, we collected the cone absorption data for scene size varying from 6 to 36 minutes of arc at increments of 3 minutes of arc. This test allows us to examine the trend of the classification accuracy with increasing scene size. The display resolution was fixed at 400ppi. The data was then processed by the machine learning classifier to find the minimum scene size with vernier acuity. Again, 75% classification accuracy is used to define vernier accuracy.

Test 3: Comparison between different machine learning classifiers

We used 3 different machine learning classifiers in this test to compare the results obtained each of them. ADD MORE…

Test 4: Comparison with human performance

Finally, we used the results collected from test 1 to compare with the performance results of real life human observers found in literature. This will give us an idea of how accurate our machine observer emulates the human.


Results and Analysis

Test 1 results: Minimum ppi for vernier acuity at 1 meter

File:Results SVM ppi2.png
Classification accuracy for different ppi at viewing distance of 1m

Descriptions to be added!!


Test 2 results: Minimum scene size for vernier acuity at 400 ppi

File:Results SVM scene size GA.png
Classification accuracy for different scene sizes at 400 ppi and viewing distance of 1m

Descriptions to be added!!


Test 3 results: Comparison between different machine learning classifiers

Test 4 results: Comparison with human performance

According to results from Westheimer [1, 6], vernier acuity in humans is around 6 seconds of arc. Klein and Levi [2] claimed that vernier acuity could go as low as 1 second of arc under ideal conditions. From the results obtained in test 1, we found that the vernier acuity is 14.97 seconds of arc.

We believe that this may be due to the size of the scene used (6 minutes of arc) which may be smaller than those used in the tests of human subjects. It is conceivable then that a larger scene size may produce results closer to those recorded in the human tests since more information will be available in the scenes. Furthermore, the lines used in literature were thicker than the ones used in this project and this may contribute to the performance differences.


Vernier acuity of human vs. computational observer
Human observer Computational observer
Vernier Acuity 6 seconds of arc 14.97 seconds of arc


Conclusions

TODO: Add conclusions!!

Possible future extension of this work includes

  1. Trying other machine learning algorithms e.g. neural networks
  2. Improving the performance gap between computational observer and human observer by increasing line thickness or using larger scene sizes
  3. Vary eye properties to mimic humans with different conditions
  • Colour-blind
  • Astigmatism and myopia. Literature suggests optical degradation affects results [3].


References

[1] G. Westheimer and S. McKee. Spatial Configuration for Visual Hyperacuity. Vision Res Vol 17 (1977) pp 941 to 947.

[2] S. Klein and D. Levi. Hyperacuity thresholds of 1 sec: theoretical predictions and empirical validations. J. Opt. Soc. Am. A Vol 2 No 7 (1985)

[3] R. William, J. Enoch and E. Essock. The Resistance of Selected Hyperacuity Configurations to Retianl Image Degradation. Annual Meeting of the Association for Research in Vision and Ophthalmology (1982)

[4] W. Geisler. Physical limits of acuity and hyperacuity. J. Opt. Soc. Am. A Vol. 1 No. 7 (1984)

[5] W. Geisler. Sequential Ideal-Observer Analysis of Visual Discriminations. Psychological Review Vol. 96 No. 2 (1989)

[6] G. Westheimer & G. Hauske. Temporal and Spatial Interference with Vernier Acuity. Vision Res. 15 (1975), pp1137–1141


Appendix A - Source codes and results

Source codes for scene and cone absorption data generation can be downloaded here.

Source codes for machine learning classifiers can be downloaded here.

Cone absorption data can be downloaded here.

Classification results can be downloaded here.


Appendix B - Breakdown of Work

Gerald - Generation of cone absorption data, literature survey

Aditya - Machine learning classifiers

Gerald & Aditya - Result analysis, conclusions, wiki page and slides