Fundamental Psychophysical Experiments: Color discrimination contours, colorblind perceptions, and contrast sensitivity

From Psych 221 Image Systems Engineering
Jump to navigation Jump to search

Introduction

Rachel Liao, Yi-Chen Tsai, Kan-Yun Tu, Hsin-Fang Wu.

This project strives to accurately model the human eye in regards to color discrimination contours, colorblind perceptions, and contrast sensitivity. Previous work has been done in this field; our class TA and mentor for this project, Haomiao Jiang, has created such a model that utilized MATLAB and the ISETBIO toolbox to produced various color contours based on cone ratio and cone opponency. His model uses a random wiring of L and M cones to produce cone opponency. In order to determine a particular cone's response, if an L cone is excited, the responses from the weighted M cones wired to it are subtracted from the L cone's response. If the excitation is from an M cone, the responses from the weighted L cones wired to it are subtracted from the M cone's response.

This cone opponency-based algorithm produces reasonable results, however, there is no physical meaning behind it since the cones are randomly wired together. The purpose of our project was to produce a similar algorithm that more accurately modeled the biology of the human retina, allowing it to be more realistic while still producing reasonable results.

Background

As we have learned in class, our eyes consist of two main photoreceptors that convert light into signals that our brains can interpret - rods are primarily used for our peripheral vision and function best in low light settings while cones are primarily used for color vision and function best in bright light settings. Since we are mainly interested in color vision and contrast sensitivity, we will focus primarily on cone cells and the role they play in color vision. However, given our project objectives, we need to more deeply understand the biological structure of these cells so that we can accurately model their responses to certain stimuli.

In order for visual information to pass to the brain, the photoreceptors, which are located near the outermost layer of the retina, need to send signals to the retinal ganglion cells (RGCs), which are neurons located near the inner surface of the retina. The human eye contains about 1.2 to 1.5 million of these RGCs and about 80% of them are considered midget cells. Midget cells receive inputs from relatively few rods and cones and respond strongly to changes in color but weakly to changes in contrast. About 10% of retinal ganglion cells are parasol cells which are the opposite of midget cells in that they receive inputs from relatively many rods and cones and respond weakly to changes in color but strongly to changes in contrast.

Both midget cells and parasol cells have receptive fields which can be represented with two regions, the center and the surround. There are two ways to describe the center surround of retinal ganglion cells. First, there is the on-center with off-surround. This means the center cone is excited and the surrounding cones inhibit that cell's response. The second is the opposite with an off-center and on-surround. In this receptive field, the surrounding cones are excited while the center cone inhibits the surrounding region's response.

Types of receptive fields: on-center/off-surround (left) and off-center/on-surround (right).

Methods

The main software tools that we used were MATLAB and the ISETBIO toolbox which is open source on GitHub. ISETBIO stands for the Image System Engineering Toolbox for Biology and it can be used to model a general image system from the scene to the optics to the sensor, whether human or not. The output from this model can then be analyzed for accuracy using machine learning classifiers and then the following results can be interpreted. By using these tools, we were able to implement an algorithm to reasonably model the human retina based on the physical responses of the cones.

Display

Our scene was set up with no background contrast and a cone mosaic with 8568 cones (size 84 x 102) - the cone ratios in the cone mosaic were changed accordingly as describe in the Sensor section below. A cone patch with 45 cones is cropped from the center of the cone mosaic for machine learning classification. The color directions in the L-M contrast sensitivity plane were set to a large range of angles (0°, 40°, 45°, 50°, 90°, 135°, -180°, -140°, -135°, -130°, -90°) and we took a total of 3000 samples. One caveat to mention is that once we included the cone type in our center-surround implementation, we took out angles 40°, 50°, -140°, and -130° in order to speed up the otherwise 2-3 hour simulation.

Optics

The optics we used were those of the standard human eye implemented in ISETBIO.

Sensor

To simplify the modeling of the human retina, we modeled the cone mosaic as a 2-D matrix consisting of L, M, and S cones. L cones are most sensitive to long wavelengths, M cones are more sensitive to medium wavelengths, and S cones are most responsive to shorter wavelengths - their normalized sensitivities are shown below.

Normalized sensitivities of L, M, and S cones in reference to wavelength.

Since we were interested in colorblind perceptions, many of our simulations required us to model trichromats and dichromats so we often needed to adjust the L, M, S cone ratios accordingly. To model trichromats, the standard (L, M, S) ratio was (0.6, 0.3, 0.1). To model dichromats who suffer from protanopia, or the lack of L cones, the (L, M, S) ratio was (0, 0.9, 0.1) while dichromats who suffer from deuteranopia, or the lack of M cones, the (L, M, S) ratio was (0.9, 0, 0.1).

Theory

Since 90% of our retinal ganglion cells have center-surround receptive fields, a center-surround algorithm was implemented in place of the original, random-wiring algorithm. To model the responses from the cones, we filtered the ideal responses with Gaussian distributions. A typical Gaussian filter is shown below where its average, or mean, is represented by and its standard deviation is given by .

Gaussian distribution where is the mean and is the standard deviation.

Assuming our 2-D cone mosaic matrix is as given below, we chose different values for the Gaussian filter representing the response from the excitation cone () and the Gaussian filter representing the responses from the surrounding, inhibiting cones (). The standard deviation of the center response was kept relatively small compared to the standard deviation of the surrounding response - a small standard of deviation causes the pulse width to shrink and the distribution will approach an ideal delta function. We wanted to model the center cone's response in this way because it mimics the response it would have if there was no coupling from the surrounding cones. Likewise, we wanted to model the surrounding cones' responses with a wider pulse because that way, it can account for any possible coupling from as many surrounding cones as possible.

Center-surround model (right) for a 2-D cone mosaic matrix (left).

The final response from the excitation cone is given as . One more parameter not shown here that can be tweaked is the weight, . The weight is used in the formula below:

By changing the weight, we can choose and adjust how much impact the surrounding field has on the center cone.

Software

Machine Learning Classifiers

We applied different machine learning algorithms to train the classifiers and plot the color discrimination contour. While keeping other variables constant, we mainly tested different sets of color directions to see the difference of orientation and size of the color contour. To readily compare the performance from different classifiers, we used the standard cone ratio of trichromats, (L, M, S) = (0.6, 0.3, 0.1), as our model. The second-site noise was generated using random wiring of L and M cones. In every simulation, each sample was a cone mosaic with size 23x23.

  • Supervised Algorithms
    • Support Vector Machine (SVM)
      • Support vector machine aims to choose hyperplanes that maximize the boundary margins to separate clusters of data. In the SVM model, we tuned the parameter k in k-fold cross validation to find the best k value for the following supervised learning algorithms. We also adjusted the parameter "nSamples" for the number of samples in simulations for the following machine learning algorithms.
    • K Nearest Neighbors (KNN)
      • K nearest neighbors classifies data by assigning each object data to the most common type of class among its k nearest neighbors.
    • Decision Tree (Tree)
      • Decision tree maps observations to labels. There are three representations in tree models: internal node, branch, and leaf.
        • Each internal node tests an attribute.
        • Each branch corresponds to an attribute value.
        • Each leaf node assigns a cluster.
    • Linear Discriminant Analysis (LDA)
      • Linear discriminant analysis finds the component axes to classify the observations while reducing dimensionality and preserving as much of the class discriminatory information as possible.
  • Unsupervised Algorithms
    • K-means Clustering (K-means)
      • K-means clustering aims to partition n observations into k clusters by assigning each observation to the cluster with the closest centroid, serving as a prototype of the cluster. Since data with high dimensionality cannot converge in a few iterations, this algorithm needs to be terminated by limiting the number of iterations. K-means clustering is an unsupervised learning algorithm so the meaning or label of each cluster needs to be defined manually by assigning the label to the cluster that generates a higher accuracy.

Delta E

Delta E is an empirical formula of the just-noticeable difference. We used the latest version CIEDE2000. The Delta E plot we generated is based on the color gray as the reference color and background color where the (R, G, B) ratio is (0.5, 0.5, 0.5).

Color contours in Delta E.

To compare Delta E generated contours with the color contours generated from machine learning classifiers, we first need to know the ratio between the number of cones in the human eye and that in our simulation. From prior knowledge, we know that the cone diameter is and the focal length of a human eye is . Therefore, we can derive the visual angle of a cone θ such that

θ

Alternatively, the number of cones per degree is

The size of the cone mosaic in our simulation is . Thus, the ratio of the number of cones in human eye over that in our simulation is

Example:

The L contrast and M contrast of the contour at a 45° orientation in SVM simulation is (, ).

The corresponding L contrast and M contrast position in delta E plot is at (, ) = (, ).

Therefore, the corresponding Delta E value in the plot is at position (, ).

Results

We split our results into two main sections - one focusing on the center-surround algorithm and one focusing on the machine learning classifiers.

Center-Surround Algorithm

The center-surround algorithm we implemented resulted in the color discrimination contour seen below.

Color contour generated with second-site noise from random wiring of L and M cones (left) and color contour generated without second-site noise from center-surround algorithm (right).

There are a couple things to notice about this above comparison. First, both ellipses are oriented at about 45° which is as expected since all the parameters are the same, including the L, M, S ratio. Second, the ellipse generated by center-surround is much smaller than that generated by the random wiring of L and M cones. This is due to the fact that second-site noise was included in the random wiring algorithm while it was not included in the center-surround implementation. Even with these observations, it is hard to tell if the center-surround algorithm is reasonable or not. Therefore, the next test we conducted was to remove the L and M cones to mimic the retina of a dichromat. The resulting color contours are shown below.

Color contour generated from center-surround with no L cones (left) and color contour generated from center-surround with no M cones (right).

Someone suffering from protanopia have a deficient or zero amount of L cones. Given that, we expect the color contour to be two parallel horizontal lines because no matter how high the L contrast, this type of dichromat will not be sensitive to any colors along 0° and 180°. On the other hand, someone suffering from deuteranopia have a deficient or zero amount of M cones. Given that, we expect the color contour to be two parallel vertical lines because no matter how high the M contrast, this type of dichromat will not be sensitive to any colors along 90° and 270°.

Region where dichromats suffering from protanopia (no L cones) are more sensitive to color than trichromats (left) and region where dichromats suffering from deuteranopia (no M cones) are more sensitive to color than trichromats (right).

Accounting for Cone Type

After generating the initial results and noting that they seemed reasonable, we looked to increase the accuracy of our model. To do that, we decided to try and take into account the cone type - depending on which cone type is being excited, it has different weights associated with it that can impact the color discrimination contour. To do this, we created multiple center functions and surround functions based on cone type. If an L cone is being excited, the final cone response would be the center function for an L cone () subtracted by the surround function for an M cone (). The vice versa is true where an M cone is being excited - the final cone response would be the center function for an M cone () subtracted by the surround function for an L cone ().

Generating different center and surround functions based on individual cone types.
Color discrimination contour after taking into account cone type.

From the plot, it is seen that the resulting color contour is still reasonable - the small size is due to the fact that there is no additional second-site noise. The orientation is slightly different than that generated by the initial center-surround algorithm and the data points do not perfectly fit the best-fit ellipse. We think that this could be due to the fact that we did not optimize this algorithm. Given our time constraint, we could not sweep any of the parameters to find an optimal combination. However, since the results still form an ellipse and there are no drastic changes from the previous algorithm, we conclude that this methodology of accounting for the cone type is reasonable and perhaps can be further explored in future work.

Machine Learning

Support Vector Machine

Color contour generated with 3-fold cross validation (left), color contour generated with 6-fold cross validation (middle) and color contour generated with10-fold cross validation (right). The color directions in the L-M contrast sensitivity plane were set to a large range of angles (0°, 40°, 45°, 50°, 90°, 135°, -180°, -140°, -135°, -130°, -90°)

There are a couple things to notice about this above comparison. First, both ellipses are oriented at about 45° which is as expected since all the parameters are the same except for the number of folds for cross validation. Second, the color contour did not change a lot across the different number of folds for cross validation. Thus, we chose k = 3 as the number of folds for cross validation for the rest of the simulations.

Color contour generated with different number of nSamples. The color directions in the L-M contrast sensitivity plane were set to a large range of angles (0°, 40°, 45°, 50°, 90°, 135°, -180°, -140°, -135°, -130°, -90°) and the nFolds = 3

The total number of data in simulations was twice as many as nSamples. From the figures shown above, there are not big differences when nSamples is larger than 2000. On the other hand, it takes so much time to simulate the program when nSamples was large. Furthermore, MATLAB would run out of memory space when nSamples is over 5000. Therefore, to balance the trade-off between reasonable results and the run-time of the simulations, we chose nSamples to be 3000 for the following simulations and so the total number of samples was 6000.

K Nearest Neighbors

Color contour generated with 1-nearest neighbor (left), color contour generated with 3-nearest neighbors (middle) and color contour generated 5-nearest neighbors (right). The color directions in the L-M contrast sensitivity plane were set to a large range of angles (0°, 45°, 50°, 60°,70°, 90°, -180°, -135°, -130°, -120°, -110°, -90°)

In these series of simulations, we fixed all the parameters except for the number of nearest neighbors. From the figure above, we found that the number of nearest neighbors did not affect the color contour much. Therefore, we chose 3-nearest neighbors model to calculate the respective Delta E.

Decision Tree

Color contour generated with a set of directions (0°, 30°, 60°, 90°, 120°, 150°) (left), and color contour generated with a set of directions (0°, 40°, 45°, 50°, 75°, 90°, 110°, 135°)

From this comparison, we kept all parameters constant, but changed the set of directions. As we can see from the figure above, if the directions we chose are closer to the orientation of ellipse, we would have a more sensitive result.

Linear Discriminant Analysis

Color contour generated with color directions (0°, 25°, 55°, 60°, 65°, 125°, -180°, -155°, -125°, -120°, -115°, -55°) (left) and color contour generated with the same color directions but removed the outlier (right).

In the left figure, the outlier, the -120° dot in the figure, made the color contour too thin to reflect the typical just-noticable-difference. Therefore, we decided to remove the the outlier to draw the right figure above. Even though the shape of the ellipse looked good, the dots didn't fit the contour. Discriminative analysis with regularization might be a good alternative to fit the data points to the color contour.

K-means Clustering

Color contour generated with color directions (0°, 55°, 60°, 65°, 125°, -180°, -155°, -125°, -120°, -115°) (left) and color contour generated with the same color directions but removed the outlier (right).

K-means clustering has the same situation as linear discriminant analysis. The outlier made the color contour too thin which could not reflect the typical just-noticable-difference. Moreover, k-means clustering sometimes could not find the centroids perfectly categorizing the data using Euclidean distance. The centroids could not converge because of the randomly assigned initial centroids and the sparse data.

Conclusions

From the results seen above, our center-surround algorithm seems reasonable, even without any optimization of variable parameters such as the standard deviation or weight of the Gaussian filters. It mostly performs as expected when the L, M, and S cone ratios are changed accordingly and we also learned that there are certain ranges of colors to which dichromats are more sensitive to than trichromats. Finally, we accounted for cone type in our center-surround algorithm and the initial results were reasonable though there needs to be some exploration on how the parameters change the characteristics of the color discrimination contours.

Orientations, the ratios of pure L contrast over pure M contrast, Delta E values, and simulation times of color discrimination contours generated by different machine learning algorithms.

From the table above, the orientations of the color discrimination contours are between 45° and 56°. Those orientations are reasonable because they are close to the orientation of the color contour in the Delta E plot. By examining the values of Delta E, which should be less than , we can prove that all the different classification results seem reasonable. All the ratios of pure L contrast over pure M contrast in the table show that the L contrast is around times more sensitive than the M contrast. This is similar to the ratio of the number of L cones over the number of M cones which is . However, there needs to be some exploration on how the ratio of the number of L cones over M cones affects the ratio of L over M contrast. As we can see, support vector machine and linear discriminant analysis take the least amount of time, because support vector machine includes regularized algorithms and linear discriminant analysis performs dimension reduction. One possible reason that the other three classifiers took longer time to classify data is that k-nearest neighbor, decision tree, and k-means algorithms are very sensitive to high dimensional data.

Future Work

The following ideas are potential future investigations that could be useful in understanding and more accurately modeling the way the human eye interprets color.

  • Better enhancement of the implemented center-surround algorithm.
    • Currently, it is reasonable although not optimized. It would be interesting to see how accounting for cone type or including S cones in our calculations changes or confirms our observation of dichromats being more sensitive to certain color ranges than trichromats.
  • Compare center-surround results with Delta E generated contours.
    • Which method is more reasonable and why.
  • Compare current scene setup with other types of scenes or displays.
    • It would be interesting to see how different displays with different quantization bands would affect the orientation and size of the color discrimination contour.
  • Machine learning algorithms with dimension reduction.
    • Since the data is high dimensional, dimension reduction increases the efficiency of classification. Therefore, it might have a better performance if we can utilize neural network to preprocess data.
  • Machine learning algorithms with regularization.
    • Regularized discriminant analysis can reduce the dimension and avoid overfitting at the same time.
  • Machine learning algorithms that can handle independent features.
    • Naive Bayes is an alternative algorithm for high dimensional data but builtin Matlab functions for Naive Bayes takes too much time. Using Naive Bayes in different programming languages like Python would be better.

References

Haomiao Jiang, TA/Mentor

http://cs229.stanford.edu/materials.html

http://sebastianraschka.com/Articles/2014_python_lda.html#principal-component-analysis-vs-linear-discriminant-analysis

http://slac.stanford.edu/cgi-wrap/getdoc/slac-pub-4389.pdf

http://home.deib.polimi.it/matteucc/Clustering/tutorial_html/kmeans.html

http://webvision.med.utah.edu/book/part-ii-anatomy-and-physiology-of-the-retina/photoreceptors/

http://white.stanford.edu/~brian/numbers/node1.html

http://retina.anatomy.upenn.edu/~rob/lance/units_space.html

http://www.ncbi.nlm.nih.gov/pubmed/10376351

https://github.com/isetbio/computationaleyebrain/tree/dev/literature/bradleyJOV2014

https://foundationsofvision.stanford.edu/

http://www.cns.nyu.edu/~david/courses/perception/lecturenotes/ganglion/ganglion.html

http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3652807/

"Hecht, Eugene, Optics, 2nd Ed, Addison Wesley, 1987

Principles of Neural Science 4th Ed. Kandel et al.

Hanggi, Evelyn B.; Ingersoll, Jerry F.; Waggoner, Terrace L. (2007). "Color vision in horses (Equus caballus): Deficiencies identified using a pseudoisochromatic plate test.". Journal of Comparative Psychology 121 (1): 65–72. http://psycnet.apa.org/journals/com/121/1/65.html

Appendix I

MATLAB source code for center-surround algorithm can be downloaded here.

MATLAB source code for machine learning classifiers can be downloaded here.

MATLAB source code for Delta E can be downloaded here.

Appendix II

Work Breakdown

  • Center-Surround Modeling
    • Rachel Liao
    • Hsin-Fang Wu
  • Machine Learning Classifiers
    • Yi-Chen Tsai
    • Kan-Yun Tu
  • Presentation Slides and Wiki Page
    • All