Kinetic Chroma: Helping the color-blind see the full spectrum (David Nissimoff)

From Psych 221 Image Systems Engineering
Jump to navigation Jump to search

Introduction

Kinetic Chroma is a mechanism devised in this project to help the color-blind see the full color spectrum through temporal color transformations (i.e. animated color transforms). The project works by leveraging a dichromat color model to estimate the color perception of a dichromat for any given input image, and by then computing a meaningful color transformation that can be animated in order to convey the portion of the color space that would otherwise not be distinguishable by the dichromat.

The result is a method that enables color-blind individuals to pass the Ishihara test (1) when looking at an Ishihara plate through the proposed application. The animated image below shows the simulated effect, from top to bottom, for a color-normal individual, a protanope, a deuteranope, and a tritanope. The color contrast becomes noticeable in all cases thanks to the mechanism introduced in this project.

Background

Color Blindness

Color-blindness affects up to about 8% of men and 0.5% of women. (2) The term color-blindness, however, can be misleading: most affected individuals can still perceive color, just not all colors.

Color-normal individuals perceive color primarily through the 3 cone types on the retina, termed the L, M and S cones (for Long, Medium and Short wavelengths). Because of the 3 cone types, color-normal individuals are also called trichromats. Color-blind individuals, on the other hand, have one (or more in rare cases) cone type that is either missing the required photo pigment or has some other anomaly that causes it to underperform. The main kinds of dichromacy can be categorized according to which one of the cone types is affected: protanopes, deuteranopes and tritanopes when the L, M and S cones are affected, respectively.

Inclusive design tools

Given the substantial number of people affected by some kind of color blindness, it is not at all surprising that a myriad of tools has been developed to help build technology that is inclusive of this population as well. In the realm of software, this includes software tools that help product designers simulate what a dichromat might perceive, and to choose better color schemes for all of the population. Despite the availability of such tools, product designs nonetheless often neglect this population. Those tools, however, tackle the problem from the perspective of helping product designers come up with better designs (e.g. by choosing appropriate colors for user interaction elements). It is the opinion of the author of this project, however, that this approach is ultimately not scalable:

  • It requires from the team which is designing / building a product to account for these requirements, which cannot possibly be enforced for the totality of products created everywhere in the world
  • It leaves no hope to improve existing products with closed designs that were already created with choices that prove inconvenient to dichromats.

Assistive tools and technologies

Existing assistive software tools for color blindness can be divided into two broad categories:

  • Color-naming tools, which provide a textual description of the color under the mouse cursor on a computer screen (e.g. (3))
  • Color-enhancement techniques, which transform an image into a new image with increased color contrast

This project falls within the Color-enhancement techniques category, with a fundamental change to existing approaches which enables this project to convey the full color spectrum to dichromats too. While this does not result in the perception of new previously unseen color, it does enable a dichromat to distinguish previously indistinguishable colors thanks to the temporal nature of the color transformation applied.

Methods

The proposed mechanism encompasses 3 key steps:

  1. Use a dichromat color model to simulate what a dichromat would perceive from an input image. The goal of this step is to transform an image so as to elicit similar perceptual experiences for dichromats and trichromats alike. If the model is accurate, the dichromat cannot distinguish the original from the transformed image. Trichromats can then evaluate what the image is believed to "look like" to a dichromat
  2. Compute the perceptual color difference for each pixel between the original and the transformed image. For each pixel, this is a 3D vector that measures the color information the dichromat is missing. Additionally, it is desirable to represent this in a perceptual color space, with several possible choices: CIELAB (4), Opponent colors (5), etc.
  3. Apply a modulation function to each pixel related to the corresponding

Dichromat color model

There are several color perception models for human vision that were developed over the years. These can be organized into two groups as follows:

  • CIELUV, CIELAB, S-CIELAB (6): Empirical models to quantify human perception. They tend to focus is on accuracy of prediction, not necessarily on the mechanisms that cause the behavior
  • Opponent, Two Stage (cone adaptation + opponent encoding), ATD, DKL, YIQ, etc. They attempt to infer neural structure behind color perception behavior, and many, or perhaps all, take special care to also explain dichromat behavior

For this project, we focus on the Two Stage model (7), which is based on the psychophysical experiments from (5). The main idea in this model is the assumption that the dichromat's perception is a result of (1) loss of a cone tyoe; and (2) a modified opponent color transformation, estimated from measurements. It is an elegant solution which seems to explain naturally observable phenomena and uses only linear transformations (unlike (8), which is also often used but not linear). The primary motivation for this choice is that the linear model can be implemented efficiently as a Pixel Shader on modern GPU's.

Two Stage linear model for dichromacy

This section is a brief explanation of the method proposed in (7). It also uses slightly different notation which makes the problem more amenable to standard linear algebra operations that will be used in what follows.

The Two Stage model prescribes that color perception of color-normal individuals is modeled by the figure to the left, where the transform T is taken from (5). It is worth mentioning that, in order to use the numerical values of that transformation as given in (5), it is important to use the appropriate LMS color space definition as was used there, namely the Smith & Pokorny (1975) model with normalized peaks of the L, M and S curves. More information on this is provided on (9).

For dichromats, the Two Stage model follows the figure on the right, where a cone type is lost, and a new transformation matrix is used. The model is built on empirical data that there are three kinds of emissions that are perceived to be the same color by unilateral dichromats: the equi-energy white, and two pure wavelengths that depend on the kind of dichromacy. The matrix is obtained as the best least squares approximation that results in equal values of A, C1, C2 for the 3 stimuli. In equation form, that is

Where:

  • represents the cone loss transformation in LMS space (usually almost an identity matrix but with a zero for the missing cone)
  • M is the matrix containing the LMS-space values for the white point and the two pure wavelengths.

Estimating what a dichromat sees

  • Trichromats perceive, in opponent color space:
  • In computer graphics, it is more convenient to operate in sRGB space. But sRGB is non-linear because of the transfer "gamma" curve. So we operate in the linear-sRGB space instead:
  • Putting it all together, trichromats perceive: (1)
  • A dichromat is analogous, but uses (2) instead:

While the transformation in (1) is invertible and all matrices involved are full rank, the same does not hold for 2. By construction, is a rank-2 singular matrix, and the missing color information for the dichromat lies on the null space of this transformation. The next step, therefore, is to compute the null space of the dichromat's transform.

Finding the Null space of the dichromat's transform

  • Using the Singular Value Decomposition, we seek to obtain matrices , with diagonal, U and V orthogonal, such that:

  • By construction, we know this is a rank-2 matrix (because is rank-2). Furthermore, the first and second columns of V () give an orthonormal basis of the column space of the transform, and the third column of V () gives the direction of the null space. Any change in linear-sRGB-space parallel to is not perceived by the dichromat!
  • The reverse transformation from opponent colors to linear sRGB is given by:
  • And the full round trip from linear sRGB to the equivalent linear sRGB value observed by the dichromat is: where

Showing the missing dimension through temporal transformation

The following steps are performed over time and for each pixel on the image:

  • Transform to linear-sRGB space (i.e. undo the sRGB "gamma" curve). Call the input value
  • Project it onto the V basis obtained previously with the SVD:
  • Compute how far the dichromat's perception is by round-tripping in the V basis:
  • Apply a temporal transformation .
  • For example, where is the modulation factor and f is the modulation frequency. Alternate schemes could also be explored, such as modulating the frequency and / or phase in addition to or instead of modulating the amplitude.
  • Compute the new opponent-color representation
  • Transform back to linear-sRGB space with the inverse trichromat model:
  • Transform back to sRGB space by reapplying the non-linear "gamma" transfer curve.

Results

The project was implemented as a proof of concept as a Windows 10 application leveraging Direct2D an using a custom Pixel Shader for improved performance. The open source library Win2D (10) was used for as the rendering mechanism and to load the custom pixel shader.

The animated GIF below demonstrates the method for a computer-generated rainbow gradient and for the set of 24 Ishihara test images, obtained from (11).

  • The first row shows the original Ishihara test plates.
  • The second row shows what a protanope normally sees, and the third rows shows what a protanope sees with the Kinetic Chroma application
  • The fourth row shows what a deuteranope normally sees, and the fifth rows shows what a deuteranope sees with the Kinetic Chroma application
  • The sixth row shows what a tritanope normally sees, and the seventh rows shows what a protanope sees with the Kinetic Chroma application

Conclusions

The goals of the project were met and the results clearly showed that, not only the dichromat color model works as expected, but we are now also able to convey the information that the dichromat had been missing by modulating the difference signal in the perceptual color space.

By using simple and effective linear models, we were able to tackle the complexity analytically and leveraged the SVD to pre-compute the necessary quantities in the models used. In addition, the design was tailored for efficient implementation with minimal overhead, which was executed by creating a custom GPU Pixel Shader to perform the pixel value transformations and animations in real time.

In closing, with this application, color blind individuals could now pass the Ishihara tests!


References

Appendix

Demo application source code: Here

Octave precomputations

% David Nissimoff
% Kinetic Chroma project
% PSYCH 221, 12/11/2017

clear;

function n=normalize(v)
    n = v / norm(v);
end

xyz475 = normalize([0.142100; 0.112600; 1.041900]);
xyz575 = normalize([0.842500; 0.915400; 0.001800]);
xyz485 = normalize([0.057950; 0.169300; 0.616200]);
xyz660 = normalize([0.164900; 0.061000; 0.000000]);

% See: https://en.wikipedia.org/wiki/SRGB#The_reverse_transformation
global lin_srgb2xyz = [ 0.4124, 0.3576, 0.1805;
                        0.2126, 0.7152, 0.0722;
                        0.0193, 0.1192, 0.9505 ];

% See: https://en.wikipedia.org/wiki/LMS_color_space#CIECAM02
%global xyz2lms = [  0.7328, 0.4296, -0.1624;
%                   -0.7036, 1.6975,  0.0061;
%                    0.0030, 0.0136,  0.9834 ];

% Hunt-Pointer-Estevez tranaformation matrix, normalized to D65 white point
% See: https://en.wikipedia.org/wiki/LMS_color_space#Hunt.2C_RLAB
%global xyz2lms = [  0.4002, 0.7076, -0.0808;
%                   -0.2263, 1.1653,  0.0457;
%                    0,      0,       0.9182 ];


% See  R. M. Boynton, Human Color Vision (Holt, Rinehart & Winston, New York, 1979)
%   (http://blog.sciencenet.cn/home.php?mod=attachment&filename=Color-1b-CIE.pdf&id=96390)
% This is the conversion used by Smith & Pokorny (1975), as referenced by Poirson & Wandell in
% Appearance of colored patterns: pattern-color separability (1993)
% Note: The left diagonal matrix is applying the same normalization as described in the later paper,
% and is used to correct for the peaks of L(\mabda), M(\lambda), S(\lambda) for peak=1 per the Smith & Pokorny model.
global xyz2lms = [ 1/1.062, 0, 0       ;
                   0,       1, 0       ;
                   0,       0, 1/1.7826 ] * [  0.15516, 0.54308, 0.03287;
                                              -0.15516, 0.45692, 0.03287;
                                               0,       0,       0.01608 ];

% See: http://white.stanford.edu/~brian/papers/color/PoirsonWandell1993.pdf
global lms2opp = [  0.990, -0.106, -0.094;
                   -0.669,  0.742, -0.027;
                   -0.212, -0.354,  0.911 ]';

xyzWhite = [ 0.9504; 1.0000; 1.0888 ]; % D65 white point, normalized to Y = 1

% Apply method based on http://www2.ece.rochester.edu/~gsharma/papers/PardoDichromat2StageModel_IVMSP2011.pdf

% M_L: Matrix that represents the cone loss in LMS space.
%      This is usually close to the identity matrix but with a zero for the missing cone.
% xyzIsochromes: Each column is an XYZ color that is perceived to be equal to the dichromat and trichromats.
function [VT, Rev] = dichromat(M_L, xyzIsochromes)
    global xyz2lms;
    global lin_srgb2xyz;
    global lms2opp;
    % M: Each column is an LMS color corresponding to how a normal obersver sees white and 2 isochromes
    M = xyz2lms * xyzIsochromes;

    % Now find the best T_D such that:
    %   T_D * M_L * M =~ lms2opp * M
    rhs = lms2opp * M;
    T_D = rhs *  pinv(M_L * M);
    e = norm(T_D * M_L * M - rhs) / norm(rhs);
    printf('    Err: %f\n', e);
    % A_D: Converts from lin_sRGB to opponent colors of the dichromat. It is a rank-2 matrix.
    % By taking the SVD of AD, we can easily find an orthonormal rank-2 basis for its column-space,
    % and the rank-1 basis of its null space.
    A_D = T_D * M_L * xyz2lms * lin_srgb2xyz;
    [U,S,V] = svd(A_D);

    VT = V';
    Rev = inv(lin_srgb2xyz) * inv(xyz2lms) * inv(lms2opp) * U * S;
    printf('    V^T:\n');
    printf('        %.6ff, %.6ff, %.6ff,\n', VT'); % Printf prints transpose matrix
    printf('    Rev:\n');
    printf('        %.6ff, %.6ff, %.6ff,\n', Rev'); % Printf prints transpose matrix
end

%% Protanope
printf('\n\nPROTANOPE:\n');
[VTp,Revp] = dichromat([ 0 0 0;
            0 1 0;
            0 0 1 ],
          [ xyzWhite, xyz475, xyz575 ]);

%% Deuteranope
printf('\n\nDEUTERANOPE:\n');
[VTd,Revd] = dichromat([ 1 0 0;
            0 0 0;
            0 0 1 ],
          [ xyzWhite, xyz475, xyz575 ]);

%% Tritanope
printf('\n\nTRITANOPE:\n');
[VTt,Revt] = dichromat([ 1 0 0;
            0 1 0;
            0 0 0 ],
          [ xyzWhite, xyz485, xyz660 ]);


Pixel shader source

// David Nissimoff
// Kinetic Chroma project
// PSYCH 221, 12/11/2017
//
// This shader has one input texture:
//
//  - The sRGB image whose color space will be transformed.

#define D2D_INPUT_COUNT 1
#define D2D_INPUT0_SIMPLE

#include "d2d1effecthelpers.hlsli"

matrix<float, 4, 4> VT;
matrix<float, 4, 4> Rev;
float mod;

#define SRGB_GAMMA_A      0.055f
#define SRGB_GAMMA_EXP    2.4f
#define SRGB_GAMMA_SLOPE  12.92f
#define SRGB_GAMMA_THRESH 0.04045f

float undo_sRGB_gamma(const float v)
{
	const float s = saturate(v);
#pragma warning(disable:3571) // warning X3571: pow(f, e) will not work for negative f, use abs(f) or conditionally handle negative values if you expect them
	return (
		s > SRGB_GAMMA_THRESH ?
		pow((s + SRGB_GAMMA_A) / (1 + SRGB_GAMMA_A), SRGB_GAMMA_EXP) :
		(s / SRGB_GAMMA_SLOPE));
#pragma warning(enable:3571)
}

float apply_sRGB_gamma(const float v)
{
	return (
		v > (SRGB_GAMMA_THRESH / SRGB_GAMMA_SLOPE) ?
		((1.0f + SRGB_GAMMA_A) * pow(v, 1.0f / SRGB_GAMMA_EXP) - SRGB_GAMMA_A) :
		v * SRGB_GAMMA_SLOPE);
}

D2D_PS_ENTRY(main)
{
	// Sample the input texture
	float4 v = D2DGetInput(0);

	// Linearize sRGB components
	v.r = undo_sRGB_gamma(v.r);
	v.g = undo_sRGB_gamma(v.g);
	v.b = undo_sRGB_gamma(v.b);

	// Apply VT
	v = mul(VT, v);
	float v3_initial = v.z;

	// Apply Rev
	// TODO: OPTIMIZATION: The matrix multiplies below can be optimized into a single dot product with a pre-computed row vector
	float4 vv = mul(Rev, v);
	vv = mul(VT, v v);
	float v3_final = vv.z;

	// Apply Rev with delta_v3 applied to v1
	v.x += mod * (v3_initial - v3_final);
	v = mul(Rev, v);

	// Re-apply sRGB gamma
	v.r = apply_sRGB_gamma(v.r);
	v.g = apply_sRGB_gamma(v.g);
	v.b = apply_sRGB_gamma(v.b);

	return v;
}