Psych221 Project Suggestions: Difference between revisions

From Psych 221 Image Systems Engineering
Jump to navigation Jump to search
Joyce (talk | contribs)
Joyce (talk | contribs)
Line 40: Line 40:
Caffery, L. J., Clunie, D., Curiel-Lewandrowski, C., Malvehy, J., Soyer, H. P., & Halpern, A. C. (2018). Transforming dermatologic imaging for the digital era: metadata and standards. Journal of digital imaging, 31, 568-577.
Caffery, L. J., Clunie, D., Curiel-Lewandrowski, C., Malvehy, J., Soyer, H. P., & Halpern, A. C. (2018). Transforming dermatologic imaging for the digital era: metadata and standards. Journal of digital imaging, 31, 568-577.
​​Daneshjou, R., Barata, C., Betz-Stablein, B., Celebi, M. E., Codella, N., Combalia, M., ... & Rotemberg, V. (2022). Checklist for evaluation of image-based AI reports in dermatology: CLEAR derm consensus guidelines from the International Skin Imaging Collaboration artificial intelligence working group. JAMA dermatology, 158(1), 90.
​​Daneshjou, R., Barata, C., Betz-Stablein, B., Celebi, M. E., Codella, N., Combalia, M., ... & Rotemberg, V. (2022). Checklist for evaluation of image-based AI reports in dermatology: CLEAR derm consensus guidelines from the International Skin Imaging Collaboration artificial intelligence working group. JAMA dermatology, 158(1), 90.
Project Mentor: Joyce Farrell


= Project Suggestions Fall 2023 =
= Project Suggestions Fall 2023 =

Revision as of 19:04, 6 September 2024

Below we list project suggestions for Psych 221. We update this page regularly with ideas for projects.

  • We describe how you should create the write-up on the Project Guidelines page.
  • More than one person or group can work on the same project.
  • Just after mid-terms you will be asked to turn in a short paragraph proposing your project
  • We want to make sure you are the right person for your proposed project
  • If you want to work on a project that is not listed, but perhaps it is helpful for your research, ask us.

See the links to past projects in the bar at the left.


Project Suggestions Fall 2024

Ray Tracing and Neural Networks

Wavefront optics and pupil shape

Initiated by this LinkedIn post

"Goats, cats and some other animals have eyes that are really creepy.

For species that are active both night and day, like domestic cats, slit pupils provide the dynamic range needed to help them see in dim light yet not get blinded by the midday sun."

The explanation in that post is denied by Roland Fleming, who cites this research article claim

Can we use ISETCam/Bio and the wavefront methods to set the aperture shape and improve on this exchange?

Imaging Skin Moles

Dermatologists frequently use smartphone cameras to photograph and monitor skin moles. When these images are properly calibrated and archived, they become invaluable for tracking alterations in a mole's asymmetry(A), border irregularity (B), color variations (C), and diameter (D). In fact, changes in the visual appearance of skin moles is the most important diagnostic indicator for skin cancer. Hence, twenty years ago, the letter E for Evolution was added to the ABCDE criteria for skin cancer monitoring. Since monitoring the change in skin moles over time is critical, it is important to know how differences in lighting, optics, sensors and the image processing in a smartphone change the quality of the images that are captured. The aim of this project is to assess the reliability of monitoring skin moles using commercially available smartphones, as well as with a dermatoscope attachment that illuminates the skin with controlled lighting, and magnifies the image at a fixed imaging distance

This can either be a one-person project or a team project. As a one-person project, the student can collect empirical data by capturing images of the same skin moles using different cameras, as well as a dermatoscope attached to a camera. The dermatoscope which we will provide, controls the lighting and imaging distance. Hence, the main differences between different smartphone cameras will be optics, sensor resolution, QE, exposure, and image processing. Images that are captured without the dermatoscope will have more variability due to variations in lighting and viewing distances. Smartphone camera images with and without a dermatoscope can be analyzed using methods described in Dugonik et al, 2020.

As a multi-person project, other students could analyze images in the International Skin Imaging Collaboration (ISIC) Archive. This is one of the largest publicly available collections of dermoscopic images of skin lesions that have been used in ISIC challenges over the years. Several researchers have emphasized the need for image metadata specifically for the ISIC database (Caffery et al, 2018, Daneshjou et al, 2022). Given the empirical data obtained by other students and/or an analysis of the ISIC database, write a document describing what and how image metadata should be obtained and stored.

References: Dugonik B, Dugonik A, Marovt M, Golob M. Image Quality Assessment of Digital Image Capturing Devices for Melanoma Detection. Applied Sciences. 2020; 10(8):2876. https://doi.org/10.3390/app10082876 Caffery, L. J., Clunie, D., Curiel-Lewandrowski, C., Malvehy, J., Soyer, H. P., & Halpern, A. C. (2018). Transforming dermatologic imaging for the digital era: metadata and standards. Journal of digital imaging, 31, 568-577. ​​Daneshjou, R., Barata, C., Betz-Stablein, B., Celebi, M. E., Codella, N., Combalia, M., ... & Rotemberg, V. (2022). Checklist for evaluation of image-based AI reports in dermatology: CLEAR derm consensus guidelines from the International Skin Imaging Collaboration artificial intelligence working group. JAMA dermatology, 158(1), 90.

Project Mentor: Joyce Farrell

Project Suggestions Fall 2023

Color science: Simulations of James Clerk Maxwell's experiments

In a series of papers just prior to 1860, James Clerk Maxwell introduced a version of the color matching experiment, along with a method to analyze the data and derive the color-matching functions (CMFs). His papers contain

* Schematics of the instrument 
* Tables of data
* Methods to calculate the CMFs from the data

I would like to implement a tutorial, using ISETCam and ISETBio, that simulates the apparatus and the expected experimental outcome based on our current knowledge. This project will result in in a Matlab LiveScript that serves as a class tutorial and possibly video about Maxwell's work. A start for this work, along with some advanced algorithms, is in the ISETBio repository isetfundamentals.

  • Mentor: Brian Wandell
  • What you will learn: You will understand color matching and the encoding of wavelength by the human retina.

Medical imaging: Simulations of human tissue reflectance and fluorescence

A number of factors impact the images we can obtain of surface tissue in health and disease. Two factors are the density and oxygenation of the blood, and the presence of fluorophores in cells. Both factors influence the health of the underlying tissue and, in some cases, these two interact.

The goal of this project is to simulate the spectral signals we expect for (a) a specific light source, given (b) skin with different amounts of blood, and (c) different ratios of oxygenated to deoxygenated blood. It will also include simulations of fluorescing tissue.

You will be able to draw from previous class projects, open-source software, and on our advice to simulate new types of light-based medical imaging devices.

  • Mentor: Joyce Farrell
  • What you will learn: How to use existing software tools for modeling tissue optics and cameras in order to simulate and evaluate quantitative medical imaging systems capable of estimating the health of biological tissue in-situ.

Physically based simulation: Night time driving

Both autonomous and ADAS driving projects often rely, in part, on simulated scenes. In particular, modern ADAS systems for lane keeping and pedestrian detection rely heavily on cameras and headlights, often first designed and evaluated using simulation. Creating and benchmarking the required physically accurate 3D spectral radiance scenes is possible, but challenging. We are developing tools for simulating cars (with their headlights on!) driving at night. This project involves using computer graphics (PBRT) and Matlab to evaluate camera and headlight design options when used with industry standard benchmark scenarios.

Another version of this project involves seeing whether a "smart" camera could have prevented the recent tragedy that made national news when Google Maps led a family off the end of a broken bridge -- and under what circumstances would it work, and at what speed range.

  • Mentor: Zhenyi Liu and Dave Cardinal
  • What you will learn: How to model lights and sensors using Matlab and ISETAuto, as well as how to conduct experiments with them and evaluate the results.

Simulate an underwater imaging system and explore water absorption and scattering estimation methods

The absorption and scattering properties are fundamental characteristics of a participating medium such as water. The knowledge of these parameters is necessary in order to faithfully model light interactions with such a medium, and hence simulate how, for example, an image captured by an underwater camera would look like. But measuring these properties for real seawater or other media is difficult.

The goal of this project is to design experiments to characterize a participating medium using camera captures. However, rather than using an inconvenient and constraining laboratory setup, your team of 2-3 people will gain the freedom to explore various estimation protocols and techniques using camera simulations and virtual 3D environments. Your project will take advantage of existing modeling tools. You will learn how to use the iset3d toolbox for Matlab for creation of simple 3D scenes; how to run PBRT for physically based, raytracing through those scenes to produce physically accurate images, and finally explore how isetcam (a Matlab toolbox) provides accurate camera simulation and modeling. All these tools already come with sample scripts demonstrating their basic functionality. When combined, these tools will enable you to simulate realistic camera images for different water conditions, explore ideas how to measure water properties using different imaging systems, and evaluate how well they perform for different water types. With the virtual experimental setup in place you will be able to focus on a more detailed research question that is particularly appealing to you. For example, you could quantify how well a conventional, three-channel RGB camera performs in comparison to a multi-channel, multispectral system at water absorption estimation.

  • Mentor: Henryk Blasinski
  • What you will learn: How to use the iset3d, PBRT and isetcam software tools to simulate an underwater imaging system that can estimate absorption and scattering properties of water.

Optics simulation

The pointspread function is a measure of optical quality. In simulation, the wavefront aberration measured in the output pupil (a complex function) predicts the pointspread function (a real valued function). But inverting the calculation, from pointspread to wavefront, is underconstrained and subject to different types of errors.

It is possible to use the point spread function (PSF) to estimate the wavefront aberrations of a lens when the PSF is fully sampled and the aberrations are not too severe, and when something is known about the possible solutions. For example, we may know that the wavefront aberrations are a smooth function; or that they are dominated by a few low-order Zernike modes.

We know a great deal about the distribution of the Zernike polynomials for the human eye, and we also know a great deal about many common lenses. Regularization may help a lot, and I would like to know how much and when.

Here are some other examples of when it may be possible to obtain the PSF and estimate the wavefront aberrations:

  • Microscopy: The PSF of a microscope can be used to estimate the wavefront aberrations of the microscope objective lens. This information can be used to correct the aberrations and improve the image quality.
  • Telescopy: The PSF of a telescope can be used to estimate the wavefront aberrations of the telescope optics. This information can be used to correct the aberrations and improve the resolution of the telescope.
  • Ophthalmology: The PSF of the human eye can be used to estimate the wavefront aberrations of the cornea and lens.

This information can be used to correct the aberrations and improve the visual acuity of the patient.

An application for me is this: The pointspread often varies with field height (on-axis to off-axis). It is difficult to interpolate the pointspread from samples at different field heights. I have always wondered whether interpolating the wavefront aberrations at different field heights might be a better approach.

  • Mentor: Brian Wandell
  • What you will learn: Basic optics and optical characterization.


Biological Model of Color Encoding

This research group has been proposing a model of how we convert the cone encoding into a local color experience. They are an important group. I don't understand their model yet. This paper is one of a series that explains their idea. If we implement it together, we can understand it and perhaps even test it.

  • Mentor: Brian Wandell
  • What you will learn: Calculating cone absorptions and ganglion cell responses

We will use ISETBio. Below is the Abstract of the paper

  According to classical opponent color theory, hue sensations are mediated by spectrally opponent neurons that are excited by some wavelengths of light and inhibited by others, while black-and-white sensations are mediated by spectrally non-opponent neurons that respond with the same sign to all wavelengths. However, careful consideration of the morphology and physiology of spectrally opponent L vs. M midget retinal ganglion cells (RGCs) in the primate retina indicates that they are ideally suited to mediate black-and-white sensations and poorly suited to mediate color. Here we present a computational model that demonstrates how the cortex could use unsupervised learning to efficiently separate the signals from L vs. M midget RGCs into distinct signals for black and white based only correlation of activity over time. The model also reveals why it is unlikely that these same ganglion cells could simultaneously mediate our perception of red and green, and shows how, in theory, a separate small population of midget RGCs with input from S, M, and L cones would be ideally suited to mediating hue perception.

@ARTICLE{Rezeanu2022-fx,

 title    = "How We See Black and White: The Role of Midget Ganglion Cells",
 author   = "Rezeanu, Dragos and Neitz, Maureen and Neitz, Jay",
 journal  = "Front. Neuroanat.",
 volume   =  16,
 pages    = "944762",
 month    =  jul,
 year     =  2022,
 url      = http://dx.doi.org/10.3389/fnana.2022.944762,
 keywords = "black-and-white vision; color vision; computational neuroscience;
             midget ganglion cell; primate retina; retinal ganglion cell (RGC)",
 language = "en"

}

ML/AI-related project

Bogdan Burlacu has expressed interest doing something in this area (which we've done in past years). I'm happy to mentor, and we haven't picked anything specific yet, but it'd be great if there was a project team of 3. So if you're interested, let he or I know.

-- David

Project Suggestions Fall 2022

Depth sensing image system

Brightway Vision loaned us a test system for 2022. Their system is a special CMOS sensor coupled with a precisely controlled light source. The system measures the time of flight of the photons from the light source back to the sensor to measures images at a series of distance ranges. We have their software for controlling the system, and we have methods for reading the data. We would like one of the class projects to be a calibration of the system. One of the cool features of the system is that - when properly controlled - it can see through fog. And, we have a fog machine!

Designer glasses with color filters

A critical topic we cover in class is the idea of color matching. This topic explains which lights appear the same to the eye, or camera, even though they are physically different. It is possible to use the principles of color matching to create special visual effects. For example, we can design glasses with color filters that make certain objects appear more similar and certain objects that appear similar can be made to appear more different. This project will have you use your knowledge of color vision to design color filters for different purposes and then validate your choice using the software tools in the class.

Ask us about the Pickleball problem.

Optics analyses

The software toolboxes in the class include methods for designing and evaluating certain simple lenses. In this project, you will design lenses to magnify and minify images. You will be asked to quantify the lenses you design by illustrating their pointspread functions, optical transfer function, chromatic aberration, and geometric distortion. Some more complex calculations such as illustrating the light field and the impact of placing small pieces of metal in the light path, may be part of the advanced projects.

Sensor design experiments

Image sensors have a number of different properties that impact how well they can capture light - these include pixel size, well capacity, various noise characteristics, and their color filter arrays. In this project you will be asked to use the ISETCam tools to design different types of sensors as well as acquisition policy (e.g., burst mode, exposure bracketing) and evaluate how well the sensors with the policy perform in different imaging contexts (high dynamic range, low light levels, day time).

Sensor calibration

These projects will teach you how to assess various camera properties that are needed to evaluate image quality. We collected the data; your role will be to analyze the data to learn about camera calibration. Click on the links below to see one of four distinct projects about camera calibration for color.

  • Color calibration
  • Camera noise
  • Lens shading

Cornell box lighting estimation

Human vision metrics

A section of the course is devoted to understanding human vision, including the optics, cone absorptions, and spatial pattern sensitivity. In this project you will be asked to use engineering metrics (ssim, s-cielab) that are designed to evaluate certain aspects of image quality. You will explore how well these metrics perform as you characterize them with simple test targets and more complex images.

Image processing algorithms including potentially neural networks

Some of you will be interested in image understanding, or computer vision, applications. With the explosion of neural networks to perform various vision tasks, especially convolutional neural networks, there are many opportunities to perform simple experiments. We will list more here in the future - but here are a couple of examples that might interest you now.

Facial Recognition & Deep Fakes

Humans have been obsessed with faces since before we were technically humans. Now, understanding faces has become one of the most-studied problems in AI. The tools and technology are evolving rapidly, so our ideas for projects in this area are very flexible and fluid.

As tools, we have integrated the deepface toolkit and Faces in The Wild dataset into our isetml repo. It allows experimenting with face detection, face matching, and face verification. Project could include evaluating how different sensors or lenses affected accuracy, or other key metrics. (David and Brian)

Simulating skin reflectance

Light is absorbed and scattered in biological tissue in complex ways. The absorption and scattering are simulated using Monte Carlo methods that depend on various tissue type. The tissue types are defined by parameters that define components such as blood, water, chromophores (tissue components that absorb light only), and fluorophores (tissue components that absorb and emit light). This project will involve using Monte Carlo simulation software packages (e.g. https://inverselight.github.io/ValoMC/ or https://omlc.org/software/mc/ ) to simulate skin reflectance and visualize the effects of changing the parameters. You will vary parameters (such as concentration of melanin and blood oxygen saturation) and visualize how the parameters change the spectral reflectance and how these changes impact typical RGB camera sensors.

References I.V. Meglinski, S.J. Matcher,Computer simulation of the skin reflectance spectra, Computer Methods and Programs in Biomedicine, Volume 70, Issue 2, 2003, Pages 179-186 Aleksi A Leino, Aki Pulkkinen, and Tanja Tarvainen, "ValoMC: a Monte Carlo software and MATLAB toolbox for simulating light transport in biological tissue," OSA Continuum 2, 957-972 (2019) Cameras for dentists (no longer available) Alternative title: Predicting (and measuring) the visibility of tissue autofluorescence

The VELscope is an “adjunct device” that some dentists use to increase the visibility of oral lesions that may or may not be cancerous. The device is based on the theory that healthy oral mucosal tissue will fluoresce when illuminated with short wavelength light. The tissue fluorescence is attributed to FAD (flavin adenine dinucleotide), an enzyme that plays an important role in cell metabolism. The device includes a short wavelength light (peak energy at 425 nm) that illuminates the oral cavity and an eyepiece containing a longpass filter to block the reflected light. When dentists view the oral cavity through the Velscope eyepiece, they should observe the autofluorescence of healthy oral mucosal tissue. The expectation is that observation of dark spots in the oral mucosal tissue that do not emit FAD fluorescence indicates the presence and location of unhealthy oral mucosal tissue that should be further investigated.

We have a Velscope in the lab and a calibrated digital camera. The goal of this project is to model the Velscope and camera in order to calculate the expected camera sensor values for different concentrations of FAD. The data and ISETcam scripts that are necessary to predict FAD autofluorescence and to model the camera will be provided. The Velscope is also available to collect calibrated camera image data. A goal of the project is to analyze predicted and measured camera sensor data. An added bonus is to model a camera that has the same spectral sensitivities of the human eye and calculate color difference metrics for different concentrations of FAD.

References:

Joyce Farrell, Zheng Lyu, Zhenyi Liu, Henryk Blasinski, Zhihao Xu, Jian Rong, Feng Xiao, Brian Wandell, "Soft-prototyping imaging systems for oral cancer screening" in Proc. IS&T Int’l. Symp. on Electronic Imaging: Imaging Sensors and Systems, 2020, pp 212-1 - 212-7, https://doi.org/10.2352/ISSN.2470-1173.2020.7.ISS-212

Lyu Z, Jiang H, Xiao F, Rong J, Zhang T, Wandell B, Farrell J. Simulations of fluorescence imaging in the oral cavity. Biomed Opt Express. 2021 Jun 21;12(7):4276-4292. doi: 10.1364/BOE.429995. PMID: 34457414; PMCID: PMC8367257.

Project Suggestions Fall 2021

Hyperspectral camera projects

IMEC devices as an example (whatever happened to that guy Peter?)

Color cross-talk for tiny, crowded patterned pixels

Recovery of spectral estimation from the device response

Optics

Neural network solution for the RTF - explore alternatives to polynomial fitting

Machine learning

Zheng differentiable color metric

Something related to Andrey?

Human ISETBio

Tadashi illusion simulation

Curation of the Artal, Polans, and Thibos data set - with a nice web interface and explanation

Enchroma [1]

Project Suggestions Fall 2020

Overview of Image Systems Simulation Software Tools available for all class projects

Image systems simulation for camera design and evaluation

Several projects below address different parts of the same question: Can we use camera simulation and graphics rendering to create images that are effectively equivalent to real pictures taken by a modern cell phone camera?

There are various aspects of this question we need to evaluate, and the several projects in this section each address some part. Here is a link to a videodescribing the goal of the projects described below.

Modeling the spatial distribution of light in a 3D scene

Mentors: Zheng and Joyce

Physically-based ray tracing software can be used to model the way light is reflected off surfaces, transmitted through filters and lenses, and impinges on an imaging surface, such as a sensor array in a digital camera or the retina in the human eye. We incorporated PBRT (physically-based ray-tracing) software (developed at Stanford) into an image systems simulation programming environment in order to calculate the optical irradiance image of a 3D scene and predict the sensor images that would be captured by a calibrated digital camera placed in the 3D scene.

In the mid-1980s, researchers at Cornell University compared photographs of a real physical box with computer graphics renderings of the simulated box. The goal of this project is to compare digital camera images of a real physical box with predicted camera images of the simulated box.

A key step in achieving this goal is to model the spectral energy and the spatial distribution of the light illuminating the box and objects in the box.

Students will learn how to use image systems simulation software (specifically, ISET3d) to create different models for the spatial distribution of light within the simulated Cornell Box. These models will be used to predict the amount of light reflected from the walls of the box and from surfaces within the box. Students will compare predicted spectral radiance with measurements of spectral radiance obtained using a spectroradiometer. The digital camera images and the measurement data will be provided to the students.

Building a camera model of the Google Pixel 4A camera

Mentors: Brian, Joyce, Zheng and Dave

In the class we will learn about the electrical characterization of image sensors, as well as means of characterizing the wavelength sensitivity properties. This project will analyze camera images obtained with the Google Pixel 4a to estimate many camera parameters. For example, we will make measurements of relative illumination due to the lens, sensor spectral quantum efficiency, and various types of sensor noise.

The mentors and students will design the experimental measurements. The mentors will acquire the data in the Packard Lab. The students will write scripts to estimate the parameters.

Evaluating the accuracy of the sensor simulation: Color and spatial metrics

Mentors: Joyce, Brian

Suppose you have used image systems engineering methods to model a camera. No model will ever be perfect. What methods would you do to test that the simulated camera was 'accurate enough' for practical use in judging the quality of the simulation? We will discuss this with you, suggest some approaches and listen to you approaches, and then implement some of them. One set of quantitative experiments will measure how accurately we capture the color of objects. Another set of experiments will measure how accurately we capture the spatial resolution of the camera.

Evaluating the perceptual accuracy: Visual Psychophysical Experiments on the Web

Mentors: Joyce

Quantitative measurements of the sensor values is a good approach to assess the validity of the simulations. Another important approach is to ask whether the real and simulated images look alike. We can only judge this perceptual similarity by having people perform experiments.

Mechanical Turk The internet provides access to a large and diverse pool of human subjects, but researchers do not typically use the internet to conduct vision experiments due to the inability to calibrate displays, control the stimulus presentation and constrain the viewing conditions. Nonetheless, there have been many attempts to conduct online visual psychophysical experiments, and this project asks you to set up an experiment using Mechanical Turk.

Similarity ratings On each trial the participant looks at a pair of images. The participants provides a number between 1 and 4 that says how likely the two images were obtained by the same camera. We will provide pairs of images that can be used in the experiment.

As you design this experiment, consider what you might do to learn about the viewing conditions under which people make their perceptual judgements. For example, is there some way in which you can estimate the viewing distance between the display and the person, or the properties of the display (e.g. gamma, resolution).

The goal of this project is to implement an experiment on Mechanical Turk. You should be able to demonstrate the experiment, but we do not expect you to collect and/or analyze data from the experiment.

This can be a group project that includes a survey of the literature and software that has already been developed for online vision experiments. Below are links to two relevant references, but there are many more papers and software packages to be found on the web.

  • Lavin, Silverstein and Zhang (1999), "Visual experiment on the Web," Proc. SPIE 3644, Human Vision and Electronic Imaging IV, (19 May 1999); doi: 10.1117/12.348482
  • Li, Jun, Yeatman and Reinecke (2020) “Controlling for Participants’ Viewing Distance in Large-Scale, Psychophysical Online Experiments Using a Virtual Chinrest”, Scientific Reports 10 (1) , doi: 10.1038/s41598-019-57204-1

How does semantic labeling by a convolutional neural networks depend on camera parameters?

Link to video describing additional software applications and tools available for this project

Mentors: Dave and Brian

Several famous convolutional neural networks for semantic labeling are implemented in Matlab (including Googlenet and Resnet). David has built tools that enable us to (a) download labeled images from large databases, (b) implement different camera models (varying lenses and sensors) to render these images, and (c) evaluate network performance at labeling the rendered images. These projects ask you to analyze how different camera models influence neural network performance.

We expect you to use the pre-trained networks for this project. But adventurous students - or those of you who are already skilled in networks - can go further and experiment with transfer learning or fine-tuning existing nets to adapt them to a specific camera design and evaluate the results.

As a general principle, you should consider that the camera parameters may have a different impact that depends on the task you are trying to achieve.

Spatial variation

Suppose that we change the spatial resolution of the image. Two ways the resolution might change are using diffraction limited lenses with different f-numbers and/or using sensors with different pixel sizes. Can we quantify what types of semantic classification might be influenced as the spatial resolution changes?

Sensor noise

It is expensive to reduce the sensor and pixel noise. And different sensors have different amounts of read noise, fixed pattern noises (DSNU, PRNU). Even for the same sensor, the amount of noise will differ depending on the exposure duration and luminance level. Can you find semantic categories of images for which the noise matters, and some for which the noise does not matter?

Color

The same for color. In some cases classes can be discriminated well without color using monochrome images. In some cases the three color channels are useful. Can you find examples that illustrate this principle, and can you find ways to quantify the value of color in some cases?

Depth

We can recover a lot about the scene just by looking at a depth map. Can these depth map images be used in classifiers? How well will the classifier trained on radiance images perform if we ask it to identify semantic categories based on the depth image?

As a general point, can you learn something about the network from the mistakes it makes? From what it gets right and wrong, what will be your speculation about the information the network is using to make the semantic classification. This will depend on the stimuli you use in your experiment. What if your categories are oranges and lemons? What if your categories are letters of the alphabet?

Modeling the human contrast sensitivity function from fovea to periphery

Mentors: Brian

ISETBio is a software system that is comparable to ISETCam, but focused on the human eye. The current ISETBio software effectively calculates the first stage of vision - cone photoreceptor absorptions. Some aspects of visual sensitivity are determined by the photoreceptors. One of the factors that is not really known is how the changing size of the photoreceptors - they are small in the fovea and much larger in the periphery - impact visual contrast sensitivity. We will write a calculator relevant for human space-color sensitivity.

If you are interested in the biology of the human eye and perceptual image metrics, I can show you ISETBio and some calculations we would like to implement.

Light field camera modeling

Mentors: Brian and Zheng

If you like to program, and you are interested in light fields, then this could be a good project for you.

Using ISETCam and ISET3d, we can simulate sensor data that we would obtain from light field cameras. There are a number of algorithms that people use to refocus and estimate depth with such sensor data. These algorithms are part of Donald Dansereau's software package, Light Field Toolbox. We use some of Don's functions in ISETCam. It would be nice to integrate more of his algorithms the simulated data more closely with Don's toolbox. This project would extend the current light field camera scripts by adding more examples that use Don's toolbox as well as providing documentation.

One part of this project that could be particularly interesting but somewhat advanced: Can we use ISET3d simulations of a dual-pixel autofocus camera to create a depth map? How well can we do? This is a good, but advanced, project.

Project Suggestions Fall 2019

ISETCam whole system validation

Mentors: Brian, Joyce and Zheng

Set up a physical 3D scene based on the Cornell box. Use the spectrophotometer to measure the radiance inside the Ronnie Luo gray box with the diffuse light source. Calibrate a lens and camera that provide raw sensor data. Compare the simulation with the prediction. We could use a calibrated camera (e.g., the Nikon). New features of this project include lens calibration.

Convert scene and oi to Matlab objects/classes.

Cinema 4D experts? Ability to control Cinema 4D programmatically as one can do with Blender

Experimental data collection for simulation comparison

Creating the ISET3D Cornell Box images, large quantities

Optics related, say lens de-centering

ISETBio related

Geometric calibration of a camera

Mentors: Brian and Zheng

There are several online videos and software packages that describe how to measure and correct for camera lens distortion.

This project involves using calibration targets and software (see references below) to estimate the camera’s intrinsic, extrinsic and lens-distortion parameters. In the process of doing this, you will learn what these parameters are and how they are calculated.

References https://www.mathworks.com/help/vision/ug/camera-calibration.html https://www.mathworks.com/videos/camera-calibration-with-matlab-81233.html http://ksimek.github.io/2013/08/13/intrinsic/ http://ksimek.github.io/2012/08/22/extrinsic/ https://www.mathworks.com/help/vision/ug/stereo-camera-calibrator-app.html

Image alignment

Mentors: Brian

Basic project: Evaluate different software algorithms for their accuracy on image alignment. We can assess the algorithm performance using test images from the ISET3d software. This software starts with computer graphics models and ends up producing (a) image data, and (b) pixel level labels that define the object location in each image (ground truth). We can use the ground truth data to measure how well the alignment algorithm performs. We can also use the ISET3D software generate images simulated from cameras with different types of lenses, different sensors, and different types of motion (global or object level motion).

Useful for students interested in image processing (alignment algorithms) or computer graphics (ISET3d).

Learning the image processing pipeline

There are a number of papers that describe methods of learning how to map from raw sensor data to jpg values. Several are referenced here, including work from Stanford (L3). We can provide you with raw data and image processed data, and you can experiment with training neural networks to perform the transformation. Also, the DeepISP folks have some images from Samsung.

  • Papers from Milanfar at Google (e.g. Blade).

Designing and evaluating illumination systems for scientific and industrial imaging applications

Mentors: Joyce and Zheng

There are many scientific and industrial imaging applications that combine an imaging sensor with ring light illumination. Ring lights are designed to surround a camera. Depending on where the camera is placed, the illumination can be both non-uniform and produce least amount of light at the center of the imaging area. It is sometimes possible to calibrate and correct for the non-uniform illumination, but for many types of ring light illumination configurations, the calibration cannot compensate when the central imaging region is not receiving enough light.

This project involves using open-source Matlab code for modeling and comparing different lighting configurations. It may be ideal for someone who is taking the class remotely.

Reference Fhionnlaoich et al., (2019). Optimising Light Source Positioning for Even and Flux-Efficient Illumination. Journal of Open Source Software, 4(37), 1392. https://doi.org/10.21105/joss.01392

Designing a multispectral light to excite tissue fluorescence

Mentors: Joyce and Zheng

The project will help us build a better light source system for the fluorescence measurements. We want to have a system that can easily switch between the light sources with different wavelengths, and (b) improve the uniformity of the light illuminating the mouth. This could be a good project for someone with a background in mechanical engineering and optics.

Basic project: Setup system baseline, lens/optics component design, system performance validation. The skills will involve (a) getting hands-on experience of working and designing on optical table, (b) using a spectrophotometer, and (c) knowledge of engineering optic path.

Measuring tissue fluorescence

Mentors: Joyce and Zheng

We are acquiring data about the fluorescence arising from different parts of the human mouth. We measure fluorescence by illuminating the mouth using a short wavelength (blue) light, and then making spectral photometric measurements of the light emitted from different locations. Even though the illuminant contains only, say, 400 nm light, the emitted light contains energy at 450 to 600 nm. These wavelengths are the fluorescent, rather than reflectance, signal.

This project will be to help us acquire more data, annotate the data, and place them in a database. Further, it will involve using software we have recently added to ISETCam to separate the light into the parts that are fluorescence and reflectance. We will also try to use statistical methods to characterize differences between people and differences between measurements made from different parts of the mouth. For example, can we use principal components and k-means clustering algorithms to understand more about the data.

Basic project: Collect spectral photometric measurements, interact with the database, and perform the fluorescence estimation. The skills will involve learning how (a) to use a spectrophotometer, (b) work with human participants to obtain measurements, and (c) design and record data for a reproducible experiment.

Simulating and designing a camera to measure tissue fluorescence

Mentors: Joyce and Zheng

A spectrophotometer is an expensive and challenging instrument to use. Moreover, it does not acquire a full image but only measures the light from a small part of the image. It would be desirable to build a camera that measures an entire image and to estimate the fluorescence from such an image. The goal of this project is to design a camera that can estimate tissue reflectance and fluorescence as well as the spectrophotometer.

There are several different aspects to this project.

Simulation: . We will teach you how to simulate cameras with different spectral properties and predict their output for different types of spectral inputs. We will also teach you how to compare the performance of a real and simulated camera. You will have opportunities to improve the simulations and the performance evaluation. There are also opportunities for developing and testing algorithms for estimating tissue reflectance and fluorescence from a camera that has only 3 or 2 spectral channels. Simulation has the advantage of having ground truth data. Hence optimization methods and machine learning are both possible.

Software enhancements for a prototype camera: We will show you how to use a prototype camera that includes a UV and a broadband light, customized filters and an imaging sensor. You have opportunities to develop software that improves how the camera captures, transmits and analyzes the camera images.

Mechanical design: There are many opportunities to improve on how the camera is used. For example, one could redesign the form factor so that it is easier to capture images of the mouth. Or, one could design an apparatus that keeps a person’s head fixed while the camera is positioned to capture images of different parts of the mouth.

Eye movements and visual acuity

Mentor: Brian Wandell

ISETBio [2] estimates the photoreceptor current responses in the presence of eye movements. This project uses ISETBio to produce (rectangular grid) photoreceptor responses to a slanted edge pattern. You then run the ISO12233 code on the output to calculate the MTF. The purpose of the project is to vary the amplitude and nature of the eye movements (using the eye movement model in ISETBio) and to show the impact of the eye movements on the MTF.

One group might make this comparison in foveal regions and a second group could do the calculation for peripheral retina, where the cone apertures are much bigger.

Calculate retinal ganglion cell responses

Mentor: EJ Chichilnisky

The ISETBio software is designed to predict responses in the human eye. It works with ISETCam.

Any camera or sensor provides an (imperfect) representation of a scene, from which we often try to reconstruct the scene itself. For example, cameras use three color sensors to represent the spectral properties of the environment: the reconstruction of the spectrum from these three sensors is far from complete, but if the sensors and the reconstruction algorithm are designed well, the reconstruction may be sufficient to represent the scene in a useful way.

The retina is a sensor that transmits to the brain an imperfect representation of the spatial properties of a scene, and a retinal prosthesis provides an even less perfect representation. Recent work attempts to describe how the visual scene may be reconstructed linearly from normal retinal activity (Brackbill et al), how machine learning methods may enhance such a reconstruction (Parthasarathy et al), and how this kind of reconstruction can be helpful in reasoning about how to make an effective retinal prosthesis (Golden et al).

Develop software tools to reconstruct the spatial properties of natural visual scenes from the output of the retina. Simulate retinal output using ISETBio. Perform a linear reconstruction of the scene using the logic of Brackbill et al. Consider whether simple nonlinearities (e.g. a nonlinear lookup table) might improve the reconstruction. Consider whether a different retinal encoding would support more accurate reconstruction. Consider what image metrics should be used to evaluate the quality of the reconstruction. If a retinal prosthesis could activate some of the neurons, but not all, how good would the representation be, and could the reconstruction take into account the missing cells to improve the reconstruction?

Here are some references of people working on this topic.

  • Brackbill et al (draft paper, incomplete, but OK to share with just this class)[3]
  • Parsasarathy et al [4]
  • Golden et al [5]

Wavefront aberrations and point spread functions

Mentor: Brian Wandell

ISETBio (and ISETCam) have functions that calculate from the wavefront aberration of an optical system to the point spread function. This calculation only involves calculating the magnitude of the Fourier Transform of the aberration. In addition, these systems often summarize the wavefront aberration using the Zernike polynomial coefficients. These coefficients define the wavefront aberration on a circular support region.

While the forward calculation (from wavefront aberration to PSF) is straightforward, returning from the PSF to the wavefront (in terms of Zernike polynomials) is not straightforward. I would like to be able to get a reliable estimate of the wavefront, as expressed in terms of Zernike polynomials) from a point spread. I have started, but not yet succeeded. If you like working on these kinds of calculations, let's do this project. It will involve laying out the math, writing the code, and testing the code with different examples.

It would be best if the math was straightforward. But if you really want to do this by training a neural network to do it, I could help you with that by producing many examples. But, really, ...

Avian vision simulation

Mentor: Henryk Blasinski

Different species have evolved in different environments, and as a consequence their vision systems have adapted to specific properties those environments. For example Teodore and Nilsson (Nature 2019) postulate that birds developed additional photoreceptors sensitive in the UV wavelength range because they improve their perception of leaves. The authors conducted a large number of simulations predicting the appearance of different forest scenes in different light conditions. These simulations use basic models of light propagation and interactions with objects in a scene.

Basic project:The goal of this project is to reproduce, and extend the result from Teodore and Nilsson, but using PBRT and ray-tracing. This environment allows to create geometrically complex scenes and model many types of interactions between objects. Some of the extensions could include:

1. Creating a 3D model of a forest

2. Varying the reflectance and transmission properties of leaves.

3. Modifying ambient illumination (time of day, season, etc.)

4. Evaluation of different photoreceptor sensitivity curves, and their impact on a vision task.

Reference: Cynthia Tedore, Dan-Eric Nilsson, ‚Avian UV vision enhances leaf surface contrasts in forest environments,’ Nature 2019; https://www.nature.com/articles/s41467-018-08142-5

Quantifying subjective judgments about image quality

Mentor: Joyce Farrell

The method of pairwise preference judgments generates reliable and informative data about the relative quality of two images. Several subjects compare two images to each other. The percentage of the time one image is preferred over the other is used as an index of the relative quality of the two images. The disadvantage of this method is that it requires many comparisons, typically ten or so for every pair of images.

Silverstein and Farrell (2001) proposed a method to reduce the number of pairwise preference judgments by selecting a subset of pairwise comparisons. Instead of comparing every pair of images (the complete method), a partial method is used that makes more comparisons between images that have similar quality values than between images that have very different quality values. A sorting algorithm is used to efficiently order the images with paired comparisons, and each comparison is recorded. When the sorting is completed, more trials will have been conducted between images that have similar quality value than images that have very different quality values. Regression is used to scale the resulting comparison matrix into a one dimensional perceptual quality estimate.

The method assumes that the images can be positioned on a one dimensional quality line. Given this assumption, it uses an efficient sorting method to reduce the number of preference judgments necessary to quantify the quality of each image.

This project uses simulations to test the method and the assumptions upon which it relies. An added bonus would be to include a method for testing the assumption that the images can be positioned on a one dimensional quality line (hint: look for violations of transitivity).

All material necessary to accomplish this project are provided in this paper: “Efficient method for paired comparison” by D. A. Silverstein and J. E. Farrell (2001) [6]

Project Suggestions Fall 2018

Oral health camera design

Mentors: Joyce and Zhenyi

We are acquiring data about the fluorescence arising from different parts of the human mouth. We measure fluorescence by illuminating the mouth using a short wavelength (blue) light, and then making spectral photometric measurements of the light emitted from different locations. Even though the illuminant contains only, say, 400 nm light, the emitted light contains energy at 450 to 600 nm. These wavelengths are the fluorescent, rather than reflectance, signal.

This project will be to help us acquire more data, annotate the data, and place them in a database. Further, it will involve using software we have recently added to ISETCam to separate the light into the parts that are fluorescence and reflectance. We will also try to use statistical methods to characterize differences between people and differences between measurements made from different parts of the mouth. For example, can we use principal components and k-means clustering algorithms to understand more about the data.

Basic project: Collect spectral photometric measurements, interact with the database, and perform the fluorescence estimation. The skills will involve learning how (a) to use a spectrophotometer, (b) work with human participants to obtain measurements, and (c) design and record data for a reproducible experiment.

A more advanced aspect of this project: A spectrophotometer is an expensive and challenging instrument to use. Moreover, it does not acquire a full image but only measures the light from a small part of the image. It would be desirable to build a camera that measures an entire image and to estimate the fluorescence from such an image. If we know a lot about the reflected light (see the Basic project') we might be able to design a calibrated camera that acquires enough data to estimate the fluorescence throughout an entire image. The goal of this project is to design and simulate the amount of light, spectral character of the light and the camera, and image processing software to embed in such a camera.

Computer graphics asset creation and rendering

Mentors: Zhenyi and Trisha

A great deal is known about the illuminants and reflectance spectra of typical objects within the visible range. Much of this knowledge was obtained because people needed it to design effective consumer cameras. With the increasing use of cameras for machine vision applications, it is becoming increasingly valuable to learn about the reflectance and illumination beyond the visible wavelengths, extending to the band gap of CMOS imagers (about 1000nm). This data could be used to guide a range of automotive and drone applications.

Basic project: Use a spectral photometer to collect spectral reflectance samples of objects that extend across the wavelength range to 950nm or 1000nm. This project involves creating a methodology for (a) acquiring images, (b) acquiring spectral data from identified image locations, (b) measuring the illumination and reflectance from these location. We then need a method for storing and retrieving the data using our database.

Advanced part I: Search the web for existing databases with material reflectance that extends into the long-wavelength (near infrared) regions. Create models of the spectra using principal components methods, k-means clustering algorithms, or other data science tools.

Advanced part II: Create computer graphics renderings of the optical image of driving scenes using reflectance data and a camera that is specified all the way into the near infrared.

Exploring rendering algorithms using machine-learning (e.g., L3)

Mentors: Zheng and Brian

This project would be great for anyone interested in training small neural networks. Using ISETCam, we can create a great many sensor images of approximately natural scenes. We are interested in creating these sensor images using different types of sensors (e.g., standard Bayer and a Bayer with a white pixel rather than two green pixels).

For this project, we would like you to try to train a neural network that converts the data from one type of sensor into another

Basic project: We will provide you with a set of ISETCam scenes to use for this project. You can use ISETCam methods to predicted sensor responses from, say, a simple RGB Bayer camera. Then you use these same methods to calculate the predicted responses from a modified version of that camera. A first modification would be to double the spatial resolution of the camera. Because the images are simulated, they will be pixel-wise aligned in the sense that a 4x4 region of the lower resolution camera will correspond to an 8x8 region of the high resolution camera. You can use a tool (e.g., pyTorch or TensorFlow) to find a mapping from the low resolution to the high resolution image. Different methods for designing and building the network - such as autoencoder methods [7] - might be applied.

Many versions of this project might tried, such as predicting a monochrome sensor response from an RGB response, or - if you dare! - an RGB from a monochrome given some environment (fruits). Or predicting the sensor responses under daylight illumination from the sensor responses under tungsten illumination. Or predicting the sensor responses of an image without camera-shake from the sensor responses of an image with camera-shake. Or predicting what the sensor responses for a high illumination (bright light, 15ms exposure) from a capture at low illumination (dark light, 15 ms exposure).

Spatial CIELAB vs ISETBio and only front-end physiological optics

Mentors: Trisha and Brian

CIELAB Delta E is a color difference metric that measures the similarity of two colors to a human observer. Although widely used, the CIELAB metric is only suitable for measuring color difference of large uniform color targets. Therefore, the spatial CIELAB metric was created to extend the Delta E metric to color images instead of uniform patches. This is necessary because color discrimination and appearance is a function of spatial pattern, therefore spatial CIELAB was designed to takes into account the spatial-color sensitivity of the human eye.

We would like someone to use ISETBio to create L,M,S receptor responses to images. We would then calculate ISETBio-CIELAB differences based on these L,M,S values and compare them with Spatial CIELAB differences computed directly from a display screen. Since the L,M,S values calculated through human optics should already taken into account parts of the spatial-color sensitivity of the human eye, we are interested to see any similarities or differences between these two calculations. The critical aspect of this project is designing test targets.

You will learn: How to use ISETBio to calculate L,M,S values and how to calculate CIELAB values and Spatial CIELAB values.

Advanced Project: Add in eye movements to the calculation and calculate based on the mean response that incorporates eye movements.

Human Optics as a function of eccentricity

Mentors: Trisha

This project would be great for someone who is interested in optics and optical modeling software.

ISET3d is an extension of ISETCam that allows users to simulate 3D scenes and realistic lens prescriptions using ray-tracing and computer graphics. Using ISET3d, we have the ability to simulate a physiological model of human optics, which allows us to predict the optical image after a 3D scene is passed through the optics of the human eye. However, there are many different models of the human eye, which all differ in detail and complexity.

We would be interested in using either ISET3d or other optical modeling software to quantify the off-axis (e.g. wide-angle) performance of several human eye models. We can do this by calculating optical images at different angles away from the center of the retina, and quantifying the modulation transfer function or point spread function at each of these location. We can then compare their performance with known values in the literature.

You will learn: How to use ISET3d to model the optics of the human eye.

Advanced Project: Some physiological models of the eye have accommodation (focusing) modeling as well. In other words, we can change the lens prescription to model the human eye focusing near and far. Can we quantify the difference in these accommodation models?

Project Suggestions Fall 2017

Simulation of Cone Responses for Photosensitive Epilepsy

Patients with photosensitive epilepsy can get seizures from being exposed to flashing lights. Certain frequencies and colors are highly epileptogenic and in 1997 one Pokémon episode resulted in seizures and hospital visits in over 600 children in Japan. Specifically the red flickering lights can be very provocative (Takahashi and Tsukahara 1976; Binnie et al., 1984). It is hypothesized that the red flashes are highly epileptogenic because the only stimulate the red cones. (Harding 1998). A current study that investigated effects of age and gender similarly found that red stimuli are much more likely to induce epileptic activity. Can we simulate how do the cone responses differ across the different colored filters?

This project will use a toolbox called ISETBIO to simulate cone responses. ISETBIO is analogous to ISET, but specifically simulates the human visual system: from a stimuli, through the optics of the eye, onto the retina and photoreceptors, and eventually into the retinal ganglion cells. We can use ISETBIO (1) to setup a simulation with different colored filters and (2) analyze cone responses to stimuli that cause epilepsy.

Mentor: Dora Hermes

Speeding up lens simulations in a ray-tracer

Instead of using ISET to simulate the optics of an imaging system, we have the option of using a graphics ray-tracer to trace rays through a full optical lens system. This allows us to model full 3D scenes in our simulations instead of the flat 2D scenes used in ISET.

In our work, we use an open-source ray-tracer called PBRT (Physically Based Ray Tracer) that we've modified to trace rays through a given optical system. We shoot a ray from the camera sensor and use Snell's law to refract the ray through each surface in a lens system. However, this type of ray-tracing can be very slow and would benefit greatly with a speed-up. One possibility is to precompute ray paths and to load them in during rendering time.

In this project, we explore various methods to speed up the lens simulation in a ray-tracer. An ideal student would have some working knowledge in C++ and an interest in computer graphics.

Mentor: Trisha Lian

Modeling a cell phone camera pipeline

The good folks at Google wrote a paper describing how they make high quality images on a cell phone camera. The paper is included on our Canvas web-site.

Burst photography for high dynamic range and low-light imaging on mobile cameras. Hasinoff et al., ACM Trans. Graph. Vol. 35, No 6. Article 192 (2016).

For some of the projects, we can divide up different parts of the image processing pipeline described in this paper and simulate the expected results using the ISET tools. The critical simulation concerns the acquisition of many brief images, alignment of these images, and combining the results into a high quality result. Let’s see how far we can get in doing an assessment of their burst photography design with software simulation tools.

Camera properties and machine-learning algorithm performance

There are two thoughts about image sensors and machine-learning algorithms. One group of people thinks that the algorithms will run across any type of camera. Another group thinks that changing the camera optics and sensor may have an impact on the algorithm performance.

It is likely that the truth is somewhere in between. Some optics and sensor changes will have an impact on some types of algorithms. But we are not aware of any systematic studies that have examined how changing out camera parameters will influence the performance of convolutional neural nets (CNNs).

We can use the ISET tools in this class to simulate images obtained by cameras with optics and sensors. Those of you who are interested or skilled in machine-learning for image classification or object detection can create a project to evaluate how well a CNN trained for one camera will generalize to images obtained from a different camera.

Cell phone camera variation

Problem: There is a lot of interest in testing the image quality of smartphones, made especially relevant by the DxOMark rankings, which are often cited by the press and by phone manufacturers as a measure of the image quality of a particular model of smartphone. However, out of necessity, only one or a few examples of each phone can be exhaustively tested. That raises the question of how much unit-to-unit variation affects the scores, and if there are correlations in that variance based on sensor model, specific lens, or smartphone brand or price.

Suggested Project: Create a crowd-sourced experiment where volunteers (passers-by?) could take a photo of a test target and send in the result. Then analyze the data to attempt to determine, through some combination of data analysis and machine learning, how much variation there is between multiple samples of a particular model, and whether that varies with brand, price, sensor, optics, or some other potentially surprising factor.

Mentor: David Cardinal, http://www.cardinalphoto.com

Sensor Calibration and Simulation

ISET makes it possible to predict the output of an imaging sensor, given a set of sensor simulation parameters. The simulation parameters are derived from a few fundamental measurements that characterize sensor spectral sensitivity and electrical properties including dark current, read noise, dark signal non-uniformity and photoreceptor non-uniformity. This project will involve making these measurements and deriving the simulation parameters for a camera that is in our lab. Calibration targets, measurement equipment, and software programs will be provided. There are also ISET scripts that describe the measurement methods and calculate the sensor parameters.

s_sensorEstimation.m illustrates how to measure the spectral response of a digital camera. s_sensorAnalyzeDarkVoltage illustrates how to measure dark noise s_sensorPixelReadNoise.m illustrates how to measure pixel read noise. s_sensorSpatialNoiseDSNU.m illustrates how to measure the DSNU of a sensor array. s_sensorSpatialNoisePRNU.m illustrates how to measure PRNU

You can use the measured and known sensor parameters to predict the RGB camera values for a color calibration target. You can then compare the predicted RGB camera values to the actual RGB camera values of the target, taken from a specific camera.

References: http://scien.stanford.edu/jfsite/Papers/ImageCapture/Farrell_Okincha_Parmar.pdf

Mentor: Joyce Farrell

Geometric calibration of a stereo camera

There are several online videos and software packages that describe how to measure and correct for camera lens distortion, and how to estimate the size and location of objects based on the images the objects project onto two cameras in a stereo configuration.

This project involves using calibration targets and software (see references below) to estimate the camera’s intrinsic, extrinsic and lens-distortion parameters. In the process of doing this, you will learn what these parameters are, how they are calculated, and how the accuracy of the estimated parameter values affect the accuracy of object size and distance predictions.

References: https://www.mathworks.com/help/vision/ug/camera-calibration.html https://www.mathworks.com/videos/camera-calibration-with-matlab-81233.html http://ksimek.github.io/2013/08/13/intrinsic/ http://ksimek.github.io/2012/08/22/extrinsic/ https://www.mathworks.com/help/vision/ug/stereo-camera-calibrator-app.html

Mentors: Joyce Farrell and Trisha Lian

Depth from Stereo Images

Database of synthetic stereo images

The Middlebury Stereo dataset is a collection of stereo images with “ground truth” disparities or depth maps. Researchers and students have used datasets that are part of this collection to compare different methods for estimating depth from stereo images. The depth maps are inherently noisy due to that they are empirically measured using range-sensing devices or structured lighting

This project will use our lab software to create a new database of synthetic stereo camera images and associated depth maps. You can modify the scene properties of a scene, position the cameras in the scene, modify the baseline distance separating two cameras, and modify properties of the optics and sensors in the two cameras.

Mentor: Trisha Lian

Stereo algorithm assessment

As a related project, a cooperating group might run depth estimation algorithms that are already published on the web (see, for example, functions in opencv ) and learn about how camera parameters such as baseline separation, optics, and/or sensor resolution affect the accuracy of the depth estimation algorithms.

Project Suggestions Fall 2016

RealSense 3D-imaging

The following are some project ideas that involve the real-time RGB-D imaging technologies using RealSense dev kits. We will provide both RealSense SR300 (short-range depth-camera based on coded-light technique) and LR200 (long-range version based on IR-assisted stereo-3D technique).

Projected Texture Stereo

RealSense LR200 module uses a projected texture stereo system. Measure and model the system’s optical properties and implement techniques for generating high-quality pattern texture projectors, as outlined in published work. Mentor: Leo Keselman

Computational Photography

Using depth maps from either the RealSense SR300 or LR200, create examples of depth-of-field blur, tilt-shift effects, and other post-processing effects. Mentor: Leo Keselman

Stereo Algorithms

RealSense LR200 hardware produces depth maps using stereo matching algorithms. However, they also provide left and right images. Implement, test and design alternative stereo matching algorithms, and compare with the results provided with built-in algorithms in the LR200 ASIC and accessed through the API. Mentor: Leo Keselman

Visual Odometry

There exists many techniques for estimating camera position when given an image. RealSense SR300 and LR200 provide both rectified images and depth maps. With these, a wide range of techniques, from ICP to three-point-pose RANSAC can be used to implement 3D scanning of large environments. Implement such a system. Mentor: Leo Keselman

Image systems simulation

Autonomous vehicle sensors: Forensic analysis of the fatal Tesla car crash

On May 7, 2016, a 40-year old man was killed when his Tesla crashed in Florida. There are many articles describing the accident and speculating about the cause. For example, Telsa reported that “Neither Autopilot nor the driver noticed the white side of the tractor trailer against a brightly lit sky, so the brake was not applied.”

The Tesla car had a Mobileye system that includes several cameras and an image processing module. There is enough known about the imaging sensors in the Mobileye system to predict the images the sensors would have captured for different types of scenes.

This class project will use the ISET digital camera simulation software to model different scenes and image sensor parameters (e.g. exposure duration and video rate). Extra bonus points if you use machine learning (svm) to determine whether a system can detect the difference between different types of scenes. For example, what type of imaging sensor is required to detect the difference between a “white side of a tractor trailer” and “a brightly lit sky”?

References:

Inside the Self-Driving Tesla Fatal Accident, by Anjali Singhvi and Karl Russell, NYTimes, July 12, 2016; Tesla faults brakes, but not autopilot, in fatal crash. By Neal Boudette, Business Day, July 29, 2016; Mobileye EMP evaluation platform; Fatal crash prompts federal investigation of Tesla self-driving cars, by Sam Thielman, The Guardian, July 13, 2016; Autopilot 2.0 adds more sensors to be better than ever, report says, by Chris Mills, BGR, Aug 11, 2016; Tesla Autopilot 2.0: retrofit to next gen sensors likely to be available for some owners, Fred Lambert, electrek, August 6, 2016; Tesla Autopilot 2.0: next gen Autopilot powered by more radar, new triple camera, some equipment already in production, Fred Lambert, electrek, August 11, 2016; Researchers trick Tesla Model S. Autopilot, Brandon Turkus, Autoblog, Aug 4, 2016; Another crash on Telsa autopilog, another driver admits to not paying attention, was cleaning his dash, by Fred Lambert, electrek, August 19, 2016; Tesla Model S, Wikipedia; Understanding the fatal Tesla accident on Autopilot and the NHTSA probe, Fred Lambert, July 1 2016 “WTF is the deal with driverless car guru George Hotz’s Comma Points?”, by Joe Carmichael, July 7, 2016; Uber and Volvo partner up, robot ride-sharing starts this summer, by Jonathan Gitlin, ARS Technica, Aug 18, 2016

Comma.ai startup in SF; Drive.ai startup in SF; Nauto – startup in Palo Alto

Learning a driving simulator, by Eder Santana and George Hotz

Mentor: Joyce Farrell

360 Camera Capture Simulation

The recent popularity of head mounted displays and VR has increased interest in constructing 360 cameras that can capture and render stereo panoramas. A couple of recent examples inlcude Facebook's Surround360 or Nokia's OZO camera.

With a combination of a customized ray-tracing renderer (PBRT-spectral) and a MATLAB toolbox to control it (RenderToolbox3) we have the ability to simulate 360 cameras in a 3D virtual scene created in a modeling program such as Blender. To do this we specify the distribution of cameras, their lenses, focus, FOV, etc. and take a "snapshot" of a virtual scene. For example, we can place 6 virtual cameras in a circle with a 1 foot radius, attach wide angle lenses to all cameras, and take images from each camera. Because the scene is virtual, we also have access to the ground-truth depth map and true panorama.

This project will focus on using these simulation tools to evaluate either the 360 stereo stitching algorithms or the design of the camera itself.

Note: Facebook's stitching code for it's Surround360 camera is now open source and on Github.

Some potential ideas to start with:

1. Can you design a database of virtual scenes that can help evaluate the effectiveness of 360 stereo stitching algorithms? This would include constructing a variety of scenes with a modeling program (e.g. Blender, Maya), porting the scenes to the simulation software and then taking 360 camera snapshots using the tools described above. Using a combination of the ground truth and the results of a stitching algorithm, can we evaluate how well the algorithm performs?

2. Can you evaluate the design of a 360 camera using the simulation tools above? For example, how would the quality of the panorama change by having 12 cameras in a ring instead of the 16 cameras on the Surround360 camera? This direction may require you to dig into the stitching code and make appropriate adjustments.

C++ and Python skills are necessary for using Facebook's open source stitching code. An understanding of basic Computer Vision would also be helpful.

Mentor: Trisha Lian for using the simulation software

Underwater simulations

The advent of GoPro camera has made underwater photography much more accessible. Unfortunately images captured underwater rarely look pleasing, they have washed out colors and low contrast due to scattering. To better understand the impact of water and its different constituents on underwater target appearance we built a ray-tracing based simulation environment for underwater photography. With this tool we think we can render images of underwater targets that look realistic, or do they?

To have some notion of how water really influences color appearance we also captured a number of underwater images using a variety of consumer cameras. In this project you will learn about raytracing through water and different mathematical models used to compute the interactions between lights, targets and water. Ultimately your goal will be to improve the simulation environment to make the simulated and captured images as visually close as possible.

Henryk, Trisha, Joyce

Model RealSense camera

In the past few years color+depth cameras such as Intel RealSense have become commonly available. Such cameras provide images of the scene together with depth maps i.e. arrays of numbers describing distances between the camera and points in the scene. Very often color and depth modules of a particular camera take advantage of fundamentally different physical processes to produce their images. Consequently substantially different tools are necessary to model how cameras produce color images and depth images.

Color image acquisition can be modelled with computer graphics tools, such as PBRT. PBRT is a ray tracking software that accurately simulates how light interacts with different objects in a 3D scene and how the light is projected onto a camera sensor. These rendering tools can be modified to incorporate the behavior of complex lens systems and elaborate camera designs such as light field cameras. In fact we have a modified version of PBRT to perform precisely such simulations (Spectral PBRT).

A different set of simulation tools is necessary to model depth estimation. One such tool is Blensor, which is a plug-in to Blender, an open source 3D editing tool. Blensor has been designed specifically to model how different types of depth cameras capture their data.

Unfortunately having two different tools is very inconvenient for modelling purposes. It is easy to loose track of simulation parameters, for example, camera poses and positions, scene orientations etc. Your goal for this project is to create a wrapper around PBRT and Blensor to allow users to easily and seamlessly use both tools. Ideally a user would define a model of a depth and color camera together with a scene mesh that would represent the world. The wrapper would need to handle the different tools, and make sure that the color and depth data is consistent with the scene mesh and camera models.

We hope that with the wrapper you will create you will be able to create a good model of a RealSense depth camera.

Achin, Henryk, Trisha

Myopia/Hypermetropia VR Experience

Myopia (near-sightedness) and hypermetropia (far-sightedness) are the most common eye problems in the world. With virtual reality, we have the potential to simulate the visual experience of uncorrected myopia or hypermetropia. This project will focus on creating such a VR experience. One possible path is to use Unity to create virtual rooms with interesting features that can highlight the experience of these vision problems. This would involve writing a shader that can blur the scene, as realistically as possible, according to depth and presenting this altered image through the VR goggles. Additional features may include sliders to change severity or to add other effects to increase the realism of the experience.

Students who work on this project may potentially be put in touch with documentary filmmakers interested in a creating a piece on myopia.

Trisha

Computer vision and computational photography

Reflectance, Fluorescence, and Color Matching

Fluorescence emission is a common property of biological tissues and materials and it strongly impacts the appearance of surfaces under different illuminants. Its presence makes any color matching task much more difficult. One example of a biological substance for which color matching is important are teeth. Natural enamel fluoresces under shot wavelength light, and whenever a dentist fills in a cavity he/she needs to select the filling with the color matching the tooth. However what may appear similar under dentists lamp, may look very different in broad daylight.

In this project you will perform a set of measurements of how teeth reflect and fluoresce light and then help design the spectral reflectance properties of a better dental filling that will be less visible under different illuminants.

Henryk, Joyce

Auto-cropping using Deep Learning

One of the most common post-processing tasks in photography is cropping of images for improved visual impact. This has only gotten more important with the widespread adoption of fixed-focal-length smartphones as the most common cameras in use today. There have been a number of very sophisticated attempts to automate this otherwise labor intensive process using adherence to various rules of composition (see References below). However, they suffer from growing complexity, as each attempt to improve the system requires layering yet more specialized knowledge. This seems like an ideal challenge for a deep learning based solution.

There don’t seem to be any (publicly available at least) frameworks for solving this problem in its entirety, but there have been several attempts to rate the aesthetics of photographs using deep learning (see References below). So, the project is to see if a similar approach can be used to automatically improve images by cropping them in some fashion. It provides some interesting challenges in design of the deep learning system. For example, should it be designed to evaluate each image and its possible crops independently, or is there a way to directly measure the success of a crop compared to the original image? The total solution spaces is extraordinarily large, so a variety of simplifying assumptions (for example a limited number of potential crops for each image) is assumed.”

Some references:

http://www.arminsamii.com/research/papers/crop-paper.pdf

https://www.cs.umd.edu/~djacobs/pubs_files/UIST2003.pdf

Optimizing photo composition (refers to above papers) https://people.mpi-inf.mpg.de/~chen/papers/photocompos.pdf

Rating Pictorial Aesthetics using Deep Learning http://infolab.stanford.edu/~wangz/project/imsearch/Aesthetics/ACMMM2014/lu.pdf

Mentor: David Cardinal

Human vision simulations (ISETBIO)

Predicting visual acuity from wavefront aberrations

Andrew B. Watson; Albert J. Ahumada, Jr
Abstract
It is now possible to routinely measure the aberrations of the human eye, but there is as yet no established metric that relates aberrations to visual acuity. A number of metrics have been proposed and evaluated, and some perform well on particular sets of evaluation data. But these metrics are not based on a plausible model of the letter acuity task and may not generalize to other sets of aberrations, other data sets, or to other acuity tasks. Here we provide a model of the acuity task that incorporates optical and neural filtering, neural noise, and an ideal decision rule. The model provides an excellent account of one large set of evaluation data. Several suboptimal rules perform almost as well. A simple metric derived from this model also provides a good account of the data set.

http://jov.arvojournals.org/article.aspx?articleid=2122162

A formula for the mean human optical modulation transfer function as a function of pupil size

Andrew B. Watson

Abstract: We have constructed an analytic formula for the mean radial modulation transfer function of the best-corrected human eye as a function of pupil diameter, based on previously collected wave front aberrations from 200 eyes (Thibos, Hong, Bradley, & Cheng, 2002). This formula will be useful in modeling the early stages of human vision. http://jov.arvojournals.org/article.aspx?articleid=2121488&resultClick=1

A unified formula for light-adapted pupil size

Andrew B. Watson; John I. Yellott

Abstract The size of the pupil has a large effect on visual function, and pupil size depends mainly on the adapting luminance, modulated by other factors. Over the last century, a number of formulas have been proposed to describe this dependence. Here we review seven published formulas and develop a new unified formula that incorporates the effects of luminance, size of the adapting field, age of the observer, and whether one or both eyes are adapted. We provide interactive demonstrations and software implementations of the unified formula.

http://www.journalofvision.org/content/12/10/12/

Mentor: Wandell

The impact of small eye movements on high frequency resolution of the eye

Simulate the effects described in this paper using ISETBIO

Abstract: Humans and other species explore a visual scene by making rapid eye movements (saccades) two to three times every second. Although the eyes may appear immobile in the brief intervals between saccades, microscopic (fixational) eye movements are always present, even when an observer is attending to a single point. These movements occur during the very periods in which visual information is acquired and processed, and their functions have long been debated. Recent technical advances in controlling retinal stimulation during normal oculomotor activity have shed new light on the visual contributions of fixational eye movements and the degree to which these movements can be controlled. The emerging body of evidence, reviewed in this article, indicates that fixational eye movements are important components of the strategy by which the visual system processes fine spatial details; they enable both precise positioning of the stimulus on the retina and encoding of spatial information into the joint space–time domain.

Control and Functions of Fixational Eye Movements Annual Review of Vision Science Vol. 1: 499-518 (Volume publication date November 2015) First published online as a Review in Advance on October 14, 2015 DOI: 10.1146/annurev-vision-082114-035742

The unsteady eye: an information-processing stage, not a bug [8]

Mentor: Wandell

Effects of age on color appearance

Use ISETBIO to simulate the combined effects of an aging eye - changes in lens opacity, light scatter, pupil size, and so on - on various perceptual phenomena, such as color appearance.

From (Brainard, D. H. & Hurlbert, A. C. (2015). Colour vision: understanding #TheDress. Current Biology, 25, R549–R568, doi: 10.1016/j.cub.2015.05.020).

There are, in fact, a number of well-documented individual differences in the sensory apparatus that supports colour vision (reviewed in [13,14]). These include differences in pre-retinal filtering of light (for example, by the lens and macular pigment) — which, intriguingly, mostly affect short-wavelength or ‘‘bluish’’ light — differences in the spectral sensitivities of the retina’s cone photoreceptors, and differences in the relative numbers of cones of different classes. This type of front-end difference affects the information extracted from an image by different individuals, and might thus lead to differences in colour constancy. Other individual differences that can be revealed with much simpler stimuli may also be important. For example, as noted above, the stimulus seen as achromatic differs from one person to another, as do the stimuli that are perceived as pure examples of the unique hues (red, green, blue, and yellow) [15]. These differences themselves may be driven by front-end sensory differences, by differences in neural mechanisms that calibrate the colour vision system [16,17], or by an interaction between the two. Lastly, there might be individual differences in higher-order neural processes that specifically mediate colour constancy. A full understanding of the individual differences in how the dress is perceived will ultimately require data that relate, on a person-by-person basis, the perception of the dress to a full set of individual difference measurements of colour vision. The rich dataset of Lafer- Sousa et al. [2] suggests that age and gender do predict, to some extent, the variability in people’s response to the dress. Intriguingly, the density of pre-retinal pigments is also known to vary systematically with age."

Mentor: Wandell

Project Suggestions Fall 2015

A new approach to image processing (L3)

We have developed a new image processing pipeline (L3) for a digital camera based on machine learning and high speed processing with GPUs. L3 (Local, Linear, Learned) automates and customize image processing pipeline for a given design to speed camera development, leveraging advanced camera simulation and machine learning techniques.

Reference:

[1] Automating the design of image processing pipelines for novel color filter arrays: local, linear, learned (L3) method

[2] Automatically designing an image processing pipeline for a five-band camera prototype using the Local, Linear, Learned (L3) method

Accelerating L3 Processing Pipeline for Cameras with Novel CFAs on NVIDIA® Shield™ Tablets using GPUs

L3 classifies input image patches into categories that are local in space and response, and automatically learns linear operators that transform pixels to the calibrated output space using training data from camera simulation. The local and linear processing of individual pixels makes L3 ideal for parallelization.

This project aims to accelerate the L3 pipeline on NVIDIA® Shield™ Tablets using GPUs for real time rendering of videos. A tablet application that demonstrates the fast rendering feature of the L3 method is potentially to be accomplished. The learned linear operators and video data captured by a multispectral camera prototype will be provided. The CUDA / C++ (or CUDA / Matlab) code that works on a PC will be provided as a starting point.

Skills preferred: CUDA, Android Programming

Mentor: Haomiao Jiang

High Dynamic Range Video Using the L3 Method

High dynamic range (HDR) imaging has advanced and translated to consumer products during the last decade. The majority of HDR techniques capture and combine multiple exposures to recover details and contrast simultaneously in dark and bright regions. However, this strategy requires the scene to be still during the multiple captures and is therefore inherently not suitable for HDR video acquisition. Altering the exposure settings in CFA is a promising approach for single-shot HDR image and HDR video acquisition, by trading-off spatial resolution. These novel HDR CFAs require time and effort to develop tuned image processing pipelines.

This project aims to explore the feasibility of L3 method on these novel HDR CFAs, particularly for HDR video application. Various HDR CFAs will be compared through the resultant images from the L3 processing pipeline in order to determine the optimal design.

References:

[3] Cheng, CH. et al., "High Dynamic Range image capturing by Spatial Varying Exposed Color Filter Array with specific Demosaicking Algorithm," IEEE Pacific Rim Conference on Communications, Computers and Signal Processing, 2009.

[4] F Yasuma, T Mitsunaga, D Iso, SK Nayar, Generalized assorted pixel camera: postcapture control of resolution, dynamic range, and spectrum, Image Processing, IEEE Transactions on 19 (9), 2241-2253

Mentors: Qiyuan Tian and Steve Lansel

Designing L3 Processing Pipeline for a Camera Testkit with an RGB/W CFA Clear pixels have been introduced to CFA to transmit much more light for low light photography (e.g. Aptina’s Clarity+ sensor, OmniVision’s Clear Pixel sensor inside Moto X and Sony’s Exmor RS RGB/W sensor). However, it is challenging to develop satisfying image processing pipelines that produce high image quality. In simulation, L3 has been demonstrated as an effective and efficient processing pipeline for RGB/W sensor (see the movie comparing L3 processing results for a conventional RGB sensor and a RGBW sensor at a series of light levels, link).

This project aims to design an L3 processing pipeline for a camera teskit with an RGB/W CFA following the procedures described in Reference [2]. The camera testkit will be first calibrated for camera simulation. L3 processing pipeline will then be created from the simulation and tested on the raw images captured by the testkit.

Mentor: Qiyuan Tian, Haomiao Jiang

Color Matching in Dentistry

When dentists fill a cavity, they must select a composite material. When they replace a tooth or place a crown or veneer on an existing tooth, they design or order a porcelain implant. These decisions require the dentist to compare the color of teeth with the color of the composite or porcelain material. Dentists try to select the color or shade of the material that provides the best color match to the surrounding teeth, but they also complain that this is a difficult task.

By now you have learned how to use the CIELAB color difference metric to predict whether two colors will appear to match under a fixed illumination. You have also learned that these predictions are not invariant with changes in illumination. In other words, if you change the lighting, the colors of two different materials may no longer appear to match. Therefore, the smile that looks so perfect in the dentist’s office under fluorescent lighting might have imperfections under daylight.

This project has three components. First, we will make spectrophotometric measurements of 1) the reflectance of teeth in-situ in different individuals, 2) the reflectance of different composite and porcelain material, and 3) the spectral power of the light that falls on teeth in-situ under different lighting conditions. Second, we will use this data and the CIELAB color difference metric to predict whether people will be able to detect the difference between teeth and composite and porcelain material under different lighting conditions. Third, we will use the data in ISET simulations in order to determine the tradeoffs in color matching accuracy, cost and convenience. More specifically, we will simulate an imaging system based on a cell phone camera with flash/no-flash mode that has the potential of providing dentists with an alternative to the more expensive spectrophotometric devices that are currently on the market.

Mentor: Joyce Farrell (joyce_farrell@stanford.edu) and Henryk Blasinski (hblasins@stanford.edu)

Simulation projects dusing ISETBIO

ISETBIO is an ISET based Matlab toolbox that can simulate human optics and photoreceptor sampling. With ISETBIO, we can accurately compute the optical irradiance image that impinges on the retina and the number of photons absorbed by human photoreceptors (cones) for a given scene. ISETBIO is capable of simulating human individuals with different optics (myopia, astigmatism, etc.) and cone mosaics (colorblind, density difference, etc.).

Reproduce and Compare with Recent Papers In this project, you are expected to reproduce the results from one recent paper with ISETBIO. You are expected to work with your mentor to rewrite it in ISETBIO and try to explain every difference (if any) from the original code.

Here is a set of papers by Watson that are computational, in Mathematica, and related to Optics and Retina

Modulation Transfer Function and pupil size
Pupil size and light level

Here a paper related to the human point spread function

Computing human optical point spread functions

Retinal ganglion cell modeling

A formula for human retinal ganglion cell receptive field density as a function of visual field location

Or ganglion cells and behavior

Retina-V1 model of detectability across the visual field. The original code for the paper will be provided.

Skills preferred: Matlab programming

Mentor: Haomiao Jiang, Brian Wandell

Simulate an eccentric camera

Write a simulation of the Foveon sensor.

Or,

Write a simulation of the Light.co camera.

Mentor: Brian Wandell

Monitoring the environment

We have ideas about how to take calibrated underwater images captured by GoPro cameras to monitor the health of coral reefs. There are various components to the project (camera calibration, modeling of light transport through water, and automating image upload, storage and analysis).

Mentor: Henryk Blasinski

An underwater, multispectral light source

Underwater imaging is quickly gaining importance not only due to its applicability in marine ecosystem monitoring, but also proliferation of inexpensive action cameras such as GoPro. Unfortunately, the colors in images acquired under water are severely distorted due to scatter and absorption phenomena. One approach to recover more spectral detail is to use active illumination techniques, this approach has proved to be very useful on the surface. In this project you will design and build an underwater, LED based multispectral light source that fits a standard GoPro size, underwater housing. With all the hardware in place you will have a chance to evaluate the accuracy and performance of active illumination spectral recovery in underwater scenarios. This is a hardware oriented project, you will be expected build and integrate the final system, which means that you should be familiar with soldering, PCB design and possibly even some CAD tools.

Skills Preferred: Hardware design experience, OR good with web-site programming.

Mentor: Henryk Blasinski

Oculus

Geometric Camera Calibration In order to simulate degradations of the human visual system using images captured by a camera, it is necessary to know exactly how those images have been captured. This project uses simple camera models that use efficient and flexible calibration procedures to derive geometric parameters such as focal length, radial distortion and the position and rotation of two cameras. There are well-established techniques that estimate these parameters using a specific calibration target like a checkerboard. The goal of this project is to become familiar with those techniques and use them on real images (OpenCV provides many building blocks which can be used) with an image undistortion procedure and a stereo image rectification procedure.

Skills preferred: Knowledge of C++

Mentor:

Streaming and Augmenting Stereo Camera Images One of the long term goals is the simulation of certain degradations of the human visual system and the evaluation of computer-aided visual enhancements to counteract those degradations. A crucial ingredient in achieving this is a software pipeline which can stream images from a stereo camera to an augmented or virtual reality device in real-time. Hence, the goal of this project is to build such a pipeline to capture, stream, and feed images from cameras in real-time to an Oculus Rift device.

Skills preferred: Knowledge of C++, Willing to learn Oculus Rift SDK, optionally also OpenGL SL or CUDA

Mentor:

Image Display Use the Oculus Rift to display images to human subjects that simulate (recreate) visual sensations that a person with a particular visual condition would see. This could be low vision, a type of color blindness, loss of central vision due to macular degeneration, or the effect of a retinal prosthesis in a blind person. We will help you use isetBio to create images to simulate one of these conditions. You will render the images on a calibrated Oculus Rift.

Information Display The goal of this project is to capture and display information so that people can track their movements and navigate in an environment with only visual input from the Oculus Rift. This will be accomplished by interfacing a Project Tango device with an Oculus Rift display. The Project Tango has sensors and software designed to track the 3D motion of the device and create a map of the environment using simultaneous localization and mapping (SLAM) algorithms. The output of the Project Tango is usually rendered on a laptop display. In this project, you will render the output on an Oculus Rift.

3D Projects

Almost anything with Real Sense

Depth Sensing With an Endoscope Using Flashes

Depth sensing has been a recent industry trend for many imaging applications. One less explored route is the use of depth sensing for endoscopes, to help identify tumors or other problems. For this project, initially use simulated Scene3D endoscope images to prototype a depth sensing algorithm involving 2 flashes and 2 captures (other capture procedures could be used as well). Prototyping using simulation is a nice, structured way to try out new algorithms quickly. Next, apply this algorithm using a real endoscope and tackle the real-world challenges involved.

Mentor: Steve Lansel

Curved Sensor Simulation

Sony and other imaging companies have recently unveiled curved sensors to improve image quality. Curved sensors bring imaging improvements because of the physics of geometric optics. For simple lenses, usually the focal area is in the shape of the surface of a sphere. However, most sensors are planar, so are only able to capture a small portion of the focal area. Lens engineers usually try to account for this problem using many lens elements and aspheric lenses. However, a curved sensor could potentially be a far simpler, and less expensive solution to obtain high quality images, in a smaller form factor. Instead of using a complex lens to obtain high resolution, imaging engineers could simply use a simple lens and a curved sensor to obtain the same, or even better results.

This projects involves using Scene3D, a full pipeline camera simulator to compare the resolution and chromatic aberration benefits of curved sensors and a simple lens, versus a planar sensor and a complex lens.

Mentor: Brian Wandell

Integration with OpenCV

Do we want to create scenes of some sort (stereo? different illumination? different noise? Different optics?) and test openCV algorithms for robustness against the range of simulated images.

Integration with Caffe

Simulation environments can be used to produce millions of images with a purpose in mind. We can then use these images to train machine learning algorithms. Is there something we want to ask people to do with, say, RenderToolbox to generate many examples and train on with Caffe?

Multispectral imaging for classification

Image classification is a very hot topic in computer vision. Most algorithms however operate on RGB camera channels, as if trying to mimic human visual system. In reality spectral information is much more abundant and can possibly be used to enhance classification algorithms. This project aims at investigating how much the accuracy of computer vision tasks can be improved if more spectrally sophisticated cameras were used . Specifically you will use a five band camera prototype to evaluate fruit and vegetable aging and perform flower classification, you will compare its performance to the performance of a classical RGB camera.

Skills Preferred: computer vision, machine learning

Gullstrand Eye and ray tracing of human optics

We are building a tool for modeling eyes, including the human, from ray tracing fundamentals. There is a famous model eye that we would like to implement.

Gullstrand eye search

We would like you to implement and test the Gullstrand eye with the ray tracing software in the ciset package (a close relative of ISET).

Mentor: Brian Wandell

Project Suggestions 2014

Predicting human performance using ISETBIO

ISETBIO is an ISET based Matlab toolbox that can simulate human optics and photoreceptor sampling. With ISETBIO, we can compute the optical irradiance image that impinges on the retina and the number of photons absorbed by human photoreceptors (cones) for a given scene.

For this project, a tutorial script describing how to calculate cone absorptions will be provided and the students will be responsible for trying to answer one of following questions:

  1. What's the maximum necessary display resolution (ppi) at certain viewing distance for Vernier acuity.
  2. What's the maximum necessary display resolution (ppi) at certain viewing distance for contrast (CSF) resolution.

To answer these kind of questions, students are encouraged to build two scenes and use their preferred machine learning algorithm (e.g. SVM/Neural Network/Random Forest, etc.) to classify cone absorption sensor data for two same or two different scenes into "same" or "different" classes. When classification accuracy for cone absorption data is greater than a pre-determined value (say, 75%), we would predict that the observer can tell the difference between the two scenes. You can compare these predictions with published data from real human observers.

Preferred Knowledge: familiarity with at least one machine learning algorithm

Mentor: Haomiao Jiang

Hardware project: Build a Multispectral Imaging System

Build a multispectral imaging system based on a rotating color filter wheel and monochrome camera.

If you have experience in design and 3D printing, you can build several necessary parts.

If you have an interest in engineering applications for art history, there is an opportunity to use the system to capture images of paintings in the Cantor Arts museum.

Mentors: Henryk Blasinski and Joyce Farrell

Hardware project: Build an inexpensive spectrophotometer

In this project, you will build a simple spectrophotometer using a clean DVD-R, a USB webcam and stiff black card paper

Here's a website introducing how to do it: http://publiclab.org/wiki/spectrometer

After building the device, you need to compare it to the performance of a much more expensive (~$50K) spectrophotometer that we have in the lab

Mentor: Haomiao Jiang

Camera Image Quality Metrics

The International Standards Organization (ISO) is developing a set of camera image quality metrics to quantify the spatial resolution, noise and color accuracy of digital cameras. http://proceedings.spiedigitallibrary.org/data/Conferences/SPIEP/64097/829302_1.pdf

Many of these metrics have been implemented in ISET.

You can use ISET to calculate these metrics for simulated cameras that have different optical properties, numbers of pixels and image processing methods. You can also use ISET to simulate how each camera captures and processes natural scenes (e.g. faces and landscapes). You can then compare the metrics with the appearance of these images as they are rendered on a display.

In this project, you will use ISET and CPIQ to quantify and illustrate how the metrics and the images change when you decrease the size of camera pixels (and inversely increase the number of camera pixels). This method will allow you to analyze how resolution tradeoffs with sensitivity: Small pixels make it possible to increase the number of sensor pixels sampling the optical irradiance image, but it also decreases the amount of photons a small pixel can capture. What do you prefer, a high resolution noisy image or a low resolution clear image? How does this depend on the display, viewing distance, etc.?

Mentor: Joyce Farrell

ISET model for real camera

In this project, you will build an accurate ISET model for a physical camera we have. You will take pictures of known scenes, analyze the captured images, and try to build an ISET model.

The goal is for the ISET model of the camera to give approximately the same computational results as the RAW output from the real camera. The similarities could be measured by the noise, color, spatial resolution and etc. Analyzing the errors between the model and the real camera will determine the model's accuracy.

If time permits, you can try to implement an image processing pipeline for the camera and evaluate the performance of the processed images.

Mentors: Qiyuan Tian, Steve Lansel, Joyce Farrell

Analysis and Compression of L3 Filters

The L3 algorithm is a learned image processing pipeline for cameras. The algorithm learns optimal linear filters for a given camera based on training data, light level, illumination color, and optics. For a complete camera this may result in many (possibly hundreds) optimized filters. We believe the filters will be closely related for similar camera settings. The goal is to analyze the filters, store a compressed set of filters, and interpolate the needed filters from this compressed set. This way we only need to store a smaller set of filters and can extrapolate to lots of new camera settings. Here is a recent SPIE paper on L3: https://drive.google.com/file/d/0B0Gw85qGqJxhbXJlcmZjbmhOQ2s/edit?usp=sharing.

Mentors: Qiyuan Tian, Steve Lansel, Brian Wandell

ISET model for underwater imaging

With the proliferation of cameras such as GoPro more and more people have started taking underwater images. These usually have large amounts of distortion, both spectral and spatial, originating from the medium in which the image was taken. Rather than experiment in the real world, the impact of different light transport phenomena on RGB images can be understood via simulation environments. In this project you will implement, enhance and integrate with ISET the underwater image simulation system described in the paper below.

Color image simulation for underwater optics

Mentors: Joyce Farrell and Henryk Blasinski

App for Programmable Camera in iOS / Android

We have a prototype programmable camera to be used with iOS or Android devices. The project's goal is to make an app that will run on iOS or Android and uses the camera. Think of an interesting camera app, and we can work together to build it. Prior experience in iOS or Android is needed.

Mentors: Steve Lansel and Munenori Fukunishi

Image classification with a five band camera

Recently image classification and object recognition have become very popular topics. Large majority, if not all, algorithms use images acquired with traditional, three channel (RGB) cameras. The goal of the project is to evaluate the performance of the state of the art algorithms applied to images captured with a five band camera. Will the recognition/classification performance change, and if so by how much? To get the flavor of the project you can look at the following paper:

Multispectral SIFT for scene category recognition

Mentors: Henryk Blasinski, Steve Lansel

Analysis of a real camera lens

Can we characterize how a lens blurs a point of light (point spread functions or psfs) by analyzing camera images of test targets that are displayed on a color monitor? This project has many possible variations.

  1. Illuminate red, green and blue pixels on a display and capture an image of the display with a camera placed on a tripod a far distance away. Vary the pattern of red, green and blue pixels (e.g. noise pattern).
  2. Estimate the psfs of a real camera with a real lens by analyzing camera images of displayed targets. Use a prosumer digital camera and vary 1/f# and observer how the estimated psfs change.
  3. Estimate the psfs for different field heights, wavelengths and depths.
  4. Use the estimated psfs to predict camera images of other displayed "natural" images, such as a face. Compare the predicted camera images to actual camera images.

Here are links to papers that describe a method for empirically estimating the psf of a camera lens. The links include code that you can download

  1. http://www.cs.ubc.ca/labs/imager/tr/2013/SimpleLensImaging/
  2. http://www.ipol.im/pub/art/2012/admm-nppsf/

People: Brian Wandell, Andy Lin, Joyce Farrell

Psf analysis and image deblurring using a simulated camera lens

The point spread function(psf) of a lens is an extremely important lens property. One possible application of knowing the psf is image deconvolution (deblurring). Deconvolution can drastically improve image sharpness. The following paper provides a good technique for estimating a psf and deconvolving an image with that psf: http://www.cs.ubc.ca/labs/imager/tr/2013/SimpleLensImaging/

Tasks

  1. Andy Lin will provide simulated camera images of several different types of spatial test targets. Your task will be to use the code from http://www.cs.ubc.ca/labs/imager/tr/2013/SimpleLensImaging/ to estimates psfs from the simulated camera images.
  2. To evaluate how well the psf estimation code works, compare the estimated psfs to the known psfs that Andy used to generate the simulated camera images.
  3. As another evaluation technique, use the estimated and known psfs to "deblur" a blurred image containing a secret message using the deconvolution code downloaded from the same site. The secret message will only be legible after proper deconvolution of the image. Andy will provide this blurred image.

Mentor: Andy Lin

Medical imaging: Super resolution microscopy

http://en.wikipedia.org/wiki/Super_resolution_microscopy

Super resolution microscopy refers to methods that build up a high resolution image of target by integrating many multiple images of the target illuminated such that only a small subset of the image points are captured in any one image. The camera image then samples a subset of the pixels in a high resolution image. The location of the pixels in many camera images are combined to construct a single full high resolution image of the target. By placing a point at the center of each sampled point, one can get very accurate spatial information about the location (phase) of illuminated points in the target. Because the center of a dot is smaller than the lens psf, some people assert that super-resolution methods beat the limit of lens diffraction. But you know better than that. Diffraction is a limit that no earthly being can beat. Nonetheless, by sampling with stochastic and sparse arrays of pixels, one can do a better job of locating the center of sampled points and hence build up a higher resolution image.

You can write an ISET simulation to test one of these super-resolution methods.

Alternatively, you can test methods for super-resolution imaging using real camera images. For example, take a camera image of a displayed image, (such as a face or a high resolution test chart) . Then take a capture a series of images of the display when only a subset of the pixels in the face (or chart) are illuminated. The illuminated pixels in each subset will be far away from each other such that the optical images of the pixels illuminated in each image do not overlap. You can further experiment by taking a blurry image of a face (say, by setting the caemra 1/f# to 12). Then, display subsets of pixels of the face that are widely separated. Find the location of the center of each illuminated pixel and combine the data to create a non-blurred camera image.

Mentor: Brian Wandell, Haomiao Jiang

Eulerian video processing (Bill Freeman thing)

Repeat one of the experiments from Bill Freeman. There published paper could be found from http://people.csail.mit.edu/mrub/vidmag/

Also, you need to compare the results for cameras with 3 color channels (rgb) and with 5 color channels (prototype in our lab).

Scene 3D System

The goal of the Scene3D project is to simulate the complete imaging pipeline for 3D scenes, from the scene to the lens , to the sensor and to the mage processing. Simulations of sensor and image processing are implemented in ISET. The novel part of Scene3D involves using a technique in 3D graphics called ray-tracing, which produces a physically accurate simulation of light rays that are refracted through lenses and towards the sensor. We modified the PBRT ray-tracer to simulate the important effects of diffraction and to be able to handle complex lenses and multispectral inputs and outputs. The end goal of the Scene3D project is to provide an infrastructure for rapid image systems prototyping.

[Scene3D project https://github.com/ydnality/Scene3D]

One important aspect in photography involves color balancing. Often times, photographs taken under different illuminant conditions will produce images that don't appear natural. For example, images taken under indoor tungsten lighting will exhibit an unnatural yellow/orange tint. These images must be corrected for in order to appear natural.

This class project involves applying the camera pipeline simulation provided by the Scene3D infrastructure for use in designing a color-balancing algorithm.

Tasks

  1. Start with a 3D radiance scene generated by Andy Lin. Modify the parameters of the scene to make different renderings of the scene under different light conditions.
  2. Design and implement an intelligent method for "correcting" (color-balancing) the illuminant.
  3. (Challenge/Optional) Design a color balancing method that is able to correct for scenes with 2 or more different illuminants.

Mentor: Andy Lin

PBRT and Zemax optics modeling

Scene3D use a combination of PBRT and ISET to simulate the complete imaging processing pipeline of a digital camera. The unique contribution of Scene3D is that it applies a technique in 3D graphics called ray-tracing, to produce a physically accurate simulation of light rays as they are refracted through lenses and towards the sensor. We modified the PBRT ray-tracer to simulate the important effects of diffraction and to be able to handle complex lenses and multispectral inputs and outputs. However, we have yet to verify this pipeline empirically.

One way we plan to evaluate our modifications to PBRT is to compare the point spread functions we generate with point spread functions generated by Zemax, a well-established software package used by many optics professionals. We provide a Zemax macro that can be used to generate the PSFs that ISET needs. Although Zemax can produce physically accurate PSF's, it cannot produce rendered physically accurate 3D multispectral images like PBRT.

Tasks

  1. This project would involve taking several PBRT multi-element lens models, and creating equivalent Zemax models.
  2. Use the Zemax to ISET interface to generate the data necessary for the ISET simulations.
  3. PSF's using the PBRT model will be provided as ISET optical images. We provide a Zemax macro that can be used to generate PSF for lenses that are modeled in Zemax.
  4. Compare and analyze the PSF's produced by these two different methods under different aperture and distances as verification.


Experience with Zemax is preferred.

Mentor: Andy Lin


Gesturing in a Virtual 3D space

The Holografika multi-projector display system creates a 3D light field that people can view without the need for special googles. Leap Motion is a controller that can sense small finger movements using an infrared led and camera. We linked these two devices so that users can grasp and move virtual objects in the 3D light space created by the Holografika display. We also linked the Leap Motion to a conventional stereoscopic display that uses an LCD with shutter goggles. The goal of this project is to compare how well users can use the hand-gesture controller to move objects in the virtual 3D spaces created by the two different types of displays.

The project has possible variations. - You can find a suitable OpenGL app or game from the Leap Motion Airspace app store that measures agility to quantify the learning rates of new users. The objects floating in front of the Holografika display will be aligned to the users hands in that 3D space, but not so with the flat LCD display. Possibly include mouse mode in the tests. -Using a 3D top-down street view map of London, test users skills at finding a location by panning and zooming a holographic 3D map of London on both kinds of displays, using hand gestures. Does the user's self-reported confidence correlate to measured performance and how does display type affect that? Use the metrics to predict the actual benefit for different kinds of organizations to transition from mouse control to (hands in air) gesture devices with 2D and Holographic displays.

The equipment is calibrated and available in Packard 070.

Here is a link to the companies involved: www.holografika.com and www.leapmotion.com

You can watch a video of the talk by the inventor of Holografika (Tibor Balogh) at https://talks.stanford.edu/scien/scien-colloquium-series/

Mentors: Dave Singhal, Harlyn Baker, Peter Kovacs

Project Suggestions 2013

Camera Forensics

You are presented with a digital image and asked to determine if it has been manipulated and if so to localize the manipulation in the image. Color filter array (CFA) interpolation generates a tell-tale signature in a digital image that can be used in a forensic setting. CFA interpolation leads to strong correlations between a specific subset of pixels and their spatial and chromatic neighbors. Build a classifier that takes as input a digital image and automatically detects which parts of an image do and do not exhibit the expected CFA correlations. Begin by generating a synthetic set of test images that have undergone your choice of CFA interpolation. Test your forensic analysis on these uncompressed images and then quantify the efficacy of your approach on increasingly more JPEG compressed images. Disputes often erupt over the provenance of photos. Consider how you might use your new forensic technique to distinguish between images taken from different types of cameras (e.g., a Canon PowerShot vs. a Nikon D-series).

References

  1. A Survey of Image Forgery Detection
  2. Exposing Digital Forgeries in Color Filter Array Interpolated Images

Tasks

  1. We provide you with training images
  2. You develop the classifier based on the papers
  3. We provide you with test images to see how you did

Image Forensics

You are presented with a JPEG image and asked to determine if it originated directly from a camera/mobile device, or if it was re-saved one or more times. Multiple compressions at different compression levels leave behind specific statistical artifacts in the distribution of DCT coefficients. These artifacts can be used to distinguish between singly and multiply compressed images. Build a classifier that can distinguish between singly and doubly compressed images (assume that the second compression level is different than the first). Validate your classifier on a large data set of images. Quantify the conditions under which the classifier is effective and not. Extend your classifier to distinguish between one, two, and three compressions. The expert forger becomes aware of your forensic technique and writes a special purpose encoder that will re-save a JPEG image with the same compression quality as the original. Consider how you might counter this by detecting multiple compressions made with the same compression setting.

References

  1. A Survey of Image Forgery Detection
  2. Statistical Tools for Digital Forensics

Tasks

  1. We provide you with training images
  2. You develop the classifier based on the papers
  3. We provide you with test images to see how you did

Turbulence removal

X. Zhu and P. Milanfar, "Removing Atmospheric Turbulence via Space-Invariant Deconvolution" IEEE Trans. on Pattern Analysis and Machine Intelligence vol. 35, no. 1, pp. 157-170, Jan. 2013

Also see related talk and Project page

Options

  1. You obtain by measurement or simulation example images and then use their methods.
  2. You develop a variant of their method, exploring deconvolution, registration, or some other part of the algorithm more deeply than in the original paper.
  3. You find another approach and compare that approach to this one.

Photon calculator utility (ISET)

Build a program, perhaps based on the ISET library, that calculates the spectral irradiance at the sensor from the scene radiance and a specification of the optics. Doing this for diffraction-limited optics, specifying only the f/#, is sufficient.

The utility should be backed by a wiki page that illustrates all of the steps in doing that calculation. This project should produce an educational and useful calculator.

  • Doing an implementation that can run on a browser on the Internet is best.
  • Doing a straight Matlab implementation with a nice GUI is also good.
  • Implementing the ISET (Matlab) routines as a Python calculator has value, as well.

Updating Wikipedia

Help us make Wikipedia better. There are surprisingly many Wikipedia entries on imaging and human vision that are just a few sentences long. Look-up for example: 'Troland', 'Stiles-Crawford effect', 'Photopic vision', 'Human PSF' or 'Active Pixel Sensor' to see how poor these entries are. Your mission, should you choose to accept it, is to improve these (or other) entries. Think of your work as of a paper, which is published online, rather than in a .pdf format. Of course, just as with writing any research paper, your work should start with a thorough literature review, select the relevant pieces of information and write them up in a way approachable to a non-expert in the field.

Neuroimaging (special approval)

With the opening of Stanford's Center for Cognitive and Neurobiological Imaging (CNI), we now have access to a large number of MR scans of the human brain. We are also closely connected to the MR hardware and image processing algorithms.

While this course is not specifically about neuroimaging, some of the methods in the course might be usefully applied to the data collected at the CNI. For students already working in MR and interested in such signal processing, we might be able to develop some projects that build on your interest.

Two possible projects are algorithms to:

  • Identify when two MR images are of the same brain (brainprint), even if they were acquired using different contrasts.
  • Evaluate image quality and MR artifacts

Scene database for computer vision testing (special approval)

Build scenes, say using Blender and PBRT, that we can run through the ISET simulation to produce images. Then analyze these calibrated scenes using computer vision algorithms to derive the depth, illumination, and shading. See this example page for folks who created a database from real, rather than simulated, scenes.

Color balancing (special approval)

Color balancing refers to the process of converting camera rgb data into display rgb values. If one simply copies the sensor pixel values into the display values, the resulting image will not generally be a good color representation of the original scene. An important step in the image processing pipeline is to transform camera rgb values to display values such that the display image appears to match the original scene that was captured.

A simple and common approach to color balancing is to make an educated guess about the scene illumination based on an analysis of the camera rgb values. The estimated illuminant is used to select a color transform (typically a 3x3 transform or a look-up table) that maps camera rgb values into human sensor (xyz) values for an ideal illuminant, such as daylight. The goal of this transform is to render the scene that the camera captured as if the scene were illuminated by daylight.

Most camera processing pipelines use a standard illuminant called D65 as the ideal rendering illuminant. As far as we know, no one has tested the assumption that people prefer to view objects illuminated by D65. The preferred rendering illuminant may also depend on the objects that are being rendered..

The project will use hyperspectral data of faces, fruit and vegetables and outdoor scenes, and spectral power distributions of different illuminants to generate images that people will view on calibrated displays. People will be asked to indicate which color renderings they prefer. In this way, we will collect preference data about preferred rendering illuminants. The preference data will provide a useful guide for engineers who are designing color balancing methods.

Hyperspectral video (special approval)

Help us build and evaluate a hyperspectral video system based on led lights synced with a monochrome video camera. Capture hyperspectral video images of human faces and estimate pulse rate by the change in color sensor values over time (see http://people.csail.mit.edu/mrub/papers/vidmag.pdf)

Biology of the mouse eye image formation (special approval)

There is a huge amount of biology done in mouse. There is a movement to study the mouse retina in particular. To study the retina, we would like to be able to understand how the cornea lens in the mouse blur the retinal image.

Adaptive optics to the rescue: Williams and his colleagues analyzed the optical quality of the mouse eye. Specifically, they measured the wavefront aberrations from 20 wild type mice. They provide the data in their article.

Optical properties of the mouse eye

Brainard, Hofer and I have written a wavefront toolbox in Matlab that enables us to specify the wavefront aberration and calculate retinal images in ISET. This project is to use our software to reproduce Figures 10 and 11 from the paper.

You can do this! If you do, many people will cite your project because there are many people who work on mouse.

Active LED-based illumination (special approval)

These days LEDs can produce high intensity light with well defined spectral properties. We are interested in a hardware system that allows to control both the on/off times of a set of LEDs, as well as their intensity using a simple Arduino microcontroller. One way you can do this, and we have a working prototype (refer to this project), is to use pulse width modulated signals to control the duty cycle of an LED. If you operate at high enough frequency, then you will perceive the rapidly flickering LEDs as having lower or higher intensity. In this project, however, we are interested in controlling the LED intensity more directly, so that even at the micro-time scale you control the LED intensity directly, rather than switch it on/off.

Project Suggestions 2012

Image processing

Hyperspectral Imaging

Analysis of hyperspectral images of paintings by famous artists

Consumer digital cameras capture electromagnetic energy in three different spectral bands. Multispectral and hyperspectral cameras capture electromagnetic energy in many more spectral bands. We used two different hyperspectral cameras to capture images of several paintings in the Cantor Arts Museum. One of the cameras captures images in 160 different spectral bands ranging between 400 and 1000 nm (visible and near-infrared or VNIR). The other camera captures images in 256 different spectral bands ranging between 1000 and 2500 nm (short-wave infrared or SWIR). There is a very large literature on hyperspectral imaging of paintings that we will use to guide our analysis of the data we have already collected. (http://www.springerlink.com/content/80342384844k0r21/fulltext.pdf) In particular, we should be able determine if there is a drawing beneath the painting (an “underpainting) and to characterize the paint pigment. This analysis will allow us to determine the history of the painting and assess its originality. We hope that this project will serve as the groundwork for an exhibit at the Cantor Arts Museum. (JEF and TS) Here is a nice website that describes methods used in art forensics (http://www.webexhibits.org/pigments/intro/look.html)

Analysis of hyperspectral images of live organs during surgery

Several research labs are investigating the advantages of hyperspectral imaging in robotic surgery. This is because hyperspectral cameras can capture a wider range of spectral data, including electromagnetic energy that the human eye cannot see. One of the challenges is how to map information that is normally invisible to surgeons onto visible images that enhance the ability to discriminate between different tissue types in a meaningful way. We have collected hyperspectral images of organs in a live pig during surgery. This project will analyze this data to determine if information in the invisible regions of the electromagnetic spectrum (> 700nm) can be used to enhance the information that surgeons see during an operation. (see http://www.intechopen.com/source/pdfs/9221/InTech-Hyperspectral_imaging_a_new_modality_in_surgery.pdf ) (JEF and TS)

Colorimetric reproduction of human faces

We collected VNIR (160 narrowband spectral images ranging between 400-100 nm) hyperspectral images of human faces, outdoor scenes, still life (fruit) and paintings. The hyperspectral image data can be used to generate a representation of spectral reflectance of the objects in a scene and the spectral power of the scene illumination. These representations can, in turn, be used as input to the ISET digital camera simulation software. ISET can then be used to predict the output of digital cameras with different color channels. For example, one can simulate a digital camera with three or more color channels, and vary the spectral sensitivities of each of the color channels. One can also vary the spatial distribution of those channels. Finally, one can vary both the demosaicking and color balancing algorithms in the digital camera. This project provides an excellent tutorial on how a digital camera works and gives you the opportunity to develop your own color imaging processing algorithms. (JEF and TS)

References and Web Links

Novel detectors for RGB and NIR

Applications, devices and algorithms for the separation of visible and near infrared signals in monolithic Si sensors

ANALYSIS OF FOVEON MULTI-SPECTRAL SENSOR FOR COUNTER-CAMOUFLAGE, CONCEALMENT AND DECEPTION APPLICATION by Devon Courtney Nugent

Using NIR to enhance visible data

Several Susstrunk lab papers. Some others.

http://infoscience.epfl.ch/record/148419/files/81_susstrunk_v5.pdf http://infoscience.epfl.ch/record/153994/files/de24567-susstrunk.pdf

http://www.comp.nus.edu.sg/~dfanbo/papers/VisualEnhanceHSI_Kim_PR2011July.pdf

http://gitl.sysu.edu.cn/papers/cvpr-2008-zhang.pdf

http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=5652900&tag=1

SIFT on visible and NIR


http://www.intechopen.com/source/pdfs/15840/InTech-An_exploration_of_color_fusion_with_multispectral_images_for_night_vision_enhancement.pdf

NIR Flash

- NIR flash

http://www.comp.nus.edu.sg/~zhuoshao/NIRFlash/nirflash_icip2010_high.pdf

Photo retouching metric (Kee and Farid)

A perceptual metric for photo retouching Eric Kee and Hany Farid

Department of Computer Science, Dartmouth College, Hanover, NH 03755 October 19, 2011 (received for review July 5, 2011)

In recent years, advertisers and magazine editors have been widely criticized for taking digital photo retouching to an extreme. Impossibly thin, tall, and wrinkle- and blemish-free models are routinely splashed onto billboards, advertisements, and magazine covers. The ubiquity of these unrealistic and highly idealized images has been linked to eating disorders and body image dissatisfaction in men, women, and children. In response, several countries have considered legislating the labeling of retouched photos. We describe a quantitative and perceptually meaningful metric of photo retouching. Photographs are rated on the degree to which they have been digitally altered by explicitly modeling and estimating geometric and photometric changes. This metric correlates well with perceptual judgments of photo retouching and can be used to objectively judge by how much a retouched photo has strayed from reality.

Visibility of movie subtitles

A persistent problem in watching foreign movies is that sometimes the subtitles are illegible. Why? Because the contrast of the default background that is assumed is wrong and you have white characters on a light background. I assume this is done automatically because it is too expensive to have people judge frame by frame whether the script is visible. Need I say more. Some automated system that could assess the brightness of the standard background space where subtitles are printed and then adjust the contrast to be legible would be a huge improvement for the industry.

E. Markman, a committed viewer of subtitled films.

Image Quality

3D Image Quality Metrics

Develop algorithms for Shooting in 3D and Displaying in 2D. Explore ways in which to improve 2D rendering of 3D content in order to enhance “immersive video”.

Optics

Wavefront Toolbox

(BW)

Advances in adaptive optics now make it possible to measure the wavefront aberrations of the living human eye. Many groups are making these measurements in both control subjects and subjects with different types of optical dysfunctions.

These aberrations are usually specified in a way that is difficult to apply to image processing: The aberrations are specified as the weights on a set of Zernike polynomials. It is a simple matter of programming to convert these polynomial weights to a point spread function that can be applied in image processing algorithms.

We have received software from experts on this topic that implements the conversion. We can probably obtain a large number of samples of measurements from different categories of human eyes. In this project, we would create a web-site to convert the Zernike polynomials to point spread functions and illustrate how those pointspread functions would influence the quality of the optical image falling on the retina.

As we accumulate additional summaries of the human measurements, we might look for statistical patterns that might be explained in terms of the biological properties of the human cornea and lens.

See: Chromatic and wavefront aberrations: L-, M- and S-cone stimulation with typical and extreme retinal image quality
Florent Autrusseau, Larry Thibos, Steven K. Shevell
Vision Research 51 (2011) 2282–2294

Integrating 3D Distributed Ray Tracing and Image Quality

(BW), (AL), (JEF)

PBRT

Radiance

RenderToolbox

Implement and test Nayar Generalized Patent

Read the patent and implement tests of the idea.

Reference:



Neuroimaging

(AT), (AM), (RFD), (GS)

With the opening of Stanford's Center for Cognitive and Neurobiological Imaging (CNI), we now have access to a large number of MR scans of the human brain. We are also closely connected to the MR hardware and image processing algorithms.

While this course is not specifically about neuroimaging, some of the methods in the course might be usefully applied to the data collected at the CNI. For students already working in MR and interested in such signal processing, we might be able to develop some projects that build on your interest.

  • Intelligent compression algorithm for multi-channel image data stored in frequency space (p-file compression)
  • Algorithm to classify volumes that contain brains in a database of MR images that includes phantoms, squash, fruits, etc. (brain detector)
  • Algorithm to identify when two MR images are of the same brain (brainprint), even if they were acquired using different contrasts.
  • We can also do another one on MR artifact detection (so many artifacts, so few projects...)

Suggestions and projects from previous years

Web page of Project Ideas for 2011
Web page of Project Ideas for 2010
PDF of Project Ideas in 2008


To see projectsfrom previous years, visit SCIEN Class Projects Page.