Attenuation-Based 3D Display Using Stacked LCDs: Difference between revisions

Revision as of 15:20, 15 December 2017

Introduction

Unlike traditional 2D displays, attenuation-based 3D displays enable the accurate, high-resolution depiction of motion parallax, occlusion, translucency, and specularity. We have implemented iterative tomographic reconstruction for image synthesis on a stack of spatial light modulators (multiple low-cost iPad LCDs). We illuminate these volumetric attenuators with a backlight to recreate a 4D target light field. Although five-layer decomposition generates the optimal tomographic reconstruction, our two-layer display costs less than $100 and requires less computation

Background

Engineers have promulgated designs for 3D displays, and even automultiscopic displays, as early as the turn of the 19th century. In particular, we consider four types of 3D display technologies that stand in contrast to what we have produced: parallax barriers, integral imaging, volumetric displays, and holograms. What relates these technologies is their shared ability to replicate disparity, motion parallax, and binocular depth cues without the need for special eyewear.

A performance summary of these comparative technologies based off the results of [Wetzstein et al. 2011] can be found below:

Multi-layer displays present a fifth class of displays, in which ... [brief summary of how it works, light fields, attenuation map, tomography].

The benefit of a multi-layered display is that it possess high resolution and contrast with only moderate trade-offs in brightness and complexity. Since our display relies only on two layers and the reconstructed light-fields are precomputed, we mitigate these limitations, although the produced image is merely static. Additionally, the stacked LCD configuration relies on multiplicative light absorption, rather than additive absorption; the benefit of this is that the display can construct occlusion, specularity, and depth without the need for any moving components.

Methods

Light Field Acquisition

7x7 light field image array spanning 20 degree field of view, collected from the Stanford Light Field Archive.

Layered Attenuation-based Displays

We implemented the computed tomography techniques decsribed in “Layered 3D” by G. Wetzstein, D. Lanman, W. Heidrich, R. Raskar (SIGGRAPH 2011) to produce two 2048x1536x7x7 reconstructed images from many precomputed views of the light field, spanning a 20-degree FOV. We solve these layer decompositions ahead of time, and paint them as static images to the LCDs.

Tomographic Approximation

We rely on the code produced by [Wetzstein et. al 2011] to synthesize an attenuation map to approximate the chosen target light field, relying on iterative tomographic reconstruction principles to find the optimal solution in the least squares sense, and apply this code to the specific two-layer case.

The 2D volumetric attenuator is defined as a continuously varying attenuation map $μ (x, y)$ . In this model, $μ (x, y)$ is computed to obey Beer-Lambert's law so that $I = I_{0} * e^{- \int_{c} (μ (r) d r)}$

By the Weber-Fechner law, however, the human visual system recognizes logarithmic changes in illumination as nearly linear, so the illumination is re-computed as $I^{-} = l n (I / I_{0})$ .

In the forward model, the Radon transform $p (u, a)$ can take the attenuation map $μ$ and the width and height of the layered slabs to encode all possible line integrals through the attenuation map, along each ray $(u, a)$ . Here, the orientation of ray $(u, a)$ is defined by the slope $a = s - u = d_{r} * t a n (θ)$ where $d_{r}$ is the distance of the s-axis from the u-axis. The oblique light field can then be described as Failed to parse (syntax error): {\displaystyle l^-(u, a) = −p(u, a)} for linear angle $a$ .

With parallel beam tomography, an estimate of the attenuation map, $μ_{a p p r o x} * (x, y)$ is recovered from the projections $p (u, a)$ using the inverse Radon transform. In turn, the filtered backprojection algorithm estimates a volumetric attenuator capable of emitting the target light field.

The projection matrix $P * (k)_{i} j$ corresponding to line integrals through every basis function k along each ray $(i, j)$ can be expressed as a linear combination of $N_{b}$ non-negative basis functions with coefficients $α$ .

This system, which can model the attenuation as a linear system of equations when considering a discrete light field, is expressed in matrix-vector form as $P * α = - l^{-} + e^{-}$ , where $e^{-}$ is the approximation error.

As a result, we cast attenuation map synthesis as the following non-negative linear least-squares problem:

arg min $α$ of $| | l^{-} + P * α | |^{2}$ for $α > = 0$

For multi-layered attenuators, the form of the projection matrix P is modified, now encoding the intersection of every ray with each mask. Thus, a similar optimization solves the inverse problem of constructing an optimal multi-layered attenuator. Practically, however, layers have a finite contrast and the apporximation is solved as a constrained least-squares problem

Assembly

We designed an enclosure such that the stack of LCDs (2048x1536, 9.7”, IPS 60Hz, iPad 3) would be well-fastened and their driver circuitry well-hidden. In this intitial prototype, we attached the front LCD with adhesive so that we could manually adjust the display for approximate pixel alignment. Quarter-wave plate sheet.

Results

Limitations

• Brightness

The most apparent usability issue in our current prototype is the low brightness of the displayed scene. This is a problem inherent to the stacked LCD approach, as each LCD only transmits a fraction of the incident backlighting. Roughly, we measure our 2-stack prototype to have a total light transmissivity of about 10%, as measured with a fixed light source and cheap luxmeter. There exist some LCD panels that achieve higher fill factors and correspondingly higher transmissivity, but for the future of our prototype, we think it simplest to pay off this heavy transmission loss by engineering a high intensity backlight.

• Low Refresh Rate

The original stacked LCD design from Wetzstein et. al called for high refresh rate (120+ Hz) LCDs, such that successive frames could multiplex many light rays in time. We want to first characterize exactly how much bearing this multiplexing has on the perceptual quality of the scene, and if not absolutely critical, explore partial solutions while avoiding perceptual flicker at just 60 Hz.

• Narrow Field of View

Our current field of view limitation comes from the LCD's limited refresh rate, as well as the increased computational cost of rendering more light fields in real time. However, we can reduce the effective field of view requirements by adding a head tracking camera to the system, optimizing for single- or few-viewer cases by rendering narrower fields targeted to the known observers. Put another way, knowing the viewer's position might let us cull many unnecessary light field elements without significantly compromising perceived scene quality.

• Pixel Alignment

Our future prototypes will be designed around a more rigid frame, with better facilities for fine positional adjustments. We think the current prototype suffers from significant misalignment of the two LCDs, resulting in what appears to be a "smeared, blurry" scene.

• Color Crosstalk

Adding more layers to our stack will afford us finer control over the aggregate tomographic mask, helping reduce the crosstalk we currently observe from our lack of an ideally directional backlight.

Performance

• High Contrast Halo Artifacts

• Depth of Field: Depth of Field Upper Bound for Two-Layer Attenuation Display: $| f_{ϵ} | < = [h / (h / 2 + | d_{o} |)] * f_{0}$ where $| f_{ϵ}$ is the spatial cutoff frequency and $h$ is the fixed display thickness.

• Resolution

Thanks to major advances in LCD technology, we achieve much higher layer/mask resolution of 2048x1536 without significant increases in cost. Stacked LCDs are thus attractive as an economical way to achieve high-fidelity control over light rays.

• PSNR

From "Layered 3D":

Conclusions

Improvements

• Dynamic Rendering

Currently, our model is designed for static scenes only, mostly constrained by a lack of compute to render dynamic scenes in real time. Very cursory profiling suggests that significant speedups to the rendering pipeline should be possible, so we intend to write a more optimized, GPU-accelerated renderer targeting 30-60fps scene display.

• High-Refresh Rate (>144Hz)

• Three (or N) Layer Display

Increasing layer count generally affords us better control over light ray steering, reducing artifacts in the light field. Of course, as mentioned above, increasing LCD count also reduces light transmission, which we will initially address with a more advanced, high power lighting scheme.

• Face-Tracking

Having a good estimate of the observer's position allows aggressive rendering optimization, reducing computational cost and potentially improving solutions to the light field. We initially plan to use a standard web camera and OpenCV software running on the render host, but intend to move to a tightly embedded "RealSense" stereo camera that additionally provides depth annotations, letting us further optimize our target light field for the viewer.

Appendix I

[1] G. Wetzstein, D. Lanman, W. Heidrich, R. Raskar. Layered 3D: Tomographic Image Synthesis for Attenuation-based Light Field and High Dynamic Range Displays. Proc. of SIGGRAPH 2011 (ACM Transactions on Graphics 30, 4), 2011.
[2] G. Wetzstein, D. Lanman, D. Gutierrez, M. Hirsch. Computational Displays. ACM SIGGRAPH 2012 Course, 2012.

Matlab Implementation of Tomographic Light Field Synthesis
Real-Time Implementation of Tomographic Light Field Synthesis

Appendix II

@@ Line 42: / Line 42: @@
 We rely on the code produced by [Wetzstein et. al 2011] to synthesize an attenuation map to approximate the chosen target light field, relying on iterative tomographic reconstruction principles to find the optimal solution in the least squares sense, and apply this code to the specific two-layer case.
 The 2D volumetric attenuator is defined as a continuously varying attenuation map <math>\mu(x,y)</math>. In this model, <math>\mu(x,y) </math> is computed to obey Beer-Lambert's law so that <math>I = I_0*e^{-\int _c(\mu(r)dr)}</math>
 By the Weber-Fechner law, however, the human visual system recognizes logarithmic changes in illumination as nearly linear, so the illumination is re-computed as <math>I^- = ln(I/I_0)</math>.
 In the forward model, the Radon transform <math>p(u, a)</math> can take the attenuation map <math>\mu</math> and the width and height of the layered slabs to encode all possible line integrals through the attenuation map, along each ray <math>(u, a)</math>. Here, the orientation of ray <math>(u, a)</math> is defined by the slope <math> a = s-u = d_r*tan(\theta)</math> where <math>d_r</math> is the distance of the s-axis from the u-axis. The oblique light field can then be described as <math>l^-(u, a) = −p(u, a)</math> for linear angle <math>a</math>.

Attenuation-Based 3D Display Using Stacked LCDs: Difference between revisions

Revision as of 15:20, 15 December 2017

Contents

Introduction

Background