An Overview of Hyperspectral Colon Tissue Cell Classification

From Psych 221 Image Systems Engineering
Revision as of 00:08, 22 March 2012 by imported>Psych2012 (Principal component analysis (PCA))
Jump to navigation Jump to search

Project Title

An Overview of Hyperspectral Colon Tissue Cell Classification

Introduction

Why Classification?

The colon is the upper part of the large intestine tube while the rectum is the lower part of this tube. Practically, colon or rectum cancer is characterized as separate cancer instances. Colorectal or bowel cancer is a composite name for colon and rectum cancer. It is the uncontrolled growth of tissue cells in either the colon or rectum which causes the colorectal cancer. It is the third most commonly diagnosed cancer after lung and breast cancer.

Yet 80% of colorectal cancer cases can be treated if caught at an early stage. Thus, it is important to discriminate between normal and malignant tissue cells of the human colon. After that, we can deal with malignant tissue cells.

Hyperspectral sensors

Hyperspectral sensor data

High spectral resolution characteristics of hyperspectral sensors preserve important aspects of the spectrum. Hyperspectral sensors commonly utilize the simple fact that any body with temperature over absolute zero either emit or reflect the absorbed energy in certain frequency bands. This eventually makes segmentation of different materials possible.

The image data provided by hyperspectral sensors is visualized as a 3D cube, where the face is a function of spatial coordinates f(x,y) and depth is a function of wavelength d(λ) . The image data can also be seen as a stack of multiple 2D images. Each spatial point on the face is characterized by its own spectrum. Each image represents a range of the electromagnetic spectrum and is also known as a spectral band. These 'images' are then combined and form a three-dimensional hyperspectral data cube for processing and analysis.

Pattern recognition (Tissue classification)

The detection of malignant cells can be viewed as a typical example of a pattern recognition problem. Pattern recognition in images consists of three independent steps, which can be applied to the tissue classification problem as follows:

1. Image segmentation: Objects(tissues cells) contained in the image scene are separated from the background. This is the separation of constituent parts of tissue cells.
2. Feature extraction: The characteristics of each object are quantified. Also these features should contain enough discriminant information to distinguish a normal tissue from a malignant tissue.
3. Classification: Normal and malignant tissue cells should be assigned unique target class.


We will extend our research based on the above three categories.

Methods

Dimensionality Reduction

Dimensionality reduction sample

Before the formal process of segmentation of hyperspectral imagery, an intermediate step of dimensionality reduction is often involved. The goal is to eliminate the redundancy in the data while simultaneously preserving the discriminant features for segmentation, detection or classification algorithms. Dimensionality reduction can solve the problem of high computational complexity which huge size of hyperspectral image data normally carries. Normal way of dimensionality reduction in data mining is to use Singular Value Decomposition (SVD) . But here we discuss PCA and ICA instead of SVD.

Two categories of methods for reduction:
(1) Linear methods: principal component analysis, factor analysis and independent component analysis.
(2) Non-linear methods: curvilinear component analysis, curvilinear distance analysis and multi-dimensional scaling.

Principal component analysis (PCA)

Principal component analysis (PCA) is a statistical multivariate data analysis tool which attempts to find the natural coordinate axes for the multidimensional dataset. It is the representation of the higher-dimensional data into lower- dimensional orthogonal axes such that it is highly decorrelated. This representation can be considered as the transformation of the original data into a new vector space where the basis vectors are actually a linear combination of the original data vectors.

Independent component analysis (ICA)

Segmentation

Similarities in the shape at wavelength between 500 and 600 nm

Abnormal spectra - spleen

Wavelength between 700 and 1000 nm

Classification

Spectral Reference Sample Preparation

Data Acquisition

Data Analysis

Axy(ri)=log[Rxy(ri)o/Rxy(ri)] (1)



Ckj=SkriRjri+ekj (2)


Skri=(PkritPkri)Pkrit (3)


Tissue Oxygen Saturation Algorithm



S02 - dependent component

s=a2A2 a1A1 a3A3 

Correction for blood volume

V=w1(r11)r2+1K522(wr1r2)K548w+K569 






Result

Algorithm




Conclusions

Describe what you learned. What worked? What didn't? Why? What would you do if you kept working on the project?


Reference