An Overview of Hyperspectral Colon Tissue Cell Classification
Project Title
An Overview of Hyperspectral Colon Tissue Cell Classification
Introduction
Why Classification?
The colon is the upper part of the large intestine tube while the rectum is the lower part of this tube. Practically,
colon or rectum cancer is characterized as separate cancer instances. Colorectal or bowel cancer is a composite name
for colon and rectum cancer. It is the uncontrolled growth of tissue cells in either the colon or rectum which causes
the colorectal cancer. It is the third most commonly diagnosed cancer after lung and breast cancer.
Yet 80% of colorectal cancer cases can be treated if caught at an early stage.
Thus, it is important to discriminate between normal and malignant tissue cells of the human colon. After that, we can deal with malignant tissue cells.
Hyperspectral sensors

High spectral resolution characteristics of hyperspectral sensors preserve important aspects of the spectrum.
Hyperspectral sensors commonly utilize the simple fact that any body with temperature over absolute zero either emit or reflect the absorbed energy in certain frequency bands. This eventually makes segmentation of different materials possible.
The image data provided by hyperspectral sensors is visualized as a 3D cube, where the face is a function of spatial coordinates f(x,y) and depth is a function of wavelength d(λ) . The image data can also be seen as a stack of multiple 2D images. Each spatial point on the face is characterized by its own spectrum. Each image represents a range of the electromagnetic spectrum and is also known as a spectral band. These 'images' are then combined and form a three-dimensional hyperspectral data cube for processing and analysis.
Pattern recognition (Tissue classification)
The detection of malignant cells can be viewed as a typical example of a pattern recognition problem.
Pattern recognition in images consists of three independent steps, which can be applied to the
tissue classification problem as follows:
1. Image segmentation: Objects(tissues cells) contained in the image scene are separated from the background. This is the separation of constituent parts of tissue cells.
2. Feature extraction: The characteristics of each object are quantified. Also these features should contain enough discriminant information to distinguish a normal tissue from a malignant tissue.
3. Classification: Normal and malignant tissue cells should be assigned unique target class.
We will extend our research based on the above three categories.
Methods
Dimensionality Reduction

Before the formal process of segmentation of hyperspectral imagery, an intermediate step of dimensionality
reduction is often involved. The goal is to eliminate the redundancy in the data while simultaneously preserving the discriminant features for segmentation, detection or classification algorithms.
Dimensionality reduction can solve the problem of high computational complexity which huge size of hyperspectral image data normally carries.
Normal way of dimensionality reduction in data mining is to use Singular Value Decomposition (SVD) . But here we discuss PCA and ICA instead of SVD.
Two categories of methods for reduction:
(1) Linear methods: principal component analysis, factor analysis and independent component analysis.
(2) Non-linear methods: curvilinear component analysis, curvilinear distance analysis and multi-dimensional scaling.
Principal component analysis (PCA)
Principal component analysis (PCA) is a statistical multivariate data analysis tool which attempts to find the natural
coordinate axes for the multidimensional dataset. It is the representation of the higher-dimensional data into lower-
dimensional orthogonal axes such that it is highly decorrelated. This representation can be considered as the
transformation of the original data into a new vector space where the basis vectors are actually a linear combination
of the original data vectors. Also PCA can be briefly described as the projection of the multivariate data on the orthogonal axes which are in fact the eigenvectors of the covariance matrix of the original data.
Independent component analysis (ICA)
Independent component analysis (ICA) extends the concept of traditional multivariate data analysis techniques to determine the hidden components in the
data.
Unlike PCA, ICA does not merely attempt to find a decorrelated lower-dimensional representation for the data but also attempts to discover statistically independent components.
To implement ICA, we also need to satisfy some assumptions:
(1) The components should be mutually independent
(2) The number of data dimensions must be equal to or greater than the hidden independent components
(3) The independent components should have a nongaussian distribution
ICA can also be seen as an enhancement to PCA and factor analysis. There are also ways like FastICA and the FlexICA to deal with data.
Segmentation
At a microscopic level, human colon tissue cells can be characterised as having four constituent parts: nuclei, cytoplasm, lamina propria, and lumen. We can label them before classification. There are two different methods of segmenting the hyperspectral image data:
(1) spatial analysis
(2) spectral analysis
Spatial analysis: Wavelet based segmentation
Spectral analysis: ICA based segmentation
Classification
Spectral Reference Sample Preparation
Data Acquisition
Data Analysis
(1)
(2)
(3)
Tissue Oxygen Saturation Algorithm
S02 - dependent component
Correction for blood volume
Result
Algorithm
Conclusions
Describe what you learned. What worked? What didn't? Why? What would you do if you kept working on the project?
Reference