Samir
Investigating the Effects of Motion Correction, Smoothing and ICA on High-Resolution fMRI Data
Summary
This project explores how two popular preprocessing methods, smoothing and ICA, influence high resolution fMRI datasets. Using a dataset focused on the brain's motor regions, the motor cortex, the basal ganglia and the cerebellum, I investigated:
1. The efficacy of FSL's and SPM's motion correction algorithms
2. The effect of smoothing on T-statistics and voxel significance
3. The ability of ICA to delineate task-specific signals from noise
Motivation
The ability to coordinate muscles is fundamental to human motor control, and underlies even simple motions like holding a cup with a hand and flipping a switch with the same arm’s elbow. Past research has not elucidated the brain’s coordination strategy because neurophysiology experiments [1] are confounded by the fact that at a fine spatial scale, motor related neurons correlate with all observable movement parameters [2] , while at a higher spatial scale, functional neuroimaging experiments have been able to delineate fractured somatotopic sensorimotor maps [3], which are hard to interpret.
Probing the motor control regions in a manner that elucidates the structure of the underlying motor controller thus requires novel experiments. High resolution fMRI is ideal for such experiments since it can probe neural correlates of motor activity with a high spatial resolution (a few mm3) and a temporal resolution similar to most motor tasks (a few seconds). The same high resolution fMRI data, however, might be interpreted in multiple ways with minor changes to the processing pipeline. The goal of this project is to understand how three popular preprocessing steps, motion correction, smoothing, and ICA, can influence analysis and statistical results.
Methods
A novel motor control dataset was acquired for this study. The data collection started before the course but continued through its duration.
Subject Details
For the data presented in this project, I am including scans for 3 healthy right-handed volunteers (2 males; 19–28 yr of age) from our set of subjects. All subjects were informed about the experiment's details in advance, and gave their informed written consent to a protocol approved by the Institutional Review Board of Stanford University. The subjects were healthy and did not have any psychiatric or neurological disorders at the time or in the past. Subjects were asked to lie supine on the scanner bed, and their heads were surrounded by cloth to comfort and reduce head movements. In addition, they bit down on a bite bar for the duration of the study. The bite bar was customized to each subject's dental structure with a slow-hardening putty.
Task Details
Subjects were asked to perform writing tasks like drawing a square with a pencil. They were provided visual text stimuli indicating when to plan and execute motions, and when to rest. For instance, one stimulus sequence would be: `Plan : Square' (yellow, 5 sec), `Execute : Square' (green, 8 sec), and `Rest' (red, 3-11 sec, randomized). During the plan phase subjects would plan their motion and possibly adjust grasp position, but would not make any whole arm motions. During the execute phase, subjects would move their entire arm to draw a square, without any finger movement. And during the rest phase, they would place their arm on their torso and rest. Subjects were asked to abandon tasks midway if they could not complete them in time.
The task presentation was randomized and orthogonalized using optseq. A sample design matrix representing one subject's task presentation demonstrates optseq's sequencing (Fig. 1).
Stimulus Presentation
The task description text was displayed on a modified Samsung SyncMaster 305T 30 inch diagonal display (76 cm, 16:10 aspect ratio), built by Resonance Technology (www.mrivideo.com) at a resolution of 1280x800 pixels. Subjects viewed the display through a double mirror at a distance of about 190cm from the head-coil, and the display was inverted so text appeared normal to the subjects. The text was rendered on the screen using VisionEgg, a freely available python based stimulus presentation software.
Scan Sequence Details
We acquired images GE MR 750 3-Tesla scanner and a Nova 32 channel head coil at the Stanford Center for Cognitive and Neurobiological Imaging (CNI). Two series of 510 functional volumes were acquired using a gradient echo, echoplanar sequence with a 134 square matrix, 27 oblique slices, 2mm thick, with a 0.5mm gap between slices. The voxel size was 1.5mm x 1.5mm x 2.0mm, repetition time was 2 sec, echo time was 33ms and flip angle was 75degrees. The slices were adjusted for each subject to include cerebellum, posterior striatum, and motor cortex.
The functional images were overlaid on a co-aligned, high-resolution anatomical scan of the whole brain taken at the end of each session (BRAVO sequence; TR = 7.8 sec; TE = 3.1 ms, flip angle = 12 degrees; matrix, 300x300; 0.8 mm, isotropic).
Estimating Motion and Correcting it
Data
Discussion
Since our study involved major arm motions, a bite bar was necessary to prevent head movement. Once we started using it, subject head motions went down considerably. Combined with ensuring that the subjects were comfortable in the scammer, ensured that we eliminated motion related artifacts in almost all our datasets. We checked that motion did not influence our results by calculating whether the motion was correlated with mean activity changes as well as the task regressors (Fig. 1 shows a representative example).
In addition, we ran motion correction using two different algorithms implemented by FSL and SPM (Fig. 2). Manually checking a video of the time-series, it seemed that FSL corrected motion better. This seems to be reflected in the motion estimates, where SPM seems to underestimate the motion at places. The difference might be due to the fact that FSL's motion correction pipeline removes the skull and smooths the data, while I did not add those steps for SPM. However, the differences were very small in general so the choice of software did not make much difference (also see [6]). I used SPM's motion correction in the end because we preferred SPM's GLM analysis to the alternatives. If motion artifacts are a concern, the best idea is probably to improve the experimental protocol and eliminate them instead of relying on motion correction.
The Effect of Smoothing
Data
Discussion
For the high-resolution data we collected, smoothing substantially decreased the T-statistic across the dataset, and often led to a complete loss of significantly activated voxels in entire brain regions (Fig. 3). While smoothing theoretically increases signal to noise [4], it did not help our analysis much. The only metric where smoothing held a minor advantage, was that it avoided false positives throughout the data (Fig. 4). One reason why smoothing might have decreased the t scores in this analysis, is that we use a rapid randomized task sequence where the elicited BOLD responses might have been at the margin of detection. In such situations, the size of the effect would be small and smoothing would effectively be equivalent to introducing partial-volume effects [Personal communications with Prof. Gary Glover, Stanford University]. This might not be the case for lower resolutions and longer task sequences or blocked designs.
I reconstructed each individual subject's brain surfaces to avoid introducing co-registration and normalization related errors, and created inflated and pial brain models with freesurfer. The projected T-statistics for a subject demonstrate how smoothing dramatically reduced the significance of motor activity (Fig. 5). This effect was conserved across subjects and multiple motion types. Note that projecting on to an inflated brain warps single voxels into circles, making them seem larger than they truly are.
Another important aspect to note is that while anatomical and smoothed functional underlays might display grey matter, CSF related scanner artifacts could have actually created voids at the same position. However, if activation in the void persists after carefully checking the data for motion artifacts and using family-wide error correction you can't reject it. The vasculature is often dense at the boundaries of the cortical surface and if the neural activation in neighboring regions is strongly correlated with the task, then the blood flow in the vasculature might make it seem like the BOLD signal is coming from the artifact region [Personal communications with Prof. Gary Glover, Stanford University].
Data
Discussion
Since smoothing did not seem like a valuable strategy to increase the contrast significance, I next explored whether ICA can delineate noise from task-related activity. The analysis used FSL's standard pipeline, McFLIRT motion correction, brain masking, and 3mm isotropic smoothing. FSL's MELODIC implementation predicted that my data was about 175 dimensional, and produced 175 independent components to describe it (Fig. 6). Quoting the FSL website, "Estimated Component maps were divided by the standard deviation of the residual noise and thresholded by fitting a mixture model to the histogram of intensity values".
Manually sorting through the components, however, was not only grueling, but actually counter productive since many components looked like they might be valid "motor components" but did not contain any task related information. Any template based component identification that only looked at spatial localization could potentially face similar problems.
To automate the process of identifying useful components I computed the correlation of each component with all the regressors, essentially applying my GLM model to each component's averaged time series (Fig. 7) and computing the P values (Fig. 8). Doing so revealed the components that contained task-related information. Verifying the components, I found that well correlated components generally appeared next to motor regions (Fig. 9). Another potential pothole for manually sorting components is when a positive z-score actually involves negative task regressor correlations. Investigating why MELODIC produced such scores might be interesting for anyone who plans to use its z-scores for further analysis.
While ICA seems like a promising approach to identify noisy underlying probability distributions in the data, my preliminary investigation doesn't support using it to identify task-related components for high resolution fMRI studies. First, simple GLM analysis give much finer results than ICA's broadly active regions. Second, the thresholded z-scores can not be interpreted on their own and must be compared to some regression model. Third, varying the number of voxels changed the regions in components and their correlations with the task regressors in an arbitrary manner (analyzed data for 7 and 12 components; email me for the results). And finally, the percentage of variance captured by task-correlated components is very small. Moreover, my investigations did not explore the effects of different smoothing levels on the components, which I am sure will be substantial.
In a nutshell, since I plan to cross validate my data in the future I will probably not use ICA. Primarily because, when it can, it removes noise in an arbitrary and non-determinsitic manner that necessitates manual validation. As such, without further analysis and possibly fundamentally changing the MELODIC algorithm's implementation it would be hard to denoise datasets without already having information about the task regressors. Since cross-validation would obviate this possibility, MELODIC ICA's denoising will probably only cause overfitting and degrade any cross-validation.
Unsmoothed Neural Correlates of Writing Tasks
Data
Discussion
With my results, I decided to process my data without smoothing it or denoising it with ICA. Fortunately, since the experiments were well designed and the scan setup at the CNI is really great, I got nice activation correlates which I converted into t-statistic maps using freesurfer. The t-statistic maps for three subjects measuring motor activity during writing displayed significant neural correlates of writing in motor, pre-motor and somatosensory cortices (Fig. 10), as well as cerebellum (not shown). The locations at which voxels correlated significantly with task regressors changed from subject to subject, whose brains had visibly different shapes.
My conclusion was that it is best to focus one's efforts on experiment design and physically measurable parameters to denoise data. And then to run statistics on minimally post-processed data. Post processing typically adds unpredictable effects to analysis pipelines, which necessitate many time-consuming checks and can possibly degrade the quality of the results.
Future Work
These preliminary results are a part of my investigations to ensure that my experimental protocol achieves the highest possible resolution and sensitivity to task-related BOLD signals. In the future, I plan to systematically explore other data-driven algorithms to identify motor features in my data. I also plan to use biomechanical models and robotic control theory to predict the functional organization of the motor regions. Unfortunately, the study is at an early stage and the details will become clearer as it progresses. Please feel free to email me if you would like to hear more about it.
Acknowledgements and Disclaimer
The experimental protocol was developed in collaboration with Prof. Samuel McClure, Wouter Van Den Bos (Prof. McClure's postodc), Prof. Oussama Khatib and Prof. Kwabena Boahen. Vanessa Sochat introduced me to FSL and helped me set up a MELODIC processing pipeline.
I would like to thank BioX Neuroventures and the CNI for a scan time grant that we used to collect the data.
While the datasets used in this study are not yet publicly available, I will be glad to share some of them if you wish to use them to reproduce the results. Please contact me if you would like to do so.
Finally, the results presented here were created by me within the time-span of the course, and were not shared with any other courses.
Appendix
Scripts
- sMRIUtil Code repository with some of my often used scripts. The code for this project will be uploaded in due time.
Software Used
- SPM 8 For the GLM analysis.
- FSL For ICA.
- Freesurfer For creating inflated and pial brains to render SPM's t-maps upon.
- mrTrix For plotting SPM t-maps overlaid on the anatomical data.
- Optseq For randomizing the stimulus presentation to ensure orthogonal regressors and minimize subject adaptation.
- VisionEgg For displaying text stimuli to subjects.
- Matlab For numerous custom processing scripts.
References
[1] S. Grillner. The motor infrastructure: From ion channels to neuronal networks. Nature Reviews Neuroscience, 4(7):573–586, July 2003
[2] M. Graziano. The organization of behavioral repertoire in motor cortex. Annual Review of Neuroscience, 29:105–34, January 2006
[3] Jeffrey D. Meier, Tyson N. Aflalo, Sabine Kastner, and Michael S. A. Graziano. Complex organization of human primary motor cortex: A high-resolution fmri study. J Neurophysiol, 100(4):1800–1812, 2008
[4] K. J. Friston, O. Josephs, E. Zarahn, A. P. Holmes, S. Rouquette, and J. B. Poline. To Smooth or Not to Smooth?. NeuroImage 12:196 –208, 2000
[5] C.F. Beckmann and S.M. Smith. Probabilistic Independent Component Analysis for Functional Magnetic Resonance Imaging. IEEE Transactions on Medical Imaging 23(2):137-152 2004
[6] T.R. Oakes, T. Johnstone, K.S. Ores Walsh, L.L. Greischar, A.L. Alexander, A.S. Fox, and R.J. Davidson. Comparison of fMRI motion correction software tools. NeuroImage 28:529–543, 2005