Separating Signal From Noise With ICA¶
Written by Luke Chang
In this tutorial we will use ICA to explore which signals in our imaging data might be real signal or artifacts. We cover ICA in more depth in the multivariate decomposition subsection of the connectivity tutorial and also using simulations in the ICA tutorial.
To run this tutorial, we will be working with data that has already been preprocessed. If you are in Psych60, this has already been done for you. If you reading this online, then I recommend reading the preprocessing tutorial, or downloading the data using the Download Data tutorial.
For a brief overview of types of artifacts that might be present in your data, I recommend watching this video by Tor Wager and Martin Lindquist.
Loading Data¶
Ok, let's load a subject and run an ICA to explore signals that are present. Since we have completed preprocessing, our data should be realigned and also normalized to MNI stereotactic space. We will use the nltools package to work with this data in python.
import numpy as np
from numpy.fft import fft, fftfreq
import plotly.graph_objects as go
from plotly.subplots import make_subplots
from nltools.data import Brain_Data
from nilearn.plotting import view_img
from dartbrains_tools.data import get_file, get_tr
/home/runner/work/dartbrains/dartbrains/.venv/lib/python3.13/site-packages/nltools/data/brain_data.py:253: UserWarning: [NiftiMasker.fit] Generation of a mask has been requested (imgs != None) while a mask was given at masker creation. Given mask will be used. self.data = self.nifti_masker.fit_transform(data)
More Preprocessing¶
Even though, we have technically already run most of the preprocessing there are a couple of more steps that will help make the ICA cleaner.
First, we will run a high pass filter to remove any low frequency scanner drift. We will pick a fairly arbitrary filter size of 0.0078hz (1/128s). We will also run spatial smoothing with a 6mm FWHM gaussian kernel to increase a signal to noise ratio at each voxel. These steps are very easy to run using nltools after the data has been loaded.
Independent Component Analysis (ICA)¶
Ok, we are finally ready to run an ICA analysis on our data.
ICA attempts to perform blind source separation by decomposing a multivariate signal into additive subcomponents that are maximally independent.
We will be using the decompose() method on our Brain_Data instance. This runs the FastICA algorithm implemented by scikit-learn. You can choose whether you want to run spatial ICA by setting axis='voxels or temporal ICA by setting axis='images'. We also recommend running the whitening flat whiten=True. By default decompose will estimate the maximum components that are possible given the data. We recommend using a completely arbitrary heuristic of 20-30 components.
with mo.persistent_cache("ica_decompose"):
# 10 components keeps the WASM-mode page snappy: the slider
# range stays small and ICA convergence in Pyodide is faster.
# Cranking back to 20-30 is one number-edit away if the reader
# wants more granularity.
output = data_1.decompose(algorithm='ica', n_components=10, axis='images', whiten='unit-variance')
Viewing Components¶
We will use an interactive component viewer to explore the results of the analysis. Use the Component slider to select which component to view; the plot updates automatically as you move it.
Components have been standardized so we can threshold the brain in terms of standard deviations. We display voxels that load greater or less than 2 standard deviations on the component (a tighter or looser cutoff than 2.0 σ requires running this notebook in marimo edit).
The second plot is the time course of the voxels that load on the component. The x-axis is in TRs, which for this dataset is 2.4 sec.
The third plot is the powerspectrum of the timecourse. There is not a large range of possible values as we can only observe signals at the nyquist frequency, which is half of our sampling frequency of 1/2.4s (approximately 0.21hz) to a lower bound of 0.0078hz based on our high pass filter. There might be systematic oscillatory signals. Remember, that signals that oscillate a faster frequency than the nyquist frequency will be aliased. This includes physiological artifacts such as respiration and cardiac signals.
It is important to note that ICA cannot resolve the sign of the component. So make sure you consider signals that are positive as well as negative.
Exercises¶
For this tutorial, try to guess which components are signal and which are noise. Also, be sure to label the type of noise you think you might be seeing (e.g., head motion, scanner spikes, cardiac, respiration, etc.) Do this for subjects s01 and s02.
What features do you think are important to consider when making this judgment? Does the spatial map provide any useful information? What about the timecourse of the component? Does it map on to the plausible timecourse of the task.What about the power spectrum?