Written by Luke Chang

Many of the imaging tutorials will use open data from the Pinel Localizer task.

The Pinel Localizer task was designed to probe several different types of basic cognitive processes, such as visual perception, finger tapping, language, and math. Several of the tasks are cued by reading text on the screen (i.e., visual modality) and also by hearing auditory instructions (i.e., auditory modality). The trials are randomized across conditions and have been optimized to maximize efficiency for a rapid event related design. There are 100 trials in total over a 5-minute scanning session. Read the original paper for more specific details about the task and the dataset paper.

This dataset is well suited for these tutorials as it is (a) publicly available to anyone in the world, (b) relatively small (only about 5min), and (c) provides many options to create different types of contrasts.

There are a total of 94 subjects available, but we will primarily only be working with a smaller subset of about 30.

We will use the osfclient package to download the entire dataset. Note, that the entire dataset is fairly large (~5.25gb), so make sure you have space on your computer. At some point, we will make a smaller version for the dartbrain course available for download.

Let's first make sure the osfclient package is installed in our python environment.

!pip install osfclient

Requirement already satisfied: osfclient in /Users/lukechang/anaconda3/lib/python3.7/site-packages (0.0.3)
Requirement already satisfied: six in /Users/lukechang/anaconda3/lib/python3.7/site-packages (from osfclient) (1.14.0)
Requirement already satisfied: tqdm in /Users/lukechang/anaconda3/lib/python3.7/site-packages (from osfclient) (4.42.1)
Requirement already satisfied: requests in /Users/lukechang/anaconda3/lib/python3.7/site-packages (from osfclient) (2.23.0)
Requirement already satisfied: idna<3,>=2.5 in /Users/lukechang/anaconda3/lib/python3.7/site-packages (from requests->osfclient) (2.8)
Requirement already satisfied: chardet<4,>=3.0.2 in /Users/lukechang/anaconda3/lib/python3.7/site-packages (from requests->osfclient) (3.0.4)
Requirement already satisfied: certifi>=2017.4.17 in /Users/lukechang/anaconda3/lib/python3.7/site-packages (from requests->osfclient) (2019.11.28)
Requirement already satisfied: urllib3!=1.25.0,!=1.25.1,<1.26,>=1.21.1 in /Users/lukechang/anaconda3/lib/python3.7/site-packages (from requests->osfclient) (1.25.8)


osfclient provides a command line interface built in python that can help us easily download (and also upload) datasets being shared on the Open Science Framework (OSF).

All we need to do is specifiy the OSF project id and the directory where we would like the data downloaded.

project_id = 'vhtf6'
output_directory = '/Users/lukechang/Dropbox/Dartbrains/Data'

!osf -p {project_id} clone {output_directory}


The dataset has been converted to be in a standard data format known as the Brain Imaging Data Structure format or BIDS for short. BIDS is a specification to organize imaging datasets in a standard way across different laboratories. It contains a structured format for people to find relevant information for analyzing the dataset.

# Test Data

This example assumes that you are using the docker container associated with the course.

We will use pybids to explore the dataset. It should already be in included in the dartbrains docker container. Otherwise, you can install it using pypi with !pip install pybids.

from bids import BIDSLayout

data_dir = '/home/jovyan/Data'

layout = BIDSLayout(data_dir, derivatives=False)
layout

BIDS Layout: .../home/jovyan/Data | Subjects: 94 | Sessions: 0 | Runs: 0

This shows us that there are 94 subjects with only a single functional run.

We can query the BIDSLayout object to get all of the file names for each participant's functional data. Let's just return the first 10.

file_list = layout.get(target='subject', suffix='bold', return_type='file', extension='nii.gz')
file_list[:10]

['/home/jovyan/Data/sub-S01/func/sub-S01_task-localizer_bold.nii.gz',
'/home/jovyan/Data/sub-S10/func/sub-S10_task-localizer_bold.nii.gz']

Ok, now let's try to load one of the functional datasets using Brain_Data from the nltools package.

from nltools.data import Brain_Data

data = Brain_Data(file_list[0])

/opt/miniconda-latest/lib/python3.7/site-packages/sklearn/utils/deprecation.py:144: FutureWarning: The sklearn.linear_model.base module is  deprecated in version 0.22 and will be removed in version 0.24. The corresponding classes / functions should instead be imported from sklearn.linear_model. Anything that cannot be imported from sklearn.linear_model is now part of the private API.
warnings.warn(message, FutureWarning)

-----------------------------------------------------------------------
DimensionError                        Traceback (most recent call last)
<ipython-input-11-02e4f9a25936> in <module>
1 from nltools.data import Brain_Data
2
----> 3 data = Brain_Data(file_list[0])

/opt/miniconda-latest/lib/python3.7/site-packages/nltools/data/brain_data.py in __init__(self, data, Y, X, mask, output_file, **kwargs)
132                 else:
135             elif isinstance(data, list):
136                 if isinstance(data[0], Brain_Data):

/opt/miniconda-latest/lib/python3.7/site-packages/nilearn/input_data/base_masker.py in fit_transform(self, X, y, confounds, **fit_params)
205                                 ).transform(X, confounds=confounds)
206             else:
--> 207                 return self.fit(**fit_params).transform(X, confounds=confounds)
208         else:
209             # fit method of arity 2 (supervised transformation)

175         self._check_fitted()
176
--> 177         return self.transform_single_imgs(imgs, confounds)
178
179     def fit_transform(self, X, y=None, confounds=None, **fit_params):

/opt/miniconda-latest/lib/python3.7/site-packages/nilearn/input_data/nifti_masker.py in transform_single_imgs(self, imgs, confounds, copy)
403             confounds=confounds,
404             copy=copy,
--> 405             dtype=self.dtype
406         )
407

/opt/miniconda-latest/lib/python3.7/site-packages/joblib/memory.py in __call__(self, *args, **kwargs)
360
361     def __call__(self, *args, **kwargs):
--> 362         return self.func(*args, **kwargs)
363
364     def call_and_shelve(self, *args, **kwargs):

39                     copy=True,
40                     dtype=None):
---> 41     imgs = _utils.check_niimg(imgs, atleast_4d=True, ensure_ndim=4)
42
43     # Check whether resampling is truly necessary. If so, crop mask

/opt/miniconda-latest/lib/python3.7/site-packages/nilearn/_utils/niimg_conversions.py in check_niimg(niimg, ensure_ndim, atleast_4d, dtype, return_iterator, wildcards)
274
275     if ensure_ndim is not None and len(niimg.shape) != ensure_ndim:
--> 276         raise DimensionError(len(niimg.shape), ensure_ndim)
277
278     if return_iterator:

DimensionError: Input data has incompatible dimensionality: Expected dimension is 4D and you provided a 5D image. See http://nilearn.github.io/manipulating_images/input_output.html.

Uh, oh... This simple command isn't working. Here is your first lesson that things are always a little messy and require debugging.

Let's try to figure out what is going on.

First, let's look at the error and try to see what went wrong.

DimensionError: Input data has incompatible dimensionality: Expected dimension is 4D and you provided a 5D image. See http://nilearn.github.io/manipulating_images/input_output.html.

It looks like that the data is being read in as a 5 dimensional image rather than a four dimensional image. Brain_Data can't read this type of data. Perhaps it's because this nifti file was created using an older version of SPM.

Let's test our hypothesis and use nibabel to load the data and inspect the shape of the data file.

import nibabel as nib

dat.shape

(64, 64, 40, 1, 128)

ok, it looks like the first 3 dimensions are correctly describing the spatial dimensions of the data and the fifth dimension reflects the number of volumes in the dataset.

Notice that there is an extra dimension of 1 that we need to remove. We can do that with the numpy squeeze function.

dat.get_data().squeeze().shape

/opt/miniconda-latest/lib/python3.7/site-packages/ipykernel_launcher.py:1: DeprecationWarning: get_data() is deprecated in favor of get_fdata(), which has a more predictable return type. To obtain get_data() behavior going forward, use numpy.asanyarray(img.dataobj).

* deprecated from version: 3.0
* Will raise <class 'nibabel.deprecator.ExpiredDeprecationError'> as of version: 5.0
"""Entry point for launching an IPython kernel.

(64, 64, 40, 128)

squeeze gets rid of that extra dimension. Now we need to create a new nifti image with the correct data and write it back out to a file that we can use later.

We will initialize a new nibabel nifti instance and write it out to file.

dat_fixed = nib.Nifti1Image(dat.get_data().squeeze(), dat.affine)
nib.save(dat_fixed, file_list[0])

/opt/miniconda-latest/lib/python3.7/site-packages/ipykernel_launcher.py:1: DeprecationWarning: get_data() is deprecated in favor of get_fdata(), which has a more predictable return type. To obtain get_data() behavior going forward, use numpy.asanyarray(img.dataobj).

* deprecated from version: 3.0
* Will raise <class 'nibabel.deprecator.ExpiredDeprecationError'> as of version: 5.0
"""Entry point for launching an IPython kernel.


Let's double check that this worked correctly.

nib.load(file_list[0]).shape

(64, 64, 40, 128)

Ok! Looks like it worked!

Now let's fix the rest of the files so we can work with the data in all of the tutorials.

file_list = layout.get(target='subject', suffix='bold', return_type='file', extension='nii.gz')

for f in file_list:

data = Brain_Data(file_list[0])