Glossary of Python Functions

Glossary

Written by Luke Chang

Throughout this course we will use a variety of different functions available in the base Python library, but also many other libraries in the scientific computing stack. Here we provide a list of all of the functions that are used across the various notebooks. It can be a helpful reference when you are learning Python about the types of things you can do with various packages. Remember you can always view the docstrings for any function by adding a ? to the end of the function name.

Jupyter Cell Magic

Magics are specific to and provided by the IPython kernel. Whether Magics are available on a kernel is a decision that is made by the kernel developer on a per-kernel basis. To work properly, Magics must use a syntax element which is not valid in the underlying language. For example, the IPython kernel uses the % syntax element for Magics as % is not a valid unary operator in Python. However, % might have meaning in other languages.

%conda: Run the conda package manager within the current kernel.

%debug: Activate the interactive debugger. This magic command support two ways of activating debugger. One is to activate debugger before executing code. This way, you can set a break point, to step through the code from the point. You can use this mode by giving statements to execute and optionally a breakpoint. The other one is to activate debugger in post-mortem mode. You can activate this mode simply running %debug without any argument. If an exception has just occurred, this lets you inspect its stack frames interactively. Note that this will always work only on the last traceback that occurred, so you must call this quickly after an exception that you wish to inspect has fired, because if another one occurs, it clobbers the previous one.

%matplotlib: Set up matplotlib to work interactively. Example: %matplotlib inline

This function lets you activate matplotlib interactive support at any point during an IPython session. It does not import anything into the interactive namespace.

%timeit: Time execution of a Python statement or expression using the timeit module. This function can be used both as a line and cell magic

!: Shell execute - run shell command and capture output (!! is short-hand). Example: !pip.

Base Python Functions

These functions are all bundled with Python

any: Test if any of the elements are true.

bool: Cast as boolean type

dict: Cast as dictionary type

enumerate: Return an enumerate object. iterable must be a sequence, an iterator, or some other object which supports iteration. The next() method of the iterator returned by enumerate() returns a tuple containing a count (from start which defaults to 0) and the values obtained from iterating over iterable.

float: Return a floating point number constructed from a number or string x.

import: Import python module into namespace.

int: Cast as integer type

len: Return the length (the number of items) of an object. The argument may be a sequence (such as a string, bytes, tuple, list, or range) or a collection (such as a dictionary, set, or frozen set).

glob.glob: The glob module finds all the pathnames matching a specified pattern according to the rules used by the Unix shell, although results are returned in arbitrary order. No tilde expansion is done, but *, ?, and character ranges expressed with [] will be correctly matched. This is done by using the os.scandir() and fnmatch.fnmatch() functions in concert, and not by actually invoking a subshell.

list: Cast as list type

max: Return the largest item in an iterable or the largest of two or more arguments.

min: Return the smallest item in an iterable or the smallest of two or more arguments.

os.path.basename: Return the base name of pathname path. This is the second element of the pair returned by passing path to the function split(). Note that the result of this function is different from the Unix basename program; where basename for '/foo/bar/' returns 'bar', the basename() function returns an empty string ('').

os.path.join: Join one or more path components intelligently. The return value is the concatenation of path and any members of paths with exactly one directory separator (os.sep) following each non-empty part except the last, meaning that the result will only end in a separator if the last part is empty. If a component is an absolute path, all previous components are thrown away and joining continues from the absolute path component.

print: Print strings. Recommend using f-strings formatting. Example, print(f'Results: {variable}').

pwd: Print current working directory

sorted: Return a new sorted list from the items in iterable.

str: For more information on static methods, see The standard type hierarchy.

range: Rather than being a function, range is actually an immutable sequence type, as documented in Ranges and Sequence Types — list, tuple, range.

tuple: Cast as tuple type

type: Return the type of the object.

zip: Make an iterator that aggregates elements from each of the iterables.

Pandas

pandas is an open source, BSD-licensed library providing high-performance, easy-to-use data structures and data analysis tools for the Python programming language.

import pandas as pd

pd.concat: Concatenate pandas objects along a particular axis with optional set logic along the other axes.

pd.DataFrame.isnull: Detect missing values.

pd.DataFrame.mean: Return the mean of the values for the requested axis.

pd.DataFrame.std: Return sample standard deviation over requested axis.

pd.DataFrame.plot: Plot data using matplotlib

pd.DataFrame.map: Map values of Series according to input correspondence.

pd.DataFrame.groupby: Group DataFrame or Series using a mapper or by a Series of columns.

pd.DataFrame.fillna: Fill NA/NaN values using the specified method.

pd.DataFrame.replace: Replace values given in to_replace with value.

NumPy

NumPy is the fundamental package for scientific computing with Python. It contains among other things:

• a powerful N-dimensional array object
• tools for integrating C/C++ and Fortran code
• useful linear algebra, Fourier transform, and random number capabilities

Besides its obvious scientific uses, NumPy can also be used as an efficient multi-dimensional container of generic data. Arbitrary data-types can be defined. This allows NumPy to seamlessly and speedily integrate with a wide variety of databases.

import numpy as np

np.arange: Return evenly spaced values within a given interval.

np.array: Create an array

np.convolve: Returns the discrete, linear convolution of two one-dimensional sequences.

np.cos: Trigonometric cosine element-wise.

np.diag: Extract a diagonal or construct a diagonal array.

np.diag_indices: Return the indices to access the main diagonal of an array.

np.dot: Dot product of two arrays.

np.exp: Calculate the exponential of all elements in the input array.

np.fft.fft: Compute the one-dimensional discrete Fourier Transform.

np.fft.ifft: Compute the one-dimensional inverse discrete Fourier Transform.

np.hstack: Stack arrays in sequence horizontally (column wise).

np.linalg.pinv: Compute the (Moore-Penrose) pseudo-inverse of a matrix.

np.mean: Compute the arithmetic mean along the specified axis.

np.nan: IEEE 754 floating point representation of Not a Number (NaN).

np.ones: Return a new array of given shape and type, filled with ones.

np.pi: Return pi 3.1415926535897932384626433...

np.random.randint: Return random integers from low (inclusive) to high (exclusive).

np.random.randn: Return a sample (or samples) from the “standard normal” distribution.

np.real: Return the real part of the complex argument.

np.sin: Trigonometric sine, element-wise.

np.sqrt: Return the non-negative square-root of an array, element-wise.

np.squeeze: Remove single-dimensional entries from the shape of an array.

np.std: Compute the standard deviation along the specified axis.

np.vstack: Stack arrays in sequence vertically (row wise).

np.zeros: Return a new array of given shape and type, filled with zeros.

SciPy

SciPy is one of the core packages that make up the SciPy stack. It provides many user-friendly and efficient numerical routines, such as routines for numerical integration, interpolation, optimization, linear algebra, and statistics.

scipy.stats.binom: A binomial discrete random variable.

scipy.signal.butter: Butterworth digital and analog filter design.

scipy.signal.filtfilt: Apply a digital filter forward and backward to a signal.

scipy.signal.freqz: Compute the frequency response of a digital filter.

scipy.signal.sosfreqz: Compute the frequency response of a digital filter in SOS format.

scipy.stats.ttest_1samp: Calculate the T-test for the mean of ONE group of scores.

Matplotlib

Matplotlib is a Python 2D plotting library which produces publication quality figures in a variety of hardcopy formats and interactive environments across platforms. Matplotlib can be used in Python scripts, the Python and IPython shells, the Jupyter notebook, web application servers, and four graphical user interface toolkits.

import matplotlib.pyplot as plt

plt.bar: Make a bar plot.

plt.figure: Create a new figure.

plt.hist: Plot a histogram.

plt.imshow: Display an image, i.e. data on a 2D regular raster.

plt.legend: Place a legend on the axes.

plt.savefig: Save the current figure.

plt.scatter: A scatter plot of y vs x with varying marker size and/or color.

plt.subplots: Create a figure and a set of subplots.

ax.axvline: Add a vertical line across the axes.

ax.set_xlabel: Set the label for the x-axis.

ax.set_xlim: Set the x-axis view limits.

ax.set_xticklabels: Set the x-tick labels with list of string labels.

ax.set_ylim: Set the y-axis view limits.

ax.set_yticklabels: Set the y-tick labels with list of string labels.

ax.set_ylabel: Set the label for the y-axis.

ax.set_title: Set a title for the axes.

Seaborn

Seaborn is a Python data visualization library based on matplotlib. It provides a high-level interface for drawing attractive and informative statistical graphics.

import seaborn as sns

sns.heatmap: Plot rectangular data as a color-encoded matrix.

sns.catplot: Figure-level interface for drawing categorical plots onto a FacetGrid.

sns.jointplot: Draw a plot of two variables with bivariate and univariate graphs.

sns.regplot: Plot data and a linear regression model fit.

scikit-learn

Scikit-learn is an open source machine learning library that supports supervised and unsupervised learning. It also provides various tools for model fitting, data preprocessing, model selection and evaluation, and many other utilities.

sklearn.metrics.pairwise_distances: This method takes either a vector array or a distance matrix, and returns a distance matrix. If the input is a vector array, the distances are computed. If the input is a distances matrix, it is returned instead.

sklearn.metrics.balanced_accuracy_score: Compute the balanced accuracy. The balanced accuracy in binary and multiclass classification problems to deal with imbalanced datasets. It is defined as the average of recall obtained on each class.

networkx

NetworkX is a Python package for the creation, manipulation, and study of the structure, dynamics, and functions of complex networks.

import networkx as nx

nx.degree: Return the degree of a node or nodes. The node degree is the number of edges adjacent to that node.

NiBabel

nibabel is a package to help Read / write access to some common neuroimaging file formats, including: ANALYZE (plain, SPM99, SPM2 and later), GIFTI, NIfTI1, NIfTI2, CIFTI-2, MINC1, MINC2, AFNI BRIK/HEAD, MGH and ECAT as well as Philips PAR/REC. We can read and write FreeSurfer geometry, annotation and morphometry files. There is some very limited support for DICOM. NiBabel is the successor of PyNIfTI.

import nibabel as nib

nib.save: Save an image to file adapting format to filename

data.get_data: Return image data from image with any necessary scaling applied

data.get_shape: Return shape for image

data.header: The header of an image contains the image metadata. The information in the header will differ between different image formats. For example, the header information for a NIfTI1 format file differs from the header information for a MINC format file.

data.affine: homogenous affine giving relationship between voxel coordinates and world coordinates. Affine can also be None. In this case, obj.affine also returns None, and the affine as written to disk will depend on the file format.

NiLearn

nilearn is a Python module for fast and easy statistical learning on NeuroImaging data.

It leverages the scikit-learn Python toolbox for multivariate statistics with applications such as predictive modelling, classification, decoding, or connectivity analysis.

nilearn.plotting.plot_anat: Plot cuts of an anatomical image (by default 3 cuts: Frontal, Axial, and Lateral)

nilearn.plotting.view_img:Interactive html viewer of a statistical map, with optional background

nilearn.plotting.plot_glass_brain: Plot 2d projections of an ROI/mask image (by default 3 projections: Frontal, Axial, and Lateral). The brain glass schematics are added on top of the image.

nilearn.plotting.plot_stat_map: Plot cuts of an ROI/mask image (by default 3 cuts: Frontal, Axial, and Lateral)

nltools

NLTools is a Python package for analyzing neuroimaging data. It is the analysis engine powering neuro-learn There are tools to perform data manipulation and analyses such as univariate GLMs, predictive multivariate modeling, and representational similarity analyses.

Data Classes

Adjacency is a class to represent Adjacency matrices as a vector rather than a 2-dimensional matrix. This makes it easier to perform data manipulation and analyses. This tool is particularly useful for performing Representational Similarity Analyses.

Adjacency.distance_to_similarity: Convert distance matrix to similarity matrix

Adjacency.to_graph: Convert Adjacency into networkx graph. only works on single_matrix for now.

Brain_Data

Brain_Data is a class to represent neuroimaging data in python as a vector rather than a 3-dimensional matrix.This makes it easier to perform data manipulation and analyses. This is the main tool for working with neuroimaging data.

Brain_Data.append: Append data to Brain_Data instance

Brain_Data.copy: Create a copy of a Brain_Data instance.

Brain_Data.decompose: Decompose Brain_Data object

Brain_Data.distance: Calculate distance between images within a Brain_Data() instance.

Brain_Data.find_spikes: Function to identify spikes from Time Series Data

Brain_Data.iplot: Create an interactive brain viewer for the current brain data instance.

Brain_Data.mean: Get mean of each voxel across images.

Brain_Data.plot: Create a quick plot of self.data. Will plot each image separately

Brain_Data.predict: Run prediction

Brain_Data.regress: Run a mass-univariate regression across voxels. Three types of regressions can be run: 1) Standard OLS (default) 2) Robust OLS (heteroscedasticty and/or auto-correlation robust errors), i.e. OLS with “sandwich estimators” 3) ARMA (auto-regressive and moving-average lags = 1 by default; experimental)

Brain_Data.shape: Get images by voxels shape.

Brain_Data.similarity: Calculate similarity of Brain_Data() instance with single Brain_Data or Nibabel image

Brain_Data.smooth: Apply spatial smoothing using nilearn smooth_img()

Brain_Data.std: Get standard deviation of each voxel across images.

Brain_Data.threshold: Threshold Brain_Data instance.

Brain_Data.to_nifti: Convert Brain_Data Instance into Nifti Object

Brain_Data.ttest: Calculate one sample t-test across each voxel (two-sided)

Brain_Data.write: Write out Brain_Data object to Nifti or HDF5 File.

Design_Matrix

Design_Matrix is a class to represent design matrices with special methods for data processing (e.g. convolution, upsampling, downsampling) and also intelligent and flexible and intelligent appending (e.g. auto-matically keep certain columns or polynomial terms separated during concatentation). It plays nicely with Brain_Data and can be used to build an experimental design to pass to Brain_Data’s X attribute. It is essentially an enhanced pandas df, with extra attributes and methods. Methods always return a new design matrix instance (copy). Column names are always string types. Inherits most methods on pandas DataFrames.

Design_Matrix.add_dct_basis: Adds unit scaled cosine basis functions to Design_Matrix columns, based on spm-style discrete cosine transform for use in high-pass filtering. Does not add intercept/constant. Care is recommended if using this along with .add_poly(), as some columns will be highly-correlated.

Design_Matrix.add_poly: Add nth order Legendre polynomial terms as columns to design matrix. Good for adding constant/intercept to model (order = 0) and accounting for slow-frequency nuisance artifacts e.g. linear, quadratic, etc drifts. Care is recommended when using this with .add_dct_basis() as some columns will be highly correlated.

Design_Matrix.clean: Method to fill NaNs in Design Matrix and remove duplicate columns based on data values, NOT names. Columns are dropped if they are correlated >= the requested threshold (default = .95). In this case, only the first instance of that column will be retained and all others will be dropped.

Design_Matrix.convolve: Perform convolution using an arbitrary function.

Design_Matrix.heatmap: Visualize Design Matrix spm style. Use .plot() for typical pandas plotting functionality. Can pass optional keyword args to seaborn heatmap.

Design_Matrix.head: This function returns the first n rows for the object based on position. It is useful for quickly testing if your object has the right type of data in it.

Design_Matrix.info: Print a concise summary of a DataFrame.

Design_Matrix.vif: Compute variance inflation factor amongst columns of design matrix, ignoring polynomial terms. Much faster that statsmodel and more reliable too. Uses the same method as Matlab and R (diagonal elements of the inverted correlation matrix).

Design_Matrix.zscore: nltools.stats.downsample, but ensures that returned object is a design matrix.

Statistics Functions

stats.fdr: Determine FDR threshold given a p value array and desired false discovery rate q.

stats.find_spikes: Function to identify spikes from fMRI Time Series Data

stats.fisher_r_to_z: Use Fisher transformation to convert correlation to z score

stats.one_sample_permutation: One sample permutation test using randomization.

stats.threshold: Threshold test image by p-value from p image

stats.regress: This is a flexible function to run several types of regression models provided X and Y numpy arrays. Y can be a 1d numpy array or 2d numpy array. In the latter case, results will be output with shape 1 x Y.shape[1], in other words fitting a separate regression model to each column of Y.

stats.zscore: zscore every column in a pandas dataframe or series.

Miscellaneous Functions

SimulateGrid: A class to simulate signal and noise within 2D grid. Need to update link to nltools documentation once it is built.

external.hrf.glover_hrf: Implementation of the Glover hemodynamic response function (HRF) model.