Multivariate Pattern Analysis in Python

Navigation

  • index
  • modules |
  • modules |
  • next |
  • previous |
  • PyMVPA Home | 
  • Sitemap »
  • PyMVPA User Manual »
  • Tutorial Introduction to PyMVPA »

Part 7: “When Is The Signal” – Event-related Data Analysis¶

Note

This tutorial part is also available for download as an IPython notebook: [ipynb]

In all previous tutorial parts we have analyzed the same fMRI data. We analyzed it using a number of different strategies, but they all had one thing in common: A sample in each dataset was always a single fMRI volume. Sometimes, we have limited ourselves to just a specific region of interest, sometimes we averaged many fMRI volumes into a single one. In all cases, however, a feature always corresponded to a voxel in the fMRI volume and appeared only once in the dataset.

In this part we are going to extend the analysis beyond the spatial dimensions and will consider time as another aspect of our data. This is a common thing to do, for example, in ERP-analyses of EEG data. Here we are going to employ a similar approach in our well-known example data – this time selecting a subset of ventral temporal regions.

>>> from mvpa2.tutorial_suite import *
>>> ds = get_raw_haxby2001_data(roi=(36,38,39,40))
>>> print ds.shape
(1452, 941)

As we know, this dataset consists of 12 concatenated experiment sessions. Every session had a stimulation block spanning multiple fMRI volumes for each of the eight stimulus categories. Stimulation blocks were separated by rest periods. What we want to do now, is to look at the spatio-temporal signal across our region of interest and the full duration of the stimulation blocks. In other words, we want to perform a sensitivity analysis revealing the spatial temporal distribution of classification-relevant information.

In this kind of analysis, we consider each stimulation block an event and we need to create a representative sample for every one of them. In the context of an event-related fMRI classification analysis the literature offers three principal techniques:

  1. Choose a single representative volume.
  2. Compress all relevant volumes into a single one (averaging or parametric fit).
  3. Consider the full event-related time series.

Obviously, only the third approach can possibly provide us with a temporal sensitivity profile, hence we will choose this path.

Event-related Pre-processing Is Not Event-related¶

For an event-related analysis most of the processing is done on data samples that are somehow derived from a set of events. The rest of the data could be considered irrelevant. However, some preprocessing is only meaningful when performed on the full time series and not on the segmented event samples. An example is the detrending that typically needs to be done on the original, continuous time series.

In its current shape our dataset consists of 1452 samples that represent contiguous fMRI volumes. At this stage we can easily perform linear detrending. Again, we are going to do it per each experiment run (the dataset has runs encoded in the chunks sample attribute), since we do not assume a contiguous linear trend throughout the whole time series.

>>> poly_detrend(ds, polyord=1, chunks_attr='chunks')

Let’s make a copy of the de-trended dataset that we can later on use for some visualization.

>>> orig_ds = ds.copy()

We still need to normalize each feature (i.e. a voxel at this point). In this case we are, again, going to Z-score them, using the mean and standard deviation from the experiment’s rest condition. The resulting values might be interpreted as “activation scores”. We are again doing it per each run.

>>> zscore(ds, chunks_attr='chunks', param_est=('targets', 'rest'))

From Timeseries To Events¶

After detrending and normalization, we can now segment the time series into a set of events. To achieve this we have to compile a list of event definitions first. An event is defined by onset, duration and potentially a number of additional properties, such as stimulus condition or recording session number.

In this example we will simply convert the block-design setup defined by the samples attributes into a list of events. With find_events(), PyMVPA provides a function to convert sequential attributes into event lists. In our dataset, we have the stimulus conditions of each volume sample available as targets sample attribute.

>>> events = find_events(targets=ds.sa.targets, chunks=ds.sa.chunks)
>>> print len(events)
204
>>> for e in events[:4]:
...    print e
{'chunks': 0.0, 'duration': 6, 'onset': 0, 'targets': 'rest'}
{'chunks': 0.0, 'duration': 9, 'onset': 6, 'targets': 'scissors'}
{'chunks': 0.0, 'duration': 6, 'onset': 15, 'targets': 'rest'}
{'chunks': 0.0, 'duration': 9, 'onset': 21, 'targets': 'face'}

We are not only feeding the targets to the function but also the chunks attribute since we do not want to have event spanning multiple recording sessions. All that is done by find_events() is sequentially parsing all provided attributes and recording an event whenever the value in any of the attributes changes. The generated event definition is a dictionary that contains:

  1. Onset of the event as index in the sequence (in this example this is a volume id)
  2. Duration of the event in “number of sequence elements” (i.e. number of volumes). The duration is determined by counting the number of identical attribute combinations following an event onset.
  3. Attribute combination of this event, i.e. the actual values of all given attributes at the particular position.

Let’s limit ourselves to face and house stimulation blocks for now. We can easily filter out all other events.

>>> events = [ev for ev in events if ev['targets'] in ['house', 'face']]
>>> print len(events)
24
>>> for e in events[:4]:
...    print e
{'chunks': 0.0, 'duration': 9, 'onset': 21, 'targets': 'face'}
{'chunks': 0.0, 'duration': 9, 'onset': 63, 'targets': 'house'}
{'chunks': 1.0, 'duration': 9, 'onset': 127, 'targets': 'face'}
{'chunks': 1.0, 'duration': 9, 'onset': 213, 'targets': 'house'}
>>> np.unique([e['duration'] for e in events])
array([9])

All of our events are of the same length, 9 consecutive fMRI volumes. Later on we would like to view the temporal sensitivity profile from before until after the stimulation block, hence we should extend the duration of the events a bit.

>>> event_duration = 13
>>> for ev in events:
...     ev['onset'] -= 2
...     ev['duration'] = event_duration

The next and most important step is to actually segment the original time series dataset into event-related samples. PyMVPA offers eventrelated_dataset() as a function to perform this conversion. Let’s just do it, it only needs the original dataset and our list of events.

>>> # alt: `evds = load_tutorial_results('ds_haxby2001_blkev_facehouse')`
>>> evds = eventrelated_dataset(ds, events=events)
>>> len(evds) == len(events)
True
>>> evds.nfeatures == ds.nfeatures * event_duration
True

Exercise

Inspect the evds dataset. It has a fairly large number of attributes – both for samples and for features. Look at each of them and think about what it could be useful for.

At this point is worth looking at the dataset’s mapper – in particular at the last two items in the chain mapper that have been added during the conversion into events.

>>> print evds.a.mapper[-2:]
<Chain: <Boxcar: bl=13>-<Flatten>>

Exercise

Reverse-map a single sample through the last two items in the chain mapper. Inspect the result and make sure it doesn’t surprise. Now, reverse map multiple samples at once and compare the result. Is this what you would expect?

The rest of our analysis is business as usual and is quickly done. We want to perform a cross-validation analysis of an SVM classifier. We are not primarily interested in its performance, but in the weights it assigns to the features. Remember, each feature is now voxel-time-point, so we get a chance of looking at the spatio-temporal profile of classification relevant information in the data. We will nevertheless enable computing of a confusion matrix, so we can assure ourselves that the classifier is performing reasonably well, because only a generalizing model is worth inspecting, as otherwise it overfits and the assigned weights could be meaningless.

>>> sclf = SplitClassifier(LinearCSVMC(),
...                        enable_ca=['stats'])
>>> sensana = sclf.get_sensitivity_analyzer()
>>> sens = sensana(evds)

Exercise

Check that the classifier achieves an acceptable accuracy. Is it enough above chance level to allow for an interpretation of the sensitivities?

Exercise

Using what you have learned in the last tutorial part: Combine the sensitivity maps for all splits into a single map. Project this map into the original dataspace. What is the shape of that space? Store the projected map into a NIfTI file and inspect it using an MRI viewer. Viewer needs to be capable of visualizing time series (hint: for FSLView the time series image has to be opened first)!

A Plotting Example¶

We have inspected the spatio-temporal profile of the sensitivities using some MRI viewer application, but we can also assemble an informative figure right here. Let’s compose a figure that shows the original peri-stimulus time series, the effect of normalization, as well as the corresponding sensitivity profile of the trained SVM classifier. We are going to do that for two example voxels, whose coordinates we might have derived from inspecting the full map.

>>> example_voxels = [(28,25,25), (28,23,25)]

The plotting will be done by the popular matplotlib package.

First, we plot the original signal after initial detrending. To do this, we apply the same time series segmentation to the original detrended dataset and plot the mean signal for all face and house events for both of our example voxels. The code below will create the plot using matplotlib’s pylab interface (imported as pl). If you are familiar with Matlab’s plotting facilities, this shouldn’t be hard to read.

Note

_ = is used in the examples below simply to absorb output of plotting functions. You do not have to swallow output in your interactive sessions.

>>> # linestyles and colors for plotting
>>> vx_lty = ['-', '--']
>>> t_col = ['b', 'r']
>>> # whole figure will have three rows -- this is the first
>>> _ = pl.subplot(311)
>>> # for each of the example voxels
>>> for i, v in enumerate(example_voxels):
...     # get a slicing array matching just to current example voxel
...     slicer = np.array([tuple(idx) == v for idx in ds.fa.voxel_indices])
...     # perform the timeseries segmentation just for this voxel
...     evds_detrend = eventrelated_dataset(orig_ds[:, slicer], events=events)
...     # now plot the mean timeseries and standard error
...     for j, t in enumerate(evds.uniquetargets):
...         l = plot_err_line(evds_detrend[evds_detrend.sa.targets == t].samples,
...                           fmt=t_col[j], linestyle=vx_lty[i])
...         # label this plot for automatic legend generation
...         l[0][0].set_label('Voxel %i: %s' % (i, t))
>>> # y-axis caption
>>> _ = pl.ylabel('Detrended signal')
>>> # visualize zero-level
>>> _ = pl.axhline(linestyle='--', color='0.6')
>>> # put automatic legend
>>> _ = pl.legend()
>>> _ = pl.xlim((0,12))

In the next figure row we do exactly the same again, but this time for the normalized data.

>>> _ = pl.subplot(312)
>>> for i, v in enumerate(example_voxels):
...     slicer = np.array([tuple(idx) == v for idx in ds.fa.voxel_indices])
...     evds_norm = eventrelated_dataset(ds[:, slicer], events=events)
...     for j, t in enumerate(evds.uniquetargets):
...         l = plot_err_line(evds_norm[evds_norm.sa.targets == t].samples,
...                           fmt=t_col[j], linestyle=vx_lty[i])
...         l[0][0].set_label('Voxel %i: %s' % (i, t))
>>> _ = pl.ylabel('Normalized signal')
>>> _ = pl.axhline(linestyle='--', color='0.6')
>>> _ = pl.xlim((0,12))

Finally, we plot the associated SVM weight profile for each peri-stimulus time-point of both voxels. For easier selection we do a little trick and reverse-map the sensitivity profile through the last mapper in the dataset’s chain mapper (look at evds.a.mapper for the whole chain). This will reshape the sensitivities into cross-validation fold x volume x voxel features.

>>> _ = pl.subplot(313)
>>> # L1 normalization of sensitivity maps per split to make them
>>> # comparable
>>> normed = sens.get_mapped(FxMapper(axis='features', fx=l1_normed))
>>> smaps = evds.a.mapper[-1].reverse(normed)
>>> for i, v in enumerate(example_voxels):
...     slicer = np.array([tuple(idx) == v for idx in ds.fa.voxel_indices])
...     smap = smaps.samples[:,:,slicer].squeeze()
...     l = plot_err_line(smap, fmt='ko', linestyle=vx_lty[i], errtype='std')
>>> _ = pl.xlim((0,12))
>>> _ = pl.ylabel('Sensitivity')
>>> _ = pl.axhline(linestyle='--', color='0.6')
>>> _ = pl.xlabel('Peristimulus volumes')

That was it. Perhaps you are scared by the amount of code. Please note that it could have been done shorter, but this way allows to plot any other voxel coordinate combination as well. matplotlib allows for saving this figure in SVG format that allows for convenient post-processing in Inkscape – a publication quality figure is only minutes away.

_images/ex_eventrelated.png

Sensitivity profile for two example voxels for face vs. house classification on event-related fMRI data from ventral temporal cortex.

Exercise

What can we say about the properties of the example voxel’s signal from the peri-stimulus plot?

If That Was Easy...¶

This demo showed an event-related data analysis. Although we have performed it on fMRI data, an analogous analysis can be done for any time series based data in an almost identical fashion. Moreover, if a dataset has information about acquisition time (e.g. like the ones created by fmri_dataset()) eventrelated_dataset() can also convert event-definition in real time, making it relatively easy to “convert” experiment design logfiles into event lists. In this case there would be no need to run a function like find_events(), but instead they could be directly specified and passed to eventrelated_dataset().

At the end of this tutorial part we want to take a little glimpse on the power of PyMVPA for “multi-space” analysis. From the previous tutorial part we know how to do searchlight analyses and it was promised that there is more to it than what we already saw. And here it is:

>>> cvte = CrossValidation(LinearCSVMC(), NFoldPartitioner(),
...                        postproc=mean_sample())
>>> sl = Searchlight(cvte,
...                  IndexQueryEngine(voxel_indices=Sphere(1),
...                                   event_offsetidx=Sphere(2)),
...                  postproc=mean_sample())
>>> res = sl(evds)

Have you been able to deduce what this analysis will do? Clearly, it is some sort of searchlight, but it doesn’t use sphere_searchlight(). Instead, it utilizes Searchlight. Yes, your are correct this is a spatio-temporal searchlight. The searchlight focus travels along all possible locations in our ventral temporal ROI, but at the same time also along the peristimulus time segment covered by the events. The spatial searchlight extent is the center voxel and its immediate neighbors and the temporal dimension comprises of two additional ime-points in each direction. The result is again a dataset. Its shape is compatible with the mapper of evds, hence it can also be back-projected into the original 4D fMRI brain space.

Searchlight is a powerful class that allows for complex runtime ROI generation. In this case it uses an IndexQueryEngine to look at certain feature attributes in the dataset to compose sphere-shaped ROIs in two spaces at the same time. This approach is very flexible and can be extended with additional query engines to algorithms of almost arbitrary complexity.

>>> ts = res.a.mapper.reverse1(1 - res.samples[0])
>>> ni = nb.Nifti1Image(ts, ds.a.imghdr.get_best_affine(),
...                         ds.a.imghdr).to_filename('ersl.nii')

After all is done¶

After you are done and want to tidy up after yourself, you can easily remove unneeded generated files from within Python:

>>> os.unlink('ersl.nii')

Table Of Contents

  • Part 7: “When Is The Signal” – Event-related Data Analysis
    • Event-related Pre-processing Is Not Event-related
    • From Timeseries To Events
    • A Plotting Example
    • If That Was Easy...
    • After all is done

Previous topic

Part 6: Looking Without Searching – Sensitivity Analysis

Next topic

Part 8: The Earth Is Round – Significance Testing

Quick search

Enter search terms or a module, class or function name.

NeuroDebian

NITRC-listed

Navigation

  • index
  • modules |
  • modules |
  • next |
  • previous |
  • PyMVPA Home | 
  • Sitemap »
  • PyMVPA User Manual »
  • Tutorial Introduction to PyMVPA »
© Copyright 2006-2012, PyMVPA Authors. Created using Sphinx 1.2.1.