trftools.pipeline.FilePredictor

class trftools.pipeline.FilePredictor(resample=None, columns=False, sampling=None)

Predictor stored in files corresponding to specific stimuli

There are two basic ways to represent predictors in files (see the Notes section below for details):

Uniform time series (UTS). A NDVar with time dimension matching the data.

Non-uniform time series (NUTS). A Dataset with columns representing time stamps, event values and optionally event masks.

Warning

When changing a file in which a predictor is stored, cached results using that predictor will not automatically be deleted. Use TRFExperiment.invalidate() whenever replacing a predictors.

Parameters:

resample (Literal['bin', 'resample']) –
How to resample predictor. When analyses are done at different sampling rates, it is often convenient to generate predictors at a high sampling rate and then downsample dynamically to match the data.
- bin: averaging the values in time bins
- resample: use appropriate filter followed by decimation
For predictors with non-continuous information, such as impulses, binning is more appropriate. Alternatively, the predictor can be saved as a list of NDVar with all the needed sampling frequencies.
columns (bool) – Only applies to NUTS (Dataset) predictors. Use a single file with different columns. The code is interpreted as {name}-{value-column}-{mask-column}. The code {name} alone invokes an intercept, i.e. a value of 1 at each time point.
sampling (Literal['continuous', 'discrete']) – Whether to expect a continuous or a discrete predictor (usually an NDVar or a Dataset, respectively). Used to decide whether to filter this predictor with filter_x='continuous'. Note: 'discrete' predictors with *-step suffix will always be trated as continuous.

Notes

The file-predictor expects to find a file for each stimulus containing the predictor at:

{root}/derivatives/predictors/{stimulus}~{key}[-{variant}].pickle

Where stimulus refers to the name provided by stim_var, key refers to the predictor’s name (key used in TRFExperiment.predictors), and the optional variant can be used to distinguish different variants of the same predictor.

UTS

UTS predictors are stored as NDVar objects with time dimension matching the data. The -{variant} part of the filename can be used freely to manage multiple predictors with the same FilePredictor instance. Use the resample parameter to determine how the predictor is resampled to match the samplingrate of the data.

NUTS

NUTS predictors are specified as Dataset objects. When loading a predictor, Dataset predictors are converted to uniform time series by placing impulses at time-stamps specified in the datasets.

Without the columns option, the dataset is expected to contain the following columns:

time: Time stamp of the event (impulse) in seconds.

value: Value of the impulse (magnitude).

mask (optional): If present, the (boolean) mask will be applied to value (i.e., value will be set to zero wherever mask is False).

With the columns=True option, the columns containing the value and mask values can be specified dynamically in the variable name, as {key}-{value-column} or {key}-{value-column}-{mask-column}.

Examples

Assume a Dataset stored at predictors/story~word.pickle, etc., with the following columns:

time, indicating the word’s onset time

frequency, the word frequency

surprisal, how surprising the word is in its context

noun, True if the word is a noun, False otherwise

This could be added to the experiment as follows:

predictors = {
‘word’: FilePredictor(columns=True),

}

With this predictor, the following terms could be used for TRF models:

word: Unit size impulse at every word onset

word-frequency: An impulse at each word onset reflecting the word’s frequency

word-frequency-noun: An impulse at each noun’s onset reflecting the noun’s frequency

These terms in turn could be used to construct the following model:

experiment.load_trfs(x="word + word-frequency + word-surprisal")

trftools.pipeline.FilePredictor

UTS

NUTS

Methods