nimare.dataset.Dataset

class Dataset(source, target='mni152_2mm', mask=None)[source]

Bases: NiMAREBase

Storage container for a coordinate- and/or image-based meta-analytic dataset/database.

Changed in version 0.0.9:

  • [ENH] Add merge method to Dataset class

Changed in version 0.0.8:

  • [FIX] Set nimare.dataset.Dataset.basepath in update_path() using absolute path.

Parameters:
  • source (str or dict) – JSON file containing dictionary with database information or the dict() object

  • target (str, optional) – Desired coordinate space for coordinates. Names follow NIDM convention. Default is ‘mni152_2mm’ (MNI space with 2x2x2 voxels). This parameter has no impact on images.

  • mask (str, Nifti1Image, NiftiMasker or similar, or None, optional) – Mask(er) to use. If None, uses the target space image, with all non-zero voxels included in the mask.

Variables:

space (str) – Standard space. Same as target parameter.

Notes

Images loaded into a Dataset are assumed to be in the same space. If images have different resolutions or affines from the Dataset’s masker, then they will be resampled automatically, at the point where they’re used, by Dataset.masker.

Methods

copy()

Create a copy of the Dataset.

get(dict_[, drop_invalid])

Retrieve files and/or metadata from the current Dataset.

get_images([ids, imtype])

Get images of a certain type for a subset of studies in the dataset.

get_labels([ids])

Extract list of labels for which studies in Dataset have annotations.

get_metadata([ids, field])

Get metadata from Dataset.

get_params([deep])

Get parameters for this estimator.

get_studies_by_coordinate(xyz[, r])

Extract list of studies with at least one focus within radius of requested coordinates.

get_studies_by_label([labels, label_threshold])

Extract list of studies with a given label.

get_studies_by_mask(mask)

Extract list of studies with at least one coordinate in mask.

get_texts([ids, text_type])

Extract list of texts of a given type for selected IDs.

load(filename[, compressed])

Load a pickled class instance from file.

merge(right)

Merge two Datasets.

save(filename[, compress])

Pickle the class instance to the provided file.

set_params(**params)

Set the parameters of this estimator.

slice(ids)

Create a new dataset with only requested IDs.

update_path(new_path)

Update paths to images.

Properties

annotations

Labels describing studies in the dataset.

coordinates

Coordinates in the dataset.

ids

1D array of identifiers in Dataset.

images

Images in the dataset.

masker

Masker object.

metadata

Metadata describing studies in the dataset.

texts

Texts in the dataset.

property annotations

Labels describing studies in the dataset.

Each study/experiment has its own row. Columns correspond to individual labels (e.g., ‘emotion’), and may be prefixed with a feature group including two underscores (e.g., ‘Neurosynth_TFIDF__emotion’).

Type:

pandas.DataFrame

property coordinates

Coordinates in the dataset.

Changed in version 0.0.10: The coordinates attribute no longer includes the associated matrix indices (columns ‘i’, ‘j’, and ‘k’). These columns are calculated as needed.

Each study has one row for each peak. Columns include [‘x’, ‘y’, ‘z’] (peak locations in mm) and ‘space’ (Dataset’s space).

Type:

pandas.DataFrame

copy()[source]

Create a copy of the Dataset.

get(dict_, drop_invalid=True)[source]

Retrieve files and/or metadata from the current Dataset.

Parameters:
  • dict (dict) – Dictionary specifying images or metadata to collect. Keys should be variables to be used as keys for results dictionary. Values should be tuples with two values: type (e.g., ‘image’ or ‘metadata’) and specific field corresponding to column of type-specific DataFrame (e.g., ‘z’ or ‘sample_sizes’).

  • drop_invalid (bool, optional) – Whether to automatically ignore any studies without the required data or not. Default is False.

Returns:

results – A dictionary of lists of requested data. Keys correspond to the keys in dict_.

Return type:

dict

Examples

>>> dset.get({'z_maps': ('image', 'z'), 'sample_sizes': ('metadata', 'sample_sizes')})
>>> dset.get({'coordinates': ('coordinates', None)})
get_images(ids=None, imtype=None)[source]

Get images of a certain type for a subset of studies in the dataset.

Parameters:
  • ids (list, optional) – A list of IDs in the Dataset for which to find images. Default is None, in which case all images of requested type are returned.

  • imtype (str, optional) – Type of image to extract. Corresponds to column name in Dataset.images DataFrame. Default is None.

Returns:

images – List of images of requested type for selected IDs.

Return type:

list

get_labels(ids=None)[source]

Extract list of labels for which studies in Dataset have annotations.

Parameters:

ids (list, optional) – A list of IDs in the Dataset for which to find labels. Default is None, in which case all labels are returned.

Returns:

labels – List of labels for which there are annotations in the Dataset.

Return type:

list

get_metadata(ids=None, field=None)[source]

Get metadata from Dataset.

Parameters:
  • ids (list, optional) – A list of IDs in the Dataset for which to find metadata. Default is None, in which case all metadata of requested type are returned.

  • field (str, optional) – Metadata field to extract. Corresponds to column name in Dataset.metadata DataFrame. Default is None.

Returns:

metadata – List of values of requested type for selected IDs.

Return type:

list

get_params(deep=True)[source]

Get parameters for this estimator.

Parameters:

deep (bool, default=True) – If True, will return the parameters for this estimator and contained subobjects that are estimators.

Returns:

params – Parameter names mapped to their values.

Return type:

dict

get_studies_by_coordinate(xyz, r=20)[source]

Extract list of studies with at least one focus within radius of requested coordinates.

Parameters:
  • xyz ((X x 3) array_like) – List of coordinates against which to find studies.

  • r (float, optional) – Radius (in mm) within which to find studies. Default is 20mm.

Returns:

found_ids – A list of IDs from the Dataset with at least one focus within radius r of requested coordinates.

Return type:

list

get_studies_by_label(labels=None, label_threshold=0.001)[source]

Extract list of studies with a given label.

Changed in version 0.0.10: Fix bug in which all IDs were returned when a label wasn’t present in the Dataset.

Changed in version 0.0.9: Default value for label_threshold changed to 0.001.

Parameters:
  • labels (list, optional) – List of labels to use to search Dataset. If a contrast has all of the labels above the threshold, it will be returned. Default is None.

  • label_threshold (float, optional) – Default is 0.5.

Returns:

found_ids – A list of IDs from the Dataset found by the search criteria.

Return type:

list

get_studies_by_mask(mask)[source]

Extract list of studies with at least one coordinate in mask.

Parameters:

mask (img_like) – Mask across which to search for coordinates.

Returns:

found_ids – A list of IDs from the Dataset with at least one focus in the mask.

Return type:

list

get_texts(ids=None, text_type=None)[source]

Extract list of texts of a given type for selected IDs.

Parameters:
  • ids (list, optional) – A list of IDs in the Dataset for which to find texts. Default is None, in which case all texts of requested type are returned.

  • text_type (str, optional) – Type of text to extract. Corresponds to column name in Dataset.texts DataFrame. Default is None.

Returns:

texts – List of texts of requested type for selected IDs.

Return type:

list

property ids

1D array of identifiers in Dataset.

The associated setter for this property is private, as Dataset.ids is immutable.

Type:

numpy.ndarray

property images

Images in the dataset.

Each image type has its own column (e.g., ‘z’) with absolute paths to files and each study has its own row. Additionally, relative paths to image files are stored in columns with the suffix ‘__relative’ (e.g., ‘z__relative’).

Warning

Images are assumed to be in the same space, although they may have different resolutions and affines. Images will be resampled as needed at the point where they are used, via Dataset.masker.

Type:

pandas.DataFrame

classmethod load(filename, compressed=True)[source]

Load a pickled class instance from file.

Parameters:
  • filename (str) – Name of file containing object.

  • compressed (bool, default=True) – If True, the file is assumed to be compressed and gzip will be used to load it. Otherwise, it will assume that the file is not compressed. Default = True.

Returns:

obj – Loaded class object.

Return type:

class object

property masker

Masker object.

Defines the space and location of the area of interest (e.g., ‘brain’).

Type:

nilearn.input_data.NiftiMasker or similar

merge(right)[source]

Merge two Datasets.

New in version 0.0.9.

Parameters:

right (Dataset) – Dataset to merge with.

Returns:

A Dataset of the two merged Datasets.

Return type:

Dataset

property metadata

Metadata describing studies in the dataset.

Each metadata field has its own column (e.g., ‘sample_sizes’) and each study has its own row.

Type:

pandas.DataFrame

save(filename, compress=True)[source]

Pickle the class instance to the provided file.

Parameters:
  • filename (str) – File to which object will be saved.

  • compress (bool, optional) – If True, the file will be compressed with gzip. Otherwise, the uncompressed version will be saved. Default = True.

set_params(**params)[source]

Set the parameters of this estimator.

The method works on simple estimators as well as on nested objects (such as pipelines). The latter have parameters of the form <component>__<parameter> so that it’s possible to update each component of a nested object.

Return type:

self

slice(ids)[source]

Create a new dataset with only requested IDs.

Parameters:

ids (array_like) – List of study IDs to include in new dataset

Returns:

new_dset – Reduced Dataset containing only requested studies.

Return type:

Dataset

property texts

Texts in the dataset.

Each text type has its own column (e.g., ‘abstract’) and each study has its own row.

Type:

pandas.DataFrame

update_path(new_path)[source]

Update paths to images.

Prepends new path to the relative path for files in Dataset.images.

Parameters:

new_path (str) – Path to prepend to relative paths of files in Dataset.images.

Examples using nimare.dataset.Dataset

The NiMARE Dataset object

The NiMARE Dataset object

Use NeuroVault statistical maps in NiMARE

Use NeuroVault statistical maps in NiMARE

Transform images into coordinates

Transform images into coordinates

Using NIMADS with NiMARE

Using NIMADS with NiMARE

Coordinate-based meta-analysis algorithms

Coordinate-based meta-analysis algorithms

Image-based meta-analysis algorithms

Image-based meta-analysis algorithms

KernelTransformers and CBMA

KernelTransformers and CBMA

The Estimator class

The Estimator class

The Corrector class

The Corrector class

Compare image and coordinate based meta-analyses

Compare image and coordinate based meta-analyses

Two-sample ALE meta-analysis

Two-sample ALE meta-analysis

Simulate data for coordinate based meta-analysis

Simulate data for coordinate based meta-analysis

Run a coordinate-based meta-analysis (CBMA) workflow

Run a coordinate-based meta-analysis (CBMA) workflow

Coordinate-based meta-regression algorithms

Coordinate-based meta-regression algorithms

Run an image-based meta-analysis (IBMA) workflow

Run an image-based meta-analysis (IBMA) workflow

Simple annotation from text

Simple annotation from text

The Cognitive Atlas

The Cognitive Atlas

LDA topic modeling

LDA topic modeling

GCLDA topic modeling

GCLDA topic modeling

Discrete functional decoding

Discrete functional decoding