nimare.dataset.Dataset

class Dataset(source, target='mni152_2mm', mask=None)[source]

Bases: nimare.base.NiMAREBase

Storage container for a coordinate- and/or image-based meta-analytic dataset/database.

Changed in version 0.0.9:

  • [ENH] Add merge method to Dataset class

Changed in version 0.0.8:

  • [FIX] Set nimare.dataset.Dataset.basepath in update_path() using absolute path.

Parameters
  • source (str or dict) – JSON file containing dictionary with database information or the dict() object

  • target (str, optional) – Desired coordinate space for coordinates. Names follow NIDM convention. Default is ‘mni152_2mm’ (MNI space with 2x2x2 voxels). This parameter has no impact on images.

  • mask (str, nibabel.nifti1.Nifti1Image, nilearn.input_data.NiftiMasker or similar, or None, optional) – Mask(er) to use. If None, uses the target space image, with all non-zero voxels included in the mask.

Variables

Notes

Images loaded into a Dataset are assumed to be in the same space. If images have different resolutions or affines from the Dataset’s masker, then they will be resampled automatically, at the point where they’re used, by Dataset.masker.

property annotations

Labels describing studies in the dataset.

Each study/experiment has its own row. Columns correspond to individual labels (e.g., ‘emotion’), and may be prefixed with a feature group including two underscores (e.g., ‘Neurosynth_TFIDF__emotion’).

Type

pandas.DataFrame

property coordinates

Coordinates in the dataset.

Changed in version 0.0.10: The coordinates attribute no longer includes the associated matrix indices (columns ‘i’, ‘j’, and ‘k’). These columns are calculated as needed.

Each study has one row for each peak. Columns include [‘x’, ‘y’, ‘z’] (peak locations in mm) and ‘space’ (Dataset’s space).

Type

pandas.DataFrame

copy()[source]

Create a copy of the Dataset.

get(dict_, drop_invalid=True)[source]

Retrieve files and/or metadata from the current Dataset.

Parameters
  • dict_ (dict) – Dictionary specifying images or metadata to collect. Keys should be variables to be used as keys for results dictionary. Values should be tuples with two values: type (e.g., ‘image’ or ‘metadata’) and specific field corresponding to column of type-specific DataFrame (e.g., ‘z’ or ‘sample_sizes’).

  • drop_invalid (bool, optional) – Whether to automatically ignore any studies without the required data or not. Default is False.

Returns

results (dict) – A dictionary of lists of requested data. Keys correspond to the keys in dict_.

Examples

>>> dset.get({'z_maps': ('image', 'z'), 'sample_sizes': ('metadata', 'sample_sizes')})
>>> dset.get({'coordinates': ('coordinates', None)})
get_images(ids=None, imtype=None)[source]

Get images of a certain type for a subset of studies in the dataset.

Parameters
  • ids (list, optional) – A list of IDs in the Dataset for which to find images. Default is None, in which case all images of requested type are returned.

  • imtype (str, optional) – Type of image to extract. Corresponds to column name in Dataset.images DataFrame. Default is None.

Returns

images (list) – List of images of requested type for selected IDs.

get_labels(ids=None)[source]

Extract list of labels for which studies in Dataset have annotations.

Parameters

ids (list, optional) – A list of IDs in the Dataset for which to find labels. Default is None, in which case all labels are returned.

Returns

labels (list) – List of labels for which there are annotations in the Dataset.

get_metadata(ids=None, field=None)[source]

Get metadata from Dataset.

Parameters
  • ids (list, optional) – A list of IDs in the Dataset for which to find metadata. Default is None, in which case all metadata of requested type are returned.

  • field (str, optional) – Metadata field to extract. Corresponds to column name in Dataset.metadata DataFrame. Default is None.

Returns

metadata (list) – List of values of requested type for selected IDs.

get_params(deep=True)[source]

Get parameters for this estimator.

Parameters

deep (bool, optional) – If True, will return the parameters for this estimator and contained subobjects that are estimators.

Returns

params (dict) – Parameter names mapped to their values.

get_studies_by_coordinate(xyz, r=20)[source]

Extract list of studies with at least one focus within radius of requested coordinates.

Parameters
  • xyz ((X x 3) array_like) – List of coordinates against which to find studies.

  • r (float, optional) – Radius (in mm) within which to find studies. Default is 20mm.

Returns

found_ids (list) – A list of IDs from the Dataset with at least one focus within radius r of requested coordinates.

get_studies_by_label(labels=None, label_threshold=0.001)[source]

Extract list of studies with a given label.

Changed in version 0.0.10: Fix bug in which all IDs were returned when a label wasn’t present in the Dataset.

Changed in version 0.0.9: Default value for label_threshold changed to 0.001.

Parameters
  • labels (list, optional) – List of labels to use to search Dataset. If a contrast has all of the labels above the threshold, it will be returned. Default is None.

  • label_threshold (float, optional) – Default is 0.5.

Returns

found_ids (list) – A list of IDs from the Dataset found by the search criteria.

get_studies_by_mask(mask)[source]

Extract list of studies with at least one coordinate in mask.

Parameters

mask (img_like) – Mask across which to search for coordinates.

Returns

found_ids (list) – A list of IDs from the Dataset with at least one focus in the mask.

get_texts(ids=None, text_type=None)[source]

Extract list of texts of a given type for selected IDs.

Parameters
  • ids (list, optional) – A list of IDs in the Dataset for which to find texts. Default is None, in which case all texts of requested type are returned.

  • text_type (str, optional) – Type of text to extract. Corresponds to column name in Dataset.texts DataFrame. Default is None.

Returns

texts (list) – List of texts of requested type for selected IDs.

property ids

1D array of identifiers in Dataset.

The associated setter for this property is private, as Dataset.ids is immutable.

Type

numpy.ndarray

property images

Images in the dataset.

Each image type has its own column (e.g., ‘z’) with absolute paths to files and each study has its own row. Additionally, relative paths to image files are stored in columns with the suffix ‘__relative’ (e.g., ‘z__relative’).

Warning

Images are assumed to be in the same space, although they may have different resolutions and affines. Images will be resampled as needed at the point where they are used, via Dataset.masker.

Type

pandas.DataFrame

classmethod load(filename, compressed=True)[source]

Load a pickled class instance from file.

Parameters
  • filename (str) – Name of file containing object.

  • compressed (bool, optional) – If True, the file is assumed to be compressed and gzip will be used to load it. Otherwise, it will assume that the file is not compressed. Default = True.

Returns

obj (class object) – Loaded class object.

property masker

Masker object.

Defines the space and location of the area of interest (e.g., ‘brain’).

Type

nilearn.input_data.NiftiMasker or similar

merge(right)[source]

Merge two Datasets.

New in version 0.0.9.

Parameters

right (nimare.dataset.Dataset) – Dataset to merge with.

Returns

nimare.dataset.Dataset – A Dataset of the two merged Datasets.

property metadata

Metadata describing studies in the dataset.

Each metadata field has its own column (e.g., ‘sample_sizes’) and each study has its own row.

Type

pandas.DataFrame

save(filename, compress=True)[source]

Pickle the class instance to the provided file.

Parameters
  • filename (str) – File to which object will be saved.

  • compress (bool, optional) – If True, the file will be compressed with gzip. Otherwise, the uncompressed version will be saved. Default = True.

set_params(**params)[source]

Set the parameters of this estimator.

The method works on simple estimators as well as on nested objects (such as pipelines). The latter have parameters of the form <component>__<parameter> so that it’s possible to update each component of a nested object.

Returns

self

slice(ids)[source]

Create a new dataset with only requested IDs.

Parameters

ids (array_like) – List of study IDs to include in new dataset

Returns

new_dset (nimare.dataset.Dataset) – Reduced Dataset containing only requested studies.

property texts

Texts in the dataset.

Each text type has its own column (e.g., ‘abstract’) and each study has its own row.

Type

pandas.DataFrame

update_path(new_path)[source]

Update paths to images.

Prepends new path to the relative path for files in Dataset.images.

Parameters

new_path (str) – Path to prepend to relative paths of files in Dataset.images.

Examples using nimare.dataset.Dataset