`nimare.dataset`.Dataset

class Dataset(source, target='mni152_2mm', mask=None)[source]

Bases: NiMAREBase

Storage container for a coordinate- and/or image-based meta-analytic dataset/database.

Changed in version 0.0.9:

[ENH] Add merge method to Dataset class

Changed in version 0.0.8:

[FIX] Set nimare.dataset.Dataset.basepath in update_path() using absolute path.

Parameters:

source (str or dict) – JSON file containing dictionary with database information or the dict() object
target (str, optional) – Desired coordinate space for coordinates. Names follow NIDM convention. Default is ‘mni152_2mm’ (MNI space with 2x2x2 voxels). This parameter has no impact on images.
mask (str, Nifti1Image, NiftiMasker or similar, or None, optional) – Mask(er) to use. If None, uses the target space image, with all non-zero voxels included in the mask.

Variables:

space (str) – Standard space. Same as target parameter.

Notes

Warning

Dataset is deprecated and will be removed in a future release. For new workflows, use Studyset instead. If you need a Dataset-compatible tabular execution view, use view().

Images loaded into a Dataset are assumed to be in the same space. If images have different resolutions or affines from the Dataset’s masker, then they will be resampled automatically, at the point where they’re used, by Dataset.masker.

Methods

`copy`()	Create a copy of the Dataset.
`get`(dict_[, drop_invalid])	Retrieve files and/or metadata from the current Dataset.
`get_images`([ids, imtype])	Get images of a certain type for a subset of studies in the dataset.
`get_labels`([ids])	Extract list of labels for which studies in Dataset have annotations.
`get_metadata`([ids, field])	Get metadata from Dataset.
`get_params`([deep])	Get parameters for this estimator.
`get_studies_by_coordinate`(xyz[, r])	Extract list of studies with at least one focus within radius of requested coordinates.
`get_studies_by_label`([labels, label_threshold])	Extract list of studies with a given label.
`get_studies_by_mask`(mask)	Extract list of studies with at least one focus in mask.
`get_texts`([ids, text_type])	Extract list of texts of a given type for selected IDs.
`load`(filename[, compressed])	Load a pickled class instance from file.
`merge`(right)	Merge two Datasets.
`save`(filename[, compress])	Pickle the class instance to the provided file.
`set_params`(**params)	Set the parameters of this estimator.
`slice`(ids)	Create a new dataset with only requested IDs.
`update_path`(new_path)	Update paths to images.

Properties

`annotations`	Labels describing studies in the dataset.
`annotations_df`	Alias for `annotations`.
`coordinates`	Coordinates in the dataset.
`ids`	1D array of identifiers in Dataset.
`images`	Images in the dataset.
`masker`	Masker object.
`metadata`	Metadata describing studies in the dataset.
`texts`	Texts in the dataset.

property annotations

Labels describing studies in the dataset.

Each study/experiment has its own row. Columns correspond to individual labels (e.g., ‘emotion’), and may be prefixed with a feature group including two underscores (e.g., ‘Neurosynth_TFIDF__emotion’).

Type:: pandas.DataFrame

property annotations_df

Alias for annotations.

Provides a unified tabular annotation interface shared with Studyset.

Type:: pandas.DataFrame

property coordinates

Coordinates in the dataset.

Changed in version 0.0.10: The coordinates attribute no longer includes the associated matrix indices (columns ‘i’, ‘j’, and ‘k’). These columns are calculated as needed.

Each study has one row for each peak. Columns include [‘x’, ‘y’, ‘z’] (peak locations in mm) and ‘space’ (Dataset’s space).

Type:: pandas.DataFrame

copy()[source]: Create a copy of the Dataset.

get(dict_, drop_invalid=True)[source]

Retrieve files and/or metadata from the current Dataset.

Warning

This legacy Dataset method will be deprecated in a future release. Prefer get() or get().

Parameters:

dict (dict) – Dictionary specifying images or metadata to collect. Keys should be variables to be used as keys for results dictionary. Values should be tuples with two values: type (e.g., ‘image’ or ‘metadata’) and specific field corresponding to column of type-specific DataFrame (e.g., ‘z’ or ‘sample_sizes’).
drop_invalid (bool, optional) – Whether to automatically ignore any studies without the required data or not. Default is False.

Returns:

results – A dictionary of lists of requested data. Keys correspond to the keys in dict_.

Return type:

dict

Examples

>>> dset.get({'z_maps': ('image', 'z'), 'sample_sizes': ('metadata', 'sample_sizes')})
>>> dset.get({'coordinates': ('coordinates', None)})

get_images(ids=None, imtype=None)[source]

Get images of a certain type for a subset of studies in the dataset.

Warning

This legacy Dataset method will be deprecated in a future release. Prefer get_images().

Parameters:

ids (list, optional) – A list of IDs in the Dataset for which to find images. Default is None, in which case all images of requested type are returned.
imtype (str, optional) – Type of image to extract. Corresponds to column name in Dataset.images DataFrame. Default is None.

Returns:

images – List of images of requested type for selected IDs.

Return type:

list

get_labels(ids=None)[source]

Extract list of labels for which studies in Dataset have annotations.

Warning

This legacy Dataset method will be deprecated in a future release. Prefer get_labels() or get_labels().

Parameters:: ids (list, optional) – A list of IDs in the Dataset for which to find labels. Default is None, in which case all labels are returned.
Returns:: labels – List of labels for which there are annotations in the Dataset.
Return type:: list

get_metadata(ids=None, field=None)[source]

Get metadata from Dataset.

Warning

This legacy Dataset method will be deprecated in a future release. Prefer get_metadata().

Parameters:

ids (list, optional) – A list of IDs in the Dataset for which to find metadata. Default is None, in which case all metadata of requested type are returned.
field (str, optional) – Metadata field to extract. Corresponds to column name in Dataset.metadata DataFrame. Default is None.

Returns:

metadata – List of values of requested type for selected IDs.

Return type:

list

get_params(deep=True)[source]

Get parameters for this estimator.

Parameters:: deep (bool, default=True) – If True, will return the parameters for this estimator and contained subobjects that are estimators.
Returns:: params – Parameter names mapped to their values.
Return type:: dict

get_studies_by_coordinate(xyz, r=20)[source]

Extract list of studies with at least one focus within radius of requested coordinates.

Warning

This legacy Dataset method will be deprecated in a future release. Prefer get_analyses_by_coordinate() for slicing-ready analysis IDs, or get_studies_by_coordinate() for the Dataset-style convenience wrapper.

Parameters:

xyz ((X x 3) array_like) – List of coordinates against which to find studies.
r (float, optional) – Radius (in mm) within which to find studies. Default is 20mm.

Returns:

found_ids – A list of IDs from the Dataset with at least one focus within radius r of requested coordinates.

Return type:

list

get_studies_by_label(labels=None, label_threshold=0.001)[source]

Extract list of studies with a given label.

Changed in version 0.0.10: Fix bug in which all IDs were returned when a label wasn’t present in the Dataset.

Changed in version 0.0.9: Default value for label_threshold changed to 0.001.

Warning

This legacy Dataset method will be deprecated in a future release. Prefer get_analyses_by_label() for slicing-ready analysis IDs, or get_studies_by_label() for the Dataset-style convenience wrapper.

Parameters:

labels (list, optional) – List of labels to use to search Dataset. If a contrast has all of the labels above the threshold, it will be returned. Default is None.
label_threshold (float, optional) – Default is 0.5.

Returns:

found_ids – A list of IDs from the Dataset found by the search criteria.

Return type:

list

get_studies_by_mask(mask)[source]

Extract list of studies with at least one focus in mask.

Warning

This legacy Dataset method will be deprecated in a future release. Prefer get_analyses_by_mask() for slicing-ready analysis IDs, or get_studies_by_mask() for the Dataset-style convenience wrapper.

Parameters:: mask (Nifti1Image) – Mask with which to evaluate coordinates for inclusion.
Returns:: found_ids – A list of IDs from the Dataset with at least one focus in the mask.
Return type:: list

get_texts(ids=None, text_type=None)[source]

Extract list of texts of a given type for selected IDs.

Warning

This legacy Dataset method will be deprecated in a future release. Prefer get_texts().

Parameters:

ids (list, optional) – A list of IDs in the Dataset for which to find texts. Default is None, in which case all texts of requested type are returned.
text_type (str, optional) – Type of text to extract. Corresponds to column name in Dataset.texts DataFrame. Default is None.

Returns:

texts – List of texts of requested type for selected IDs.

Return type:

list

property ids

1D array of identifiers in Dataset.

The associated setter for this property is private, as Dataset.ids is immutable.

Type:: numpy.ndarray

property images

Images in the dataset.

Each image type has its own column (e.g., ‘z’) with absolute paths to files and each study has its own row. Additionally, relative paths to image files are stored in columns with the suffix ‘__relative’ (e.g., ‘z__relative’).

Warning

Images are assumed to be in the same space, although they may have different resolutions and affines. Images will be resampled as needed at the point where they are used, via Dataset.masker.

Type:: pandas.DataFrame

classmethod load(filename, compressed=True)[source]

Load a pickled class instance from file.

Parameters:

filename (str) – Name of file containing object.
compressed (bool, default=True) – If True, the file is assumed to be compressed and gzip will be used to load it. Otherwise, it will assume that the file is not compressed. Default = True.

Returns:

obj – Loaded class object.

Return type:

class object

property masker

Masker object.

Defines the space and location of the area of interest (e.g., ‘brain’).

Type:: nilearn.maskers.NiftiMasker or similar

merge(right)[source]

Merge two Datasets.

Added in version 0.0.9.

Warning

This legacy Dataset method will be deprecated in a future release. Prefer merge().

Parameters:: right (Dataset) – Dataset to merge with.
Returns:: A Dataset of the two merged Datasets.
Return type:: Dataset

property metadata

Metadata describing studies in the dataset.

Each metadata field has its own column (e.g., ‘sample_sizes’) and each study has its own row.

Type:: pandas.DataFrame

save(filename, compress=True)[source]

Pickle the class instance to the provided file.

Parameters:

filename (str) – File to which object will be saved.
compress (bool, optional) – If True, the file will be compressed with gzip. Otherwise, the uncompressed version will be saved. Default = True.

set_params(**params)[source]

Set the parameters of this estimator.

The method works on simple estimators as well as on nested objects (such as pipelines). The latter have parameters of the form <component>__<parameter> so that it’s possible to update each component of a nested object.

Return type:: self

slice(ids)[source]

Create a new dataset with only requested IDs.

Warning

This legacy Dataset method will be deprecated in a future release. Prefer slice() for nested Studysets or slice() for Studyset-backed tabular slicing.

Parameters:: ids (array_like) – List of study IDs to include in new dataset
Returns:: new_dset – Reduced Dataset containing only requested studies.
Return type:: Dataset

property texts

Texts in the dataset.

Each text type has its own column (e.g., ‘abstract’) and each study has its own row.

Type:: pandas.DataFrame

update_path(new_path)[source]

Update paths to images.

Warning

This legacy Dataset method will be deprecated in a future release. Prefer update_path().

Prepends new path to the relative path for files in Dataset.images.

Parameters:: new_path (str) – Path to prepend to relative paths of files in Dataset.images.

Examples using `nimare.dataset.Dataset`

The legacy NiMARE Dataset object

The NiMARE Studyset object

Create a legacy NiMARE Dataset object from a JSON file

nimare.dataset.Dataset

Examples using nimare.dataset.Dataset

`nimare.dataset`.Dataset

Examples using `nimare.dataset.Dataset`