nimare.dataset
.Dataset
- class Dataset(source, target='mni152_2mm', mask=None)[source]
Bases:
nimare.base.NiMAREBase
Storage container for a coordinate- and/or image-based meta-analytic dataset/database.
Changed in version 0.0.9:
[ENH] Add merge method to Dataset class
Changed in version 0.0.8:
[FIX] Set
nimare.dataset.Dataset.basepath
inupdate_path()
using absolute path.
- Parameters
source (
str
ordict
) – JSON file containing dictionary with database information or the dict() objecttarget (
str
, optional) – Desired coordinate space for coordinates. Names follow NIDM convention. Default is ‘mni152_2mm’ (MNI space with 2x2x2 voxels). This parameter has no impact on images.mask (
str
,nibabel.nifti1.Nifti1Image
,nilearn.input_data.NiftiMasker
or similar, or None, optional) – Mask(er) to use. If None, uses the target space image, with all non-zero voxels included in the mask.
- Variables
ids (1D
numpy.ndarray
) – Identifiersmasker (
nilearn.input_data.NiftiMasker
or similar) – Masker object defining the space and location of the area of interest (e.g., ‘brain’).space (
str
) – Standard space. Same astarget
parameter.annotations (
pandas.DataFrame
) – Labels describing studiescoordinates (
pandas.DataFrame
) – Peak coordinates from studiesimages (
pandas.DataFrame
) – Images from studiesmetadata (
pandas.DataFrame
) – Metadata describing studiestexts (
pandas.DataFrame
) – Texts associated with studies
Notes
Images loaded into a Dataset are assumed to be in the same space. If images have different resolutions or affines from the Dataset’s masker, then they will be resampled automatically, at the point where they’re used, by
Dataset.masker
.- property annotations
Labels describing studies in the dataset.
Each study/experiment has its own row. Columns correspond to individual labels (e.g., ‘emotion’), and may be prefixed with a feature group including two underscores (e.g., ‘Neurosynth_TFIDF__emotion’).
- Type
- property coordinates
Coordinates in the dataset.
Changed in version 0.0.10: The coordinates attribute no longer includes the associated matrix indices (columns ‘i’, ‘j’, and ‘k’). These columns are calculated as needed.
Each study has one row for each peak. Columns include [‘x’, ‘y’, ‘z’] (peak locations in mm) and ‘space’ (Dataset’s space).
- Type
- get(dict_, drop_invalid=True)[source]
Retrieve files and/or metadata from the current Dataset.
- Parameters
dict_ (
dict
) – Dictionary specifying images or metadata to collect. Keys should be variables to be used as keys for results dictionary. Values should be tuples with two values: type (e.g., ‘image’ or ‘metadata’) and specific field corresponding to column of type-specific DataFrame (e.g., ‘z’ or ‘sample_sizes’).drop_invalid (
bool
, optional) – Whether to automatically ignore any studies without the required data or not. Default is False.
- Returns
results (
dict
) – A dictionary of lists of requested data. Keys correspond to the keys indict_
.
Examples
>>> dset.get({'z_maps': ('image', 'z'), 'sample_sizes': ('metadata', 'sample_sizes')}) >>> dset.get({'coordinates': ('coordinates', None)})
- get_images(ids=None, imtype=None)[source]
Get images of a certain type for a subset of studies in the dataset.
- Parameters
- Returns
images (
list
) – List of images of requested type for selected IDs.
- get_metadata(ids=None, field=None)[source]
Get metadata from Dataset.
- Parameters
- Returns
metadata (
list
) – List of values of requested type for selected IDs.
- get_studies_by_coordinate(xyz, r=20)[source]
Extract list of studies with at least one focus within radius of requested coordinates.
- get_studies_by_label(labels=None, label_threshold=0.001)[source]
Extract list of studies with a given label.
Changed in version 0.0.10: Fix bug in which all IDs were returned when a label wasn’t present in the Dataset.
Changed in version 0.0.9: Default value for label_threshold changed to 0.001.
- Parameters
- Returns
found_ids (
list
) – A list of IDs from the Dataset found by the search criteria.
- get_studies_by_mask(mask)[source]
Extract list of studies with at least one coordinate in mask.
- Parameters
mask (img_like) – Mask across which to search for coordinates.
- Returns
found_ids (
list
) – A list of IDs from the Dataset with at least one focus in the mask.
- get_texts(ids=None, text_type=None)[source]
Extract list of texts of a given type for selected IDs.
- Parameters
- Returns
texts (
list
) – List of texts of requested type for selected IDs.
- property ids
1D array of identifiers in Dataset.
The associated setter for this property is private, as
Dataset.ids
is immutable.- Type
- property images
Images in the dataset.
Each image type has its own column (e.g., ‘z’) with absolute paths to files and each study has its own row. Additionally, relative paths to image files are stored in columns with the suffix ‘__relative’ (e.g., ‘z__relative’).
Warning
Images are assumed to be in the same space, although they may have different resolutions and affines. Images will be resampled as needed at the point where they are used, via
Dataset.masker
.- Type
- property masker
Masker object.
Defines the space and location of the area of interest (e.g., ‘brain’).
- Type
nilearn.input_data.NiftiMasker
or similar
- merge(right)[source]
Merge two Datasets.
New in version 0.0.9.
- Parameters
right (
nimare.dataset.Dataset
) – Dataset to merge with.- Returns
nimare.dataset.Dataset
– A Dataset of the two merged Datasets.
- property metadata
Metadata describing studies in the dataset.
Each metadata field has its own column (e.g., ‘sample_sizes’) and each study has its own row.
- Type
- set_params(**params)[source]
Set the parameters of this estimator.
The method works on simple estimators as well as on nested objects (such as pipelines). The latter have parameters of the form
<component>__<parameter>
so that it’s possible to update each component of a nested object.- Returns
self
- slice(ids)[source]
Create a new dataset with only requested IDs.
- Parameters
ids (array_like) – List of study IDs to include in new dataset
- Returns
new_dset (
nimare.dataset.Dataset
) – Reduced Dataset containing only requested studies.
- property texts
Texts in the dataset.
Each text type has its own column (e.g., ‘abstract’) and each study has its own row.
- Type