`nimare.annotate.cogat`.extract_cogat

extract_cogat(text_df, id_df=None, text_column='abstract')[source]

Extract Cognitive Atlas terms and count instances using regular expressions.

Parameters:

text_df ((D x 2) pandas.DataFrame) – Pandas dataframe with at least two columns: ‘id’ and the text. D = document.
id_df ((T x 3) pandas.DataFrame) –
Cognitive Atlas ontology dataframe with one row for each term and at least three columns:
- "id": A unique identifier for each term.
- "alias": A natural language expression for each term.
- "name": The preferred name of each term. Currently unused.
text_column (str, optional) – Name of column in text_df that contains text. Default is ‘abstract’.

Returns:

counts_df ((D x T) pandas.DataFrame) – Term counts for documents in the corpus. One row for each document and one column for each term.
rep_text_df ((D x 2) pandas.DataFrame) – An updated version of the text_df DataFrame with terms in the text column replaced with their CogAt IDs.

Notes

The Cognitive Atlas [1] is an ontology for describing cognitive neuroscience concepts and tasks.

References

nimare.annotate.cogat.extract_cogat