nimare.annotate.cogat.extract_cogat

extract_cogat(text_df, id_df=None, text_column='abstract')[source]

Extract Cognitive Atlas terms and count instances using regular expressions.

Parameters:
  • text_df ((D x 2) pandas.DataFrame) – Pandas dataframe with at least two columns: ‘id’ and the text. D = document.

  • id_df ((T x 3) pandas.DataFrame) –

    Cognitive Atlas ontology dataframe with one row for each term and at least three columns:

    • "id": A unique identifier for each term.

    • "alias": A natural language expression for each term.

    • "name": The preferred name of each term. Currently unused.

  • text_column (str, optional) – Name of column in text_df that contains text. Default is ‘abstract’.

Returns:

  • counts_df ((D x T) pandas.DataFrame) – Term counts for documents in the corpus. One row for each document and one column for each term.

  • rep_text_df ((D x 2) pandas.DataFrame) – An updated version of the text_df DataFrame with terms in the text column replaced with their CogAt IDs.

Notes

The Cognitive Atlas [1] is an ontology for describing cognitive neuroscience concepts and tasks.

References

See also

nimare.extract.download_cognitive_atlas

This function will be called automatically if id_df is not provided.