nimare.annotate.text.generate_cooccurrence

generate_cooccurrence(text_df, text_column='abstract', vocabulary=None, window=5)[source]

Build co-occurrence matrix from documents. Not the same approach as used by the GloVe model.

Parameters
  • text_df ((D x 2) pandas.DataFrame) – A DataFrame with two columns (‘id’ and ‘text’). D = document.

  • vocabulary (list, optional) – List of words in vocabulary to extract from text.

  • window (int, optional) – Window size for cooccurrence. Words which appear within window words of one another co-occur.

Returns

df (multi-indexed pandas.DataFrame) – A DataFrame with three indices (id, first_term, and second_term) and one column (cooccurrence_count).