LDA topic modeling

Trains a latent Dirichlet allocation model with scikit-learn using abstracts from Neurosynth.

import os

import pandas as pd

from nimare import annotate
from nimare.nimads import Studyset
from nimare.utils import get_resource_path

Load Studyset with abstracts

studyset = Studyset(
    os.path.join(get_resource_path(), "neurosynth_laird_studyset.json"),
    target="mni152_2mm",
)

Initialize LDA model

model = annotate.lda.LDAModel(n_topics=5, max_iter=1000, text_column="abstract")

Run model

new_studyset = model.fit(studyset)

View results

This DataFrame is very large, so we will only show a slice of it.

id study_id contrast_id Neurosynth_TFIDF__001 Neurosynth_TFIDF__01 Neurosynth_TFIDF__05 Neurosynth_TFIDF__10 Neurosynth_TFIDF__100 Neurosynth_TFIDF__11 Neurosynth_TFIDF__12
0 17029760-1 17029760 1 0.0 0.0 0.0 0.000000 0.0 0.000000 0.0
1 18760263-1 18760263 1 0.0 0.0 0.0 0.000000 0.0 0.000000 0.0
2 19162389-1 19162389 1 0.0 0.0 0.0 0.000000 0.0 0.176321 0.0
3 19603407-1 19603407 1 0.0 0.0 0.0 0.000000 0.0 0.000000 0.0
4 20197097-1 20197097 1 0.0 0.0 0.0 0.000000 0.0 0.000000 0.0
5 22569543-1 22569543 1 0.0 0.0 0.0 0.000000 0.0 0.000000 0.0
6 22659444-1 22659444 1 0.0 0.0 0.0 0.000000 0.0 0.000000 0.0
7 23042731-1 23042731 1 0.0 0.0 0.0 0.000000 0.0 0.000000 0.0
8 23702412-1 23702412 1 0.0 0.0 0.0 0.061006 0.0 0.000000 0.0
9 24681401-1 24681401 1 0.0 0.0 0.0 0.000000 0.0 0.000000 0.0


Given that this DataFrame is very wide (many terms), we will transpose it before presenting it.

model.distributions_["p_topic_g_word_df"].T.head(10)
LDA5__1_functional_connectivity_networks LDA5__2_functional_cbp_literature LDA5__3_connectivity_functional_motor LDA5__4_cortex_lateral_prefrontal LDA5__5_connectivity_functional_human
10 0.001000 1.001304 0.001000 0.001000 1.000512
abstract 0.001000 0.001000 1.001121 0.001000 1.000727
action 0.001000 0.001000 0.001000 0.001000 2.000910
active 1.231645 0.001000 0.001000 2.770283 0.001000
addition 1.000832 2.000774 0.001000 2.001274 0.001000
additionally 2.000895 0.001000 0.001000 0.001000 0.001000
affective 0.001000 1.001082 1.196614 0.001000 3.805148
affective processes 0.001000 0.001000 0.001000 0.001000 2.000874
ale 0.001000 1.000998 1.000925 0.001000 0.001000
altered 1.000440 0.001000 3.001489 0.001000 0.001000


LDA5__1_functional_connectivity_networks LDA5__2_functional_cbp_literature LDA5__3_connectivity_functional_motor LDA5__4_cortex_lateral_prefrontal LDA5__5_connectivity_functional_human
Token
0 functional functional connectivity cortex connectivity
1 connectivity cbp functional lateral functional
2 networks literature motor prefrontal human
3 functional networks clusters functional connectivity identified social
4 anterior parcellation cognitive prefrontal cortex posterior
5 insula higher functions stimulation functional connectivity
6 memory frontal cortex medial macm
7 language talairach method published seed
8 approaches frontal pole function cognition task
9 involved pole modeling systems cognitive


Total running time of the script: (0 minutes 2.681 seconds)

Gallery generated by Sphinx-Gallery