LDA topic modeling

Trains a latent Dirichlet allocation model with scikit-learn using abstracts from Neurosynth.

import os

import pandas as pd

from nimare import annotate
from nimare.nimads import Studyset
from nimare.utils import get_resource_path

Load Studyset with abstracts

studyset = Studyset(
    os.path.join(get_resource_path(), "neurosynth_laird_studyset.json"),
    target="mni152_2mm",
)

Initialize LDA model

model = annotate.lda.LDAModel(n_topics=5, max_iter=1000, text_column="abstract")

Run model

new_studyset = model.fit(studyset)

View results

This DataFrame is very large, so we will only show a slice of it.

id study_id contrast_id Neurosynth_TFIDF__001 Neurosynth_TFIDF__01 Neurosynth_TFIDF__05 Neurosynth_TFIDF__10 Neurosynth_TFIDF__100 Neurosynth_TFIDF__11 Neurosynth_TFIDF__12
0 17029760-1 17029760 1 0.0 0.0 0.0 0.000000 0.0 0.000000 0.0
1 18760263-1 18760263 1 0.0 0.0 0.0 0.000000 0.0 0.000000 0.0
2 19162389-1 19162389 1 0.0 0.0 0.0 0.000000 0.0 0.176321 0.0
3 19603407-1 19603407 1 0.0 0.0 0.0 0.000000 0.0 0.000000 0.0
4 20197097-1 20197097 1 0.0 0.0 0.0 0.000000 0.0 0.000000 0.0
5 22569543-1 22569543 1 0.0 0.0 0.0 0.000000 0.0 0.000000 0.0
6 22659444-1 22659444 1 0.0 0.0 0.0 0.000000 0.0 0.000000 0.0
7 23042731-1 23042731 1 0.0 0.0 0.0 0.000000 0.0 0.000000 0.0
8 23702412-1 23702412 1 0.0 0.0 0.0 0.061006 0.0 0.000000 0.0
9 24681401-1 24681401 1 0.0 0.0 0.0 0.000000 0.0 0.000000 0.0


Given that this DataFrame is very wide (many terms), we will transpose it before presenting it.

model.distributions_["p_topic_g_word_df"].T.head(10)
LDA5__1_connectivity_functional_functional connectivity LDA5__2_motor_literature_cortex LDA5__3_social_cortex_identified LDA5__4_connectivity_human_maps LDA5__5_connectivity_functional_networks
10 0.001000 0.001000 0.001000 2.000940 0.001000
abstract 1.000741 0.001000 1.001137 0.001000 0.001000
action 0.001000 0.001000 0.001000 2.000961 0.001000
active 0.001000 0.001000 1.000919 2.001159 1.000795
addition 0.999690 1.101626 1.901855 0.001000 1.000638
additionally 0.001000 0.001000 0.001000 0.001000 2.000907
affective 3.467438 0.001000 1.231868 1.303545 0.001000
affective processes 2.000898 0.001000 0.001000 0.001000 0.001000
ale 1.000467 1.001431 0.001000 0.001000 0.001000
altered 3.001357 0.001000 0.001000 0.001000 1.000521


LDA5__1_connectivity_functional_functional connectivity LDA5__2_motor_literature_cortex LDA5__3_social_cortex_identified LDA5__4_connectivity_human_maps LDA5__5_connectivity_functional_networks
Token
0 connectivity motor social connectivity connectivity
1 functional literature cortex human functional
2 functional connectivity cortex identified maps networks
3 cognitive talairach lateral frontal approaches
4 posterior functional systems cortex functional networks
5 functions task network macm anterior
6 anterior published prefrontal cognition insula
7 human likelihood stimulation methods structural
8 analytic estimation published lobe memory
9 using likelihood estimation task pole human


Total running time of the script: (0 minutes 2.454 seconds)

Gallery generated by Sphinx-Gallery