LDA topic modeling

Trains a latent Dirichlet allocation model with scikit-learn using abstracts from Neurosynth.

import os

import pandas as pd

from nimare import annotate
from nimare.dataset import Dataset
from nimare.utils import get_resource_path

Load dataset with abstracts

dset = Dataset(os.path.join(get_resource_path(), "neurosynth_laird_studies.json"))

Initialize LDA model

model = annotate.lda.LDAModel(n_topics=5, max_iter=1000, text_column="abstract")

Run model

new_dset = model.fit(dset)

View results

This DataFrame is very large, so we will only show a slice of it.

id study_id contrast_id Neurosynth_TFIDF__001 Neurosynth_TFIDF__01 Neurosynth_TFIDF__05 Neurosynth_TFIDF__10 Neurosynth_TFIDF__100 Neurosynth_TFIDF__11 Neurosynth_TFIDF__12
0 17029760-1 17029760 1 0.0 0.0 0.0 0.000000 0.0 0.000000 0.0
1 18760263-1 18760263 1 0.0 0.0 0.0 0.000000 0.0 0.000000 0.0
2 19162389-1 19162389 1 0.0 0.0 0.0 0.000000 0.0 0.176321 0.0
3 19603407-1 19603407 1 0.0 0.0 0.0 0.000000 0.0 0.000000 0.0
4 20197097-1 20197097 1 0.0 0.0 0.0 0.000000 0.0 0.000000 0.0
5 22569543-1 22569543 1 0.0 0.0 0.0 0.000000 0.0 0.000000 0.0
6 22659444-1 22659444 1 0.0 0.0 0.0 0.000000 0.0 0.000000 0.0
7 23042731-1 23042731 1 0.0 0.0 0.0 0.000000 0.0 0.000000 0.0
8 23702412-1 23702412 1 0.0 0.0 0.0 0.061006 0.0 0.000000 0.0
9 24681401-1 24681401 1 0.0 0.0 0.0 0.000000 0.0 0.000000 0.0


Given that this DataFrame is very wide (many terms), we will transpose it before presenting it.

model.distributions_["p_topic_g_word_df"].T.head(10)
LDA5__1_human_functional_cognitive LDA5__2_identified_stimulation_brainmap LDA5__3_connectivity_functional_posterior LDA5__4_connectivity_macm_method LDA5__5_connectivity_functional_networks
10 2.001000 0.001000 0.001000 0.001000 0.001000
abstract 1.001236 0.001000 1.000764 0.001000 0.001000
action 0.001000 0.001000 0.001000 2.001000 0.001000
active 0.001000 1.001431 0.001000 2.001002 1.000567
addition 1.001061 1.001843 1.000260 0.001000 2.000837
additionally 0.001000 0.001000 1.001098 0.001000 1.000902
affective 1.925169 0.001000 3.001129 0.001000 1.076702
affective processes 0.001000 0.001000 2.001000 0.001000 0.001000
ale 0.001000 0.001000 0.001000 0.001000 2.001000
altered 0.001000 0.001000 0.001000 0.001000 4.001000


LDA5__1_human_functional_cognitive LDA5__2_identified_stimulation_brainmap LDA5__3_connectivity_functional_posterior LDA5__4_connectivity_macm_method LDA5__5_connectivity_functional_networks
Token
0 human identified connectivity connectivity connectivity
1 functional stimulation functional macm functional
2 cognitive brainmap posterior method networks
3 cbp published task functional approaches
4 connectivity using anterior methods structural
5 functions prefrontal cortex connections functional networks
6 frontal resting motor behavioral insula
7 cortex use social prefrontal cortex anterior
8 maps processes functional connectivity connectivity patterns correlations
9 parcellation brodmann cognitive mapping macm


Total running time of the script: (0 minutes 2.983 seconds)

Gallery generated by Sphinx-Gallery