LDA topic modeling

Trains a latent Dirichlet allocation model with scikit-learn using abstracts from Neurosynth.

import os

import pandas as pd

from nimare import annotate
from nimare.dataset import Dataset
from nimare.utils import get_resource_path

Load dataset with abstracts

dset = Dataset(os.path.join(get_resource_path(), "neurosynth_laird_studies.json"))

Initialize LDA model

model = annotate.lda.LDAModel(n_topics=5, max_iter=1000, text_column="abstract")

Run model

new_dset = model.fit(dset)

View results

This DataFrame is very large, so we will only show a slice of it.

id study_id contrast_id Neurosynth_TFIDF__001 Neurosynth_TFIDF__01 Neurosynth_TFIDF__05 Neurosynth_TFIDF__10 Neurosynth_TFIDF__100 Neurosynth_TFIDF__11 Neurosynth_TFIDF__12
0 17029760-1 17029760 1 0.0 0.0 0.0 0.000000 0.0 0.000000 0.0
1 18760263-1 18760263 1 0.0 0.0 0.0 0.000000 0.0 0.000000 0.0
2 19162389-1 19162389 1 0.0 0.0 0.0 0.000000 0.0 0.176321 0.0
3 19603407-1 19603407 1 0.0 0.0 0.0 0.000000 0.0 0.000000 0.0
4 20197097-1 20197097 1 0.0 0.0 0.0 0.000000 0.0 0.000000 0.0
5 22569543-1 22569543 1 0.0 0.0 0.0 0.000000 0.0 0.000000 0.0
6 22659444-1 22659444 1 0.0 0.0 0.0 0.000000 0.0 0.000000 0.0
7 23042731-1 23042731 1 0.0 0.0 0.0 0.000000 0.0 0.000000 0.0
8 23702412-1 23702412 1 0.0 0.0 0.0 0.061006 0.0 0.000000 0.0
9 24681401-1 24681401 1 0.0 0.0 0.0 0.000000 0.0 0.000000 0.0


Given that this DataFrame is very wide (many terms), we will transpose it before presenting it.

model.distributions_["p_topic_g_word_df"].T.head(10)
LDA5__1_network_cortex_control LDA5__2_human_cortex_motor LDA5__3_prefrontal_identified_stimulation LDA5__4_connectivity_functional_functional connectivity LDA5__5_connectivity_functional_social
10 0.001000 1.001202 0.001000 1.000798 0.001000
abstract 0.001000 0.001000 0.001000 0.001000 2.001000
action 0.001000 1.001199 0.001000 1.000801 0.001000
active 0.001000 0.001000 3.002109 0.001000 0.999891
addition 2.001659 0.001000 1.001618 0.001000 1.999723
additionally 1.001188 0.001000 0.001000 0.001000 1.000812
affective 0.001000 1.001117 0.001000 2.000833 3.001050
affective processes 0.001000 0.001000 0.001000 1.001006 1.000994
ale 0.001000 1.001105 0.001000 0.001000 1.000895
altered 1.000964 0.001000 0.001000 0.001000 3.001036


LDA5__1_network_cortex_control LDA5__2_human_cortex_motor LDA5__3_prefrontal_identified_stimulation LDA5__4_connectivity_functional_functional connectivity LDA5__5_connectivity_functional_social
Token
0 network human prefrontal connectivity connectivity
1 cortex cortex identified functional functional
2 control motor stimulation functional connectivity social
3 error literature cortex macm networks
4 functional functional active posterior anterior
5 lateral maps prefrontal cortex human insula
6 indicated talairach published seed functional networks
7 cortices probabilistic lobe method cognitive
8 cognition functional segregation resting cognitive modeling
9 thalamus segregation use connections processes


Total running time of the script: (0 minutes 2.889 seconds)

Gallery generated by Sphinx-Gallery