LDA topic modeling

Trains a latent Dirichlet allocation model with scikit-learn using abstracts from Neurosynth.

import os

import pandas as pd

from nimare import annotate
from nimare.dataset import Dataset
from nimare.utils import get_resource_path

Load dataset with abstracts

dset = Dataset(os.path.join(get_resource_path(), "neurosynth_laird_studies.json"))

Initialize LDA model

model = annotate.lda.LDAModel(n_topics=5, max_iter=1000, text_column="abstract")

Run model

new_dset = model.fit(dset)

View results

This DataFrame is very large, so we will only show a slice of it.

id study_id contrast_id Neurosynth_TFIDF__001 Neurosynth_TFIDF__01 Neurosynth_TFIDF__05 Neurosynth_TFIDF__10 Neurosynth_TFIDF__100 Neurosynth_TFIDF__11 Neurosynth_TFIDF__12
0 17029760-1 17029760 1 0.0 0.0 0.0 0.000000 0.0 0.000000 0.0
1 18760263-1 18760263 1 0.0 0.0 0.0 0.000000 0.0 0.000000 0.0
2 19162389-1 19162389 1 0.0 0.0 0.0 0.000000 0.0 0.176321 0.0
3 19603407-1 19603407 1 0.0 0.0 0.0 0.000000 0.0 0.000000 0.0
4 20197097-1 20197097 1 0.0 0.0 0.0 0.000000 0.0 0.000000 0.0
5 22569543-1 22569543 1 0.0 0.0 0.0 0.000000 0.0 0.000000 0.0
6 22659444-1 22659444 1 0.0 0.0 0.0 0.000000 0.0 0.000000 0.0
7 23042731-1 23042731 1 0.0 0.0 0.0 0.000000 0.0 0.000000 0.0
8 23702412-1 23702412 1 0.0 0.0 0.0 0.061006 0.0 0.000000 0.0
9 24681401-1 24681401 1 0.0 0.0 0.0 0.000000 0.0 0.000000 0.0


Given that this DataFrame is very wide (many terms), we will transpose it before presenting it.

model.distributions_["p_topic_g_word_df"].T.head(10)
LDA5__1_connectivity_functional_macm LDA5__2_human_cortex_lateral LDA5__3_connectivity_functional_anterior LDA5__4_connectivity_function_functionally LDA5__5_motor_functional_literature
10 0.0010 1.001268 1.000732 0.001000 0.001000
abstract 0.0010 0.001000 1.000670 0.001000 1.001330
action 1.0009 1.001100 0.001000 0.001000 0.001000
active 0.0010 0.001000 4.001000 0.001000 0.001000
addition 0.0010 0.934466 1.319269 0.001000 2.749265
additionally 0.0010 0.001000 1.000864 0.001000 1.001136
affective 0.0010 0.001000 3.764053 2.237947 0.001000
affective processes 0.0010 0.001000 2.001000 0.001000 0.001000
ale 0.0010 0.001000 0.001000 1.001225 1.000775
altered 0.0010 0.001000 0.001000 3.002233 0.999767


LDA5__1_connectivity_functional_macm LDA5__2_human_cortex_lateral LDA5__3_connectivity_functional_anterior LDA5__4_connectivity_function_functionally LDA5__5_motor_functional_literature
Token
0 connectivity human connectivity connectivity motor
1 functional cortex functional function functional
2 macm lateral anterior functionally literature
3 functional connectivity cognition social functional connectivity
4 method maps posterior maps error
5 connections functional cognitive structural talairach
6 methods segregation task altered cortex
7 human functional segregation processes structure function cognitive
8 structural frontal insula structure functions
9 seed new networks candidate magnetic


Total running time of the script: (0 minutes 3.825 seconds)

Gallery generated by Sphinx-Gallery