LDA topic modeling

Trains a latent Dirichlet allocation model with scikit-learn using abstracts from Neurosynth.

import os

import pandas as pd

from nimare import annotate
from nimare.dataset import Dataset
from nimare.utils import get_resource_path

Load dataset with abstracts

dset = Dataset(os.path.join(get_resource_path(), "neurosynth_laird_studies.json"))

Initialize LDA model

model = annotate.lda.LDAModel(n_topics=5, max_iter=1000, text_column="abstract")

Run model

new_dset = model.fit(dset)

View results

This DataFrame is very large, so we will only show a slice of it.

id study_id contrast_id Neurosynth_TFIDF__001 Neurosynth_TFIDF__01 Neurosynth_TFIDF__05 Neurosynth_TFIDF__10 Neurosynth_TFIDF__100 Neurosynth_TFIDF__11 Neurosynth_TFIDF__12
0 17029760-1 17029760 1 0.0 0.0 0.0 0.000000 0.0 0.000000 0.0
1 18760263-1 18760263 1 0.0 0.0 0.0 0.000000 0.0 0.000000 0.0
2 19162389-1 19162389 1 0.0 0.0 0.0 0.000000 0.0 0.176321 0.0
3 19603407-1 19603407 1 0.0 0.0 0.0 0.000000 0.0 0.000000 0.0
4 20197097-1 20197097 1 0.0 0.0 0.0 0.000000 0.0 0.000000 0.0
5 22569543-1 22569543 1 0.0 0.0 0.0 0.000000 0.0 0.000000 0.0
6 22659444-1 22659444 1 0.0 0.0 0.0 0.000000 0.0 0.000000 0.0
7 23042731-1 23042731 1 0.0 0.0 0.0 0.000000 0.0 0.000000 0.0
8 23702412-1 23702412 1 0.0 0.0 0.0 0.061006 0.0 0.000000 0.0
9 24681401-1 24681401 1 0.0 0.0 0.0 0.000000 0.0 0.000000 0.0


Given that this DataFrame is very wide (many terms), we will transpose it before presenting it.

model.distributions_["p_topic_g_word_df"].T.head(10)
LDA5__1_connectivity_functional_macm LDA5__2_connectivity_functional_networks LDA5__3_motor_literature_cortex LDA5__4_functional_connectivity_human LDA5__5_cortex_network_control
10 0.001 0.001000 0.001000 2.001000 0.001000
abstract 0.001 0.001000 0.001000 2.001000 0.001000
action 0.001 1.000983 0.001000 1.001017 0.001000
active 0.001 2.001246 0.001000 0.001000 2.000754
addition 0.001 2.001093 2.001402 0.001000 1.000505
additionally 0.001 0.001000 0.001000 1.000875 1.001125
affective 0.001 0.001000 0.001000 5.001372 1.000628
affective processes 0.001 0.001000 0.001000 2.001000 0.001000
ale 0.001 0.001000 1.001180 0.001000 1.000820
altered 0.001 0.001000 0.001000 0.001000 4.001000


LDA5__1_connectivity_functional_macm LDA5__2_connectivity_functional_networks LDA5__3_motor_literature_cortex LDA5__4_functional_connectivity_human LDA5__5_cortex_network_control
Token
0 connectivity connectivity motor functional cortex
1 functional functional literature connectivity network
2 macm networks cortex human control
3 functional connectivity insula talairach cognitive altered
4 structural anterior dorsal social functional
5 method stimulation degree posterior cognition
6 connections functional networks modalities functions prefrontal cortex
7 correlations identified functional anterior parietal
8 structure approaches multimodal task prefrontal
9 function resting published cortex lateral


Total running time of the script: ( 0 minutes 3.345 seconds)

Gallery generated by Sphinx-Gallery