Note
Click here to download the full example code
Train an LDA model and use it¶
This example trains a latent Dirichlet allocation model with MALLET using abstracts from Neurosynth.
import os
import nimare
from nimare import annotate
from nimare.tests.utils import get_test_data_path
Load dataset with abstracts¶
dset = nimare.dataset.Dataset.load(
os.path.join(get_test_data_path(), 'neurosynth_laird_studies.pkl.gz'))
Download MALLET¶
MALLET is a Java toolbox for natural language processing. While LDA is implemented in some Python libraries, like scikit-learn, MALLET appears to do a better job at LDA than other tools. LDAModel will download MALLET automatically, but it’s included here for clarity.
mallet_dir = nimare.extract.download_mallet()
Run model¶
This may take some time, so we won’t run it in the gallery.
model = annotate.lda.LDAModel(dset.texts, text_column='abstract', n_iters=5)
model.fit()
model.save('lda_model.pkl.gz')
# Let's remove the model now that you know how to generate it.
os.remove('lda_model.pkl.gz')
Total running time of the script: ( 0 minutes 0.000 seconds)