`nimare.annotate.lda`.LDAModel

class LDAModel(text_df, text_column='abstract', n_topics=50, n_iters=1000, alpha='auto', beta=0.001)[source]

Perform topic modeling using Latent Dirichlet Allocation (LDA).

Build an LDA 1 topic model with the Java toolbox MALLET 2, as performed in 3.

Parameters

text_df (pandas.DataFrame) – A pandas DataFrame with two columns (‘id’ and text_column) containing article text.
text_column (str, optional) – Name of column in text_df that contains text. Default is ‘abstract’.
n_topics (int, optional) – Number of topics to generate. Default=50.
n_iters (int, optional) – Number of iterations to run in training topic model. Default=1000.
alpha (float or ‘auto’, optional) – The Dirichlet prior on the per-document topic distributions. Default: auto, which calculates 50 / n_topics, based on Poldrack et al. (2012).
beta (float, optional) – The Dirichlet prior on the per-topic word distribution. Default: 0.001, based on Poldrack et al. (2012).

Variables

commands_ (list of str) – List of MALLET commands called to fit model.

References

1: Blei, David M., Andrew Y. Ng, and Michael I. Jordan. “Latent dirichlet allocation.” Journal of machine Learning research 3.Jan (2003): 993-1022.
2: McCallum, Andrew Kachites. “Mallet: A machine learning for language toolkit.” (2002).
3: Poldrack, Russell A., et al. “Discovering relations between mind, brain, and mental disorders using topic mapping.” PLoS computational biology 8.10 (2012): e1002707. https://doi.org/10.1371/journal.pcbi.1002707

Examples using `nimare.annotate.lda.LDAModel`