Topic Modeling | Towards Data Science https://towardsdatascience.com/tag/topic-modeling/ Publish AI, ML & data-science insights to a global community of data professionals. Mon, 14 Jul 2025 23:44:33 +0000 en-US hourly 1 https://wordpress.org/?v=6.8.1 https://towardsdatascience.com/wp-content/uploads/2025/02/cropped-Favicon-32x32.png Topic Modeling | Towards Data Science https://towardsdatascience.com/tag/topic-modeling/ 32 32 Topic Model Labelling with LLMs https://towardsdatascience.com/topic-model-labelling-with-llms/ Mon, 14 Jul 2025 23:44:18 +0000 https://towardsdatascience.com/?p=606581 Python tutorial for reproducible labeling of cutting-edge topic models with GPT4-o-mini.

The post Topic Model Labelling with LLMs appeared first on Towards Data Science.

]]>
A Practical Guide to BERTopic for Transformer-Based Topic Modeling https://towardsdatascience.com/a-practical-guide-to-bertopic-for-transformer-based-topic-modeling/ Thu, 08 May 2025 05:08:43 +0000 https://towardsdatascience.com/?p=605946 A deep dive into BERTopic’s 6 modules to transform financial news into insightful topics

The post A Practical Guide to BERTopic for Transformer-Based Topic Modeling appeared first on Towards Data Science.

]]>
Choose the Right One: Evaluating Topic Models for Business Intelligence https://towardsdatascience.com/choose-the-right-one-evaluating-topic-models-for-business-intelligence/ Thu, 24 Apr 2025 19:50:50 +0000 https://towardsdatascience.com/?p=605801 Python tutorial for evaluating top-notch bigram topic models in customer email classification

The post Choose the Right One: Evaluating Topic Models for Business Intelligence appeared first on Towards Data Science.

]]>
Contextual Topic Modelling in Chinese Corpora with KeyNMF https://towardsdatascience.com/contextual-topic-modelling-in-chinese-corpora-with-keynmf-9a1d02f02648/ Mon, 13 Jan 2025 18:47:24 +0000 https://towardsdatascience.com/contextual-topic-modelling-in-chinese-corpora-with-keynmf-9a1d02f02648/ A comprehensive guide on getting the most out of your Chinese topic models, from preprocessing to interpretation.

The post Contextual Topic Modelling in Chinese Corpora with KeyNMF appeared first on Towards Data Science.

]]>
Topic Modelling Your Personal Data https://towardsdatascience.com/topic-modelling-your-personal-data-9561e25a042e/ Sat, 21 Sep 2024 01:09:23 +0000 https://towardsdatascience.com/topic-modelling-your-personal-data-9561e25a042e/ Using Traditional and Transformer Models to Explore Personal Data Stored by Brokers

The post Topic Modelling Your Personal Data appeared first on Towards Data Science.

]]>
Case-Study: Multilingual LLM for Questionnaire Summarization https://towardsdatascience.com/case-study-multilingual-llm-for-questionnaire-summarization-edf7acdcb37c/ Tue, 30 Jul 2024 13:43:24 +0000 https://towardsdatascience.com/case-study-multilingual-llm-for-questionnaire-summarization-edf7acdcb37c/ An LLM Approach to Summarizing Students' Responses for course Questionnaires in Hebrew, Arabic and English

The post Case-Study: Multilingual LLM for Questionnaire Summarization appeared first on Towards Data Science.

]]>
Topic Modeling Open-Source Research with the OpenAlex API https://towardsdatascience.com/topic-modeling-open-source-research-with-the-openalex-api-5191c7db9156/ Mon, 15 Jul 2024 22:59:37 +0000 https://towardsdatascience.com/topic-modeling-open-source-research-with-the-openalex-api-5191c7db9156/ An overview of topic modeling global research through the OpenAlex API and visualizing results

The post Topic Modeling Open-Source Research with the OpenAlex API appeared first on Towards Data Science.

]]>
Topic Modelling with BERTtopic in Python https://towardsdatascience.com/topic-modelling-with-berttopic-in-python-8a80d529de34/ Mon, 01 Apr 2024 12:25:04 +0000 https://towardsdatascience.com/topic-modelling-with-berttopic-in-python-8a80d529de34/ Hands-on tutorial on modeling political statements with a state-of-the-art transformer-based topic model

The post Topic Modelling with BERTtopic in Python appeared first on Towards Data Science.

]]>
Semantic Signal Separation https://towardsdatascience.com/semantic-signal-separation-769f43b46779/ Sun, 11 Feb 2024 13:19:52 +0000 https://towardsdatascience.com/semantic-signal-separation-769f43b46779/ Understand Semantic Structures with Transformers and Topic Modeling

The post Semantic Signal Separation appeared first on Towards Data Science.

]]>
Topic Modelling using ChatGPT API https://towardsdatascience.com/topic-modelling-using-chatgpt-api-8775b0891d16/ Wed, 04 Oct 2023 22:42:51 +0000 https://towardsdatascience.com/topic-modelling-using-chatgpt-api-8775b0891d16/ Comprehensive guide to ChatGPT API for newbies

The post Topic Modelling using ChatGPT API appeared first on Towards Data Science.

]]>
Topics per Class Using BERTopic https://towardsdatascience.com/topics-per-class-using-bertopic-252314f2640/ Sat, 09 Sep 2023 00:24:21 +0000 https://towardsdatascience.com/topics-per-class-using-bertopic-252314f2640/ How to understand the differences in texts by categories

The post Topics per Class Using BERTopic appeared first on Towards Data Science.

]]>
Cᵥ Topic Coherence Explained https://towardsdatascience.com/c%e1%b5%a5-topic-coherence-explained-fc70e2a85227/ Thu, 12 Jan 2023 22:11:46 +0000 https://towardsdatascience.com/c%e1%b5%a5-topic-coherence-explained-fc70e2a85227/ Understanding the metric that correlates the highest with humans.

The post Cᵥ Topic Coherence Explained appeared first on Towards Data Science.

]]>
Let us Extract some Topics from Text Data – Part IV: BERTopic https://towardsdatascience.com/let-us-extract-some-topics-from-text-data-part-iv-bertopic-46ddf3c91622/ Mon, 19 Dec 2022 20:47:13 +0000 https://towardsdatascience.com/let-us-extract-some-topics-from-text-data-part-iv-bertopic-46ddf3c91622/ Learn more about the family member of BERT for topic modelling

The post Let us Extract some Topics from Text Data – Part IV: BERTopic appeared first on Towards Data Science.

]]>
Let us Extract some Topics from Text Data - Part III: Non-Negative Matrix Factorization (NMF) https://towardsdatascience.com/let-us-extract-some-topics-from-text-data-part-iii-non-negative-matrix-factorization-nmf-8eba8c8edada/ Wed, 14 Dec 2022 21:03:56 +0000 https://towardsdatascience.com/let-us-extract-some-topics-from-text-data-part-iii-non-negative-matrix-factorization-nmf-8eba8c8edada/ Let us Extract some Topics from Text Data – Part III: Non-Negative Matrix Factorization (NMF) Introduction Topic modeling is a type of Natural Language Processing (NLP) task that utilizes unsupervised learning methods to extract out the main topics of some text data we deal with. The word "Unsupervised" here means that there are no training […]

The post Let us Extract some Topics from Text Data - Part III: Non-Negative Matrix Factorization (NMF) appeared first on Towards Data Science.

]]>
Let us Extract some Topics from Text Data – Part I: Latent Dirichlet Allocation (LDA) https://towardsdatascience.com/let-us-extract-some-topics-from-text-data-part-i-latent-dirichlet-allocation-lda-e335ee3e5fa4/ Thu, 03 Nov 2022 04:09:19 +0000 https://towardsdatascience.com/let-us-extract-some-topics-from-text-data-part-i-latent-dirichlet-allocation-lda-e335ee3e5fa4/ Learn what topic modelling entails and its implementation using Python's nltk, gensim, sklearn, and pyLDAvis packages

The post Let us Extract some Topics from Text Data – Part I: Latent Dirichlet Allocation (LDA) appeared first on Towards Data Science.

]]>
An Introduction to Topic-Noise Models https://towardsdatascience.com/an-introduction-to-topic-noise-models-c48fe77e32a6/ Fri, 21 Oct 2022 13:47:24 +0000 https://towardsdatascience.com/an-introduction-to-topic-noise-models-c48fe77e32a6/ Learn how to use topic-noise models (1/3) Words matter. And these days, it can be hard to cut through the noise to find the words that matter the most. In this series of articles, we will introduce a new type of model – the topic-noise model, and show you how to use these models on […]

The post An Introduction to Topic-Noise Models appeared first on Towards Data Science.

]]>
Understanding Outliers in Text Data with Transformers, Cleanlab, and Topic Modeling https://towardsdatascience.com/understanding-outliers-in-text-data-with-transformers-cleanlab-and-topic-modeling-db3585415a19/ Thu, 06 Oct 2022 18:42:26 +0000 https://towardsdatascience.com/understanding-outliers-in-text-data-with-transformers-cleanlab-and-topic-modeling-db3585415a19/ Understanding Outliers in Text Data with Transformers, cleanlab, and Topic Modeling An open-source python workflow to audit text datasets Many text corpora contain heterogeneous documents, some of which may be anomalous and worth understanding more. For deployed ML systems, in particular, we may want to automatically flag test documents that do not stem from the […]

The post Understanding Outliers in Text Data with Transformers, Cleanlab, and Topic Modeling appeared first on Towards Data Science.

]]>
Topic Modeling with LSA, pLSA, LDA, NMF, BERTopic, Top2Vec: a Comparison https://towardsdatascience.com/topic-modeling-with-lsa-plsa-lda-nmf-bertopic-top2vec-a-comparison-5e6ce4b1e4a5/ Mon, 19 Sep 2022 17:46:59 +0000 https://towardsdatascience.com/topic-modeling-with-lsa-plsa-lda-nmf-bertopic-top2vec-a-comparison-5e6ce4b1e4a5/ A comparison between different topic modeling strategies including practical Python examples

The post Topic Modeling with LSA, pLSA, LDA, NMF, BERTopic, Top2Vec: a Comparison appeared first on Towards Data Science.

]]>
Seeded Topic Models as a Yard Stick: Implement them in R with keyATM https://towardsdatascience.com/why-to-use-seeded-topic-models-in-your-next-project-and-how-to-implement-them-in-r-8502d15d6e8d/ Wed, 07 Sep 2022 13:04:15 +0000 https://towardsdatascience.com/why-to-use-seeded-topic-models-in-your-next-project-and-how-to-implement-them-in-r-8502d15d6e8d/ Modelling geopolitical risk based on UK parliament transcripts

The post Seeded Topic Models as a Yard Stick: Implement them in R with keyATM appeared first on Towards Data Science.

]]>
LDA Topic Modeling – A Case Study with Chinese Tweets Data https://towardsdatascience.com/lda-topic-modeling-a-case-study-with-chinese-tweets-data-2d08ad25b08c/ Fri, 08 Jul 2022 20:49:21 +0000 https://towardsdatascience.com/lda-topic-modeling-a-case-study-with-chinese-tweets-data-2d08ad25b08c/ A data-driven approach to understanding the trending topics on Twitter

The post LDA Topic Modeling – A Case Study with Chinese Tweets Data appeared first on Towards Data Science.

]]>