Topic modeling is an unsupervised machine learning technique that analyzes text data to determine word clusters for a set of documents. The clusters of similar words are called topics.
Given the text collection and the number of topics, we want to infer the actual topics and the topic distribution for each document.
Topics are just word distributions. Interpreting topics, making sense of words and generating labels is subjective.
Latent Dirichlet Allocation (LDA) is one of the most popular topic modeling methods.