
Probability Estimation : Where the quantity of water in each glass is measured.Segmentation : Where the water is partitioned into several glasses assuming that the quality of water in each glass is different.The meter and the pipes combined (yes you guessed it right) is the topic coherence pipeline. This was not quantitative but only qualitative. Earlier, the topics coming out from these topic modelling algorithms used to be tested on their human interpretability by presenting them to humans and taking their input on them. The water here is the topics from some topic modelling algorithm. You can get it straight from the meter and this value is always in accordance with the human opinions. Hence now you don’t need to go and gather hundred different people to get their opinion on the quality of water. While doing this you receive help from a lot of wonderful people around you and therefore you are successful in installing it.
Topic coherence score install#
Since you are a lazy person and strive to assign a quantity to the quality, you install four different pipes at the end of the water source and design a meter which tells you the exact quality of water by assigning a number to it. Hence it can’t be used to compare two different sources of water in a definitive manner. But this doesn’t assign a particular number to the quality of water and thus is only a qualitative analysis. If someone asks you exactly how good (or bad) the water is, you blend in your personal opinion. So basically all your evaluations are based on reviews with ratings as bad or good. If most of the reviews are bad, you say the water is bad and vice-versa. The way you test this water is by providing it to a lot of people and then taking their reviews. Imagine that you get water from a lot of places. This is meant for the general reader as much as a technical one so I will try to engage your imaginations more and your maths skills less. What exactly is this topic coherence pipeline thing? Why is it even important? Moreover, what is the advantage of having this pipeline at all? In this post I will look to answer those questions in an as non-technical language as possible.
