Personas FAQ
How does the Personas chart calculate clusters?
- It doesn't handle outliers well, so behaviors with large frequency ranges could skew the clusters toward representing unusual patterns of engagement. As a result, the clusters could fail to capture the nuance in more typical rates of behavior.
- It doesn't handle "high-dimensional" data well, so when customers have many different event types, the clusters could fail to represent groups of users who were truly similar in behavior.
What is Non-Negative Matrix Factorization (NMF)?
Amplitude uses NMF to calculate clusters. Given a data set, clustering algorithms look for ways to partition the set that maximize similarity within each partition while minimizing similarity between different partitions.
To escape the curse of high-dimensionality in the original "event space," NMF explicitly carries out a mathematical dimension-reduction to arrive at a more comprehensible "behavior space." The method also reduces outlier effects by weighing events based on their frequencies and by normalizing each user's event counts. Once projected to the simpler behavior space, users similar along certain behavioral dimensions cluster together easily.The number of dimensions in the behavior space matches the number of clusters specified. Because of this built-in connection, NMF clusters tend to be hierarchical.
To learn more about how NMF works, refer to the NMF clustering paper.Was this helpful?