Instead of random initialization used currently, use "Sequential Prediction" as described in Sec 5.1 of https://cs.stanford.edu/~pliang/papers/permutation-dp-icml2007.pdf:
sequential prediction: a clustering is sampled according to the CRP but weighting the probability of choosing a cluster by the predictive likelihood of the new point given the existing points in the cluster.
which can be seen as a sort of 0th iteration with incrementally growing class/group membership as each data point is introduced to the sampler. wang notes that "this works well empirically"
Instead of random initialization used currently, use "Sequential Prediction" as described in Sec 5.1 of https://cs.stanford.edu/~pliang/papers/permutation-dp-icml2007.pdf:
which can be seen as a sort of 0th iteration with incrementally growing class/group membership as each data point is introduced to the sampler. wang notes that "this works well empirically"