a question about different topics from same corpus #12

yanbo68 · 2012-05-04T08:58:29Z

Hi

   I ran the train mode on a corpus of size about 1G. I tried twice, each with 500 topics and 500 iterations. But I got two quite different results. I means the 2 files "lad.topToWor.txt" from 2 train results are quite different. I compared the words on each topic( ignored  weight) . Only 250 topics on 2 files are similar ( more than 10 words are matched, which I can say that 2 topics in 2 files are similar )
   This means that I will get quite different results from a random initialization. Is there a way that I can get a stable result?

increase the number of topics? or increase the iterations? I tried 1000 iterations but no big change.

  Thanks!

Yanbo

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

a question about different topics from same corpus #12

a question about different topics from same corpus #12

yanbo68 commented May 4, 2012

a question about different topics from same corpus #12

a question about different topics from same corpus #12

Comments

yanbo68 commented May 4, 2012