comparing results of Kmeans algorithm with Database to find out The precision of algorithm

1 view (last 30 days)
Hi, I have 2000 articles(2000 .txt files) from 20 subjects(20 Folders). it's my Database.
I clustered them by Kmeans Algorithm.("idx" parametr in Kmeans , shows me Each article belongs to which cluster)
Now , How can i compare Kmeans Result With Database to find out The precision of algorithm?
it's hard to use "Eye" for 2000 files!

Answers (1)

Image Analyst
Image Analyst on 23 Oct 2014
This is typically done with a "confusion matrix" which is a table of N classes by N classes that shows you what class a sample got classified as, versus what it's "True" class is. Ideal classification would yield a confusion matrix with numbers only along the diagonal. The more off-diagonal it becomes, the less accurate your classification algorithm is.
You can also use ROC curves http://en.wikipedia.org/wiki/Receiver_operating_characteristic which is a plot of true positives vs. false negatives. ROC curves are especially used in clinical studies.

Categories

Find more on Statistics and Machine Learning Toolbox in Help Center and File Exchange

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!