Top most algorithms used in Data Mining

I am trying to compile a comprehensive  list of  Data Mining Algorithm and while trying to do so I found a top 10 list can be created by several ways.

Based on a Scientific research paper here is top 10 data mining algorithms identified by the IEEE International Conference on Data Mining (ICDM) in December 2006 and  these top 10 algorithms are among the most influential data mining algorithms in the research community

  1. C4.5
  2. k-Means
  3. SVM
  4. Apriori
  5. EM
  6. PageRank
  7. AdaBoost
  8. kNN
  9. Naive Bayes
  10. CART

Public Voting:

  1. Decision Trees/Rules
  2. Regression
  3. Clustering
  4. Statistics (descriptive)
  5. Visualization
  6. Time series/Sequence analysis
  7. Support Vector (SVM)
  8. Association rules
  9. Ensemble methods
  10. Text Mining
  11. Neural Nets
  12. Boosting
  13. Bayesian
  14. Bagging
  15. Factor Analysis
  16. Anomaly/Deviation detection
  17. Social Network Analysis
  18. Survival Analysis
  19. Genetic algorithms
  20. Uplift modeling

Based on voting done by “Mahout user mailing list” here is the list:

  1. Matrix factorization (SVD)
  2. k-means
  3. Naive Bayes
  4. Dirichlet Process Clustering
  5. Matrix Factorization
  6. Frequent Pattern Matching
  7. LDA
  8. Expectation Maximization
  9. SVM
  10. Decision Trees
  11. Logistics Regression
  12. Random Forest



One thought on “Top most algorithms used in Data Mining

  1. Thanks for the compiling these lists. It would be great if each algorithm had a one-sentence description or a link to the Wikipedia page for the curious (and lazy).


Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s