Description: j0441924

 

Data mining labs. Spring 2009.

University of Victoria, Canada

 

Presentations, sample data sources and python implementations of various data mining algorithms which were prepared and used for data mining tutorials in data mining course.

 

Lab 1. Decision trees.

Presentation: Lab1_decisiontree.pdf

Sample data sets: data.zip

Code starters: code.zip

Solutions: solutions.zip

 

Lab 2. Using map-reduce framework to compute Attribute-Value-Class sets for decision trees on massive data.

Presentation: Lab2AVCSetsClusterPython.pdf

Sample data sets: data.zip

Code starters: code.zip

Solutions: solutions.zip

 

Lab 3. Classifiers tutorial: decision trees, association rules and naïve Bayes. Classifiers with WEKA.

Presentations: Lab3_assrulesexample.pdf, Lab3_decisiontreeexample.pdf, Lab3_naivebayesxample.pdf, Lab3_classifiersWithWEKA.pdf

Sample data sets: data.zip

Lab 4. Bayesian networks.

Presentation: Lab4_Bayesiannetworks.pdf

Sample data set: weather.nominal.arff

Lab5. ROC curves with WEKA.

Presentation: Lab5_ROC_weka.pdf

Sample data set (zipped): adult_income.zip, conversionsofeducation.txt

Lab 6. Optimizations with genetic algorithm.

Presentation: Lab6_genetic_algorithm.pdf

Sample data (to download): schedule.txt

Code starters: code.zip

Solutions: solutions.zip

Lab 7. Frequent itemsets tutorial: apriori and FP-trees.

Presentations: Lab7_apriori.pdf, Lab7_fpree.pdf

Sample data sets: data.zip

Code starters (includes fimi06b): code.zip

Solutions: solutions.zip

 

Lab 8. Clustering tutorial.

Presentation: Lab8_distances_clustering.pdf

Sample data sets: data.zip

Code for demonstration: code.zip

Lab 9. WEB ranking tutorial: link analysis and LSA

Presentation: Lab9_rankingWEB.pdf

Code for demonstration: PageRankCode