Abstract
The use of e-learning systems has gradually increased through last years, becoming more and more pregnant in universities or just self-study. This is because the amount of information is rapidly growing and people often need it available at a click's distance. The plentiful information can be seen as a true blessing, but it can also be a real hassle through the unclassified information leaving the user without any articulate knowledge about the searched subject. Data mining can be seen as the solution for this kind of problem, being used already in various research areas such as medicine, business, market research with immense chunks of provided data. Simply put, data mining techniques can be used to "mine" knowledge from e-learning systems by analyzing information and detecting patterns of teachers and students and, maybe the most important one, knowing and learning the students' assimilation process and learn pattern, information that can't be seen easily with the naked eye. The results can be then applied so that the learning process is more effective, by adding functionalities such as personalized learning process, feedback for professors of the didactic contents or intrusion detection tools. Data mining classification algorithms can be used, ultimately, to predict and classify students (student success, grades). Also, students can be grouped and predict or analyze their response to different teaching techniques. The objective of this study is to improve the performance of one algorithm of data mining classification using another one, having in mind their strengths and weaknesses. The test of their performance was made with the same training sets for all algorithms, firstly testing the "clean" algorithms, taken separately and then the combination of the algorithms. The algorithms used in the study are decision-trees and one instance-based algorithm, k-NN. I used the C4.5 decision trees implementation (both pruned and unpruned tree) and 1-NN and 3-NN algorithms for the instance-based one. We will see in the study whether the combined algorithm performed better than the solitary algorithms and what combination of algorithms outperforms the others. |