菜单
  

    摘要当今时代是一个大数据时代,在生活中随处可见的就是数据,而且人们工作、学习中也是不断地与数据打交道,最终处理分析的都会与数据相关。但是每一件需要决策或分析的事情可能是由大量的数据组成,而且这些数据可能是不连续,甚至是无规则的,采用一般的方法很难找到规律或者进行处理,因此数据挖掘技术不断地发展起来,对于大数据的处理也有了专门的算法进行研究分析,给生活中一些难以预测的事情带来了分析,例如:银行贷款风险、临床决策、生产制造等各个方面。

    决策树算法是一种分类算法,对于需要进行决策或是分析的事件,根据可能影响该事件结果的因素的属性来确定树的结构。C4.5决策树算法是通过计算影响数据因素的信息增益率的大小来确定树的各个节点。信息增益率最大的作为根节点,之后再计算信息增益率分别确定各个叶节点,从而形成一颗自上向下的决策树,数据通过树的各个节点进行判断最终找到自己所属的分类。48009

    在学校中学生成绩的优劣是学校评估学生最好的标准,也是因材施教的判断方式,但是影响学生成绩的因素很多,学生的成绩稳定并不是固定的,也不能单凭学生的考试成绩就能判断出学生的水平,因此需要对学生上课、学习等表现来最终判断学生的学习情况,这是需要将学生各个方面的表现情况进行整体的分析,采用数据挖掘的方式,根据学生各方面的表现情况可以判断出学生的学习状况。

    毕业论文关键词:数据挖掘; 大数据; 决策树算法;C4.5算法

    Abstract

    In today's era is an era of big data, is data can be seen everywhere in life, in people's work, study and is constantly dealing with data, finally will be related to data processing analysis. But every things need decision or analysis may be composed of a large amount of data, and these data may be discontinuous, even without rules, it is difficult to find a rule or adopt the method of general processing, thus the data mining technology constantly developed, for large data processing has a special algorithm research and analysis, has brought some unpredictable things in life, for example: bank loan risk, clinical decision making, production and manufacturing, and other aspects. 

    The decision tree algorithm is a kind of classification algorithm, for the intention to decision or analysis of events, according to the factors may affect the results of the event attributes to determine the structure of the tree. C4.5 decision tree algorithm is used to determine the size of the information gain rate each node of the tree. Maximum of information gain rate as a root node, and then calculate the information gain rate, respectively, to determine each leaf node, thus forming a downwards on the decision tree, data through the various nodes of tree judgment finally found his own classification.

    The advantages and disadvantages of middle school students grades in school is the best standard, evaluate students the school is also according to their aptitude way of judgment, but there are many factors which can affect student achievement, student performance stability is not fixed, nor the students' test scores alone can determine the level of the students, so you need to the students in class and learning performance to eventually determine the students' learning situation, this is the need of the students all aspects of performance to carry on the overall analysis, with the method of data mining, according to the performance of students can judge the students' learning situation.

    Keyword: Data Mining ; Big data; Decision Tree Algorithm;C4.5 algorithm

    目    录

    1、引言 5

    2、算法介绍

  1. 上一篇:asp.net+sqlserver《UML面向对象技术》实验平台设计+源代码
  2. 下一篇:asp.net汽车模拟道路驾驶的设计
  1. 基于MATLAB的图像增强算法设计

  2. 基于Kinect的手势跟踪与识别算法设计

  3. 基于核独立元分析的非线...

  4. JAVA+MYSQL《算法与数据结构...

  5. STM32的μC/OS-II的分析与改进

  6. php快递公司效益分析系统的设计与实现

  7. 神经网络算法在核素识别中的应用研究

  8. java+mysql车辆管理系统的设计+源代码

  9. 大众媒体对公共政策制定的影响

  10. 酸性水汽提装置总汽提塔设计+CAD图纸

  11. 电站锅炉暖风器设计任务书

  12. 十二层带中心支撑钢结构...

  13. 河岸冲刷和泥沙淤积的监测国内外研究现状

  14. 杂拟谷盗体内共生菌沃尔...

  15. 中考体育项目与体育教学合理结合的研究

  16. 当代大学生慈善意识研究+文献综述

  17. 乳业同业并购式全产业链...

  

About

751论文网手机版...

主页:http://www.751com.cn

关闭返回