文献[11] 在数据挖掘技术中,分类预测具有十分广泛的应用。由于数据集中总是存在着不同程度的数据缺失,降低了分类模型的预测准确率。主要通过灵敏度分析来研究缺失数据对分类算法的影响。对6种分类器进行实验,结果显示,当数据集中缺失数据超过20%,会对分类模型的预测准确率产生很大的不利影响,而且对于不同特征的数据集影响也不同。在这6种分类器中,朴素贝叶斯分类器对缺失数据最不敏感。对于目前流行缺失数据的处理方法———利用预测模型来预测并填补缺失数据,朴素贝叶斯分类器将是一个不错的选择。
文献[12] Various Bayesian network classi_er learning algorithms are implemented in Weka [10].This note provides some user documentation and implementation details.
Summary of main capabilities:
_ Structure learning of Bayesian networks using various hill climbing (K2, B, etc) and general
purpose (simulated annealing, tabu search) algorithms.
_ Local score metrics implemented; Bayes, BDe, MDL, entropy, AIC.
_ Global score metrics implemented; leave one out cv, k-fold cv and cumulative cv.
_ Conditional independence based causal recovery algorithm available.
_ Parameter estimation using direct estimates and Bayesian model averaging.
_ GUI for easy inspection of Bayesian networks.
_ Part of Weka allowing systematic experiments to compare Bayes net performance with general
purpose classi_ers like C4.5, nearest neighbor, support vector, etc.
_ Source code available under GPL allows for integration in other systems and makes it easy to extend.
文献[13] 海量数据且高维环境下,朴素贝叶斯分类可能即面临获取大量带类标签代价过高又面临当前分类规则不能适应数据变化等问题。于是提出一种基于小规模训练集的基于粗糙集(RS)动态约简贝叶斯算法来实现问题分类:利用粗糙集理论对决策表属性进行动态约简,挖掘出对分类最有利的条件属性即极小值属性,作为朴素贝叶斯推理(NBC)方法对知识进行学习和分类的输入。该方法结合了贝叶斯推理与动态约简将大数据库采样划分的优点。实验证明了算法的可行性。
文献[14] We investigate why discretization can be effective in naive-Bayes learning. We prove a theorem that identifiers particular conditions under which discretization will result in naïve Bayes classifiers delivering the same probability estimates as would be obtained if the correct probability density functions were employed. We discuss the factors that might affect naive-Bayes classification error under discretization. We suggest that the use of different discretization techniques can effect the classification bias and variance of the generated classifiers. We argue that by properly managing discretization bias and variance, we can effectively reduce naive-Bayes classification error.
文献[15]The Naïve Bayes classifier is a simple and accurate classifier . This paper shows that assuming the Naïve Bayes classifier model and applying Bayes ian model averaging and the principle of indifference , an equally simple , more accurate and theoretically well founded classifier can be obtained.
文献[16] This paper presents a generalization of the Naive Bayes Classifier.The method is specifically designed for binary classification problems commonly found in credit scoring and
marketing applications. The Generalized Naive Bayes Classifier turns out to be a powerful tool for both exploratory and predictive analysis. It can generate accurate predictions through a flexible, non-parametric fitting procedure, while being able to uncover hidden patterns in the data. In this
paper, the Generalized Naive Bayes Classifier and the original Bayes Classifier will be demonstrated. Also, importantties to logistic regression, the Generalized Additive Model 模式识别文献综述和参考文献(3):http://www.751com.cn/wenxian/lunwen_45820.html