摘要聚类分析作为一种无监督学习方法,是数据挖掘中进行数据处理的重要工具。聚类所遵循的是类内样本最大化相似,而类间样本最小化相似的原则,同时无任何先验知识作为铺垫。聚类分析的研究主要集中在聚类算法上,目前相关研究人员已提出了很多不同的聚类算法,其中模糊聚类算法是当前研究的热点。本课题针对模糊C均值(Fuzzy C-means,FCM)聚类算法在图像分割中的应用进行研究,发现传统FCM聚类算法对含噪量低的图像分割效果较好,对含噪量高的图像分割效果却很差。原因在于该算法采用的欧式距离对噪声敏感,且聚类效果受离群点和样本分布不均衡的影响,针对这些问题,本文基于目标函数对传统FCM聚类算法进行改进,主要有以下几点:
(1)将传统的欧式距离用一种非欧式距离代替,改善传统的欧式距离对噪声敏感的问题。再此基础上,将归一化条件,即每个样本对各类隶属度的和为1的条件放松,使所有样本对各类隶属度的总和为 ,改善聚类效果受离群点和样本分布不均衡影响的问题。26124
(2)引入高斯核函数,将输入空间的样本,映射到高文特征空间中,使原空间中非线性可分的问题,转化为线性可分,同时优化了样本的特征,最后的聚类是在特征空间中完成的。
(3)将非欧式距离、放松的归一化条件以及高斯核函数三者融合在一起就形成了本课题所提出的改进算法。
最后通过图像分割仿真对比实验验证了改进算法的有效性。
毕业论文关键词:FCM聚类算法 非欧式距离 隶属度 高斯核函数 特征空间
Abstract
Cluster analysis as a method of unsupervised learning method,is an important tool of data processing in data mining.Clustering followed similar maximize sample within the class,while the sample is similar to the principle of minmizing the inter-class,but without any prior knowledge as bedding.Research on clustering analysis focused on clustering algorithm.At present,researchers have proposed many different clustering algorithms,fuzzy clustering algorithm is a hotspot of current research.The topic for Fuzzy C-Means clustering algorithm in image segmentation application research,and found that the traditional FCM clustering algorithm is better for the image with low noise content while the effect of the image with high noise is very poor.The reason is that the Euclidean diatance is sensitive to the noise ,and the clustering effect is influenced by the outliers and the distribution of the uneven sample .For these problems ,the objective function based on the traditional FCM clustering algorithm is improved,the following main points:
(1) The traditional Euclidean distance is replaced by a non-Euclidean distance,improve the traditional Euclidean diatance for noise sensitive issue.On this basis, relax the normalization condition where the sum of each sample of various types of membership is 1,make the sum of all samples of various types of membership is ,to improve the clustering effect by outliers and sample distribution with no balance.
(2) The introduction of the Gaussian kernel function maps the input sample space to a high dimensional feature space,to allow space for nonliner separable problems into linearly separable,while optimizing the characteristics of the sample,the final clustering is completed in the feature space.
(3) The non-Euclidean distance,relax normalized conditions,and the Gaussian kernel there fused together to form a new algorithm proposed by this project.
Finally,the effectiveness of the improved algorithm is verified by image segmentation simulation and comparison experiments.
Key words: FCM clustering algorithm Non-Euclidean distance
Membership Gauaaian kernel function Feature space FCM聚类算法的研究+visio图+文献综述:http://www.751com.cn/zidonghua/lunwen_20193.html