龙空技术网

懿说学区(36)SPSS统计分析(46)K-均值聚类

LearningYard学苑 120

前言:

今天同学们对“k均值聚类算法的基本思想包括”可能比较关注,我们都想要知道一些“k均值聚类算法的基本思想包括”的相关资讯。那么小编在网摘上搜集了一些关于“k均值聚类算法的基本思想包括””的相关资讯,希望各位老铁们能喜欢,大家一起来了解一下吧!

Yishuo school district (36) | SPSS statistical analysis (46) K-means clustering

“分享兴趣,传播快乐,增长见闻,留下美好! 大家好,这里是小编。欢迎大家继续访问学苑内容,我们将竭诚为您带来更多更好的内容分享。

"Share interest, spread happiness, increase knowledge, and leave a good impression! Hello everyone, this is Xiaobian. Welcome to continue to visit the content of Xueyuan, and we will wholeheartedly bring you more and better content to share.

上一期,我们讲述了二阶聚类分析的基本情况,这一期,我们一起来学习关于K-均值聚类的内容。首先,K-均值聚类是由用户指定类别数的大样本资料的逐步聚类分析方法。它对数据进行初始分类,然后逐步调整,得到最终分类数。当要聚成的类数已知的时候,使用K-均值聚类的处理速度快,占用的计算机内存少。

In the last issue, we talked about the basic situation of second-order clustering analysis. In this issue, we will learn about K-means clustering. First, K-means clustering is a stepwise clustering analysis method for large sample data with the number of categories specified by users. It classifies the data initially, and then adjusts it gradually to get the final classification number. When the number of classes to be clustered is known, K-means clustering is fast and takes up less computer memory.

K-均值聚类的大致步骤是:1.指定聚类数目K 2.确定K个初始类中心 3.根据距离最近原则进行聚类 4.重新确定K个类中心 5.迭代计算。

The general steps of K-means clustering are: 1. Specify the number of clusters K 2. Determine K initial class centers 3. Cluster according to the nearest distance principle 4. Redetermine K class centers 5. Iterative calculation.

我们来进行一下实际操作,测量12名大学生对“高等数学”课程的心理状况和学习效果,主要包括四个因素:学习动机、学习态度、自我感觉、学习效果,具体数据如下图所示。试将该12名学生分成3类以分析不同心理状况下学生的学习效果。

Let's conduct a practical operation to measure the psychological status and learning effect of 12 college students on the course of "Advanced Mathematics", which mainly includes four factors: learning motivation, learning attitude, self-perception and learning effect. The specific data is shown in the figure below. Try to divide the 12 students into three categories to analyze the learning effects of students under different psychological conditions.

第一步,分析并组织数据。由于已知分成3类,故可以采用K-均值聚类法。按图所示组织数据,将“编号”变量的数据类型设为字符型,并以此为标识变量。

The first step is to analyze and organize the data. Since it is known that it is divided into three categories, K-means clustering method can be used. Organize the data as shown in the figure, set the data type of "number" variable as character type, and use it as identification variable.

第二步,进行K-均值聚类分析设置。按下图所示进行设置。

The second step is to set K-means clustering analysis. Set as shown in the following figure.

第三步,主要结果及分析。由下图可知,由于没有指定初始聚类中心,故列出了由系统指定的聚类中心。与原数据相比,它们分别是1号、6号和7号。

第一次迭代后,3个类的中心点分别变化了8.193,9.889和13.472,一共进行了10次迭代,达到聚类结果的要求,聚类分析结束。

The third step is the main results and analysis. As can be seen from the figure below, since no initial cluster center is specified, the cluster centers specified by the system are listed. Compared with the original data, they are No. 1, No. 6 and No. 7 respectively.

After the first iteration, the center points of the three classes changed 8.193, 9.889 and 13.472 respectively. A total of 10 iterations were carried out to meet the requirements of clustering results, and the clustering analysis was completed.

下期预告:本期,我们学习了

K-均值聚类的理论知识和基础运用。

下一期,我们将会学习

系统聚类的理论和实例操作。

今天的分享就到这里了

如果您对今天的文章有独特的想法

欢迎给我们留言

让我们相约明天

祝您今天过得开心快乐!

That's all for today's sharing. If you have unique ideas about today's article, please leave us a message. Let's meet tomorrow. I wish you a happy day today!

参考资料:百度百科,《SPSS 23 统计分析实用教程》

翻译:百度翻译

本文由learningyard新学苑原创,部分文字图片来源于他处,如有侵权,请联系删除

标签: #k均值聚类算法的基本思想包括