第三章 系统的描述与模型建立 Chapter 3. Description of Systems and Modeling 3.5.3 主元分析法/主成分分析法 3.5.3 Principle Component Analysis (CPA) (先复习有关矩阵运算) (First review the knowledge about matrix manipulation) 找出若干彼此不相关的综合因素来代表原来为数总多的变量, 将多个描述指标(参数)化为少数几个综合指标. Seek out some integrate factors to represent the numerous original variables, perform transformation of more parameters to less ones called integrate parameters. 例: 设某应变量W与两个自变量(描述指标) x1,x2 有关. 现有n组数据 x1j,x2j (j=1,…,n) Ex. Let variable W related to two self-variables, x1,x2, Suppose there exists n groups of data, x1j,x2j (j=1,…,n) 即: Namely, 将n组数据””绘制在x1,x2 的座标中. 将座标旋转一个角度, 得到新座标系y1,y2 , 则各点在新座标系中的座标值为: To plot this n-group data point ”” in the coordinate system x1,x2, then rotate the coordinate by an angle , the new coordinate system y1,y2 is thus established. The data points in the new coordinate system are , 其中U 为正交变换矩阵, UT=U-1, U UT =I where U is orthogonal transform matrix, UT=U-1, U UT =I 即 that is 若将y1取为椭圆长轴方向, 则新座标值机具有下列性质: If take y1 as the major axis, the new coordinates possess the following properties N 个点y1j, y2j (j=1,…,n)纵座标值相关几乎为零. 1) For N points y1j, y2j (j=1,…,n), their correlation is almost zero. N 个点的方差大部分是由于y1轴分量引起, 而沿y2轴引起的较小. 2) For N points y1j, y2j, more significant contributions to the variances are from the weight of y1 axis compared with that of y2. y1和y2是x1和x2 的线性组合, 即称为综合变量. y1 and y2 are the linear combination of x1 and x2, namely integrate variable. 由于沿y1的方差较大, 所以差异性反映在方面较突出, 若仅以y1的座标值做为代表, 损失的信息量最小, 则y1被称为第一主元(主分量). Because the variance along y1 is relative large, so the variation of the data is mainly reflected in this orientation. If use y1 to represent original data points, the information loss will be minimized. Hence, y1 is called the 1st principle component. y2与y1 正交, 对应方差较小, 称为第二主元. y2 is orthogonal to y1, and the corresponding variance is relatively smaller, referred to as the 2nd principle component. 设原有p 个指标(或描述变量) x1, x2,…, xp , 有n个样本(集合), 经过主元变换


