求翻译,不要机器翻译,谢谢

来源:百度知道 编辑:UC知道 时间:2024/06/20 05:29:32
PSYCHOMETRIKA—VOL. 73, NO. 1, 125–144
MARCH 2008
DOI: 10.1007/S11336-007-9019-Y

SELECTION OF VARIABLES IN CLUSTER ANALYSIS:AN EMPIRICAL COMPARISON OF EIGHT PROCEDURES

DOUGLAS STEINLEY

UNIVERSITY OF MISSOURI-COLUMBIA
MICHAEL J. BRUSCO

FLORIDA STATE UNIVERSITY

Eight different variable selection techniques for model-based and non-model-based clustering are evaluated across a wide range of cluster structures. It is shown that several methods have difficulties when non-informative variables (i.e., random noise) are included in the model. Furthermore, the distribution of the random noise greatly impacts the performance of nearly all of the variable selection procedures. Overall, a variable selection technique based on a variance-to-range weighting procedure coupled with the largest decreases in within-cluster sums of squares error performed the best. On the other hand, variable selection methods used in conjunction with fi

psychometrika -卷。 73 ,没有。 1 , 125-144
2008年3月
土井: 10.1007/s11336-007-9019-y

变量选择在聚类分析:一个实证比较8程序

道格拉斯steinley

密苏里大学哥伦比亚
菲利普莫里斯brusco

佛罗里达州立大学

8个不同的变量选择技术,基于模型和非基于模型的聚类评估,跨越多种团簇结构。这表明几种方法有困难时,非翔实的变数(即随机噪声)都包含在模型。此外,分布的随机噪声大大影响的表现,几乎所有的变量选择程序。整体而言,一个变量选择技术的基础上,方差,以远程加权程序,再加上最大的跌幅在内部联网平方错误表现是最好的。在另一方面,变量选择所采用的方法,结合有限混合模型的表现最差。关键词:聚类分析,变量选择。