Database of Simulated Test Data

The CEP Series offers two extensive files of simulated test data sets that have been presented and analyzed in numerous articles.

These simulated data sets vary in several properties, including level of heterogeneity, number of underlying gradients, noise level, sample number and distribution, and the presence of partial or complete disjunction and of outliers of two types.

The variety of test data sets included is deliberately intended to reveal the performance of multivariate methods under a wide variety of circumstances in order to facilitate a balanced assessment of the merits of new multivariate methods.

A file with 24 data sets is described in the documentation for ORDIFLEX and is included with both the mainframe and microcomputer versions of that program.

A larger group of 30 data sets is included at no charge with any order for the machine readable source of the FORTRAN-IV versions of the CEP programs. For PC users and other interested individuals, these data sets are also available separately.

These data sets are of great interest to understand how multivariate analyses treat various data sets of known underlying structure.
Dr. Richard E. Furnas
