College Common Data Set

College Common Data Set

College Common Data Set

Not all of this data is required on a regular basis and it should be filtered out of the query tables. For example, when creating a classification or prediction model, it may be adequate to sample the table first and then mine the sample. This is usually a faster and less expensive operation.

There are several potential sources of data that may be useful in a data mining application:

  • Census data
  • Sales records
  • Mailing lists
  • Demographic databases.

These sources should all be explored before performing an analysis of the data. The selected data types may be organized along multiple tables. Developing a sound model involves combining parts of separate tables into a single database for mining purposes.