Professor John McCarthy
Father of AI

Articles

Phenomenal Data Mining: From Data to Phenomena

Phenomenal data mining concerns establishing relations among the phenomena underlying data. This is paper updated for inclusion in the SIGKDD Explorations.

Phenomenal data mining concerns establishing relations among the phenomena underlying data. This is essentially the same article as Phenomenal Data Mining, updated for inclusion in the SIGKDD Explorations.

Phenomenal data mining finds relations between the data and the phenomena that give rise to data rather than just relations among the data.

For example, suppose supermarket cash register data does not identify cash customers. Nevertheless, there really are customers, and these customers are characterized by sex, age, ethnicity, tastes, income distribution, and sensitivity to price changes. A data mining program might be able to identify which baskets of purchases are likely to have been made by the same customers. In this example, the receipts are the data, and the customers are phenomena not directly represented

in the data. Once the "baskets" of purchases are grouped by customer, the way is open to infer further phenomena about the customers, e.g. their sex, age, etc.

This article concerns what can be inferred by programs about phenomena from data and what facts are relevant to doing this. We work mainly with the supermarket example, but the idea is general.

In order to infer phenomena from data, facts about their relations must be supplied. Sometimes these facts can be implicit in the programs that look for the phenomena, but more generality is achieved if the facts are represented as sentences of logic in a knowledge base used by the programs.

The result of phenomenal data-mining might include an extended database with additional fields on existing relations and new relations. Thus the relations describing supermarket baskets might be extended with a customer field, and new relations about customers and their properties might be introduced.

Download the article in PDF.