Anonymous Microsoft Web Data Data Set
Download: Data Folder, Data Set Description

Abstract: Log of anonymous users of; predict areas of the web site a user visited based on data on other areas the user visited.

Jack S. Breese, David Heckerman, Carl M. Kadie
Microsoft Research, Redmond WA, 98052-6399, USA
breese '@', heckerma '@', carlk '@'


Breese:, Heckerman, & Kadie

Data Set Information:

We created the data by sampling and processing the logs. The data records the use of by 38000 anonymous, randomly-selected users. For each user, the data lists all the areas of the web site (Vroots) that user visited in a one week timeframe.

Users are identified only by a sequential number, for example, User #14988, User #14989, etc. The file contains no personally identifiable information. The 294 Vroots are identified by their title (e.g. "NetShow for PowerPoint") and URL (e.g. "/stream"). The data comes from one week in February, 1998.

Attribute Information:

Each attribute is an area ("vroot") of the web site.

The datasets record which Vroots each user visited in a one-week timeframe in Feburary 1998.

Relevant Papers:

J. Breese, D. Heckerman., C. Kadie _Empirical Analysis of Predictive Algorithms for Collaborative Filtering_ Proceedings of the Fourteenth Conference on Uncertainty in Artificial Intelligence, Madison, WI, July, 1998.
[Web Link]

Also, expanded as Microsoft Research Technical Report MSR-TR-98-12, The papers are available on-line at: [Web Link]

