Keynote Speakers

  1. Dr. Rakesh Agrawal, IBM Almaden Lab., USA

    [Photo] Rakesh Agrawal leads the Quest project at the IBM Almaden Research Center, which pioneered key data mining concepts and technologies. IBM's commercial data mining product, Intelligent Miner, grew out of this work. His research has been incorporated into other IBM products, including DB2 Mining Extender, DB2 OLAP Server and WebSphere Commerce Server. His technical contributions have also influenced several external commercial and academic products, prototypes and applications. Rakesh has published more than 100 research papers and he has been granted 45 patents. He is the recepient of the ACM-SIGKDD First Innovations Award as well as the ACM-SIGMOD 2000 Innovation Award. He is an IBM Fellow and also a Fellow of IEEE.

    Keynote: Privacy Aware Data Management and Analytics
    Abstract:
      The explosive progress in networking, storage, and processor technologies is resulting in an unprecedented amount of digitization of information. In concert with this dramatic increase in digital data, concerns about the privacy of personal information have emerged globally. The concerns over massive collection of data are naturally extending to analytic tools applied to data. Data mining, with its promise to efficiently discover valuable, non-obvious information from large databases, is particularly vulnerable to misuse.

      Inspired by the privacy tenet of the Hippocratic Oath, we argue that future database systems must include responsibility for the privacy of data they manage as a founding tenet. We enunciate the key principles for such Hippocratic database systems, distilled from the principles behind current privacy legislations and guidelines. We identify the technical challenges and problems in designing Hippocratic databases, and also outline some solution approaches.

      One way of preserving privacy of individual data records would be to perturb them. Since the primary task in data mining is the development of models about aggregated data, we explore if we can develop accurate models without access to precise information in individual data records. We consider the concrete case of building a decision-tree classifier from perturbed data. While it is not possible to accurately estimate original values in individual data records, we describe a reconstruction procedure to accurately estimate the distribution of original data values. By using these reconstructed distributions, we are able to build classifiers whose accuracy is comparable to the accuracy of classifiers built with the original data.

      We will conclude by pointing out some open research problems.

  2. Dr. Jaideep Srivastava, Univ. of Minnesota, USA

    [Photo] Dr. Jaideep Srivastava is a professor on the faculty of the University of Minnesota. Between 1999 and 2001 he took a two-year leave, during which he spent time at Amazon.com and at Yodlee Inc. This wide-ranging industry experience has provided him with a unique perspective on the application of various computer science technologies in various kinds of Web-based services. As a researcher, educator, consultant, and invited speaker in the areas of data mining, databases, artificial intelligence, and multimedia for over 15 years, Dr. Srivastava continues his active collaboration with the technology industry, both for research and technology transfer. An often-invited participant in technical and technology strategy forums, Dr. Srivastava has presented at a multitude of industry, academic and government meetings. He has been involved in the organization of a number of conferences, and serves on the editorial board of various journals. The federal government has solicited his opinion on computer science research as an expert witness. He also served in an advisory role to the governments of India and Chile on various software technologies. Dr. Srivastava received his B.Tech. in Computer Science from the Indian Institute of Technology - Kanpur, and M.S. and Ph.D. in Computer Science from the University of California - Berkeley.

    Keynote: Web Mining - Accomplishments & Future Directions
    Abstract:
      From its very beginning, the potential of extracting valuable knowledge from the Web has been quite evident. Web mining - i.e. the application of data mining techniques to extract knowledge from Web content, structure, and usage - is the collection of technologies to fulfill this potential. Interest in Web mining has grown rapidly in its short existence, both in the research and practitioner communities. A number of new concepts, e.g. PageRank, hubs & authorities, web communities, web interestingness measures, etc., and techniques to compute them have been developed. In addition, a wide variety of commercial enterprises regularly use Web mining in their daily operations, e.g. Amazon, Yahoo, Google, etc. This talk provides an overview of the accomplishments of the field - both in terms of technologies and applications - and outlines key future research directions.

PAKDD2003 Homepage : http://aitrc.kaist.ac.kr/~pakdd03