The Pacific-Asia Conference on Knowledge Discovery and Data Mining

May 23-26, 2017, Jeju, South Korea

Photograph courtesy of Jeju Tourism Organization


• Tutorial 1


Text Similarity Search within Homogenous and Heterogenous Document Collections


Adam Jatowt, Yating Zhang, Zhenglu Yang


Text similarity search such as similarity search for short texts has applications in many diverse areas. It can be used for query correction, query substitution, paraphrasing, analogy detection, data cleaning, question answering, and so on. Typically, a word or a sentence are taken as an input for which one wish to return similar corresponding words or similar sentences from a large volume of candidate ones. Searching for similar texts can be carried within the same domain such as the same homogenous document collection or can be conducted across-domains. The latter case represents search over heterogeneous document collections such as in the collections of documents published at different time periods, documents originating from (or related to) diverse geographic areas/cultures or scientific domains. For example, a user may wish to find corresponding entities across time by searching within temporal document archives or analogous objects in diverse geographical spaces. Searching across different domains is particularly difficult due to multiple aspects including different vocabulary, context, relationships, etc.

In this tutorial we will begin with the general overview of methods for text similarity search within homogenous collections, their applications and importance. In particular, we will introduce effective and efficient approaches for searching similar strings, words and sentences in different kinds of text based applications. We will then explain several techniques for finding semantically corresponding terms by analysing temporal document archives and other types of heterogenous document collections. We plan also to overview recent achievements in analogy detection and analogical search.

The goal of this tutorial is to give participants an overview about different theories and techniques that are relevant in this field and demonstrate the possibilities of novel search scenarios and functionalities.


Adam Jatowt is an Associate Professor at the Graduate School of Informatics at Kyoto University. He received his Ph.D. in Information Science and Technology from the University of Tokyo, Japan in 2005.  He has published more than 80 papers in international conferences and journals on the text and data mining including ones at WWW, WSDM, ACL, KDD, TKDE, Ubicomp and CIKM. He has served as a PC co-chair of JCDL2017, ICADL2014, SocInfo2013, and tutorial co-chair at SIGIR2017, as well as he was a PC member of over 100 conferences or workshops including also SIGIR, WSDM, WWW, CIKM, AAAI, IJCAI. He has co-organized 10 international workshops on topics of text mining and information search.

Yating Zhang is a Postdoctoral Researcher at Riken AIP Center/NAIST. She has received her Master's degree in Information System and Management at Carnegie Mellon University in 2012, and she received her Ph.D in Social Informatics from Kyoto University, Japan in 2016. She has published papers at international conferences and journals, such as ACL, WWW, IEEE BigData, TKDE and TOIS. Her research interests include information retrieval, web archive search and mining, query suggestion and Q&A.

Zhenglu Yang is a professor at the Institute of Computer and Control Engineering at Nankai University, China. He received his Ph.D. degree in Information Science and Technology from the University of Tokyo, Japan in 2008. From 2008 to 2014, he worked as a faculty at the Institute of Industrial Science, the University of Tokyo. He has published more than 50 papers in the conferences and journals on data mining, artificial intelligence and natural language processing including ACL, AAAI, IJCAI, WWW, CIKM and TKDE.

• Tutorial 2


Advances in Recommender System


Guandong XU


Recommender System, an active domain of information retrieval, focus on modelling user preference or rating of items that they may be interested. The suggestions provided are aimed at improving user experience and loyalty, facilitating decision-making for users and creating more revenues for online businesses and merchants and so on. Development of recommender systems is a multi-disciplinary effort which involves experts from various fields such as Artificial intelligence, Human Computer Interaction, Information Technology, Data Mining, Statistics, Adaptive User Interfaces, Decision Support Systems, Marketing, or Consumer Behaviour.

In this tutorial, we will systematically address the basic but practical concepts and techniques of recommender system in the context of traditional recommender systems, the state-of-the-art techniques of recommender systems, e.g., content-based systems, sequential based, and graph based. Specifically, we will introduce matrix factorization and learning to rank techniques at traditional recommender system section. Then side information representation based on deep learning will be showed at content-based section. Next is recommender system based on sequential information. At last, some graph-based methods will be presented at point-of-interesting recommendation as case study.

The tutorial will target at the audiences who are conducting researches or pursuing research degrees in related areas, and who are working in the engineering domains of e-commerce, e-marketing, business analytics and intelligence and so on. The whole tutorial expects to last for two and half hours.


Guandong Xu received the Ph.D. degree in Computer Science from Victoria University, Australia. He is currently an Associate Professor (Reader) in School of Software and the Advanced Analytics Institute at University of Technology Sydney. He has authored three monographs with the Springer and the CRC Press, and 100+ journal and conference papers. His current research interests include data science and data analytics, Web data mining, behaviour analytics, recommender systems, predictive analytics, social network analysis. His research has gained grant funding from Australian and Chinese governments, e.g., ARC and NSFC grants, and projects funded by industries. In last decade, he has had over 100+ publications including TOIS, TNNLS, TIFS, TSC, Inf Sci, IEEE-IS, IJCAI, AAAI, WWW, ICDE, ICDM, and CIKM. He is the Assistant EiC of WWW Journal plus serving in the Editorial Board or as Guest Editor for several international journals. He received Australian BigInsight Data Analytics Award in December 2016 due to his significant impact on Best Customer Insights.

• Tutorial 3


Large Scale Tensor Anaysis


U Kang


Many real-world data are naturally represented as tensors, or multi-dimensional arrays. Tensor analysis is an important tool for various applications such as latent concept discovery, trend analysis, clustering, and anomaly detection. In this tutorial, I will describe theories for tensor analysis. Next, I will explain algorithms for tensor analysis, focusing on scalability. Finally, I will discuss applications of tensor analysis.


U Kang is an assistant professor in the Department of Computer Science and Engineering of Seoul National University. He received Ph.D. in Computer Science at Carnegie Mellon University, after receiving B.S. in Computer Science and Engineering at Seoul National University. He won 2013 SIGKDD Doctoral Dissertation Award, 2013 New Faculty Award from Microsoft Research Asia, 2016 Korean Young Information Scientist Award, and two best paper awards. He has published over 50 refereed articles in major data mining and database venues. He holds four U.S. patents. His research interests include big data mining.

Sponsored By





Hosted By