The 19th Pacific-Asia Conference on

Knowledge Discovery and Data Mining

Session Details

Announced on April 27, 2015

Workshop Day

May 19, 2015 (Tuesday)

08:30 - 09:30

BigPMA: The 2nd Workshop on Pattern Mining and Application of Big Data

Location: Sunflower Ballroom A

 

Session 1

Chair: Shih-Hao Chang

08:30

08:50

From Cluster-based Outlier Detection to Time Series Discord Discovery

Nguyen Huy Kha, Duong Tuan Anh (Ho Chi Minh City University of Technology)

08:50

09:10

ProbitUCB: A Novel Method for Review Ranking

Wanying Ding, Yue Shang (Drexel University), Dae Hoon Park (University of Illinois at Urbana-Champaign), Lifan Guo (TCL Research America), Xiaohua Hu (Drexel University)

09:10

09:30

Web site audience segmentation using hybrid alignment techniques

Vinh-Trung Luu, Germain Forestier, Fr ed eric Fondement, Pierre-Alain Muller (Universite de Haute Alsace)

VLSP: The 3rd Workshop on Vietnamese Language and Speech Processing

Location: Sunflower Ballroom B

08:30

09:00

Author Profiling for Vietnamese Forum Posts
Duc Tran Duong, Son Bao Pham and Hanh Tan

09:00

09:30

Syntax-Based Statistical machine translation approach for solving the diacritization problem
Nguyen Minh Tuan and Nguyen Minh Hang

PAISI: Pacific Asia Workshop on Intelligence and Security Informatics

Location: Daisy Room

08:30 09:30

Opening and Keynote Speech

TBC

QIMIE: Workshop on Quality Issues, Measures of Interestingness and Evaluation of

Data Mining Models

Location: Lotus Room

08:30  08:35

Opening

08:35  09:30

Keynote: Classifier Evaluation Under Changing Scenarios

Nitesh Chawla

10:00 - 12:00

BigPMA: The 2nd Workshop on Pattern Mining and Application of Big Data

Location: Sunflower Ballroom A

 

Session 1 (cont.)

Chair: Shih-Hao Chang

10:00

10:20

Mining Massive-Scale Spatiotemporal Trajectories in Parallel: A Survey
Pengtao Huang, Bo Yuan (Tsinghua University)

10:20

10:40

Manifold Regularized Symmetric Joint Link Model for Overlapping Community Detection
Hao Chen, Xianchao Zhang, Wenxin Liang, Feng Ding (Dalian University of Technology)

10:40

11:00

High Dimensional Explicit Feature Biased Matrix Factorization Recommendation
Weibin Sun, Xianchao Zhang, Wenxin Liang, Zengyou He (Dalian University of Technology)

11:00

11:20

A Simhash-based Generalized Framework for Citation Matching in MapReduce
Pengsen Wang, Bin Wu, Xiaoming Li, Lin Wang, Bai Wang (University of Posts and Telecommunications)

11:20

11:40

A Dynamic Feature Selection Based LDA Approach to Baseball Pitch Prediction
Phuong Hoang (North Carolina State University), Michael Hamilton (Columbia University), Joseph Murray (ZeroFOX Company),
Corey Stafford (Columbia University), Hien Tran (North Carolina State University)

VLSP: The 3rd Workshop on Vietnamese Language and Speech Processing

Location: Sunflower Ballroom B

10:00

10:30

Utilizing Vietnamese Sentiment Analysis for Online Reputation Management Platform
Cong Thanh Le, Manh Hung Nguyen, Hoang Phuong Dinh and Viet Cuong Nguyen

10:30

11:00

An improved method for Vietnamese speech segmentation based on method of waveform image representation
Long Hoang Nguyen and Lung Duc Vu

11:00

11:30

Modeling Vietnamese speech prosody: A step-by-step approach towards an expressive speech synthesis system
Dang-Khoa Mac and Do-Dat Tran

11:30

12:00

Hybrid Deep Neural Network-Hidden Markov Model for Vietnamese Large Vocabulary Continuous Speech Recognition System
Quoc Bao Nguyen and Tat Thang Vu

PAISI: Pacific Asia Workshop on Intelligence and Security Informatics

Location: Daisy Room

 

Session 1: Social Media Intelligence

Chair: TBC

10:00

10:30

Media REVEALr: A Social Multimedia Monitoring and Intelligence System for Web Multimedia Verification

Katerina Andreadou, Symeon Papadopoulos, Lazaros Apostolidis, Anastasia Krithara, and Yiannis Kompatsiaris

10:30

11:00

Geotagging Social Media Content with a Refined Language Modelling

Approach
Giorgos Kordopatis-Zilos, Symeon Papadopoulos, and Yiannis Kompatsiaris

11:00

11:30

Predicting Vehicle Recalls with User-Generated Contents: A Text Mining Approach

Xuan Zhang, Shuo Niu, Da Zhang, G. Alan Wang, and Weiguo Fan

11:30

12:00

GCM: A Greedy-based Cross-Matching Algorithm for Identifying Users across Multiple Online Social Networks

Wenxin Liang, Bo Meng, Xiaosong He, and Xianchao Zhang

QIMIE: Workshop on Quality Issues, Measures of Interestingness and Evaluation of

Data Mining Models

Location: Tulip Room

 

Session 1: Network and Community Analysis

Chair: TBC

10:00  10:30

Analyzing User Behaviors Based on Temporal Patterns of Sequential Pattern Evaluation Indices on Twitter

Hidenao Abe

10:30

11:00

Evaluation of Community Mining Algorithms in the Presence of Attributes

Reihaneh Rabbany and Osmar Zaïane

 

Session 2: Clustering

Chair: TBC

11:00  11:30

Internal Clustering Evaluation of Data Streams

Marwan Hassani and Thomas Seidl

11:30

12:00

Feature maximization based clustering quality evaluation: a promising approach

Jean-Charles Lamirel, Shadi Al Shehabi

13:30 - 15:00

BigPMA: The 2nd Workshop on Pattern Mining and Application of Big Data

Location: Sunflower Ballroom A

 

Session 2

Chair: Yi-Cheng Chen

13:30

13:50

A Cloud Based Diabetes Lifestyle Management System
Shih-Hao Chang (Tamkang University)

13:50

14:10

Big Data GenerationApplication of Mobile Healthcare
Lin Hui, Huan-Chao Keh (TamKang University), Nan-Ching Huang (National Yang-Ming University), Chiung-Tzu Chang, Yi-Fan Yang (TamKang University)

14:10

14:30

Construction of a prediction model for nephropathy among obese patients using genetic and clinical features
Guan-Mau Huang (Yuan Ze University), Yi-Cheng Chen (Tamkang University), Julia Tzu-Ya Weng (Yuan Ze University)

VLSP: The 3rd Workshop on Vietnamese Language and Speech Processing

Location: Sunflower Ballroom B

13:30

14:00

Fast Dependency Parsing using Distributed Word Representations
Phuong Le-Hong, Thi-Minh-Huyen Nguyen and Thi-Luong Nguyen

14:00

14:30

Building a LTAG syntax - semantic interface for Vietnamese
Minh Hai Nguyen, Thi Huyen Nguyen, Thi Minh Huyen Nguyen and The Quyen Ngo

PAISI: Pacific Asia Workshop on Intelligence and Security Informatics

Location: Daisy Room

 

Session 2: Fraud Detection

Chair: TBC

13:30

14:00

P2P Lending Fraud Detection: A Big Data Approach

Jennifer Xu, Yong Lu, and Michael Chau

14:00

14:30

Drug Anti-forgery and Tracing System Based on Lightweight Asymmetric Identities

Shenghui Su, Na Li, and Shuwang Lu

QIMIE: Workshop on Quality Issues, Measures of Interestingness and Evaluation of

Data Mining Models

Location: Tulip Room

 

Session 3: Interestingness Measures

Chair: TBC

13:30  14:00

A Study of Interestingness Measures for Associative Classification on Imbalanced Data

Guangfei Yang and Xuejiao Cui

14:00

14:30

Model Selection of Symbolic Regression to Improve the Accuracy of PM2.5 Concentration Prediction

Guangfei Yang and Jian Huang

14:30

15:00

Leveraging the Common Cause of Errors for Constraint-based Data Cleansing

Ayako Hoshino, Hiroki Nakayama, Chihiro Ito, Kyota Kanno and Kenshi Nishimura

15:00 - 18:10

VLSP: The 3rd Workshop on Vietnamese Language and Speech Processing

Location: Sunflower Ballroom B

15:00

15:30

Semantic Role Labeling in Bilingual English-Vietnamese Corpus
Huynh Quang Duc and Tran Le Tam Linh

15:30

16:00

Named Entity Alignment in an English-Vietnamese Bilingual Corpus
Long Hong Buu Nguyen and Dien Dinh

16:00

17:00

General Discussion - Closing

PAISI: Pacific Asia Workshop on Intelligence and Security Informatics

Location: Daisy Room

 

Session 3: Text Mining

Chair: TBC

15:00

15:30

Chinese Word POS Tagging with Markov Logic

Zhihua Liao

15:30

16:00

In Search of Plagiarism Behaviors: An Empirical Study of Online Reviews

Zhuolan Bao and Michael Chau

DAEBH: Workshop on Data Analytics for Evidence-Based Healthcare

Location: Tulip Room

15:00

15:10

Workshop Opening

Dr. Xujuan Zhou

15:10

16:10

Keynote I: Evidence Mining Systems

Dr. Guy Tsafnat ( Chair: Dr. Wei Liu )

16:10

17:10

 

Keynote II: From Big Data Analytics in Healthcare to New Generic Data Mining Approaches

Prof. Osmar Zaiane ( Chair: Dr. Xujuan Zhou )

 

Presentation Session ( Chair: Dr. Wei Liu )

17:10

17:30

Learning Entry Profiles of Children with ASD from Multivariate Treatment Information using Restricted Boltzmann Machines

Pratibha Vellanki, Dinh Phung, Thi Duong and Svetha Venkatesh

17:30

17:50

Citation Enrichment Improves Deduplication of Primary Evidence

Miew Keen Choong, Sarah Thorning and Guy Tsafnat

17:50

18:10

Integrating Content Centric Networking and Web Content Mining: A Future Efficient Internet Architecture for Healthcare

Rabia Bashir and Sajjad Akbar



Main Conference

Notes:
- Each L paper is allocated 20 minutes for presentation plus 5 minutes for Q&A.
- Each R paper is allocated 15 minutes for presentation plus 5 minutes for Q&A.
- No-show cases will be seriously treated by PAKDD policy.
 

May 20, 2015 (Wednesday)

 

09:00 - 10:00

 

KEYNOTE I: Online and Batch Learning with Interventions

Thorsten Joachims
Session Chair:
Jaideep Srivastava
Location: Lotus Ballroom

 

10:20 - 12:30

 

10:20

12:30

TUTORIAL I: Crowdsourcing for Big Data Analytics

Presenters: Hisashi Kashima, Satoshi Oyama, and Yukino Baba

Location: Tulip Room

 

10:20

12:30

TUTORIAL II: Differential Privacy and Its Applications

Presenters: Gang Li, Tianqing Zhu, and Wanlei Zhou

Location:Daisy Room

 

SESSION 1A: Social Networks and Social Media
Session Chair: Kyuseok Shim
Location: Lotus Ballroom A

 

10:20

10:45

Maximizing Friend-Making Likelihood for Social Activity Organization (L)

Chih-Ya Shen, De-Nian Yang, Wang-Chien Lee, and Ming-Syan Chen

 

10:45

11:10

What Is New in Our City? A Framework for Event Extraction Using Social Media Posts (L)

Chaolun Xia, Jun Hu, Yan Zhu, and Mor Naaman

 

11:10

11:35

Link Prediction in Aligned Heterogeneous Networks (L)

Fangbing Liu and Shu-Tao Xia

 

11:35

11:55

Scale-Adaptive Group Optimization for Social Activity Planning (R)

Hong-Han Shuai, De-Nian Yang, Philip S. Yu, and Ming-Syan Chen

 

11:55

12:15

Influence Maximization across Partially Aligned Heterogenous Social Networks (R)

Qianyi Zhan, Jiawei Zhang, Senzhang Wang, Philip S. Yu,and Junyuan Xie

 

SESSION 1B: Classification
Session Chair: Latifur Khan 
Location: Lotus Ballroom B

 

10:20

10:40

Double Ramp Loss Based Reject Option Classifier (R)

Naresh Manwani, Kalpit Desai, Sanand Sasidharan, and Ramasubramanian Sundararajan

 

10:40 11:00

Efficient Methods for Multi-label Classification (R)

Chonglin Sun, Chunting Zhou, Bo Jin, and Francis C.M. Lau

 

11:00  11:20

A Coupled k-Nearest Neighbor Algorithm for Multi-label Classification (R)

Chunming Liu and Longbing Cao

 

11:20 11:40

Learning Topic-Oriented Word Embedding for Query Classification (R)

Hebin Yang, Qinmin Hu, and Liang He

 

11:40 12:00

Reliable Early Classification on Multivariate Time Series with Numerical and Categorical Attributes (R)

Yu-Feng Lin, Hsuan-Hsu Chen, Vincent S. Tseng, and Jian Pei

 

12:00 12:20

Document Classification based on Distributed Document Representation: A Supervised Deep Learning Framework (R)

Rumeng Li and Hiroyuki Shindo

 

SESSION 1C: Machine Learning
Session Chair: Hoai-An Le-Thi
Location: Sunflower Ballroom A

 

10:20 10:45

Collaborating Differently on Different Topics: A Multi-Relational Approach to Multi-Task Learning (L)

Sunil Kumar Gupta, Santu Rana, Dinh Phung, and Svetha Venkatesh

 

10:45  11:05

A Bayesian Nonparametric Approach to Multilevel Regression (R)

Vu Nguyen, Dinh Phung, Svetha Venkatesh, and Hung H. Bui

 

11:05  11:25

Learning Conditional Latent Structures from Multiple Data Sources (R)

Viet Huynh, Dinh Phung, Long Nguyen, Svetha Venkatesh,and Hung H. Bui

 

11:25  11:45

Collaborative Multi-view Learning with Active Discriminative Prior for Recommendation (R)

Qing Zhang and Houfeng Wang

 

11:45  12:05

Online and Stochastic Universal Gradient Methods for Minimizing Regularized Holder Continuous Finite Sums in Machine Learning (R)

Ziqiang Shi and Rujie Liu

 

12:05  12:25

Multi-Task Metric Learning on Network Data (R)

Chen Fang and Daniel N. Rockmore

 

SESSION 1D: Applications
Session Chair: Xiaoli Li
Location: Sunflower Ballroom B

 

10:20  10:45

On Damage Identification in Civil Structures Using Tensor Analysis (L)

Nguyen Lu Dang Khoa, Bang Zhang, Yang Wang, Wei Liu,Fang Chen, Samir Mustapha, and Peter Runcie

 

10:45  11:10

Predicting Smartphone Adoption in Social Networks (L)

Le Wu, Yin Zhu, Nicholas Jing Yuan, Enhong Chen, Xing Xie, and Yong Rui

 

11:10  11:30

Discovering the Impact of Urban Traffic Interventions Using Contrast Mining on Vehicle Trajectory Data (R)

Xiaoting Wang, Christopher Leckie, Hairuo Xie, and Tharshan Vaithianathan

 

11:30  11:50

Locating Self-collection Points for Last-mile Logistics using Public Transport Data (R)

Huayu Wu, Dongxu Shao, and Wee Siong Ng

 

11:50  12:10

A Stochastic Framework for Short-term Solar Irradiance Forecasting (R)

Jin Xu, Shinjae Yoo, Dantong Yu, Hao Huang, Dong Huang, John Heiser, and Paul Kalb

 

12:10  12:30

Online Prediction of Chess Match Result (R)

Mohammad M. Masud, Ameera Al-Shehhi, Eiman Al-Shamsi,Shamma Al-Hassani, Asmaa Al-Hamoudi, and Latifur Khan

 

14:00 - 16:10

 

14:00

16:10

TUTORIAL III: Behavior Computing: Deep Behavior Analytics and Active Behavior Management

Presenter: Longbing Cao

Location: Tulip Room

 

CONTEST: Gender Prediction Based on E-commerce Data

Chairs: Hung Son Nguyen, Nitesh Chawla, and Nguyen Duc Dung

Location: Daisy Room

 

14:00

Welcome and Introduction

Hung Son Nguyen, Poland

 

14:00 14:20

FRDC's approach at PAKDD’15 Data Mining Competition

Miao Qingliang, Fujitsu Research & Development Center Co.,LTD., China

 

14:20 14:40

The combination of supervised and unsupervised approach

Xia Yingju, Fujitsu Research & Development Center Co.,LTD., China

 

14:40 15:00

Random forrest based classification on heterogeneous generated features.

Jan Kralj, Jozef Stefan Institute, Slovenia

 

15:00 15:20

TTI's Gender Prediction System using Bootstrapping and Identical-Hierarchy

Mohammad Golam Sohrab, Toyota Technological Institute, Japan

 

15:20 15:40

Factor Models for Gender Prediction Based on E-commerce Data

Immanuel Bayer, University of Konstanz, Germany

 

15:40 16:00

A Granular Classifier for PAKDD 2015 Data Mining Competition.

Wojtek Swieboda, University of Warsaw, Poland

 

16:00 16:20

Gender Prediction based on counting with weight method

Pham Ngoc An, FPT University, Vietnam

 

16:20 16:25

Summary and closing remarks

 

 

SESSION 2A: Opinion Mining and Sentiment Analysis
Session Chair: Xujuan Zhou
Location: Lotus Ballroom A

 

14:00

14:20

Emotion Cause Detection for Chinese Micro-blogs based on ECOCC Model (R)

Kai Gao, Hua Xu, and Jiushuo Wang

 

14:20

14:40

Parallel Recursive Deep Model for Sentiment Analysis (R)

Changliang Li, Bo Xu, Gaowei Wu, Saike He, Guanhua Tian,and Yujun Zhou

 

14:40  

15:00

Sentiment Analysis in Transcribed Utterances (R)

Nir Ofek, Gilad Katz, Bracha Shapira, and Yedidya Bar-Zev

 

15:00

15:20

HierRating: A Hierarchical Generative Bayesian Model for Entity Latent Aspect Rating Analysis (R)

Xun Wang, Katsuhito Sudoh, and Masaaki Nagata

 

15:20

15:40

Sentiment Analysis on Microblogging by Integrating Text and Image Features (R)

Yaowen Zhang, Lin Shang, and Xiuyi Jia

 

15:40

16:00

TSum4act: A Framework for Retrieving and Summarizing Actionable Tweets during a Disaster for Reaction (R) Minh-Tien Nguyen, Asanobu Kitamoto, and Tri-Thanh Nguyen

 

SESSION 2B: Clustering
Session Chair: Dat Tran
Location: Lotus Ballroom B

 

14:00

14:25

Evolving Chinese Restaurant Process for Modeling Evolutionary Trace in Temporal Data (L) Peng

Wang, Chuan Zhou, Peng Zhang, Weiwei Feng, Li Guo, and Binxing Fang

 

14:25

14:50

Small-Variance Asymptotics for Bayesian Nonparametric Models with Constraints (L)

Cheng Li, Santu Rana, Dinh Phung, and Svetha Venkatesh

 

14:50

15:10

Spectral Clustering for Large-Scale Social Networks via a Pre-Coarsening Sampling based NystrÖm Method (R)

Ying Kang, Bo Yu, Weiping Wang, and Dan Meng

 

15:10

15:30

pcStream: A Stream Clustering Algorithm for Dynamically Detecting and Managing Temporal Contexts (R)

Yisroel Mirsky, Bracha Shapira, Lior Rokach, and Yuval Elovici

 

15:30

15:50

Clustering over Data Streams based on Growing Neural Gas (R)

Mohammed Ghesmoune, Mustapha Lebbah, and Hanene Azzag

 

15:50

16:10

ClustCube Cubes: A Novel OLAP-based Mining Structure for Clustering Complex Database Objects (R)

Alfredo Cuzzocrea

 

SESSION 2C: Novel Methods and Algorithms
Session Chair: Tuyen Huynh
Location: Sunflower Ballroom A

 

14:00

14:25

Principal Sensitivity Analysis (L)

Sotetsu Koyamada, Masanori Koyama, Ken Nakae, and Shin Ishii

 

14:25

14:50

SocNL: Bayesian Label Propagation with Confidence (L)

Yuto Yamaguchi, Christos Faloutsos, and Hiroyuki Kitagawa

 

14:50

15:15

An Incremental Local Distribution Network for Unsupervised Learning (L)

Youlu Xing, Tongyi Cao, Ke Zhou, Furao Shen, and Jinxi Zhao

 

15:15

15:35

Trend-based Citation Count Prediction for Research Articles (R)

Cheng-Te Li, Yu-Jen Lin, Rui Yan, and Mi-Yen Yeh

 

15:35

15:55

Mining text enriched heterogeneous citation networks (R)

Jan Kralj, Anita Valmarska, Marko Robnik-Šikonja, and Nada Lavrac

 

SESSION 2D: Outlier and Anomaly Detection

Session Chair: Chih-Ya Shen
Location: Sunflower Ballroom B

 

14:00 14:25

Contextual Anomaly Detection Using Log-linear Tensor Factorization (L)

Alpa Jayesh Shah, Christian Desrosiers, and Robert Sabourin

 

14:25 14:50

A Semi-supervised Framework for Social Spammer Detection (L)

Zhaoxing Li, Xianchao Zhang, Hua Shen, Wenxin Liang, and Zengyou He

 

14:50 15:10

Fast One-Class Support Vector Machine for Novelty Detection (R)

Trung Le, Dinh Phung, Khanh Nguyen, and Svetha Venkatesh

 

15:10 15:30

ND-SYNC: Detecting Synchronized Fraud Activities (R)

Maria Giatsoglou, Despoina Chatzakou, Neil Shah, Alex Beutel,Christos Faloutsos, and Athena Vakali

 

15:30 15:50

An Embedding Scheme for Detecting Anomalous Block Structured Graphs (R)

Lida Rashidi, Sutharshan Rajasegarar, and Christopher Leckie

 

15:50 16:10

A Core-attach Based Method for Identifying Protein Complexes in Dynamic PPI Networks (R)

Jiawei Luo, Chengchen Liu, and Hoang Tu Nguyen

 
   

May 21, 2015 (Thursday)

 

09:00 - 10:00

 

KEYNOTE II: Topic Modeling with More Confidence - A Theory and Some Algorithms

Xuan Long Nguyen
Session Chair: Dinh Phung
Location: Lotus Ballroom

 

10:20 - 12:30

 

SESSION 3A: Social Networks and Social Media
Session Chair: Jun Luo
Location: Lotus Ballroom A

 

10:20

10:45

Multiple Factors-Aware Diffusion in Social Networks (L)

Chung-Kuang Chou and Ming-Syan Chen

 

10:45

 11:05

Understanding Community Effects on Information Diffusion (R)

Shuyang Lin, Qingbo Hu, Guan Wang, and Philip S. Yu

 

11:05

11:25

On Burst Detection and Prediction in Retweeting Sequence (R)

Zhilin Luo, Yue Wang, Xintao Wu, Wandong Cai, and Ting Chen

 

11:25

11:45

Few Things About Idioms: Understanding Idioms and its users in the Twitter Online Social Network (R)

Koustav Rudra, Abhijnan Chakraborty, Manav Sethi, Shreyasi Das, Niloy Ganguly, and Saptarshi Ghosh

 

11:45

12:05

Retweeting activity on Twitter: Signs of Deception (R)

Maria Giatsoglou, Despoina Chatzakou, Neil Shah, Christos Faloutsos, and Athena Vakali

 

12:05

 12:25

Resampling-based Gap Analysis for Detecting Nodes with High Centrality on Large Social Network (R)

Kouzou Ohara, Kazumi Saito, Masahiro Kimura, and Hiroshi Motoda

 

SESSION 3B: Classification
Session Chair: Graham Williams
Location: Lotus Ballroom B

 

10:20

10:40

Prediciton of Emergency Events: A Multi-task Multi-label learning Approach (R)

Budhaditya Saha, Sunil Kumar Gupta, and Svetha Venkatesh

 

10:40

11:00

Nearest Neighbor Method Based on Local Distribution for Classification (R)

Chengsheng Mao, Bin Hu, Philip Moore, Yun Su, and Manman Wang

 

11:00

11:20

Immune Centroids Over-Sampling Method for Multi-Class Classification (R)

Xusheng Ai, Jian Wu, Victor S. Sheng, Pengpeng Zhao, Yufeng Yao, and Zhiming Cui

 

11:20

11:40

Optimizing Classifiers for Hypothetical Scenarios(R)

Reid A. Johnson, Troy Raeder, and Nitesh V. Chawla

 

11:40

12:00

Repulsive-SVDD Classification (R)

Phuoc Nguyen and Dat Tran

 

12:00

12:20

Centroid-Means-Embedding: an Approach to Infusing Word Embeddings into Features for Text Classification (R)

Mohammad Golam Sohrab, Makoto Miwa, and Yutaka Sasaki

 

SESSION 3C: Machine Learning
Session Chair: Takashi Washio
Location: Sunflower Ballroom A

 

10:20

10:45

Context-aware Detection of Sneaky Vandalism on Wikipedia across Multiple Languages (L)

Khoi-Nguyen Tran, Peter Christen, Scott Sanner, and Lexing Xie

 

10:45

11:05

Uncover the Latent Structures of Crowd Labeling (R)

Tian Tian and Jun Zhu

 

11:05

11:25

Use Correlation Coefficients in Gaussian Process to Train Stable ELM Models (R)

Yulin He, Joshua Zhexue Huang, Xizhao Wang, and Rana Aamir Raza

 

11:25

11:45

Local Adaptive and Incremental Gaussian Mixture for Online Density Estimation (R)

Tianyu Qiu, Furao Shen, and Jinxi Zhao

 

11:45

12:05

Latent Space Tracking from Heterogeneous Data with an Application for Anomaly Detection (R)

Jiaji Huang and Xia Ning

 

12:05

 12:25

A Learning-rate Schedule for Stochastic Gradient Methods to Matrix Factorization (R)

Wei-Sheng Chin, Yong Zhuang, Yu-Chin Juan, and Chih-Jen Lin

 

SESSION 3D: Applications
Session Chair: Vincent Tseng
Location: Sunflower Ballroom B

 

10:20

10:45

Learning of Performance Measures from Crowd-sourced Data with Application to Ranking of Investments (L)

Greg Harris, Anand Panangadan, and Viktor K. Prasanna

 

10:45

11:10

Hierarchical Dirichlet Process for Tracking Complex Topical Structure Evolution and Its Application to Autism Research Literature (L)

Adham Beykikhoshk, Ognjen Arandjelovic ´, Svetha Venkatesh,and Dinh Phung

 

11:10

11:30

Automated Detection for Probable Homologous Foodborne Disease Outbreaks (R)

Xiao Xiao, Yong Ge, Yunchang Guo, Danhuai Guo, Yi Shen,Yuanchun Zhou, and Jianhui Li

 

11:30

11:50

Identifying Hesitant and Interested Customers for Targeted Social Marketing (R)

Guowei Ma, Qi Liu, Le Wu, and Enhong Chen

 

11:50

12:10

Activity-Partner Recommendation (R)

Wenting Tu, David W. Cheung, Nikos Mamoulis, Min Yang,and Ziyu Lu

 

12:10

12:30

Iterative Use of Weighted Voronoi Diagrams to Improve Scalability in Recommender Systems (R)

Joydeep Das, Subhashis Majumder, Debarshi Dutta, and Prosenjit Gupta

 

14:00 - 15:00

 

KEYNOTE III: Direct Change Detection without Identification

Masashi Sugiyama
Session Chair:
Hiroshi Motoda
Location: Lotus Ballroom

 

15:20 - 17:30 

 

SESSION 4A: Mining Uncertain and Imprecise Data
Session Chair: Philippe Lenca
Location: Lotus Ballroom A

 

15:20 15:45

Mining Uncertain Sequential Patterns in iterative MapReduce (L)

Jiaqi Ge, Yuni Xia, and Jian Wang

 

15:45 16:10

Quality Control for Crowdsourced POI Collection (L)

Shunsuke Kajimura, Yukino Baba, Hiroshi Kajino, and Hisashi Kashima

 

16:10 16:30

Towards Efficient Sequential Pattern Mining in Temporal Uncertain Databases (R)

Jiaqi Ge, Yuni Xia, and Jian Wang

 

16:30 16:50

Preference-based top-k representative skyline queries on uncertain databases (R)

Ha Thanh Huynh Nguyen and Jinli Cao

 

16:50 17:10

Cluster Sequence Mining: Causal Inference with Time and Space Proximity under Uncertainty (R)

Yoshiyuki Okada, Ken-ichi Fukui, Koichi Moriyama,and Masayuki Numao

 

17:10 17:30

Achieving Accuracy Guarantee for Answering Batch Queries with Differential Privacy (R)

Dong Huang, Shuguo Han, and Xiaoli Li

 

SESSION 4B: Mining Temporal and Spatial Data
Session Chair: Marwan Hassani
Location: Lotus Ballroom B

 

15:20 15:45

Automated Classification of Passing in Football (L)

Michael Horton, Joachim Gudmundsson, Sanjay Chawla, and Joël Estephan

 

15:45 16:10

Stabilizing Sparse Cox Model using Statistic and Semantic Structures in Electronic Medical Records (L)

Shivapratap Gopakumar, Tu Dinh Nguyen, Truyen Tran, Dinh Phung, and Svetha Venkatesh

 

16:10

16:30

Semi Supervised Adaptive Framework for Classifying Evolving Data Stream (R)

Ahsanul Haque, Latifur Khan, and Michael Baron

 

16:30 16:50

Predicting Next Locations with Object Clustering and Trajectory Clustering (R)

Meng Chen, Yang Liu, and Xiaohui Yu

 

16:50 17:10

A Plane Moving Average Algorithm for Short-Term Traffic Flow Prediction (R)

Lei Lv, Meng Chen, Yang Liu, and Xiaohui Yu

 

17:10

17:30

Recommending Profitable Taxi Travel Routes based on Big Taxi Trajectories Data (R)

Wenxin Yang, Xin Wang, Seyyed Mohammadreza Rahimi,and Jun Luo

 
     

SESSION 4C: Novel Methods and Algorithms
Session Chair: Hung Son Nguyen
Location: Sunflower Ballroom A

 

15:20

15:45

Boosting via Approaching Optimal Margin Distribution (L)

Chuan Liu and Shizhong Liao

 

15:45 16:05

o-HETM: An Online Hierarchical Entity Topic Model for News Streams (R)

Linmei Hu, Juanzi Li, Jing Zhang, and Chao Shao

 

16:05 16:25

Modeling User Interest and Community Interest in Microbloggings: An Integrated Approach (R)

Tuan-Anh Hoang

 

16:25 16:45

Minimal Jumping Emerging Patterns: Computation and Practical Assessment (R)

Bamba Kane, Bertrand Cuissart, and Bruno Crémilleux

 

16:45 17:05

Rank matrix factorisation (R)

Thanh Le Van, Matthijs van Leeuwen, Siegfried Nijssen,and Luc De Raedt

 

17:05 17:25

An Empirical Study of Personal Factors and  Social Effects on Rating Prediction (R)

Zhijin Wang, Yan Yang, Qinmin Hu, and Liang He

 

SESSION 4D: Feature Extraction and Selection
Session Chair: Vinh Nguyen
Location: Sunflower Ballroom B

 

15:20 15:45

Cost-sensitive Feature Selection on Heterogeneous Data (L)

Wenbin Qian, Wenhao Shu, Jun Yang, and Yinglong Wang

 

15:45 16:05

A Feature Extraction Method for Multivariate Time Series Classification Using Temporal Patterns (R)

Pei-Yuan Zhou and Keith C.C. Chan

 

16:05

16:25

Scalable Outlying-Inlying Aspects Discovery via Features Ranking (R)

Nguyen Xuan Vinh, Jeffrey Chan, James Bailey, Christopher Leckie, Kotagiri Ramamohanarao, and Jian Pei

 

16:25 16:45

A DC Programming Approach for Sparse Optimal Scoring (R)

Hoai An Le Thi and Duy Nhat Phan

 

16:45

17:05

Graph Based Relational Features for Collective Classification (R)

Immanuel Bayer, Uwe Nagel, and Steffen Rendle

 

17:05 17:25

A New Feature Sampling Method in Random Forests for Prediction High Dimensional Data (R)

Thanh-Tung Nguyen, He Zhao, Joshua Zhexue Huang, Thuy Thi Nguyen, and Mark Junjie Li

 
   

May 22, 2015 (Friday)

 

09:00 - 11:05

 

SESSION 5A: Mining Heterogeneous, High Dimensional, and Sequential Data
Session Chair: Alfredo Cuzzocrea
Location: Lotus Ballroom A

 

09:00

09:20

Seamlessly Integrating Effective Links with Attributes for Networked Data Classification (R)

Yangyang Zhao, Zhengya Sun, Changsheng Xu, and Hongwei Hao

 

09:20

09:40

Clustering on Multi-source Incomplete Data via Tensor Modeling and Factorization (R)

Weixiang Shao, Lifang He, and Philip S. Yu

 

09:40 10:00

Locally Optimized Hashing for Nearest Neighbor Search (R)

Seiya Tokui, Issei Sato, and Hiroshi Nakagawa

 

10:00 10:20

Do-Rank: DCG Optimization for Learning-to-Rank in Tag-based Item Recommendation Systems (R)

Noor Ifada and Richi Nayak

 

10:20 10:40

Efficient Discovery of Recurrent Routine Behaviours in Smart Meter Time Series by Growing Subsequences (R)

Jin Wang, Rachel Cardell-Oliver, and Wei Liu

 

10:40 11:00

Convolutional Nonlinear Neighbourhood Components Analysis for Time Series Classification (R)

Yi Zheng, Qi Liu, Enhong Chen, J. Leon Zhao, Liang He, and Guangyi Lv

 

SESSION 5B: Entity Resolution and Topic Modelling
Session Chair: Minh-Huyen Nguyen-Thi
Location: Lotus Ballroom B

 

09:00 09:20

Clustering-based Scalable Indexing for Multi-party Privacy-preserving Record Linkage (L)

Thilina Ranbaduge, Dinusha Vatsalan, and Peter Christen

 

09:20 09:40

Efficient Interactive Training Selection for Large-scale Entity Resolution (R)

Qing Wang, Dinusha Vatsalan, and Peter Christen

 

09:40 10:00

Unsupervised Blocking Key Selection for Real-Time Entity Resolution (R)

Banda Ramadan and Peter Christen

 

10:00 10:20

Incorporating Probabilistic Knowledge into Topic Models (R)

Liang Yao, Yin Zhang, Baogang Wei, Hongze Qian, and Yibing Wang

 

10:20 10:40

Learning Focused Hierarchical Topic Models with Semi-Supervision in Microblogs (R) Anton Slutsky, Xiaohua Hu, and Yuan An

 

10:40 11:00

Predicting Future Links Between Disjoint Research Areas Using Heterogeneous Bibliographic Information Network (R)

Yakub Sebastian, Eu-Gene Siew, and Sylvester Olubolu Orimaye

 

SESSION 5C: Itemset and High Performance Data Mining
Session Chair: Philippe Fournier-Viger
Location: Sunflower Ballroom A

 

09:00 09:20

CPT+: Decreasing the time/space complexity of the Compact Prediction Tree (L)

Ted Gueniche, Philippe Fournier-Viger, Rajeev Raman,and Vincent S. Tseng

 

09:20 09:40

Mining Association Rules in Graphs based on Frequent Cohesive Itemsets (R)

Tayena Hendrickx, Boris Cule, Pieter Meysman, Stefan Naulaerts, Kris Laukens, and Bart Goethals

 

09:40 10:00

Mining High Utility Itemsets in Big Data (R)

Ying Chun Lin, Cheng-Wei Wu, and Vincent S. Tseng

 

10:00 10:20

Decomposition Based SAT Encodings for Itemset Mining Problems (R)

Said Jabbour, Lakhdar Sais, and Yakoub Salhi

 

10:20 10:40

A Comparative Study on Parallel LDA Algorithms in MapReduce Framework (R)

Yang Gao, Zhenlong Sun, Yi Wang, Xiaosheng Liu, Jianfeng Yan, and Jia Zeng

 

10:40 11:00

Distributed Newton Methods for Regularized Logistic Regression (R)

Yong Zhuang, Wei-Sheng Chin, Yu-Chin Juan, and Chih-Jen Lin

 

SESSION 5D: Recommendation
Session chair: Xin Wang
Location: Sunflower Ballroom B

 

09:00 09:25

Coupled Matrix Factorization within Non-IID Context (L)

Fangfang Li, Guandong Xu, and Longbing Cao

 

09:25 09:50

Complementary usage of Tips and Reviews for Location Recommendation in Yelp (L)

Saurabh Gupta, Sayan Pathak, and Bivas Mitra

 

09:50 10:15

Coupling Multiple Views of Relations for Recommendation (L)

Bin Fu, Guandong Xu, Longbing Cao, Zhihai Wang, and Zhiang Wu

 

10:15 10:35

Pairwise one class recommendation algorithm (R)

Huimin Qiu, Chunhong Zhang, and Jiansong Miao

 

10:35 10:55

RIT: Enhancing Recommendation with Inferred Trust (R)

Guo Yan, Yuan Yao, Feng Xu, and Jian Lu