Long, regular and short paper presentations are denoted by [pink],
[violet] and [green] respectively. Each long paper presentation is
allocated with 30 minutes, with 25 minutes for presentation and 5
minutes for questions; each regular paper presentation is allocated with
20 minutes, with 17 minutes for presentation and 3 minutes for
questions; and each short paper presentation is allocated with 15
minutes, with 12 minutes for presentation and 3 minutes for questions. |
|
|
|
|
|
May 21 |
8:30 - 9:00 |
Opening |
|
|
|
9:00 - 10:00 |
Keynote Speech |
|
|
|
|
Chair: Einoshin Suzuki |
|
|
|
|
Christos Faloutsos; Graph Mining: Laws, Generators and Tools |
|
|
|
(Room A,B,C) |
|
|
|
10:00 - 10:20 |
Coffee Break |
|
|
|
10:20 - 11:50 |
Session 1A: Privacy Preserving Data Mining |
Session 1B: Web Mining |
Session 1C: Clustering 1 |
Session 1D: Network Mining |
|
(Room A) |
(Room B) |
(Room C) |
(Room D) |
|
Chair: Mei Kobayashi |
Chair: Vincent S. Tseng |
Chair: Saso Dzeroski |
Chair: Yifeng Zeng |
|
Protecting Privacy in Incremental Maintenance for Distributed
Association Rule Mining |
SEM: Mining Spatial Events from the Web |
A Clustering-Oriented Star Coordinate Translation Method for Reliable
Clustering Parameterization |
Mining Bulletin Board Systems Using Community Generation |
|
Wai Kit Wong, David Wai Lok Cheung, Edward Hung, and Huan Liu |
Kaifeng Xu, Rui Li, Shenghua Bao, Dingyi Han, and Yong Yu |
Chieh-Yuan Tsai and Chuang-Cheng Chiu |
Ming Li, Zhongfei (Mark) Zhang, and Zhi-Hua Zhou |
|
On Addressing Accuracy Concerns in Privacy Preserving Association Rule
Mining |
Person Name Disambiguation in Web Pages using Social Network; Compound
Words and Latent Topics |
Constrained Clustering for Gene Expression Data Mining |
Mining Changes in Patent Trends for Competitive Intelligence |
|
Ling Guo, Songtao Guo, and Xintao Wu |
Shingo Ono, Issei Sato, Minoru Yoshida, and Hiroshi Nakagawa |
Vincent S. Tseng, Lien-Chin Chen, and Ching-Pin Kao |
Meng-Jung Shih, Duen-Ren Liu, and Ming-Li Hsu |
|
On Privacy in Time Series Data Mining |
Fighting WebSpam: Detecting spam on the Graph via Content and Link
Features |
G-TREACLE: A New Grid-based and Tree-alike Pattern Clustering Technique
for Large Databases |
Structure-based hierarchical transformations for interactive visual
exploration of social networks |
|
Ye Zhu, Yongjian Fu, and Huirong Fu |
Yu-Jiu Yang, Shuang-Hong Yang, and Bao-Gang Hu |
Cheng-Fa Tsai and Chia-Chen Yen |
Lisa Singh, Mitchell Beard, Brian Gopalan, and Gregory Nelson |
|
|
Using Ontology-Based User Preferences to Aggregate Rank Lists in Web
Search |
R-map: Mapping Categorical Data for Clustering and Visualization based
On Reference Sets |
Analyzing the Propagation of Influence and Concept Evolution in
Enterprise Social Networks Through Centrality and Latent Semantic
Analysis |
|
|
Lin Li, Zhenglu Yang, and Masaru Kitsuregawa |
Zhi-Yong Shen, Ming Li, Yi-Dong Shen, and Jun Sun |
Weizhong Zhu, Chaomei Chen, and Robert B. Allen |
|
|
|
Efficient Joint Clustering Algorithms in Optimization and Geography
Domains |
A Framework for Discovering Spatio-Temporal Cohesive Networks |
|
|
|
Chia-Hao Lo and Wen-Chih Peng |
Jin Soung Yoo and Joengmin Hwang |
11:50 - 13:20 |
Lunch |
|
|
|
13:20 - 14:20 |
Invited Talk |
|
|
|
|
Chair: David Cheung |
|
|
|
|
Hiroki Arimura; Efficient Algorithms for Mining Frequent and Closed
Patterns from Semi-structured Data |
|
|
(Room A,B,C) |
|
|
|
14:20 - 14:40 |
Coffee Break |
|
|
|
14:40 - 16:10 |
Session 2A: Feature Selection and Construction |
Session 2B: Clustering 2 |
Session 2C: Frequent Itemset 1 |
Session 2D: Sequence Data Mining 1 |
|
(Room A) |
(Room B) |
(Room C) |
(Room D) |
|
Chair: Michel Verleysen |
Chair: Xintao Wu |
Chair: Chi Ming Kao |
Chair: Takeaki Uno |
|
Feature Selection by Nonparametric Bayes Error Minimization |
Large-scale k-means Clustering with User-Centric Privacy Preservation |
LCM over ZBDDs: Fast Generation of Very Large-Scale Frequent Itemsets
Using a Compact Graph-Based Representation |
A Framework for Modeling Positive Class Expansion with Single Snapshot |
|
Shuang-Hong Yang and Bao-Gang Hu |
Jun Sakuma and Shigenobu Kobayashi |
Shin-ichi Minato, Takeaki Uno, and Hiroki Arimura |
Yang Yu and Zhi-Hua Zhou |
|
Feature construction based on closedness properties is not that simple |
Scaling Record Linkage to Non-Uniform Distributed Class Sizes |
Efficient Mining of High Utility Itemsets from Large Datasets |
Concept Lattice Based Mutation Control for Reactive Motifs Discovery |
|
Dominique Gay, Nazha Selmaoui, and Jean-Francois Boulicaut |
Steffen Rendle and Lars Schmidt-Thieme |
Alva Erwin, Raj P. Gopalan, and Narasimaha Achuthan |
Kitsana Waiyamai, Peera Liewlom, Thanapat Kangkachit, and Thanawin
Rakthanmanon |
|
Generation of Globally Relevant Continuous Features for Classification |
Clustering Transaction Datasets Using Seeds |
FIsViz: A Frequent Itemset Visualizer |
A Simple Characterization on Serially Constructible Episodes |
|
Sylvain Letourneau, Stan Matwin, and A. Fazel Famili |
Yun Sing Koh and Russel Pears |
Carson Kai-Sang Leung, Pourang P. Irani, and Christopher L. Carmichael |
Takashi Katoh and Kouichi Hirata |
|
|
I/O Scalable Bregman Co-clustering |
A Tree-Based Approach for Frequent Pattern Mining from Uncertain Data |
Semantic Video Annotation by Mining Association Patterns from Visual and
Speech Features |
|
|
Kuo-Wei Hsu, Arindam Banerjee, and Jaideep Srivastava |
Carson Kai-Sang Leung, Mark Anthony F. Mateo, and Dale A. Brajczuk |
Vincent S. Tseng, Ja-Hwung Su, Jhih-Hong Huang, and Chih-Jen Chen |
16:10 - 16:30 |
Coffee Break |
|
|
|
16:30 - 18:00 |
Session 3A: Frequent Itemset 2 |
Session 3B: Subspace Clustering |
Session 3C: Decision Tree and |
Session 3D: Relational and Network Mining |
|
(Room A) |
(Room B) |
(Room C) Class Imbalance Problem |
(Room D) |
|
Chair: Carson K. Leung |
Chair: Peer Kröger |
Chair: Zhi-Hua Zhou |
Chair: Akihiro Yamamoto |
|
Ambiguous Frequent Itemset Mining and Polynomial Delay Enumeration |
Mining Quality-Aware Subspace Clusters |
BOAI: Fast Alternating Decision Tree Induction based on Bottom-up
Evaluation |
Tracking Topic Evolution in On-line Postings: 2006 IBM Innovation Jam
data |
|
Takeaki Uno and Hiroki Arimura |
Ying-Ju Chen, Yi-Hong Chu, and Ming-Syan Chen |
Bishan Yang, Tengjiao Wang, Dongqing Yang, and Lei Chang |
Mei Kobayashi and Raylene Yung |
|
A Decremental Approach for Mining Frequent Itemsets from Uncertain Data |
SubClass: Classification of Multidimensional Noisy Data Using Subspace
Clusters |
A comparison of different off-centered entropies to deal with class
imbalance for decision trees |
Entity Network Prediction using Multitype Topic Models |
|
Chun-Kit Chui and Ben Kao |
Ira Assent, Ralph Krieger, Petra Welter, Jorg Herbers, and Thomas Seidl |
Philippe Lenca, Stephane Lallich, Thanh-Nghi Do, and Nguyen-Khang Pham |
Hitohiro Shiozaki, Koji Eguchi, and Takenao Ohkawa |
|
A Cluster-Based Genetic-Fuzzy Mining Approach for Items with Multiple
Minimum Supports |
A Creditable Subspace Labeling Method based on D-S Evidence Theory |
Analyzing PETs on Imbalanced Datasets when Training and Testing Class
Distributions Differ |
Relational pattern mining based on equivalent classes of properties
extracted from samples |
|
Chun-Hao Chen, Tzung-Pei Hong, and Vincent S. Tseng |
Yu Zong, Xianchao Zhang, He Jiang, and Mingchu Li |
David Cieslak and Nitesh Chawla |
Nobuhiro Inuzuka, Jun-ichi Motoyama, Shinpei Urazawa, and Tomofumi
Nakano |
|
CP-tree: A Tree Structure for Single-Pass Frequent Pattern Mining |
|
A New Credit Scoring Method Based on Rough Sets and Decision Tree |
Exploiting Propositionalization based on Random Relational Rules for
Semi-Supervised Learning |
|
Syed Khairuzzaman Tanbeer, Chowdhury Farhan Ahmed, Byeong-Soo Jeong, and
Young-Koo Lee |
|
XiYue Zhou, DeFu Zhang, and Yi Jiang |
Grant Anderson and Bernhard Pfahringer |
|
|
|
|
|
May 22 |
9:00 - 10:00 |
Invited Talk |
|
|
|
|
Chair: Graham Williams |
|
|
|
|
Michael R. Berthold; Supporting Creativity: Towards Associative
Discovery of New Insights |
|
|
(Room A,B,C) |
|
|
|
10:00 - 10:20 |
Coffee Break |
|
|
|
10:20 - 11:55 |
Session 4A: Outlier Detection |
Session 4B: SVM and Regression |
Session 4C: Rule Discovery |
Session 4D: Feature and Instance Selection |
|
(Room A) |
(Room B) |
(Room C) |
(Room D) |
|
Chair: Arthur Zimek |
Chair: Masashi Sugiyama |
Chair: Bernhard Pfahringer |
Chair: Md Rafiul Hassan |
|
Unusual Pattern Detection in High Dimensions |
Extreme Support Vector Machine |
Minimum Variance Associations --- Discovering Relationships in Numerical
Data |
Automatic Training Example Selection for Unsupervised Record Linkage |
|
Minh Nguyen, Leo Mark, and Edward Omiecinski |
Qiuge Liu, Qing He, and Zhongzhi Shi |
Szymon Jaroszewicz |
Peter Christen |
|
Unsupervised Change Analysis using Supervised Learning |
A Minimal Description Length Scheme for Polynomial Regression |
Mining a Complete Set of both Positive and Negative Association Rules
from Large Databases |
Sparse Kernel-based Feature Weighting |
|
Shohei Hido, Tsuyoshi Ide, Hisashi Kashima, Harunobu Kubo, and Hirofumi
Matsuzawa |
Aleksandar Pekov, Saso Dzeroski, and Ljuptuo Todorovski |
Hao Wang, Xing Zhang, and Guoqing Chen |
Shuang-Hong Yang, Yu-Jiu Yang Yang, and Bao-Gang Hu |
|
Improving the Robustness to Outliers of Mixtures of Probabilistic PCAs |
Bootstrap based Pattern Selection for Support Vector Regression |
Combined Association Rule Mining |
A More Topologically Stable Locally Linear Embedding Algorithm Based on
R*-Tree |
|
Nicolas Delannay, Cedric Archambeau, and Michel Verleysen |
Dongil Kim and Sungzoon Cho |
Huaifeng Zhang, Yanchang Zhao, Longbing Cao, and Chengqi Zhang |
Tian Xia, Jintao Li, Yongdong Zhang, and Sheng Tang |
|
Cell-based Outlier Detection Algorithm: A Fast Outlier Detection
Algorithm for Large Datasets |
Customer Churn Time Prediction in Mobile Telecommunication Industry
using Ordinal Regression |
Mining Non-Coincidental Rules Without A User Defined Support Threshold |
Locally Linear Online Mapping for Mining Low-Dimensional Data Manifolds |
|
You Wan and Fuling Bian |
Rupesh Gopal and Saroj Meher |
Yun Sing Koh |
Huicheng Zheng, Wei Shen, Qionghai Dai, and Sanqing Hu |
|
|
|
Rule Extraction with Rough-Fuzzy Hybridization Method |
A Selective Classifier for Incomplete Data |
|
|
|
Nan-Chen Hsieh |
Jingnian Chen, Houkuan Huang, Fengzhan Tian, and Shengfeng Tian |
11:55 - 13:00 |
Lunch |
|
|
|
13:00 - 18:00 |
Excursion |
|
|
|
|
|
|
|
|
18:30 - 22:00 |
Banquet |
|
|
|
|
|
|
|
|
May 23 |
9:00 - 10:00 |
Invited Talk |
|
|
|
|
Chair: Huan Liu |
|
|
|
|
Genshiro Kitagawa; Prospective Scientific Methodology in Knowledge
Society |
|
|
|
(Room A,B,C) |
|
|
|
10:00 - 10:20 |
Coffee Break |
|
|
|
10:20 - 11:50 |
Session 5A: Spatial and Image Data Mining |
Session 5B: Sequence Data Mining 2 |
Session 5C: Semi-Supervised Learning |
Session 5D: Application 1 |
|
(Room A) |
(Room B) |
(Room C) |
(Room D) |
|
Chair: Jin Soung Yoo |
Chair: Shin-ichi Minato |
Chair: Nitesh V. Chawla |
Chair: Masayuki Numao |
|
Towards Region Discovery in Spatial Datasets |
An Efficient Algorithm for Finding Similar Short Substrings from Large
Scale String Data |
Semi-Supervised Local Fisher Discriminant Analysis for Dimensionality
Reduction |
Data-Aware Clustering Hierarchy for Wireless Sensor Networks |
|
Wei Ding, Rachsuda Jiamthapthaksin, Rachana Parmar, Dan Jiang, Tomasz
Stepinski, and Christoph Eick |
Takeaki Uno |
Masashi Sugiyama, Tsuyoshi Ide, Shinichi Nakajima, and Jun Sese |
Xiaochen Wu, Peng Wang, Wei Wang, and Baile Shi |
|
ANEMI: An Adaptive Neighborhood Expectation-Maximization Algorithm with
Spatial Augmented Initialization |
Accurate and Efficient Retrieval of Multimedia Time Series Data under
Uniform Scaling and Time Warping |
Using Supervised and Unsupervised Techniques to Determine Groups of
Patients with Different Continuity of Care |
Learning User Purchase Intent From User-Centric Data |
|
Tianming Hu, Hui Xiong, Xueqing Gong, and Sam Yuan Sung |
Waiyawuth Euachongprasit and Chotirat Ann Ratanamahatana |
Eu-Gene Siew, Leonid Churilov, Kate A. Smith-Miles, and Joachim P.
Sturmberg |
Rajan Lukose, Jiye Li, Jing Zhou, and Satyanarayana Raju Penmetsa |
|
A New Model for Image Annotation |
Characteristic-based Descriptors for Motion Sequence Recognition |
Forward Semi-Supervised Feature Selection |
Exploratory Hot Spot Profile Analysis using an Interactive Visual
Drill-Down Self-Organizing Maps |
|
Sanparith Marukatat |
Liang Wang, Xiaozhe Wang, Christopher Leckie, and Ramamohanarao Kotagiri |
Jiangtao Ren, Zhengyuan Qiu, Wei Fan, Hong Cheng, and Philip S. Yu |
Denny, Graham Williams and Peter Christen |
|
Jumping Emerging Patterns with Occurrence Count in Image Classification |
|
Active Learning with Misclassification Sampling Using Diverse Ensembles
Enhanced by Unlabeled Instances |
Discovering New Orders of the Chemical Elements through Genetic
Algorithms |
|
Lukasz Kobylinski and Krzysztof Walczak |
|
Jun Long, Jianping Yin, En Zhu, and Wentao Zhao |
Alexandre Blansche and Shuichi Iwata |
|
|
|
|
Combining Context and Existing Knowledge When Recognizing Biological
Entities -- Early results |
|
|
|
|
Mika Timonen and Antti Pesonen |
11:50 - 13:20 |
Lunch |
|
|
|
13:20 - 14:20 |
Invited Talk |
|
|
|
|
Chair: Kai Ming Ting |
|
|
|
|
Robert C. Holte; Cost-sensitive Classifier Evaluation using Cost Curves |
|
|
|
(Room A,B,C) |
|
|
|
14:20 - 14:40 |
Coffee Break |
|
|
|
14:40 - 16:10 |
Session 6A: Classification 1 |
Session 6B: Graph and Network Mining |
Session 6C: Statistical Methods |
Session 6D: Text Mining 1 |
|
(Room A) |
(Room B) |
(Room C) |
(Room D) |
|
Chair: Ulf Johansson |
Chair: Michael Berthold |
Chair: Szymon Jaroszewicz |
Chair: Manabu Okumura |
|
Multi-Class Named Entity Recognition via Bootstrapping with Dependency
Tree-based Patterns |
A Mixture Model for Expert Finding |
A Decomposition Algorithm for Learning Bayesian Network Structures from
Data |
Applying Latent Semantic Indexing in Frequent Itemset Mining for
Document Relation Discovery |
|
Van Dang and Akiko Aizawa |
Jing Zhang, Jie Tang, Liu Liu, and Juanzi Li |
Yifeng Zeng and Jorge Cordero Hernandez |
Thanaruk Theeramunkong, Kritsada Sriphaew, and Manabu Okumura |
|
An efficient unordered tree kernel and its application to glycan
classification |
Mining Correlated Subgraphs in Graph Databases |
Tradeoff Analysis of Different Markov Blanket Local Learning Approaches |
Enriching WordNet with Folksonomies |
|
Tetsuji Kuboyama, Kouichi Hirata, and Kiyoko F. Aoki-Kinoshita |
Tomonobu Ozaki and Takenao Ohkawa |
Shunkai Fu and Michel C. Desmarais |
Hao Zheng, Xian Wu, and Yong Yu |
|
Learning Rules for Multiple Target Classification |
Efficient Mining of Minimal Distinguishing Subgraph Patterns from Graph
Databases |
Query expansion for the language modelling framework using the naive
Bayes assumption |
Automatic Extraction of Basis Expressions that Indicate Economic Trends |
|
Bernard Zenko and Saso Dzeroski |
Zhiping Zeng, Jianyong Wang, and Lizhu Zhou |
Laurence Park and Kotagiri Ramamohanarao |
Hiroki Sakaji, Hiroyuki Sakai, and Shigeru Masuyama |
|
|
What is Frequent in a Single Graph |
On Discrete Data Modeling |
Seeing several stars: a rating inference task for a document containing
several evaluation criteria |
|
|
Bjoern Bringmann and Siegfried Nijssen |
Nizar Bouguila and Walid Elguebaly |
Kazutaka Shimada and Tsutomu Endo |
16:10 - 16:30 |
Coffee Break |
|
|
|
16:30 - 18:00 |
Session 7A: Classification 2 |
Session 7B: Stream Mining |
Session 7C: Application 2 |
Session 7D: Text Mining 2 |
|
(Room A) |
(Room B) |
(Room C) |
(Room D) |
|
Chair: Robert C. Holte |
Chair: Hiroki Arimura |
Chair: Takashi Okada |
Chair: Dirk E. Van den Poel |
|
Privacy-Preserving Linear Fisher Discriminant Analysis |
Handling Numeric Attributes in Hoeffding Trees |
Designing a system for a process parameter determined through modified
PSO and fuzzy neural network |
Term Committee Based Event Identification Within News Topics |
|
Shuguo Han and Wee Keong Ng |
Bernhard Pfahringer, Geoff Holmes, and Richard Kirkby |
Jui-Tsung Wong, Kuei-Hsien Chen, and Chwen-Tzeng Su |
Kuo Zhang, JuanZi Li, Gang Wu, and KeHong Wang |
|
Evaluating Standard Techniques for Implicit Diversity |
Maintaining Optimal Multi-way Splits for Numerical Attributes in Data
Streams |
Forecasting Urban Air Pollution Using HMM-fuzzy Model |
A New Framework for Taxonomy Discovery from Text |
|
Ulf Johansson, Tuve Lofstrom, and Lars Niklasson |
Tapio Elomaa and Petri Lehtinen |
M. Maruf Hossain, Md. Rafiul Hassan, and Michael Kirley |
Ahmad El Sayed, Hakim Hacid, and Djamel Zighed |
|
Local Projection in Jumping Emerging Patterns Discovery in Transaction
Databases |
Connectivity Based Stream Clustering Using Localised Density Exemplars |
PAID: Packet Analysis for Anomaly Intrusion Detection |
Detecting Near-Duplicates in Large-Scale Short Text Databases |
|
Pawel Terlecki and Krzysztof Walczak |
Sebastian Luhr and Mihai Lazarescu |
Kuo-Chen Lee, Jason Chang, and Ming-Syan Chen |
Caichun Gong, Yulan Huang, Xueqi Cheng, and Shuo Bai |
|
Fast k Most Similar Neighbor Classifier for Mixed Data based on an
Approximation and Elimination algorithm |
Fast on-line estimation of the joint probability distribution |
Unmixed Spectrum Clustering for Template Composition in Lung Sound
Classification |
Text Categorization of Multilingual Web Pages on Specific Domain |
|
Selene Hernandez Rodriguez, J. Ariel Carrasco-Ochoa, and J. Fco.
Martinez-Trinidad |
Jan Peter Patist |
Tomonari Masada, Senya Kiyasu, and Sueharu Miyahara |
Jicheng Liu and Chunyan Liang |
|
|
|
The Application of Echo State Network in Stock Data Mining |
|
|
|
|
Xiaowei Lin, Zehong Yang, and Yixu Song |
|