Organized by

Sirindhorn International Institute of Technology (SIIT)
Thammasat University
Chulalongkorn University
Asian Institute of Technology (AIT)

Sponsored by

National Electronics and Computer Technology Center (NECTEC), Thailand
ECTI, Thailand
Thailand Convention and Exhibition Bureau (TCEB)
The Air Force Office of Scientific Research, Asian Office of Aerospace Research and Development (AFOSR/AOARD)

PAKDD 2009 Workshop Program

Wokshop Program Summary

Click on the workshop name to see the detailed program.

April 27, 2009
Room Time Workshop Details
Queen’s Park 4 (2nd floor) 08:30 – 17:30 PAISI’09
Queen’s Park 5 (2nd floor) 08:30 – 12:30 ICEC’09
13:30 – 17:30 QIMIE’09
Queen’s Park 6 (2nd floor) 08:30 – 12:00 AIBDM’09
13:00 – 18:30 OSDM’09
(37th floor)
19:00 – 22:00 Workshop Reception

Pacific Asia Workshop on Intelligence and Security Informatics (PAISI’09)

8:30 – 9:10

Opening and Keynote Speech
Session Chair: Hsinchun Chen

Building a Geosocial Semantic Web for Military Stabilization and Reconstruction Operations
Bhavani Thuraisingham

9:10 – 10:15

Session 1: Terrorism Informatics and Crime Analysis
Session Chair: Michael Chau

Criminal Cross Correlation Mining and Visualization
Peter Phillips, Ickjai Lee

A Cybercrime Forensic Method for Chinese Web Information Authorship Analysis
Jianbin Ma, Guifa Teng, Yuxin Zhang, Yueli Li, Ying Li

Prediction of Unsolved Terrorist Attacks Using Group Detection Algorithms *
Fatih Ozgul, Zeki Erdem, Chris Bowerman

10:15 – 10:30 Coffee Break
10:30 – 11:40

Session 2: Enterprise Risk Management
Session Chair: Shu-Shing Li

Exploring Financial Reporting Fraud with GHSOM
Rua-Huan Tsaih, Wan-Ying Lin, Shin-Ying Huang

Identifying Firm-Specific Risk Statements in News Articles
Hsin-Min Lu, Wan-Hsin Huang, Zhu Zhang, Tsai-Jyh Chen

Predicting Future Earnings Change Using Numeric and Textual Information in Financial
Reports *

Kuo-Tay Chen, JuChun Yen, Tsai-Jyh Chen

11: 40 – 12:30

Session 3: Emergency Response and Surveillance
Session Chair: Peter Philips

When Generalized Voronoi Diagrams Meet GeoWeb for Emergency Management
Christopher Torpelund-Bruin, Ickjai Lee

E3TP: A Novel Trajectory Prediction Algorithm in Moving Objects Databases
Teng Long, Shaojie Qiao, Changjie Tang, Liangxu Liu, Taiyong Li, Jiang Wu

12:30 – 14:00 Lunch
14:00 – 15:10

Session 4: Information Access and Security
Session Chair: Paul Kwan

A User-Centered Framework for Adaptive Fingerprint Identification
Paul Kwan, Junbin Gao, Graham Leedham

Design of a Passport Anti-forgery System Based on Digital Signature Schemes
Lei Shi, Shenghui Su, Zhengrong Xiang

A Chronological Evaluation of Unknown Malcode Detection *
Robert Moskovitch, Clint Feher, Yuval Elovici

15:10 – 15:30 Coffee Break
15:30 – 16:40

Session 5: Data and Text Mining
Session Chair: Michael Chau

Relation Discovery from Thai News Articles Using Association Rule Mining
Nichnan Kittiphattanabawon, Thanaruk Theeramunkong

Discovering Compatible Top-K Theme Patterns from Text based on Users' Preferences
Yongxin Tong, Shilong Ma, Dan Yu, Yuanyuan Zhang, Li Zhao, Ke Xu

Juicer: Scalable Extraction for Thread Meta-Information of Web Forum *
Yan Guo, Yu Wang, Guodong Ding, Donglin Cao, Gang Zhang, Yi Lv

16:40 – 17:40

Session 6: Data and Text Mining
Session Chair: Thanaruk Theeramunkong

A Feature-based Approach for Relation Extraction from Thai News Documents *
Nattapong Tongtep, Thanaruk Theeramunkong

An Incremental-Learning Method for Supervised Anomaly Detection by Cascading Service Classifier and ITI Decision Tree Methods *

Wei-Yi Yu, Hahn-Ming Lee
Quantifying News Reports to Proxy Other Information in ERC Models *
Kuo-Tay Chen, Jian-Shuen Lian, Yu-Ting Hsieh

*: Short papers

[ back to top ]

Data Mining When Classes are Imbalanced and Errors Have Costs (ICEC’09)

8:00 – 8:35

Opening Remarks
Nitesh Chawla and Zhi-Hua Zhou

8:35 – 9:25

Keynote Talk
Charles Ling

9:25 – 10:10

Session 1: Theoretical and Empircal Studies

9:25 – 9:40 Data Complexity Analysis for Imbalanced Datasets
Cheng G. Weng and Josiah Poon
9:40 – 9:55 Dealing with Severely Imbalanced Data
William Klement, Szymon Wilk, Wojtek Michaowski, and Stan Matwin
9:55 – 10:10 Learning with Cost Intervals
Xu-Ying Liu and Zhi-Hua Zhou
9:10 – 10:30

Coffee Break

10:30 – 11:30

Session 2: Applications in Imbalance

10:30 – 10:45 An Empirical Study of Applying Ensembles of Heterogeneous Classifiers on Imperfect Data
Kuo-Wei Hsu and Jaideep Srivastava
10:45 – 11:00 SVM based Credit Card Fraud Detection with Reject Cost and Class-dependent Error Cost
En-hui Zheng, Chao Zou, Jian Sun, and Le Chen
11:00 – 11:15 Imbalanced Training Sets and Induction from Multi-Label Text-Categorization Domains
Sareewan Dendamrongvit and Miroslav Kubat
11:15 – 11:30 Outlier Detection in Hyperspectral Imaging Based on Conditional Subspace Models
Edisanter Lo
11:30 – 12:00

Session 3: Mining Streams of Imbalanced Data

11:30 – 11:45 Adaptive Methods for Classification in Arbitrarily Imbalanced and Drifting Data Streams
Ryan Lichtenwalter and Nitesh Chawla
11:30 – 12:00 Classifier Ensemble for Mining Data Streams with Skewed Distribution
Yi Wang, Yang Zhang, and Yong Wang
12:00 – 12:30

Discussion and Conclusion
Nitesh Chawla and Zhi-Hua Zhou

[ back to top ]

Quality Issues, Measures of Interestingness and Evaluation of Data Mining Models (QIMIE’09)

13:30 – 13:35 Opening remarks
13:35 – 14:20

Chair: Kitsana Waiyamai

Interestingness Measures - Limits, Desiderata, and Recent Results.
Invited talk by Prof. Einoshin Suzuki (Kyushu University, Japan)

14:20 – 14:45

Chair: Russel Pears

Confidence Width: An Objective Measure for Association Rule Novelty
José L. Balcázar

14:45 – 15:10

Chair: Russel Pears

DCR: Discretization using Class Information to Reduce Number of Intervals
Prachya Pongaksorn, Thanawin Rakthanmanon, and Kitsana Waiyamai

15:10 – 15:30 Coffee Break
15:30 – 16:15

Chair: Hiroshi Motoda

A framework for monitoring classifiers’ performance: when and why failure occurs.
Invited talk by Prof. Nitesh V. Chawla (University of Notre Dame, USA)

16:15 – 16:40

Chair: José L. Balcázar

A fair will for seeing and believing A feature cardinality driven distance measure to uninformative distributions
Joan Garriga

16:40 – 17:05

Chair: José L. Balcázar

Enhancing Rule Importance Measure Using Concept Hierarchy
Jiye Li, Nick Cercone, Serene W. H. Wong, and Lisa Yan

17:05 – 17:30

Chair: José L. Balcázar

False neighbourhoods and tears are the main mapping defaults. How to avoid it? How to exhibit remaining ones?
Sylvain Lespinats, Michael Aupetit

17:30 – 17:35 Closing remarks & Conclusion

[ back to top ]

Advances and Issues in Biomedical Data Mining (AIBDM’09)

8:55 – 10:15

Opening Remarks

Session 1
Chaired by Paul Kwan

9:00 – 9:25 Using Component Extraction Association Rules for Sensor Data
Di Dong
9:25 – 9:50 Traditional Chinese medicine clinical data mining: experiences and issues
Xuezhong Zhou
9:50 – 10:15 Mining Protein Interactions from Text using Convolution Kernels
Ramanathan Narayanan
10:15 – 10:35

Coffee Break

10:35 – 12:00

Session 2
Chaired by Josiah Poon

10:35 – 11:00 Comparison of Feature Selection Methods for Classification of Brain Computer Interface Data
Irena Koprinska
11:00 – 11:25 Multi-Relational Topic Model for TCM clinical knowledge discovery
Xiaoping Zhang
11:25 – 11:50 Multi-class Discriminant Analysis with Sparse Penalty for Gene Selection and Cancer Classification Using Gene Expression Data
Dao-Qing Dai
11:50 – 12:00

Closing Remarks
by AIBDM09 Chair (A/Prof Junbin Gao)

[ back to top ]

The first Open Source in Data Mining workshop (OSDM’09)

13:00 – 13:10

Opening and Welcome

13:10 – 14:10

Invited keynote

The WEKA open source data mining system
Dr. Mark Hall

14:10 – 15:10

Technical papers (part 1)

OpenSubspace: An Open Source Framework for Evaluation and Exploration of Subspace Clustering Algorithms in WEKA.
Emmanuel Mueller, Ira Assent, Stephan Guennemann, Timm Jansen, Thomas Seidl

The open source library iZi for pattern mining problems.
Frederic Flouvat, Fabien De Marchi, Jean-Marc Petit

15:10 – 15:30 Coffee Break
15:30 – 16:30

Technical papers (part 2)

The Konstanz Information Miner 2.0.
Thorsten Meinl, Nicolas Cebron, Thomas Gabriel, Kilian Thiel, Bernd Wiswedel, Michael Berthold, Fabian Dill, Peter Ohl, Tobias Koetter

Cougar2: An Open Source Machine Learning and Data Mining Development Platform.
Abraham Bagherjeiran, Oner Celepcikay, Rachsuda Jiamthapthaksin, Chunsheng Chen, Vadeerat Rinsurongkawong, Seungchan Lee, Justin Thomas, Christoph Eick.

16:30 – 18:00

Demonstration session

The following tools will be demonstrated (30 minutes each):
- Rattle

18:00 – 18:30

Panel discussion

Topic will be one (or both) of:
- Why open source for data mining research and education?
- Why publish your data mining tool as open source software?


This will be an informal session where workshop participants are encouraged to participate.


Confirmed panel participants:
- Dr. Mark Hall, Pentaho, New Zealand (WEKA developer)
- Dr. Graham Williams, Togaware, Australia (Rattle developer)
- Dr. Nicolas Cebron, University of Konstanz, Germany (KNIME developer)

[ back to top ]