Reading work
The papers listed below were selected from the list of papers accepted at
the 2018 ACM KDD conference.
You should organize in groups of at most 3 people. Each group chooses
a unique paper. You will have one hour to read through and discuss the
paper in order to answer the questions that follow. A brief
presentation will be given in the last half hour of the class.
1. ACM KDD 2018 list of accepted papers:
-
Discovering Non-Redundant K-means Clusterings in Optimal Subspaces Dominik Mautz (Ludwig Maximilian University of Munich); Wei Ye (Ludwig Maximilian University of Munich); Claudia Plant (University of Vienna); Christian B
- Why should I trust you?
-
Unlearn What You Have Learned: Adaptive Crowd Teaching with Exponentially Decayed Memory Learners Yao Zhou (Arizona State University); Arun Reddy Nelakurthi (Arizona State University); Jingrui He (Arizona State University)
-
Scalable k-Means Clustering via Lightweight Coresets Olivier Bachem (ETH Zurich); Mario Lucic (Google); Andreas Krause (ETH Zurich)
-
TextTruth: An Unsupervised Approach to Discover Trustworthy Information from Multi-Sourced Text Data Hengtong Zhang (SUNY at Buffalo); Yaliang Li (Baidu Research); Fenglong Ma (SUNY Buffalo); Jing Gao (University at Buffalo); Lu Su (The State University of New York at Buffalo)
-
Graph Classification using Structural Attention John Boaz Lee (WPI); Ryan Rossi (Adobe Research); Xiangnan Kong (WPI)
-
You Are How You Drive: Peer and Temporal-Aware Representation Learning for Driving Behavior Analysis Pengyang Wang (Missouri University of Science and Technology); Yanjie Fu (Missouri University of Science and Technology); Jiawei Zhang (Florida State University); Pengfei Wang (CNIC, Chinese Academy of Sciences); Yu Zheng (Urban Computing Business Unit, JD Finance); Charu Aggarwal (IBM)
-
MiSoSouP: Mining Interesting Subgroups with Sampling and Pseudodimension Matteo Riondato (Two Sigma Investments, LP); Fabio Vandin (University of Padova)
-
Towards Mitigating the Class-Imbalance Problem for Partial Label Learning Jing Wang (Southeast University); Min-Ling Zhang (Southeast University)
-
Complex Object Classification: A Multi-Modal Multi-Instance Multi-Label Deep Network with Optimal Tr Yang Yang (NanJing university); Yi-Feng Wu (LAMDA Group, Nanjing University); De-Chuan Zhan (Nanjing University); Zhi-Bin Liu (Tencent); Yuan Jiang (Nanjing University)
-
TruePIE: Discovering Reliable Patterns in Pattern-Based Information Extraction Qi Li (University of Illinois at Urbana-Champaign); Meng Jiang (University of Notre Dame); Xikun Zhang (University of Illinois at Urbana-Champaign); Meng Qu (University of Illinois at Urbana-Champaign); Timothy Hanratty (US Army Research Laboratory); Jing Gao (University at Buffalo); Jiawei Han (University of Illinois at Urbana-Champaign)
-
Risk Prediction on Electronic Healthcare Records with Prior Medical Knowledge Fenglong Ma (SUNY Buffalo); Jing Gao (SUNY Buffalo); Qiuling Suo (SUNY Buffalo); Quanzeng You (Microsoft AI & Research); Jing Zhou (Eheath Inc); Aidong Zhang (SUNY Buffalo)
-
Algorithms for Hiring and Outsourcing in the Online Labor Market Aris Anagnostopoulos (Sapienza University of Rome); Carlos Castillo (Universitat Pompeu Fabra); Adriano Fazzone (Sapienza University of Rome); Stefano Leonardi (Sapienza University of Rome); Evimaria Terzi (Boston University)
-
PCA by Determinant Optimization has no Spurious Local Optima Raphael Hauser (University of Oxford); Armin Eftekhari (Alan Turing Institute); Heinrich Matzinger (Georgia Institute of Technology)
-
Efficient Large-Scale Fleet Management via Multi-Agent Deep Reinforcement Learning Kaixiang Lin (Michigan State University); Renyu Zhao (AI Labs, Didi Chuxing); Zhe Xu (AI Labs, Didi Chuxing); Jiayu Zhou (Michigan State University)
-
TaxoGen: Constructing Topical Concept Taxonomy by Adaptive Term Embedding and Clustering Chao Zhang (University of Illinois at Urbana-Champaign); Fangbo Tao (Facebook); Xiusi Chen (University of Illinois at Urbana-Champaign); Jiaming Shen (University of Illinois at Urbana-Champaign); Meng Jiang (University of Notre Dame); Brian Sadler (U.S. Army Research Lab); Michelle Vanni (U.S. Army Research Lab); Jiawei Han (University of Illinois at Urbana-Champaign)
-
IntelliLight: a Reinforcement Learning Approach for Intelligent Traffic Light Control Hua Wei (The Pennsylvania State University); Guanjie Zheng (The Pennsylvania State University); Huaxiu Yao (The Pennsylvania State University); Zhenhui Li (The Pennsylvania State University)
-
Generalized Score Functions for Causal Discovery Biwei Huang (Carnegie Mellon University); Kun Zhang (Carnegie Mellon University); Yizhu Lin (Carnegie Mellon University); Bernhard Scho?lkopf (Max-Planck Institute for Intelligent Systems); Clark Glymour (Carnegie Mellon University)
-
XiaoIce Band: A Melody and Arrangement Generation Framework for Pop Music Hongyuan Zhu (USTC); Qi Liu (USTC); Nicholas Jing Yuan (Microsoft); Chuan Qin (USTC); Jiawei Li (Soochow University); Kun Zhang (USTC); Guang Zhou (Microsoft); Furu Wei (Microsoft); Yuanchun Xu (Microsoft); Enhong Chen (USTC)
-
Training Big Random Forests with Little Resources Fabian Gieseke (University of Copenhagen); Christian Igel (University of Copenhagen)
2. ACM KDD 2018 list of accepted Posters
Extremely Fast Decision Tree Chaitanya Manapragada (Monash University); Geoffrey Webb (Monash University); Mahsa Salehi (Monash University)
On the Generative Discovery of Structured Medical Knowledge Chenwei Zhang (University of Illinois at Chicago); Yaliang Li (Baidu Research Big Data Lab); Nan Du (Tencent Medical AI Lab); Wei Fan (Tencent Medical AI Lab); Philip S. Yu (University of Illinois at Chicago)
3. Questions:
- What is the data mining or learning task in the paper?
- What is the data mining/machine learning method used?
- What are the charateristics of the data used? (dimension, types
of variables, imbalanced? missing? etc)
- What is the evaluation metrics used?
- What is the validation method used?
- Is the method applied to various datasets or just one?
- Is the method compared with other methods?
- Can you conclude that the experimental methodology is sound?
- Can you conclude that the experimental results are good?
- Are the models applied in practice? (would the model generalize? Is the model biased?)
- Just using the information contained in the paper would you be
able to reproduce results?