EE448 Big Data Mining 2019 

Weinan Zhang, Assistant Professor

John Hopcroft Center for Computer Science
Shanghai Jiao Tong University

Email: wnzhang [AT]

Big data driven techniques have been revolutionizing various aspects of our daily life. Big data means not only big volume but also high dimension and diversity. How to collect, represent, process and compute so as to successfully mine valuable patterns and acquire benefit from the big data is a fundamental challenge to both academia and industry.

This course provides a comprehensive introduction of the fundamental problems and methodologies of big data mining. The organization of the course would be application oriented, which helps SEIEE students get familar with various data mining tasks and basic solutions. Via lectures, hands-on courseworks and poster presentations, the students are expected to acquire the basic theory, algorithms, and some practice experience of big data mining techniques. It would also help students find their interested research topics, which could benefit their further graduate study and industrial practice.


Please submit the poster and coursework reports on time!

Course Works

Course work 1: Question-Answer Algorithms on Paper Reading
To select a better answer from the two answers given towards each abstract and question pair.
Apr. 3 - May. 14, 2019.

Course work 2: Node Classification on Academic Network
To solve a multi-label classification problem in an academic network.
Apr. 10 - May. 29, 2019.


Lecture 1: Introduction to Big Data Mining
Basic concepts, history and some examples of data mining.
Feb. 27, 2019

Lecture 2: Know Your Data
Data representation, visualization and proximity measures.
Mar. 6, 2019

Lecture 3: Fundamental Data Mining Algorithms
Frequent patterns, association rules, Apriori, FPGrowth, KNN.
Mar. 13, 2019

Lecture 4: Supervised Learning (Part I)
Intro to machine learning, linear regression and logisitic regression.
Mar. 20 and 27, 2019

Lecture 5: Supervised Learning (Part II)
Support Vector Machines, Neural Networks.
Apr. 3, 2019

Lecture 6: Supervised Learning (Part III)
Tree models, Ensemble Methods
Apr. 3 and 10, 2019

Lecture 7: Unsupervised Learning
K-means Clustering, PCA, Mixture Gaussian, EM Methods
Apr. 17, 2019

Lecture 8: Search Engines
Information retrieval, inverted index, retrieval model, relevance model
May 8, 2019

Lecture 9: Learning to Rank
Ranking problem, pairwise/listwise ranking, LambdaRank
May 15, 2019

Lecture 10: Recommender Systems
Information filtering, collaborative filering, matrix factorization
May 15-22, 2019

Lecture 11: Computational Ads
Computational advertising, auctions, sponsored search, contextual ads
May 22-29, 2019

Lecture 12: Behavioral Targeting
Display advertising, RTB, Fruad detection
Jun. 5, 2019

Related Readings

Teaching Assistants

Haiwen Wang, 2018 Ph.D student at IIoT
Research on data mining, graph deep learning
Email: wanghaiwencn [at]

Zhaorun Han, 2018 M.S. student at IIoT
Research on big data analysis
Email: hanzhaorun [at]

Past Course

EE448 Big Data Mining 2018
Compared to EE448 2018, EE448 2019 will provide more DM scenarios and advanced DM researches.


Mar. 1, 2019
Haiwen Wang and Zhaorun Han appointed as the TAs of this course.

Feb. 26, 2019
Web site created!