Advanced Topics in Artificial Intelligence
Program
-
Graphical Systems (Probabilistic)
-
Logic Based Aproaches
-
Other representations
-
Integrated Systems
Instructor
-
(Vítor Santos Costa)[https://www.dcc.fc.up.pt/~vsc].
-
Email: vsc@dcc.fc.up.pt
Slides
Probabilistic Systems
Main slides, also check quick reading slides:
-
Adrian Weller, MLSALT4 graphical model
-
Marc Toussaint University of Stuttgart Summer 2015, Machine Learning, Graphical Models
-
For detailed information, try Daphne Koller Open Class Slides
Not-so-quick Reading:
a. Daphne Koller and Nir Friedman, Probabilistic Graphical Models: everything you wanted and everything you didn't
b. Kevin P. Murphy, Machine Learning A Probabilistic Learning: A probabilistic view of the world.
c. David Barber, Bayesian Reasoning and Machine Learning: pdf from author available/
Propositional Inference
Following slides discuss the connection between SAT solvers, BDDs and trees:
-
Binary Decision Diagrams are one of the most widely used tools in CS. Their application to BN was proposed by Minato et al but they are not a very popular approach to compile BNs. Model Counting is also not widely used.
-
The Problog language was initially implemented on BDDs. The ProbLog2 system can use BDDs. model-counting or trees.
-
Arithmetic Circuits are a very efficient approach for bayesian inference; the Darwiche group recently proposed an extension, SDDs.
-
The connection to constraint solving is discussed by Dechter.
Optimisation
Optimisation is a fundamental tool for modern machine learning and other areas of AI. Techniques are often based on work from the Operations Research and Constraint communities.
-
Discrete domains are used in areas such as planning and game playing. They usually alternate between search and propagation, and are often implemented with hand-crafted tools. Constraint Solvers, such as Eclipse and Gecode, and SAT solvers can be used.
- Gradient based optimisers are widely used in Machine Learning. Often, they assume the problem can be represented by convex functions. Pure gradient descent
Classes
Aula I - Graphical Models [14/02]
- Probabilities: conditional probability, chain rule;
- Bayesian Nerworks
- Naive Bayes
- Ref:
- MLSALT4 graphical model
- Machine Learning, Graphical Models
- Murphy Book: CH 1 -> background material
Aula II - Graphical Models [21/02]
- Log-likelihood
- Markov Models/HMMs
- Parameter optimisation
- Priors - Dirichlecht m-estimate
- Ref:
- MLSALT4 graphical model
- Machine Learning, Graphical Models
- Murphy Book: CH 10, 17.1--17:3,
Aula III - Graphical Models [28/02]
- undirected models
- Inference: VE
- Ref:
- MLSALT4 graphical model
- Machine Learning, Graphical Models
- Murphy Book: CH 20:3 -> background material
Aula IV - Graphical Models [28/02]
- VE step by step
- Inference: MCMC, Gibbs, Sampling
- MLSALT4 graphical model
- Machine Learning, Graphical Models
- Murphy Book: CH 24:2 -> background material
Aula V - Graphical Models vs Logic[28/02]
- Binary Decision Diagrams
- Compilation to BDDs
- Alternatives
- Refs
- (Slides)[slides.pdf]
Aula VI - Graphical Models vs Logic [28/02]
- BDDs vs BN
- ProbLog
- Parameter Inference
- Refs
- (Slides)[slides.pdf]
- Introduction. — ProbLog: Probabilistic Programming - DTAI - KU Leuven
Aula VII - Learning as Optimisation [28/02]
- Gradient descent
- Application: logistic regression
- The sigmoid function
- Refs
- Murphy Book: 8 to 8.3.4
Aula VIII -Improving on Logistic Regression [28/02]
- The Hessian
- Stochastic Methods
- Using the past
- Refs
- Slides Hinton 6
Aula IX - Neural Networks [28/02]
Aula X - Tensorflow
Aula XI - Deep Networks [28/02]
Grading
-
Mini Project: 4x3 Valores, or 2x3
-
Project: 6 Valores, or none
-
Exam: 8 Valores
Mini-Projects
Submission:
To submit you either/or must - brief presentation in class; - source + small report + run log (eg, jupyter notebook) to be sent to vscosta AT fc.up.pt, subj TAIA.
Deadlines:
- VE: Apr 25th
- Datasets: May 9
- TensorFlow: May 23
- Tutorial/Presentation: last week
Presentation
Short presentation on your favorite AI topic.
Support Material
Tutorials
Why attend a tutorial? Introduce, explain and comment on the material on n ai tutorial. Look for good tutorials at:
- IJCAI
- AAAI
- ECAI
- ICML
- NIPS
- ECML
- KDD
- ICDM
Graphical Models
-
implement variable elimination for bayesian networks:
-
read the network using a standard format.
I advise writing a small parser for the BIF format; you can reuse code at github from existing parsers).
-
implement the core algorithm. The user should specify a query variable and you should return back a list of numbers,eg:
Pr? smoke
[0.8,0.2]
Please avoid copying existing code.
-
implement evidence. The interaction should be something like:
-
Pr? smoke|coughing
-
[0.9,0.1]
-
compare selection strategies. Ie, find the novariable we want to eliminate first).
Example is greedy search on least number of variables or smallest table size.
-
Evaluation:
Programs will be evaluated based on a short report and interaction with authors. Parameters are:
-
Running Time: compare selection functions and/or vs other packages;
-
Memory Size
-
Try different network sizes
Minimum: you should at least run on the Asia and Cancer Networks
Machine Learning
Given two of the following datasets, plus a dataset of your choosing, compare machine learning models in terms of accuracy and running times:
- Naive Bayes
- Logistic Regression
- other of your choosing
Machine Learning
Given two of the following datasets, plus a dataset of your choosing, compare Tensorflow Neural Network models. One of the models learning models in terms of accuracy and running times. The report should include:
- A Model for sequences using LSTM nodes or GRU nodes.
- A pictorial representation of the model
- A comparison with non-DNN models.
Datasets:
-
Congressional Voting Records (UCI dataset repository): task is to predict voting based on other votes.
-
European SOCCER dataset, found at Kaggle (you must specify your task from the data)
Evaluation:
-
Which Metrics?
-
How to test?
-
What are the most sensitive parameters?
Typical Exam Questions
I include a number of questions that would be typical of the exam. The first question should be worth 2 points (or more). The other questions will be circa 1 point:
-
Given the Bayesian network not depicted here, and assuming that you have evidence on variables X1 and X2, give the sequence of steps that would be taken by the VE algorithm to compute the posterior probability on Y.
-
For each step, please describe the operation it would perform, and report both the input and output sizes.
-
Explain the difference between generative and discriminate models, by comparing the Naive Bayes classifier with another model.
-
Of the two trees above, which one is a proper BDD. Why?
-
Fig ... shows the structure of a GAN. What is a GAN for, and how does it work?
-
Stochastic gradient descent is widely combined with mini-batches. What is a mini-batch and why is this done?
-
TensorFlow is often used through the KERAS API. Give an example of setting up KERAS for a neural network with 2 internal layers.