Advanced Topics in Artificial Intelligence

Robots

Program

Graphical Systems (Probabilistic)
Logic Based Aproaches
Other representations
Integrated Systems

Instructor

(Vítor Santos Costa)[https://www.dcc.fc.up.pt/~vsc].
Email: vsc@dcc.fc.up.pt

Slides

Probabilistic Systems

Crowd

Main slides, also check quick reading slides:

Adrian Weller, MLSALT4 graphical model
Marc Toussaint University of Stuttgart Summer 2015, Machine Learning, Graphical Models
For detailed information, try Daphne Koller Open Class Slides

Not-so-quick Reading:

a. Daphne Koller and Nir Friedman, Probabilistic Graphical Models: everything you wanted and everything you didn't

b. Kevin P. Murphy, Machine Learning A Probabilistic Learning: A probabilistic view of the world.

c. David Barber, Bayesian Reasoning and Machine Learning: pdf from author available/

Propositional Inference

Following slides discuss the connection between SAT solvers, BDDs and trees:

Binary Decision Diagrams are one of the most widely used tools in CS. Their application to BN was proposed by Minato et al but they are not a very popular approach to compile BNs. Model Counting is also not widely used.
The Problog language was initially implemented on BDDs. The ProbLog2 system can use BDDs. model-counting or trees.
Arithmetic Circuits are a very efficient approach for bayesian inference; the Darwiche group recently proposed an extension, SDDs.
The connection to constraint solving is discussed by Dechter.

Optimisation

Optimisation is a fundamental tool for modern machine learning and other areas of AI. Techniques are often based on work from the Operations Research and Constraint communities.

Discrete domains are used in areas such as planning and game playing. They usually alternate between search and propagation, and are often implemented with hand-crafted tools. Constraint Solvers, such as Eclipse and Gecode, and SAT solvers can be used.
- Gradient based optimisers are widely used in Machine Learning. Often, they assume the problem can be represented by convex functions. Pure gradient descent

Classes

self-driving

Aula I - Graphical Models [14/02]

Probabilities: conditional probability, chain rule;
Bayesian Nerworks
Naive Bayes
Ref:
- MLSALT4 graphical model
- Machine Learning, Graphical Models
- Murphy Book: CH 1 -> background material

Aula II - Graphical Models [21/02]

Log-likelihood
Markov Models/HMMs
Parameter optimisation
Priors - Dirichlecht m-estimate
Ref:
- MLSALT4 graphical model
- Machine Learning, Graphical Models
- Murphy Book: CH 10, 17.1--17:3,

Aula III - Graphical Models [28/02]

undirected models
Inference: VE
Ref:
- MLSALT4 graphical model
- Machine Learning, Graphical Models
- Murphy Book: CH 20:3 -> background material

Aula IV - Graphical Models [28/02]

VE step by step
Inference: MCMC, Gibbs, Sampling
- MLSALT4 graphical model
- Machine Learning, Graphical Models
- Murphy Book: CH 24:2 -> background material

Aula V - Graphical Models vs Logic[28/02]

Binary Decision Diagrams
Compilation to BDDs
Alternatives
Refs
- (Slides)[slides.pdf]

Aula VI - Graphical Models vs Logic [28/02]

BDDs vs BN
ProbLog
Parameter Inference
Refs
- (Slides)[slides.pdf]
- Introduction. — ProbLog: Probabilistic Programming - DTAI - KU Leuven

Aula VII - Learning as Optimisation [28/02]

Gradient descent
Application: logistic regression
The sigmoid function
Refs
Murphy Book: 8 to 8.3.4

Aula VIII -Improving on Logistic Regression [28/02]

The Hessian
Stochastic Methods
Using the past
Refs
Slides Hinton 6

Aula IX - Neural Networks [28/02]

Biological Inspiration
The Hidden layer
Boltzman.
Refs
- Slides Hinton 1, Hinton2, Hinton3,

Aula X - Tensorflow

Dataflow
Example Use Cases
Refs
- SLI 1,
- SLI 2,
- SLI 3.

Aula XI - Deep Networks [28/02]

Convolution
LSTM
Generative models
- Slides MIT 1,
- Slides MIT 2,
- Slides MIT 3,
- Slides MIT 3,

Grading

watson

Mini Project: 4x3 Valores, or 2x3
Project: 6 Valores, or none
Exam: 8 Valores

Mini-Projects

Submission:

To submit you either/or must - brief presentation in class; - source + small report + run log (eg, jupyter notebook) to be sent to vscosta AT fc.up.pt, subj TAIA.

Deadlines:

VE: Apr 25th
Datasets: May 9
TensorFlow: May 23
Tutorial/Presentation: last week

Presentation

Short presentation on your favorite AI topic.

Support Material

Tutorials

Why attend a tutorial? Introduce, explain and comment on the material on n ai tutorial. Look for good tutorials at:

- IJCAI
- AAAI
- ECAI
- ICML
- NIPS 
- ECML
- KDD
- ICDM

Graphical Models

implement variable elimination for bayesian networks:
1. read the network using a standard format.
  
  I advise writing a small parser for the BIF format; you can reuse code at github from existing parsers).
2. implement the core algorithm. The user should specify a query variable and you should return back a list of numbers,eg:
Pr? smoke

[0.8,0.2]
```
Please avoid copying existing code.
```
1. implement evidence. The interaction should be something like:
2. Pr? smoke|coughing
3. [0.9,0.1]
4. compare selection strategies. Ie, find the novariable we want to eliminate first).
  
  Example is greedy search on least number of variables or smallest table size.

Evaluation:

Programs will be evaluated based on a short report and interaction with authors. Parameters are:

Running Time: compare selection functions and/or vs other packages;
Memory Size
Try different network sizes

Minimum: you should at least run on the Asia and Cancer Networks

Machine Learning

Given two of the following datasets, plus a dataset of your choosing, compare machine learning models in terms of accuracy and running times:

Naive Bayes
Logistic Regression
other of your choosing

Machine Learning

Given two of the following datasets, plus a dataset of your choosing, compare Tensorflow Neural Network models. One of the models learning models in terms of accuracy and running times. The report should include:

A Model for sequences using LSTM nodes or GRU nodes.
A pictorial representation of the model
A comparison with non-DNN models.

Datasets:

Congressional Voting Records (UCI dataset repository): task is to predict voting based on other votes.
European SOCCER dataset, found at Kaggle (you must specify your task from the data)

Evaluation:

Which Metrics?
How to test?
What are the most sensitive parameters?

Typical Exam Questions

I include a number of questions that would be typical of the exam. The first question should be worth 2 points (or more). The other questions will be circa 1 point:

Given the Bayesian network not depicted here, and assuming that you have evidence on variables X1 and X2, give the sequence of steps that would be taken by the VE algorithm to compute the posterior probability on Y.
For each step, please describe the operation it would perform, and report both the input and output sizes.
Explain the difference between generative and discriminate models, by comparing the Naive Bayes classifier with another model.
Of the two trees above, which one is a proper BDD. Why?
Fig ... shows the structure of a GAN. What is a GAN for, and how does it work?
Stochastic gradient descent is widely combined with mini-batches. What is a mini-batch and why is this done?
TensorFlow is often used through the KERAS API. Give an example of setting up KERAS for a neural network with 2 internal layers.

Keys	Action
`?`	Open this help
`n`	Next page
`p`	Previous page
`s`	Search