Course 4541.676, 132.650
Probabilistic Graphical Models (Artificial Neural Networks, Studies in Artificial Intelligence and Cognitive Process)

 

 

School of Computer Science and Engineering,
Seoul National University

Instructor

   Prof. Byoung-Tak Zhang

TA

    (Tel: 02-880-5890, Rm. 409, Bldg. 138)

Classroom

    302-209

Time

    Mon & Wed.,  13:00 - 14:15

Objectives

- To study the information theory and statistical physics of computational learning
- To understand the architectures and principles of probabilistic graphical models
- To study the algorithms for learning and inference in graphical models
- To learn how to use graphical models for neural and cognitive modeling
- To practice the use of the hypernetwork models for language and vision computing

Textbook

[1] Beckerman, M., Adaptive Cooperative Systems, Wiley, 1997. (Chs. 2, 3, 4 & 6).

[2] Bishop, C. M., Pattern Recognition and Machine Learning, Springer, 2006. (Chs. 8, 9 & 11).

[3] Haykin, S., Neural Networks: A Comprehensive Foundation, Prentice Hall, 1999. (Chs. 10 & 11). 

[4] Mackay, D. J. C., Information Theory, Inference, and Learning Algorithms, Cambridge Univ. (Chs. 42 & 43).

[5] Zhang, B.-T., Hypernetworks: A Molecular Evolutionary Architecture for Cognitive Learning and Memory, IEEE Computational Intelligence Magazine, 3(3):49-63, 2008 [PDF]

References

References

[1] Geman, S. and Geman, D., Stochastic relaxation, Gibbs distribution, and the Bayesian restoration of images, IEEE Trans. on Patt. Anal. and Mach. Int., 6(1): 721-741, 1984 [PDF].

[2] Hinton, G. E., Dayan, P., Frey, B. J., and Neal, R. M., The wake-sleep algorithm for unsupervised neural networks, Science 268(5214): 1158-1161, 1995 [PDF].

[3] Hopfield, J. J., Neural networks and physical systems with emergent collective computational abilities, Proc. Natl., Acad. Sci. USA 79: 2554-2556, 1982 [PDF].

[4] Kirkpatrick, S., Gelatt Jr, C. D., and Vecchi, M. P., Optimization by simulated annealing, Science, 220: 671-680 [PDF].

[5] Solomonoff, R. J., A formal theory of inductive inference, Information and Control, 7: 1-22, 1964 [PDF1][PDF2].

[6] Valiant, L. G., A theory of the learnable, Comm. ACM, 27, 1134-1142, 1984 [PDF].

Evaluation

- Two exams (60%) ->70%
- Term  project (20%)->
10% (mini report on software practice)
- Essay (10%)
- Participation in discussion (10%)

 

Notice

[Mid-term score] Link

[Software practice

         1. Practice material

       2. Project report (max: 3 pages) due: Dec. 13th (23:59).
             - Delay: Dec. 14th (13:00) Report files received after 13:00 will be ignored. So please send your report as soon as possible.

             - Email to TA

             - Supplementary materials.

               a. Zhang, B.-T., Hypernetworks: A Molecular Evolutionary Architecture for Cognitive Learning and
                    Memory, IEEE Computational Intelligence Magazine, 3(3):49-63, 2008[PDF]
-New!

            b. Hypernetworks for Human-level Machine Learning [PDF]-New!

            c. Evolving Hypernetworks for Language Modeling [PDF]-New!

            d. Unsubmitted report: 20457

[Evaluation criteria change]

 

 

 

?  Course Schedule

Date

    Topic

Lecture Notes

Week 1

Computational Learning Theory

   - Learning as a lifelong process

   - Inductive inference

   - Statistical learning theory and the VC dimension

   - Computational learning theory and the PAC model

   - Towards self-teaching cognitive agents

[PDF1]

Week 2

Information Theory and Statistical Mechanics

   - Probability, information, and entropy

   - Lagrange multipliers

   - Maximum entropy methods

   - Cross entropy and KL divergence

      - Mutual information

   - Bayesian inference

   - Errors, likelihoods, and risks

   - ML and MAP estimations

[PDF2]

Week 3

Ising Models and Hopfield Networks

   - Mean-field theory

   - Lattice gas models

   - Renormalization group methods

   - Hopfield networks

   - Liapunov functions

   - Associative memory

   - Cohen-Grossberg Theorem

[PDF3]!

Week 4

Boltzmann Machines and Helmholtz Machines

   - Basic architecture

   - Simulated annealing

   - Learning in Boltzmann machines

   - Helmholtz machines

   - Wake-sleep algorithms

Week 5

Sampling Algorithms

   - Markov chain Monte Carlo (MCMC)

   - Metropolis algorithms

   - Gibbs sampling

   - Importance sampling

   - Evolutionary computation

   - Estimation of distribution algorithms (EDAs)

   - Sampling importance resampling (SIR)

   - Evolutionary MCMC

[ Sampling algorithms]

Week 6

Exam 1 and discussion

Week 7-8

Markov Random Fields

   - Random fields on graphs

   - Markov random fields

   - Hammersley-Clifford Theorem

   - Coupled Markov random fields

   - Conditional random fields

   - Markov logic

[Graphical Models1]

[Graphical Models2]

[MRF-Lecture Notes]

Week 9

Mixture Models

    - Mixtures of Gaussians

    - Maximum likelihood

    - Expectation maximization (EM)

    - Relation to k-means

[Latent Variable Models]

Week 10

Bayesian Networks

   - Directed graphical models

   - Conditional independence

   - Learning Bayesian networks

   - Belief propagation algorithms

 

[Bayesian Networks]

Week 11

Dynamic Bayesian Networks

   - Temporal sequence learning

   - Hidden Markov models

   - Learning in HMM

   - Inference in HMM

[Hidden Markov Models]

Week 12

Exam2 and discussion/Term-project assignment

Week 13

Hypernetworks

    - Chemical reaction and hypergraphs

    - Random fields on molecular hypergraphs

    - Directed/undirected hypernetworks

    - Learning in hypernetworks

    - Inference in hypernetworks

 [PDF]-New!

Week 14

Cognitive Hypernetworks

   - Language learning and generation

   - Music learning and composition

   - Crossmodal retrieval and recommendation

   - Lifelog modeling

[PDF]-New!

Week 15

Advanced Topics in Hypernetworks

   - Lifelong learning with hypernets

   - Dynamic hypernetworks

   - Hierarchical hypernetworks

   - Coupled hypernetworks

 

Week 16

 

 

Term project presentation

 

 


http://pics3.inxhost.com/images/sticker.gif