Difference between revisions of "Deep Learning Course"
(Created page with "Coming soon") |
(→Detailed course schedule) |
||
(62 intermediate revisions by the same user not shown) | |||
Line 1: | Line 1: | ||
− | Coming soon | + | __FORCETOC__ |
+ | |||
+ | The following are last minute news you should be aware of ;-) | ||
+ | * 21/02/2018: Added slides on Word Embedding, Attention Mechanism, and Deep Convolutional Architectures | ||
+ | * 16/02/2018: Added slides about Neural Networks Training and Recurrent Neural Networks | ||
+ | * 12/02/2018: Added slides about Course Intro, Deep Learning Intro, and Feedforward Neural Networks | ||
+ | * 12/02/2018: The course starts today! | ||
+ | * 09/02/2018: First version of this website ... | ||
+ | |||
+ | ==Course Aim & Organization== | ||
+ | |||
+ | Deep Learning is nowadays becoming the dominant approach in learning cognitive systems which are nowdays able to recognize patterns in data (e.g., images, text, and sounds) or to perform end to end learning of complex behaviors. The proposed doctoral course will introduce the deep learning basics as well as its applications with an on-hands approach where students will be challenged with the practical issues of data collection, model design, model training and performance evaluation. | ||
+ | |||
+ | Starting from the foundations of Neural Networks and Deep Learning, the course will introduce the most successful models, algorithms, and tools, for image understanding, sequence prediction and sequence to sequence translation through deep neural models. | ||
+ | |||
+ | In the first part of the course we will provide attendee the theory and the computational tools to approach deep learning with an in hand experience. In this part, we will cover the theory of neural networks, including feed-forward and recurrent models, with a specific focus on Feed Forward Neural Networks (FFNN), Convolutional Neural Networks (CNNs) and Long-Short Term Memory networks (LSTMs). These models will be presented both from a theoretical point of view and from a practical point of view with simple self-contained examples of applications. In this context two deep learning frameworks will be introduced, i.e., TensorFlow and PyTorch, in practical sessions with a special emphasis on the TensorFlow library. | ||
+ | |||
+ | In the second part we will focus on different application domains. This will be done by presenting a selection of state of the art results in applying deep learning techniques to domains such as (but not limited to) pattern recognition, speech recognition and language modeling, geometric reasoning. | ||
+ | |||
+ | For the final evaluation groups of 3 to 5 students (depending on the number of attendee) will be required to implement one of the models from the papers taken into analysis during the course by using the TensorFlow library. | ||
+ | |||
+ | ===Course Program=== | ||
+ | |||
+ | To avoid overlap between the "Deep Learning, Theory Techniques and Applications" (DL) PhD course and the "Image Classification: modern approaches" (IC) PhD course, and leverage on the specific competencies of the teachers, topics presented in DL classes will not be covered by the IC classes. Similarly, topics in IC won’t be covered in DL. Please refer to the detailed program and the course logistics to see which are the mandatory classes to be attended from the Image Classification course. | ||
+ | |||
+ | The topics which will be covered by the course are: | ||
+ | * Introduction to neural networks and the feed-forward architectures | ||
+ | * Backpropagation, training and overfitting in neural networks | ||
+ | * Recurrent Neural Networks and other classical architectures, e.g., Radial Basis Functions, Neural Autoencoders, etc. | ||
+ | * Introduction and basics of image handling | ||
+ | * Basic approaches to image classification | ||
+ | * Data-driven features: Convolutional Neural Networks | ||
+ | * TensorFlow and PyTorch introduction with examples | ||
+ | * Structural learning, Long-Short Term Memories, and application to text and speech | ||
+ | * Extended problems in image classification | ||
+ | |||
+ | ===Teachers=== | ||
+ | |||
+ | The course is composed by a blending of theoretical lectures, practical exercises, and seminars | ||
+ | * [http://www.deib.polimi.it/ita/personale/dettagli/267262 Matteo Matteucci]: the official teacher for the Deep Learning course | ||
+ | * [http://www.deib.polimi.it/eng/people/details/1001112 Marco Ciccone]: co -teacher for the Deep Learning course | ||
+ | * [http://home.deib.polimi.it/boracchi/ Giacomo Boracchi]: the co-teacher from the Image Classification course (shared code course) | ||
+ | * [http://www.leet.it/home/giusti/website/doku.php Alessandro Giusti]: the co-teacher from the Image Classification course (shared code course) | ||
+ | * [http://www.luigimalago.it/ Luigi Malago']: Guest Speaker from the Machine Learning and Optimization Group at the [https://rist.ro/en.html Romanian Institute of Science and Technology (RIST)] | ||
+ | * [http://www.people.usi.ch/mascij/ Jonathan Masci]: Guest Speaker from [https://nnaisense.com/ NNAISENSE] | ||
+ | * [https://www.linkedin.com/in/francescovisin/ Francesco Visin]: Guest Speaker from [https://deepmind.com/ DeepMind] | ||
+ | * ... | ||
+ | |||
+ | ===Websites=== | ||
+ | |||
+ | Please refer to the following websites for specific course materials and detailed calendars: | ||
+ | * DL: http://chrome.ws.dei.polimi.it/index.php/Deep_Learning_Course | ||
+ | * IC: http://home.deib.polimi.it/boracchi/teaching/ImageClassification.htm | ||
+ | |||
+ | |||
+ | ===Detailed course schedule=== | ||
+ | |||
+ | A detailed schedule of the course can be found here; topics are just indicative while days and teachers are correct up to some last minute change. Please note that some days you have a lecture both in the morning and in the afternoon. | ||
+ | |||
+ | Please remind that the Deep Learning PhD course has some shared lectures with the Image Classification PhD course, students are required to attend these lectures which are listed below since those topic will not be covered by the Deep learning classes although they are part of the course program. You will find also the [Optional] lectures from the Image Classification course you might want to attend either because you are enrolled also in that course or for personal interest. | ||
+ | |||
+ | Note: Lecture timetable interpretation | ||
+ | * In the morning lecture starts at 9:30 sharp and ends at 13:00 | ||
+ | * In the afternoon lecture starts at 14:15 sharp and ends at 17:45 | ||
+ | |||
+ | {| border="1" align="center" style="text-align:center;" | ||
+ | |- | ||
+ | |Date || Day || Time || Room || Teacher || Topic | ||
+ | |- | ||
+ | |12/02/2018 || Monday || 09:30 - 13:00 || S.0.2 - Building 3 || Matteo Matteucci || Course introduction, Machine Learning vs. Deep Learning introduction, the perceptron, the feed forward neural network architecture, backpropagation and gradient descent, error measures for regression and classification. | ||
+ | |- | ||
+ | |12/02/2018 || Monday || 14:15 - 17:45 || S.0.2 - Building 3 || Giacomo Boracchi || Introduction and basics of image handling in Python | ||
+ | |- | ||
+ | |14/02/2018 || Wednesday || 09:30 - 13:00 || S.0.2 - Building 3 || Matteo Matteucci || Overfitting, Early Stopping, Weight Decay, Regularization | ||
+ | |- | ||
+ | |14/02/2018 [OPTIONAL] || Wednesday || 14:15 - 17:45 || S.0.2 - Building 3 || Giacomo Boracchi || Hand-crafted features for image classification | ||
+ | |- | ||
+ | |16/02/2018 || Friday || 09:30 - 13:00 || S.0.5 - Building 3 || Matteo Matteucci || Recurrent Neural Networks, Backpropagation through time, Vanishing Gradient, Long-Short Term Memories, Gated Recurrent Units | ||
+ | |- | ||
+ | |16/02/2018 [OPTIONAL] || Friday || 14:15 - 17:45 || S.0.5 - Building 3 || Giacomo Boracchi || Computer Vision features for image classification | ||
+ | |- | ||
+ | |19/02/2018 || Monday || 09:30 - 13:00 || S.0.5 - Building 3 || Marco Ciccone || TensorFlow and PyTorch Tutorial | ||
+ | |- | ||
+ | |19/02/2018 || Monday || 14:15 - 17:45 || S.0.5 - Building 3 || Alessandro Giusti || Data-driven features: Convolutional Neural Networks | ||
+ | |- | ||
+ | |21/02/2018 || Wednesday || 09:30 - 10:30 || S.0.2 - Building 3 || Marco Ciccone || Common deep architectures for image classification: LeNet, AlexNet, GoogleNet, ResNet, ... | ||
+ | |- | ||
+ | |21/02/2018 || Wednesday || 10:45 - 11:45 || S.0.2 - Building 3 || Matteo Matteucci || Word Embedding | ||
+ | |- | ||
+ | |21/02/2018 || Wednesday || 12:00 - 13:00 || S.0.2 - Building 3 || Alberto Mario Pirovano || Attention Mechanisms | ||
+ | |- | ||
+ | |21/02/2018 || Wednesday || 14:15 - 17:45 || S.0.2 - Building 3 || Alessandro Giusti || Advanced CNNs and Best practices in image classification | ||
+ | |- | ||
+ | |23/02/2018 || Friday || 09:30 - 10:30 || C.I.1 - Building 6 || Luigi Malago' || Variational Autoencoders | ||
+ | |- | ||
+ | |23/02/2018 || Friday || 10:45 - 11:45 || C.I.1 - Building 6 || Jonathan Masci || Deep Learning on graphs and manifolds: going beyond Euclidean data | ||
+ | |- | ||
+ | |23/02/2018 || Friday || 12:00 - 13:00 || C.I.1 - Building 6 || Francesco Visin || Graph Networks | ||
+ | |- | ||
+ | |23/02/2018 [OPTIONAL] || Friday || 14:10 - 17:45 || S.0.5 - Building 3 || Alessandro Giusti || An overview on extended problems in image classification | ||
+ | |- | ||
+ | |} | ||
+ | |||
+ | ===Course Evaluation=== | ||
+ | |||
+ | Coming soon ... | ||
+ | |||
+ | <!-- | ||
+ | The course evaluation is composed by two parts: | ||
+ | * HW: Homework with exercises covering the whole program | ||
+ | * WE: A written examination covering the whole program | ||
+ | the final score will take the '''MAXIMUM''' between '''WE''' and the combination '''0.7*WE + 0.3*HW'''. In practice | ||
+ | * the homework can only increase your score | ||
+ | * the homework can only impact for the 30% of the score | ||
+ | * the homework is not mandatory | ||
+ | --> | ||
+ | |||
+ | ==Course Logistics== | ||
+ | |||
+ | We would like to thank all the students for their enthusiastic participation, it has made course logistics quite challenging, but we have followed the motto “no student left behind” and we have done our best to accommodate the all of you. | ||
+ | |||
+ | ===Classrooms=== | ||
+ | |||
+ | Both DL and IC will be held in the following rooms so take note and spread the voice in case you know people who are going to attend and might not have received this email. | ||
+ | |||
+ | * February 12th, Aula S.0.2. Ed 3, 260 seats | ||
+ | * February 14th, Aula S.0.2. Ed 3, 260 seats | ||
+ | * February 16th, Aula S.0.5. Ed 3, 174 seats | ||
+ | * February 19th, Aula S.0.5. Ed 3, 174 seats | ||
+ | * February 21th, Aula S.0.2. Ed 3, 260 seats | ||
+ | * February 23th, Aula N.1.2. Ed 2, 168 seats | ||
+ | |||
+ | These classrooms should be able to fit the all of you, in case not we will check for alternative rooms and will notify this. | ||
+ | |||
+ | ===Course hours=== | ||
+ | |||
+ | Starting hours are sharp, i.e., they already include the “academic quarter”: | ||
+ | |||
+ | * DL will be in the morning: from 9.30 to 13.00, | ||
+ | * IC will be in the afternoon: from 14.15 to 17.45, | ||
+ | |||
+ | ===Attendance=== | ||
+ | |||
+ | The Deep eThese are courses open to Master Students, PhD students and few external participants. Everybody who is interested in getting credits (CFU) or attendance certificates, will be asked to sign an attendance list on each class (a class is a block of hours in the morning or in the afternoon). | ||
+ | |||
+ | Notice that In particular DL students have to attend IC on: February 12th, February 19th, and February 21th. This makes the DL schedule: | ||
+ | * February 12th Morning | ||
+ | * February 12th Afternoon | ||
+ | * February 14th Morning | ||
+ | * February 16th Morning | ||
+ | * February 19th Morning | ||
+ | * February 19th Afternoon | ||
+ | * February 21th Morning | ||
+ | * February 21th Afternoon | ||
+ | * February 23th Morning | ||
+ | |||
+ | Despite the fact you are attending just one of the courses (e.g., because as master student you can get credits only for one of them) we warmly suggest to attend all the lectures from both courses. | ||
+ | |||
+ | ===Programming Environment=== | ||
+ | |||
+ | The reference programming language for both courses is Python. In both DL and IC there will be sessions where you’ll be asked to implement yourself a few lines of codes and possibly we will give some simple assignment. The first lecture, bring your laptop and make sure to have Python 3.6 installed; install Miniconda / or Anaconda framework from conda.io … more informations will be provided later on regarding for additional tools and facilities. | ||
+ | |||
+ | ==Teaching Material== | ||
+ | |||
+ | Lectures will be mostly based on presentations from the teachers and invited speakers. These slides are taken from several sources, tutorials, summer schools, papers and so on. In case you are interested in a reference book, you can read: | ||
+ | * [http://www.deeplearningbook.org/ Deep Learning]. Ian Goodfellow, Yoshua Bengio, and Aaron Courville, MIT Press, 2016. | ||
+ | |||
+ | ===Teachers Slides === | ||
+ | |||
+ | In the following you can find the lecture slides used by the course teachers during classes. | ||
+ | |||
+ | ====Deep Learning Classes==== | ||
+ | |||
+ | These are the slides presented during Matteo Matteucci lectures | ||
+ | * [[Media:DL-00-IntroCourse.pdf | Course introduction]]: Introduction to the course with details about the logistics, the grading, the topics, the teachers, and so on ... | ||
+ | * [[Media:DL-01-IntroDeepLearning.pdf | Deep Learning introduction]]: Introduction to Deep Learning and learning data representation from data. | ||
+ | * [[Media:DL-02-Perceptron2NeuralNetworks.pdf | From Perceptrons to Neural Networks]]: The Perceptron and its learning algorithm, Feed Forward Neural Networks and Backpropagation. | ||
+ | * [[Media:DL-03-NeuralNetworksTraining.pdf | Neural Networks Training]]: Dealing with overfitting and optimization in Fee Forward Neural Networks | ||
+ | * [[Media:DL-04-RecurrentNeuralNetworks.pdf | Recurrent Neural Networks]]: Vanishing and exploding gradient, Long-Short Term Memory cells | ||
+ | * [[Media:DL-05-WordEmbedding.pdf | Word Embedding]]: Deep Unsupervised Learning, Embedding, Language Models, and word2vec. | ||
+ | |||
+ | Slides for the tutorials by Marco Ciccone are published here: | ||
+ | * [[Media:DL-E1-TensorFlow101.pdf | Tensorflow 101]]: Tutorial on Tensorflow by Marco Ciccone | ||
+ | * [[Media:DL-E2-PyTorch101.pdf | pyTorch 101]]: Tutorial on pyTorch by Marco Ciccone | ||
+ | |||
+ | ====Image Classification Classes==== | ||
+ | These are the slides presented during Giacomo Boracchi and Alessandro Giusti lectures: | ||
+ | * ... | ||
+ | * ... | ||
+ | * ... | ||
+ | |||
+ | ====Seminars==== | ||
+ | These are the presentations given by course special guests: | ||
+ | * [[Media:DL-G1-AdvancedCNNArchitectures.pdf | Advanced CNN Architectures]]: Seminar on the importance of depth in convolutional neural networks by Marco Ciccone | ||
+ | * [[Media:DL-G2-AttentionMechanisms.pdf | Attention Mechanisms]]: Seminar on attention mechanism in sequence to sequence learning by Alberto Mario Pirovano | ||
+ | * [https://drive.google.com/file/d/1pOCBc_PxMKsuEiDT7D82vQ4bX2sTg6T8/view?usp=sharing| Deep Learning on Graphs and Manifolds]: Seminar on deep learning beyond Euclidean data by Jonathan Masci | ||
+ | * [[Media:DL-G4-VariationAutoencoders.pdf | Variational Autoencoders]]: Seminar on variational autoencoders principles and perspectives by Luigi Malagò | ||
+ | * [[Media:DL-G5-GraphNetworks.pdf | Graph Networks]]: Seminar on deep learning and graphs, i.e., graph networks, by Francesco Visin | ||
+ | |||
+ | ===Additional Resources=== | ||
+ | |||
+ | Papers and links useful to integrate the slides from lecturers and guests: | ||
+ | * [http://binds.cs.umass.edu/papers/1995_Siegelmann_Science.pdf Computation Beyond the Turing Limit] Hava T. Siegelmann, Sience (268)545:548,1995. | ||
+ | |||
+ | ==F.A.Q.== | ||
+ | |||
+ | ===About attendance=== | ||
+ | |||
+ | '''''I heard you take signatures and attendance is mandatory, but I have an exam and I will mess one or two lessons. Do I still get the credits?''''' | ||
+ | |||
+ | * PhD students are required to attend 70% of the lectures, i.e., at least (9d x 4h x 0.7) ~= 25h. This means that in case of 4h blocks you have to attend at least 6 blocks out of the 9 blocks foreseen. For Master students we are little more flexible in case they have exams ... I suggest them not to push me writing here a number since then it stays ;-) | ||
+ | |||
+ | <!-- | ||
+ | == 2013-2014 Homework == | ||
+ | |||
+ | The 2013 Homework (alike the 2012 one) is organized as an octave series of tutorials. You are requested to go through the tutorials and practice with the algorithms that have been presented in class. To prove us you have done it and that you have understood the code you will be requested to solve few exercises and provide us a pdf report by email | ||
+ | |||
+ | === Part 1: Linear Classification Methods === | ||
+ | |||
+ | * [[Media:homework_pami_classification_2013_2014.pdf | Homework 2013-2014 on Classification]]: this is the text of the first part of the homework; it has been intentionally edited not to allow cut and paste. '''This part of the homework will contribute to the 10% of the grade and the deadline to submit the solution by Sunday 17/11 23:59''' | ||
+ | ** [[Media:SAheart.data | SAheart.data]]: the dataset used for the homework | ||
+ | ** [[Media:SAheart.info | SAheart.info]]: the dataset used for the homework | ||
+ | |||
+ | '''Note 1:''' Submit the solution by loading it on www.dropitto.me/matteucci (pwd is dropittome) | ||
+ | |||
+ | '''Note 2:''' please name your pdf as pami_SURNAME_STUDENTID_classification.pdf; if you submit a homework for different people, please pick one of the names for the file but PUT ALL THE NAMES IN THE COVER PAGE!! | ||
+ | |||
+ | === Part 2: Regression === | ||
+ | * [[Media:homework_pami_regression_2013_2014.pdf | Homework 2013-2014 Regression]]: this is the text of the second part of the homework; it has been intentionally edited not to allow cut and paste. '''This part of the homework will contribute to the 10% of the grade and the deadline to submit the solution by email to malago@di.unimi.it (cc to matteo.matteucci@polimi.it) is Friday 20/12 23:59''' | ||
+ | ** [[Media:prostate.data | prostate.data]]: the dataset used for the homework | ||
+ | ** [[Media:prostate.info | prostate.info]]: the dataset used for the homework | ||
+ | ** [[Media:diabete.mat | diabete.mat]]: the dataset used for the homework | ||
+ | ** [[Media:textread.m | textread.m]]: (optional) function which might be useful depending on your octave version | ||
+ | ** [[Media:strread.m | strread.m]]: (optional) function which might be useful depending on your octave version | ||
+ | |||
+ | For any question or doubt please sen us an email as soon as possible. | ||
+ | |||
+ | '''Note 1:''' for some strange reason the CSM of the website has decided to rename the files with capitals, please save them in lower case :-( | ||
+ | |||
+ | <strike>'''Note 2:''' rename the file Diabete.data into diabete.mat ... still fighting with the CSM :-)</strike> | ||
+ | |||
+ | '''Note 3:''' the code has been tested with octave under linux, we suggest to use it not to spend too much time with installing it under windows or using matlab. If you do not have linux installed, try using a live CD as the ubuntu 13.04 live distro ;-) | ||
+ | |||
+ | === Part 3: Clustering === | ||
+ | |||
+ | The code and the text of the third part of the homework are available online at this post | ||
+ | |||
+ | * [http://davide.eynard.it/2013/12/30/octave-clustering-demo-part-6-more-evaluation/ Homework 2013-2014 on clustering evaluation] | ||
+ | |||
+ | As usual, '''this part of the homework will contribute to the 10% of the grade'''; the deadline to submit the solution is the end of the course. You have to '''send it to davide.eynard_at_gmail.com Friday 24/01 23:59'''. | ||
+ | |||
+ | '''Note 1:''' for any doubt or question send an email, as soon as possible, to Davide Eynard so to have a prompt reply and not get stuck during homework execution. | ||
+ | |||
+ | '''Note 2:''' you have to turn in only the solution of "Ocatave clustering demo part 6", while the other parts can be used as reference to improve your understanding in basic clustering algorithms. | ||
+ | |||
+ | === Part 2: Classification === | ||
+ | |||
+ | * [[Media:homework_pami_classification_2013.pdf | Homework 2013 Classification]]: this is the text of the second part of the homework; it has been intentionally edited not to allow cut and paste. '''This part of the homework will contribute to the 10% of the grade and the deadline to submit the solution by Sunday 23/06 23:59''' | ||
+ | ** [[Media:SAheart.data | SAheart.data]]: the dataset used for the homework | ||
+ | ** [[Media:SAheart.info | SAheart.info]]: the dataset used for the homework | ||
+ | |||
+ | '''Note 1:''' Submit the solution by loading it on www.dropitto.me/matteucci (pwd is dropittome) | ||
+ | |||
+ | '''Note 2:''' please name your pdf as pami_SURNAME_STUDENTID_classification.pdf; if you submit a homework for different people, please pick one of the names for the file but PUT ALL THE NAMES IN THE COVER PAGE!! | ||
+ | |||
+ | |||
+ | |||
+ | '''Errata Corrige''': there were a few bugs in the homework text. I have updated the pdf and they were: | ||
+ | |||
+ | In the computation of feature projection, the code for the maximization of a'B*a via SVD should be changed as it follows | ||
+ | % maximization of a'*B*a / a'*w*a via SVD | ||
+ | [Vw, Dw, Vw] = svd(W); | ||
+ | Whalf = Vw * sqrt(Dw) * Vw'; % Whalf'*Whalf == W | ||
+ | Wminushalf = inv(Whalf); | ||
+ | Mstar = M*Wminushalf; | ||
+ | % Add this variable for computing Mstar mean | ||
+ | meanMstar = mean(Mstar); | ||
+ | for i=1:size(M,1) | ||
+ | % Remove the mean saved before the loop | ||
+ | Mstar(i,:) = Mstar(i,:)-meanMstar; | ||
+ | end | ||
+ | Bstar = Mstar'*Mstar; | ||
+ | [Vstar, Db, Vstar] = svd(Bstar); | ||
+ | |||
+ | In the Fisher projection it is more correct to use only the training data to learn the projection and then we can train and test on the corresponding subsets | ||
+ | |||
+ | a = FisherProjection(X(training,:),Y(training,:)); | ||
+ | reducedX = X*a(:,1); | ||
+ | [mu_0, mu_1, sigma, p_0, p_1] = linearDiscriminantAnalysis_train(reducedX(training), Y(training)) | ||
+ | |||
+ | I forgot to filter for just the training samples when performing Quadratic Discriminant Analysis | ||
+ | |||
+ | quadX = expandToQuadraticSpace(X); | ||
+ | %check this out! | ||
+ | size(quadX) | ||
+ | beta = linearRegression_train(quadX(training), Y(training)); | ||
+ | |||
+ | And in general you should always train on the training data and test on the testing data ;-). | ||
+ | |||
+ | === Part 3: Clustering === | ||
+ | |||
+ | The code and the text of the third part of the homework are available online at these posts | ||
+ | |||
+ | * [http://davide.eynard.it/2013/06/18/octave-clustering-demo-part-4-k-medoids/ Homework 2013 on k-medoids] | ||
+ | * [http://davide.eynard.it/2013/06/18/octave-clustering-demo-part-5-hierarchical-clustering/ Homework 2013 on hierarchical clustering] | ||
+ | |||
+ | As usual, '''this part of the homework will contribute to the 10% of the grade'''; the deadline to submit the solution is '''before the you take the exam''' sending it to davide.eynard_at_gmail.com. | ||
+ | |||
+ | * [http://davide.eynard.it/2012/06/05/octave-clustering-demo-part-0-introduction-and-setup/ Homework 2012 part 3:] follow this tutorial and answer the questions from all 5 sub-tutorials. | ||
+ | |||
+ | == 2012 Homework == | ||
+ | |||
+ | |||
+ | The Homework of 2012 organized like an octave/matlab series of tutorials. You are requested to go through the tutorials and practice with the algorithms that have been presented in class. To prove us you have done it and that you have understood the code you will be requested to solve few exercises and provide us a pdf report by email | ||
+ | |||
+ | * [[Media:PAMI_homework_2012_1.pdf | Homework 2012 part 1]]: this is the text of the first part of the homework; it has been intentionally edited not to allow cut and paste. '''This part of the homework will contribute to the 10% of the grade and the deadline to submit the solution by email to matteucci@elet.polimi.it and malago@elet.polimi.it is Tuesday 5/6 23:59''' | ||
+ | ** [[Media:prostate.data | prostate.data]]: the dataset used for the homework | ||
+ | ** [[Media:prostate.info | prostate.info]]: the dataset used for the homework | ||
+ | ** [[Media:textread.m | textread.m]]: (optional) function which might be useful depending on your octave version | ||
+ | ** [[Media:strread.m | strread.m]]: (optional) function which might be useful depending on your octave version | ||
+ | |||
+ | '''Note:''' for some strange reason the CSM of the website has decided to rename the files with capitals, please save them in lower case :-( | ||
+ | |||
+ | * [[Media:PAMI_homework_2012_2.pdf | Homework 2012 part 2]]: this is the text of the second part of the homework; it has been intentionally edited not to allow cut and paste. '''This part of the homework will contribute to the 10% of the grade; the deadline to submit the solution by email to matteucci@elet.polimi.it is the day before the exam you decide to attend''' (e.g., if you decide to take the exam on the 26/6 then you need to turn it in by 25/6). | ||
+ | ** [[Media:SAheart.data | SAheart.data]]: the dataset used for the homework | ||
+ | ** [[Media:SAheart.info | SAheart.info]]: the dataset used for the homework | ||
+ | |||
+ | '''Errata Corrige''': there were a few bugs a bug in the homework text. I have updated the pdf and they were: | ||
+ | In the code for loading the data I forgot to remove the first column which you do not need | ||
+ | data = dlmread('SAheart.data',',',1,1); | ||
+ | X = data(:,1:9); | ||
+ | Y = data(:,10); | ||
+ | |||
+ | In the StratifiedSampling function the sorted verctors should be assigned | ||
+ | % just an ahestetic sorting | ||
+ | testing = sort(testing); | ||
+ | training = sort(training); | ||
+ | |||
+ | In the computation of feature projection, the code for the maximization of a'B*a via SVD should be changed as it follows | ||
+ | % maximization of a'*B*a / a'*w*a via SVD | ||
+ | [Vw, Dw, Vw] = svd(W); | ||
+ | Whalf = Vw * sqrt(Dw) * Vw'; % Whalf'*Whalf == W | ||
+ | Wminushalf = inv(Whalf); | ||
+ | Mstar = M*Wminushalf; | ||
+ | % Add this variable for computing Mstar mean | ||
+ | meanMstar = mean(Mstar); | ||
+ | for i=1:size(M,1) | ||
+ | % Remove the mean saved before the loop | ||
+ | Mstar(i,:) = Mstar(i,:)-meanMstar; | ||
+ | end | ||
+ | Bstar = Mstar'*Mstar; | ||
+ | [Vstar, Db, Vstar] = svd(Bstar); | ||
+ | |||
+ | In the expansion to quadratic space the starting index for the inner loop should i and not 1. Moreover in some cases it might be possible to have columns which are duplicated (e.g., with boolean attribute); in this case you should not need the robust version of linear regression. | ||
+ | function extendedX = expandToQuadraticSpace(X) | ||
+ | % adds new columns to extendedX; keeps X for other calculations | ||
+ | extendedX = X; | ||
+ | for i=1:size(X, 2) | ||
+ | for j=i:size(X, 2) | ||
+ | newColumn = X(:, i) .* X(:, j); | ||
+ | extendedX = [extendedX newColumn]; | ||
+ | end | ||
+ | end | ||
+ | % remove duplicated columns | ||
+ | duplicates = []; | ||
+ | for i=1:size(extendedX, 2) | ||
+ | for j=i+1:size(extendedX, 2) | ||
+ | if(sum(extendedX(:,i)==extendedX(:,j)) == size(X,1)) | ||
+ | duplicates = [duplicates j]; | ||
+ | end | ||
+ | end | ||
+ | end | ||
+ | extendedX(:,duplicates) = []; | ||
+ | end | ||
+ | |||
+ | * [http://davide.eynard.it/2012/06/05/octave-clustering-demo-part-0-introduction-and-setup/ Homework 2012 part 3]: the third part of the homework is '''optional''', so you are not required to complete it. However, if you want to give it a try and use it to understand the topics covered by Davide Eynard in his lectures you are welcome. As usual, the questions in this homework are very close to the ones you will find in classworks, so we suggest to have a look at hose anyway! '''In case you decide to turn it in and have it contribute with a 10% to the grade, the deadline to submit the solution by email to matteucci@elet.polimi.it and davide.eynard@polimi.it is the day before you decide to take the exam''' (e.g., if you decide to take the exam on the 10/7 then you need to turn it in by 9/7) | ||
+ | |||
+ | '''Note:''' homeworks are meant to let you see (and practice) a little bit with the topics presented during the course. They are evaluated because you spent some time on those and thus you deserve some credit for that ;-) | ||
+ | |||
+ | == 2011 Homework == | ||
+ | |||
+ | Here you can find the homework for the year 2011 and the material you need to complete it. Please read the F.A.Q. below and for any unsolved doubt contact the teachers of the course. | ||
+ | |||
+ | * [[Media:PAMI_homework_2011_v02.pdf | Homework 2011 v02]] a minor change in the signature of the logistic regression function | ||
+ | * [[Media:PAMI_homework_2011_v01.pdf | Homework 2011 v01]] text with questions and exercises | ||
+ | * [[Media:dataset.txt | Dataset]] for the clustering exercise in csv format | ||
+ | |||
+ | '''Frequently Asked Questions''' | ||
+ | |||
+ | * '''''How do I take the square root of a matrix?''''': check the diagonalization approach from [http://en.wikipedia.org/wiki/Square_root_of_a_matrix]. | ||
+ | |||
+ | * '''''How do I compute the chi square statistics?'''': in the slide there is a cut and paste error since e_ij=R_it*C_tj as described here [http://en.wikipedia.org/wiki/Pearson's_chi-square_test] | ||
+ | |||
+ | * '''''When it is due? In which format?''''': The homework is due on the 29/06 and should be delivered by email. Send us (all the course teachers) the .m files in a zip archive attached to this email and a link to the pdf with the written part (not to flood our mailboxes). | ||
+ | |||
+ | * '''''Can we do that in groups? How many people per group?''''': Yes, you can work on the homework in groups, but no more than 3 people per group are allowed. Put the names of all homework authors in the pdf and in all the .m files. If you discuss something with other people, w.r.t. the people in your group, point it out in the pdf file as well. | ||
+ | |||
+ | * '''''Can we ask questions about the exercises or the code?''''': Yes you should! First of all, there might be unclear things in the exercise descriptions and those should be clarified as soon as possible for all (this is why the homework is versioned). But you could ask for help as well, our goal is to have you all solving all the questions and get a high grade ... but we will not do the homework on you behalf ;-) | ||
+ | |||
+ | * '''''How the optional questions are graded?''''': They compensate for possible errors in the other questions; we suggest to work on them anyway to be sure you get the maximum grading. | ||
+ | |||
+ | * '''''How the homework will be graded?''''': we are interested in understanding if you understood or not; thus we are not interested in the result, but we want to check how you get to the result. So please: 1) clarify all the assumptions and all the steps in your exercises 2) comment as much as possible your .m files! | ||
+ | |||
+ | --> |
Latest revision as of 23:22, 26 February 2018
The following are last minute news you should be aware of ;-)
* 21/02/2018: Added slides on Word Embedding, Attention Mechanism, and Deep Convolutional Architectures * 16/02/2018: Added slides about Neural Networks Training and Recurrent Neural Networks * 12/02/2018: Added slides about Course Intro, Deep Learning Intro, and Feedforward Neural Networks * 12/02/2018: The course starts today! * 09/02/2018: First version of this website ...
Course Aim & Organization
Deep Learning is nowadays becoming the dominant approach in learning cognitive systems which are nowdays able to recognize patterns in data (e.g., images, text, and sounds) or to perform end to end learning of complex behaviors. The proposed doctoral course will introduce the deep learning basics as well as its applications with an on-hands approach where students will be challenged with the practical issues of data collection, model design, model training and performance evaluation.
Starting from the foundations of Neural Networks and Deep Learning, the course will introduce the most successful models, algorithms, and tools, for image understanding, sequence prediction and sequence to sequence translation through deep neural models.
In the first part of the course we will provide attendee the theory and the computational tools to approach deep learning with an in hand experience. In this part, we will cover the theory of neural networks, including feed-forward and recurrent models, with a specific focus on Feed Forward Neural Networks (FFNN), Convolutional Neural Networks (CNNs) and Long-Short Term Memory networks (LSTMs). These models will be presented both from a theoretical point of view and from a practical point of view with simple self-contained examples of applications. In this context two deep learning frameworks will be introduced, i.e., TensorFlow and PyTorch, in practical sessions with a special emphasis on the TensorFlow library.
In the second part we will focus on different application domains. This will be done by presenting a selection of state of the art results in applying deep learning techniques to domains such as (but not limited to) pattern recognition, speech recognition and language modeling, geometric reasoning.
For the final evaluation groups of 3 to 5 students (depending on the number of attendee) will be required to implement one of the models from the papers taken into analysis during the course by using the TensorFlow library.
Course Program
To avoid overlap between the "Deep Learning, Theory Techniques and Applications" (DL) PhD course and the "Image Classification: modern approaches" (IC) PhD course, and leverage on the specific competencies of the teachers, topics presented in DL classes will not be covered by the IC classes. Similarly, topics in IC won’t be covered in DL. Please refer to the detailed program and the course logistics to see which are the mandatory classes to be attended from the Image Classification course.
The topics which will be covered by the course are:
- Introduction to neural networks and the feed-forward architectures
- Backpropagation, training and overfitting in neural networks
- Recurrent Neural Networks and other classical architectures, e.g., Radial Basis Functions, Neural Autoencoders, etc.
- Introduction and basics of image handling
- Basic approaches to image classification
- Data-driven features: Convolutional Neural Networks
- TensorFlow and PyTorch introduction with examples
- Structural learning, Long-Short Term Memories, and application to text and speech
- Extended problems in image classification
Teachers
The course is composed by a blending of theoretical lectures, practical exercises, and seminars
- Matteo Matteucci: the official teacher for the Deep Learning course
- Marco Ciccone: co -teacher for the Deep Learning course
- Giacomo Boracchi: the co-teacher from the Image Classification course (shared code course)
- Alessandro Giusti: the co-teacher from the Image Classification course (shared code course)
- Luigi Malago': Guest Speaker from the Machine Learning and Optimization Group at the Romanian Institute of Science and Technology (RIST)
- Jonathan Masci: Guest Speaker from NNAISENSE
- Francesco Visin: Guest Speaker from DeepMind
- ...
Websites
Please refer to the following websites for specific course materials and detailed calendars:
- DL: http://chrome.ws.dei.polimi.it/index.php/Deep_Learning_Course
- IC: http://home.deib.polimi.it/boracchi/teaching/ImageClassification.htm
Detailed course schedule
A detailed schedule of the course can be found here; topics are just indicative while days and teachers are correct up to some last minute change. Please note that some days you have a lecture both in the morning and in the afternoon.
Please remind that the Deep Learning PhD course has some shared lectures with the Image Classification PhD course, students are required to attend these lectures which are listed below since those topic will not be covered by the Deep learning classes although they are part of the course program. You will find also the [Optional] lectures from the Image Classification course you might want to attend either because you are enrolled also in that course or for personal interest.
Note: Lecture timetable interpretation
* In the morning lecture starts at 9:30 sharp and ends at 13:00 * In the afternoon lecture starts at 14:15 sharp and ends at 17:45
Date | Day | Time | Room | Teacher | Topic |
12/02/2018 | Monday | 09:30 - 13:00 | S.0.2 - Building 3 | Matteo Matteucci | Course introduction, Machine Learning vs. Deep Learning introduction, the perceptron, the feed forward neural network architecture, backpropagation and gradient descent, error measures for regression and classification. |
12/02/2018 | Monday | 14:15 - 17:45 | S.0.2 - Building 3 | Giacomo Boracchi | Introduction and basics of image handling in Python |
14/02/2018 | Wednesday | 09:30 - 13:00 | S.0.2 - Building 3 | Matteo Matteucci | Overfitting, Early Stopping, Weight Decay, Regularization |
14/02/2018 [OPTIONAL] | Wednesday | 14:15 - 17:45 | S.0.2 - Building 3 | Giacomo Boracchi | Hand-crafted features for image classification |
16/02/2018 | Friday | 09:30 - 13:00 | S.0.5 - Building 3 | Matteo Matteucci | Recurrent Neural Networks, Backpropagation through time, Vanishing Gradient, Long-Short Term Memories, Gated Recurrent Units |
16/02/2018 [OPTIONAL] | Friday | 14:15 - 17:45 | S.0.5 - Building 3 | Giacomo Boracchi | Computer Vision features for image classification |
19/02/2018 | Monday | 09:30 - 13:00 | S.0.5 - Building 3 | Marco Ciccone | TensorFlow and PyTorch Tutorial |
19/02/2018 | Monday | 14:15 - 17:45 | S.0.5 - Building 3 | Alessandro Giusti | Data-driven features: Convolutional Neural Networks |
21/02/2018 | Wednesday | 09:30 - 10:30 | S.0.2 - Building 3 | Marco Ciccone | Common deep architectures for image classification: LeNet, AlexNet, GoogleNet, ResNet, ... |
21/02/2018 | Wednesday | 10:45 - 11:45 | S.0.2 - Building 3 | Matteo Matteucci | Word Embedding |
21/02/2018 | Wednesday | 12:00 - 13:00 | S.0.2 - Building 3 | Alberto Mario Pirovano | Attention Mechanisms |
21/02/2018 | Wednesday | 14:15 - 17:45 | S.0.2 - Building 3 | Alessandro Giusti | Advanced CNNs and Best practices in image classification |
23/02/2018 | Friday | 09:30 - 10:30 | C.I.1 - Building 6 | Luigi Malago' | Variational Autoencoders |
23/02/2018 | Friday | 10:45 - 11:45 | C.I.1 - Building 6 | Jonathan Masci | Deep Learning on graphs and manifolds: going beyond Euclidean data |
23/02/2018 | Friday | 12:00 - 13:00 | C.I.1 - Building 6 | Francesco Visin | Graph Networks |
23/02/2018 [OPTIONAL] | Friday | 14:10 - 17:45 | S.0.5 - Building 3 | Alessandro Giusti | An overview on extended problems in image classification |
Course Evaluation
Coming soon ...
Course Logistics
We would like to thank all the students for their enthusiastic participation, it has made course logistics quite challenging, but we have followed the motto “no student left behind” and we have done our best to accommodate the all of you.
Classrooms
Both DL and IC will be held in the following rooms so take note and spread the voice in case you know people who are going to attend and might not have received this email.
- February 12th, Aula S.0.2. Ed 3, 260 seats
- February 14th, Aula S.0.2. Ed 3, 260 seats
- February 16th, Aula S.0.5. Ed 3, 174 seats
- February 19th, Aula S.0.5. Ed 3, 174 seats
- February 21th, Aula S.0.2. Ed 3, 260 seats
- February 23th, Aula N.1.2. Ed 2, 168 seats
These classrooms should be able to fit the all of you, in case not we will check for alternative rooms and will notify this.
Course hours
Starting hours are sharp, i.e., they already include the “academic quarter”:
- DL will be in the morning: from 9.30 to 13.00,
- IC will be in the afternoon: from 14.15 to 17.45,
Attendance
The Deep eThese are courses open to Master Students, PhD students and few external participants. Everybody who is interested in getting credits (CFU) or attendance certificates, will be asked to sign an attendance list on each class (a class is a block of hours in the morning or in the afternoon).
Notice that In particular DL students have to attend IC on: February 12th, February 19th, and February 21th. This makes the DL schedule:
- February 12th Morning
- February 12th Afternoon
- February 14th Morning
- February 16th Morning
- February 19th Morning
- February 19th Afternoon
- February 21th Morning
- February 21th Afternoon
- February 23th Morning
Despite the fact you are attending just one of the courses (e.g., because as master student you can get credits only for one of them) we warmly suggest to attend all the lectures from both courses.
Programming Environment
The reference programming language for both courses is Python. In both DL and IC there will be sessions where you’ll be asked to implement yourself a few lines of codes and possibly we will give some simple assignment. The first lecture, bring your laptop and make sure to have Python 3.6 installed; install Miniconda / or Anaconda framework from conda.io … more informations will be provided later on regarding for additional tools and facilities.
Teaching Material
Lectures will be mostly based on presentations from the teachers and invited speakers. These slides are taken from several sources, tutorials, summer schools, papers and so on. In case you are interested in a reference book, you can read:
- Deep Learning. Ian Goodfellow, Yoshua Bengio, and Aaron Courville, MIT Press, 2016.
Teachers Slides
In the following you can find the lecture slides used by the course teachers during classes.
Deep Learning Classes
These are the slides presented during Matteo Matteucci lectures
- Course introduction: Introduction to the course with details about the logistics, the grading, the topics, the teachers, and so on ...
- Deep Learning introduction: Introduction to Deep Learning and learning data representation from data.
- From Perceptrons to Neural Networks: The Perceptron and its learning algorithm, Feed Forward Neural Networks and Backpropagation.
- Neural Networks Training: Dealing with overfitting and optimization in Fee Forward Neural Networks
- Recurrent Neural Networks: Vanishing and exploding gradient, Long-Short Term Memory cells
- Word Embedding: Deep Unsupervised Learning, Embedding, Language Models, and word2vec.
Slides for the tutorials by Marco Ciccone are published here:
- Tensorflow 101: Tutorial on Tensorflow by Marco Ciccone
- pyTorch 101: Tutorial on pyTorch by Marco Ciccone
Image Classification Classes
These are the slides presented during Giacomo Boracchi and Alessandro Giusti lectures:
- ...
- ...
- ...
Seminars
These are the presentations given by course special guests:
- Advanced CNN Architectures: Seminar on the importance of depth in convolutional neural networks by Marco Ciccone
- Attention Mechanisms: Seminar on attention mechanism in sequence to sequence learning by Alberto Mario Pirovano
- Deep Learning on Graphs and Manifolds: Seminar on deep learning beyond Euclidean data by Jonathan Masci
- Variational Autoencoders: Seminar on variational autoencoders principles and perspectives by Luigi Malagò
- Graph Networks: Seminar on deep learning and graphs, i.e., graph networks, by Francesco Visin
Additional Resources
Papers and links useful to integrate the slides from lecturers and guests:
- Computation Beyond the Turing Limit Hava T. Siegelmann, Sience (268)545:548,1995.
F.A.Q.
About attendance
I heard you take signatures and attendance is mandatory, but I have an exam and I will mess one or two lessons. Do I still get the credits?
- PhD students are required to attend 70% of the lectures, i.e., at least (9d x 4h x 0.7) ~= 25h. This means that in case of 4h blocks you have to attend at least 6 blocks out of the 9 blocks foreseen. For Master students we are little more flexible in case they have exams ... I suggest them not to push me writing here a number since then it stays ;-)