MACHINE LEARNING FOR PHYSICS
Academic Year 2021/2022 - 1° Year - Curriculum APPLIED PHYSICS and Curriculum THEORETICAL PHYSICSCredit Value: 6
Scientific field: FIS/01 - Experimental physics
Taught classes: 35 hours
Laboratories: 15 hours
Term / Semester: 2°
Learning Objectives
The specific objectives of this course are:
- Transmit the basic principles of Machine Learning
- Provide through practical examples some ways of using Machine Learning.
- Understand all the various steps that allow you to move from the problem to be solved or from the phenomenon to be analyzed (and / or simulated) to its implementation.
- Acquire the ability to evaluate the most satisfactory among several solutions.
- Acquire the ability to correctly analyze experimental data.
Furthermore, with reference to the so-called Dublin Descriptors, this course helps to acquire the
following soft skills:
Knowledge and understanding:
The primary objective of the course is identified in the students' acquisition of the various "philosophies" that underlie numerous different techniques in the field of Machine Learning.
Ability to apply knowledge and understanding,
It is intended to provide students with the following skills:
- Given a problem how to frame it in the right Machine Learning frame.
- Design, describe and implement classifiers and / or regressions;
- Properly prepare the data to be processed
- Making judgments.
Through the examination of examples generated and applied to physics and a consistent practical component, the learner will be able, both autonomously and cooperatively, to analyze problems and design and implement the related solutions.
Communication skills.
the student will acquire the necessary communication skills and expressive appropriateness in the use of technical verbal language.
Learning skills.
The course aims to provide the learner with the necessary theoretical and practical methodologies to be used in research and professional contexts with particular attention to the physical field.
Course Structure
Frontal lessons and practical exercises.
Detailed Course Content
- Types of machine learning algorithms
* Supervised Learning
* Unsupervised Learning
* Reinforcement Learning
- Regression or Classification what are them? Are they really different?
- Cost functions and their importance
* Typical regression error measures and their shortcomings.
* Classifier evaluation: Sensitivity, Specificity, Accuracy, ROC, AUC, etc.
- Datasets and Machine Learning the first and most important step
* Statistical validation
* Missing values
* Raw data: when we have to use or not to use them?
* Preprocessing
* Feature Extraction
* Feature Selection
* Feature Reduction
* Curse of dimensionality
- Model complexity
* Underfitting and overfitting
* Occam's razor principle
* Many parameters and the importance of regularization
- Singular Value Decomposition (SVD)
* Linear modeling is not infrequently enough
* Lowering machine learning algorithm coomplexity
* SVD/PCA as feature reduction but sometimes fails
- Neural Networks
* The biological neuron
* The artifical neuron
* Network topology
* The Multilayer Perceptron
* The universal approximation theorem
* Fixed, Self Adaptive, and Stochastic Gradient descend as a general technique
for parameter estimation
* Backpropagation
* Radial Basis Functions
* Deep Learning
- Cluster Analysis and Vector Quantization
* K-means/LBG algorithm
= Serial improvements
> Escaping from local minima: the Enhanced LBG Algorithm (ELBG)
> From target error to clusters: Fully Automatic Clustering system (FACS)
= Big data and parallel clustering
> Parallel algorithms for unsupervised learning (PAUL)
> Very large data sets vector quantization (LBGS)
* Other clustering approaches: Hierarchical Clustering and Fuzzy Clustering
- Global optimization inspired by biological evolution
* From Monte Carlo methods to Evolutionary Computation
= The Population: A set of candidate solutions as individuals
= Selection among individuals: Roulette, Ordering, Tournament.
= Generation of new solutions: offsprings
> Recombination/crossover
> Mutation
> Hill-climbing
= Multi-objective optimization: The Fitness Function
* Evolutionary techniques, some examples
= Genetic Algorithms and Holland's schema theorem
= Genetic Programming
= Parallel/Distributed Genetic Programming for Mathematical Modelling: The Brain Project
* Case study: Find the minimum of the function $y=\sum_{i=1}^1000$ (x_i-1000/i)^2 with $x_i in [0,2]$
- Fuzzy logic from classical boolean logic to many-valued logic.
* Fuzzy sets and membership functions.
* Operations on Fuzzy sets.
* Fuzzy relations, rules, propositions, implications and inferences.
* Defuzzification techniques.
* Fuzzy logic controller design.
- Hybridization is often the way to get better results
- Case studies in physics
* Data Analysis of Gravitation Wave time series
* Track recognition in Nuclear Physics Collisions
* Structure of the proton using contemporary methods of artificial intelligence
Textbook Information
Notes provided in class. These notes, the code developed in class and any other material useful for the course will be available on the teacher's website: superpippo.ct.infn.it/~marco/didattica.