# MACHINE LEARNING FOR PHYSICS

Academic Year 2022/2023 - Teacher: Marco RUSSO

## Expected Learning Outcomes

The specific objectives of this course are:

• Transmit the basic principles of Machine Learning
• Provide through practical examples some ways of using Machine Learning.
• Understand all the various steps that allow you to move from the problem to be solved or from the phenomenon to be analyzed (and / or simulated) to its implementation.
• Acquire the ability to evaluate the most satisfactory among several solutions.
• Acquire the ability to correctly analyze experimental data.

Furthermore, with reference to the so-called Dublin Descriptors, this course helps to acquire the
following soft skills:

Knowledge and understanding:

The primary objective of the course is identified in the students' acquisition of the various "philosophies" that underlie numerous different techniques in the field of Machine Learning.

Ability to apply knowledge and understanding,

It is intended to provide students with the following skills:

- Given a problem how to frame it in the right Machine Learning frame.

- Design, describe and implement classifiers and / or regressions;

- Properly prepare the data to be processed

- Making judgments.

Through the examination of examples generated and applied to physics and a consistent practical component, the learner will be able, both autonomously and cooperatively, to analyze problems and design and implement the related solutions.

Communication skills.

the student will acquire the necessary communication skills and expressive appropriateness in the use of technical verbal language.

Learning skills.

The course aims to provide the learner with the necessary theoretical and practical methodologies to be used in research and professional contexts with particular attention to the physical field.

## Course Structure

Frontal lessons and practical exercises.

Should the circumstances require online or blended teaching, appropriate modifications to what is hereby stated may be introduced, in order to achieve the main objectives of the course.

## Required Prerequisites

In-depth knowledge of C language and possibly matlab.

It is also highly recommended to everyone and in particular to those wishing to develop new Machine Learning techniques such as Genetic Programming to follow the optional course "Object-oriented and Big Data Programming" of the Bachelor of Physics.

## Attendance of Lessons

Attendance to the course is usually compulsory (consult the Academic Regulations of the Course of Studies)

## Detailed Course Content

- Types of machine learning algorithms

* Supervised Learning
* Unsupervised Learning
* Reinforcement Learning

- Regression or Classification what are them? Are they really different?

- Cost functions and their importance
* Typical regression error measures and their shortcomings.
* Classifier evaluation: Sensitivity, Specificity, Accuracy, ROC, AUC, etc.

- Datasets and Machine Learning the first and most important step
* Statistical validation
* Missing values
* Raw data: when we have to use or not to use them?
* Preprocessing
* Feature Extraction
* Feature Selection
* Feature Reduction
* Curse of dimensionality

- Model complexity
* Underfitting and overfitting
* Occam's razor principle
* Many parameters and the importance of regularization

- Singular Value Decomposition (SVD)
* Linear modeling is not infrequently enough
* Lowering machine learning algorithm coomplexity
* SVD/PCA as feature reduction but sometimes fails

- Neural Networks
* The biological neuron
* The artifical neuron
* Network topology
* The Multilayer Perceptron
* The universal approximation theorem
* Fixed, Self Adaptive, and Stochastic Gradient descend as a general technique
for parameter estimation
* Backpropagation
* Deep Learning

- Cluster Analysis and Vector Quantization
* K-means/LBG algorithm
= Serial improvements
> Escaping from local minima: the Enhanced LBG Algorithm (ELBG)
> From target error to clusters: Fully Automatic Clustering system (FACS)
= Big data and parallel clustering
> Parallel algorithms for unsupervised learning (PAUL)
> Very large data sets vector quantization (LBGS)
* Other clustering approaches: Hierarchical Clustering and Fuzzy Clustering

- Global optimization inspired by biological evolution
* From Monte Carlo methods to Evolutionary Computation
= The Population: A set of candidate solutions as individuals
= Selection among individuals: Roulette, Ordering, Tournament.
= Generation of new solutions: offsprings
> Recombination/crossover
> Mutation
> Hill-climbing
= Multi-objective optimization: The Fitness Function
* Evolutionary techniques, some examples
= Genetic Algorithms and Holland's schema theorem
= Genetic Programming
= Parallel/Distributed Genetic Programming for Mathematical Modelling: The Brain Project
* Case study: Find the minimum of the function $y=\sum_{i=1}^1000$ (x_i-1000/i)^2 with $x_i in [0,2]$

- Fuzzy logic from classical boolean logic to many-valued logic.
* Fuzzy sets and membership functions.
* Operations on Fuzzy sets.
* Fuzzy relations, rules, propositions, implications and inferences.
* Defuzzification techniques.
* Fuzzy logic controller design.

- Hybridization is often the way to get better results

- Case studies in physics
* Data Analysis of Gravitation Wave time series
* Track recognition in Nuclear Physics Collisions
* Structure of the proton using contemporary methods of artificial intelligence

## Textbook Information

Notes provided in class. These notes, the code developed in class and any other material useful for the course will be available on the teacher's website: superpippo.ct.infn.it/~marco/didattica.

## Course Planning

 Subjects Text References 1 All Notes and handouts provided by the teacher

LEARNING ASSESSMENT

## Learning Assessment Procedures

Practical thesis carried out in agreement with the teacher. Once the argument has been agreed, an appropriate code or hardware is created. During the lessons the work will be carried out together with the teacher himself. At the end of the lessons everything will be ultimately in complete autonomy from the student. Together with the sw/ hw it is expected the delivery of a written report preferably in latex. Both components will be considered in the final evaluation of the student. Since the course focuses precisely on the topic chosen at the beginning of the course itself, all the teaching material must be acquired by the learner in order to pass the exam.

## Examples of frequently asked questions and / or exercises

http://superpippo.ct.infn.it/~marco/didattica