\begin{block}{Autonomous cars}
\begin{center} \includegraphics[scale=.20]{Images/acar.jpeg} \end{center}
\end{block}

\begin{block}{Robotics}
\begin{center} \includegraphics[scale=.10]{Images/nurse-robot-italy.jpg} \end{center}
\end{block}

\begin{block}{Smart Homes}
\begin{center} \includegraphics[scale=.20]{Images/smarthome.jpeg} \end{center}
\end{block}

\begin{block}{Chat Bots}
\begin{center} \includegraphics[scale=.20]{Images/chatbot.jpeg} \end{center}
\end{block}

\begin{block}{}
\begin{center} ...and many more ...
AI is in urgent need of verification: safety, security, robustness to changing conditions and adversarial attacks, ...
\end{center}
\end{block} \frametitle{Perception and Reasoning}

\begin{center} AI methods divide into:
\end{center}

Perception tasks:

\begin{block}{Computer Vision}
\begin{center} \includegraphics[scale=.20]{Images/cv.jpeg} \end{center}
\end{block}

\begin{block}{Natural language understanding}
\begin{center} \includegraphics[scale=.20]{Images/nlu.jpeg} \end{center}
\end{block}

Reasoning tasks:

\begin{block}{Planning}
\begin{center} \includegraphics[scale=.20]{Images/route.jpeg} \end{center}
\end{block}

\begin{block}{(Logical) reasoning}
\begin{center} \includegraphics[scale=.20]{Images/chatbot.jpeg} \end{center}
\end{block} PPDP'20.} - \end{thebibliography}}} - -\end{frame} - - - - %\subsection{Why is it important?} - -\begin{frame} - \frametitle{Neural Networks...} - - - - \begin{columns} - \column{.4\textwidth} - - - \begin{center} - \includegraphics[scale=0.3]{Images/NN} - \end{center} - -\begin{block}{take care of \alert{\textbf{perception}} tasks:} - - \begin{itemize} - \item[] computer vision - \item[] speech recognition - \item[] pattern recognition - \item[] ... - \end{itemize} - - -\end{block} - - - -\pause - -\column{.4\textwidth} - - -\begin{block}{In:} - -\begin{itemize} -\item[] autonomous cars -\item[] robots -%\item security applications -%\item financial applications - \item[] medical applications -\item[] chatbots -%\item Google bot on mobile phones -\item[] mobile phone apps -\item[] $\ldots$ -\end{itemize} - - \end{block} - - - \end{columns} - -\end{frame} - -% \subsection{What is it?} - -\section{Why Verifying Neural Networks?} - -\begin{frame} - \frametitle{Neural network is} - - \begin{block}{... a function } - $$N: \Real^n \rightarrow \Real^m$$ - % where $n$ is the size (or \emph{dimension}) of inputs, and $m$ -- the number of \emph{classes}. - \end{block} - - %By abuse of terminology, a \emph{training} algorithm that computes this function exactly (via a loss minimisation algorithm such as \emph{gradient decent}) - %is also often called a \emph{neural network}. - -% \pause - -% \begin{center} -%Ignore the learning functions for now... - % \small{We will ignore the training algorithm for now, and will look at neural networks as functions. } -%\end{center} - -\end{frame} - - -\begin{frame} - \frametitle{Neural network is} - %\begin{block}{} - ... a function that separate inputs (data points) into classes - %\end{block} - \pause - - \begin{alertblock}{Suppose we have four data points} - \begin{center}\begin{tabular}{l|ll c |} - \hline - & $x_1$ & $x_2$ & y \\ \hline - 1 & 1 & 1 & 1 \\ - 2 & 1 & 0 & 0 \\ - 3 & 0 & 1 & 0 \\ - 4 & 0 & 0 & 0 \\ - \hline - \end{tabular} - \end{center} -\end{alertblock} - -\pause - - \begin{block}{We may look for a \alert{linear} function:} - $$ -\begin{array}{l} - \neuron : (x_1:\Real) \to (x_2: \Real) \to (y: \Real)\\ - \neuron \; x_1 \; x_2 = b + w_{x_1} \times x_1 + w_{x_2} \times x_2 -\end{array} -$$ - \end{block} - - -\end{frame} - -\begin{frame} - \frametitle{} - - - \begin{alertblock}{Plotting these four data points in 3-dimensional space:} - \begin{center} - \begin{tikzpicture} - \begin{axis}[ - xlabel={$x_1$}, - ylabel={$x_2$}, - zlabel={$y$}, - xmin=0, xmax=1, - ymin=0, ymax=1, - zmin=0, zmax=1, - xtick={0,1}, - ytick={0,1}, - ztick={0,1}, - legend pos=north west, - ymajorgrids=false, - grid style=dashed, - ] - \addplot3[ - only marks, - color=blue, - scatter, - mark=halfcircle*, - mark size=2.9pt - ] - coordinates { - (0,0,0)(1,0,0)(0,1,0)(1,1,1) - }; - % \addplot3[ - % mesh, - % samples=10, - % domain=0:1, - % ] - % {(0.5*x+0.5*y+0)}; - \end{axis} - \end{tikzpicture} - \end{center} -\end{alertblock} - - - - -\end{frame} - - - -\begin{frame} - \frametitle{Neural network is} - \begin{alertblock}{... a separating linear function:} - \begin{center} - \begin{tikzpicture} - \begin{axis}[ - xlabel={$x_1$}, - ylabel={$x_2$}, - zlabel={class}, - xmin=0, xmax=1, - ymin=0, ymax=1, - zmin=0, zmax=1, - xtick={0,1}, - ytick={0,1}, - ztick={0,1}, - legend pos=north west, - ymajorgrids=false, - grid style=dashed, - ] - \addplot3[ - only marks, - color=blue, - scatter, - mark=halfcircle*, - mark size=2.9pt - ] - coordinates { - (0,0,0)(1,0,0)(0,1,0)(1,1,1) - }; \begin{array}{l}
\neuron : (x_1:\Real) \to (x_2: \Real) \to (y: \Real)\\
\neuron \; x_1 \; x_2 = b + w_{x_1} \times x_1 + w_{x_2} \times x_2
\end{array} The problem of opaque semantics

$Program: \mathcal{A} \rightarrow \mathcal{B}$
$$NeuralNet: \Real^n \rightarrow \Real^m$$

But normally, programs have semantically meaningful parts which allows us to verify components that matter The problem of opaque semantics

$Program: \mathcal{A} \rightarrow \mathcal{B}$
$$NeuralNet: \Real^n \rightarrow \Real^m$$

For neural nets: input and output are the only semantically meaningful parts (and even that is somewhat blurry) let _ = forall (x:vector real 784). (|sample_in - x| <. 0.01R)
==> (|sample_out - (run network x)| <. 0.1R)) (|sample_in - x| <. 0.01R)
==> (|sample_out - (run network x)| <. 0.1R)) \end{alertblock}

\end{frame}

\end{document} @parameter
epsilon : Rat

boundedByEpsilon : Image -> Bool
boundedByEpsilon x = forall i j . -epsilon <= x ! i ! j <= epsilon

robustAround : Image -> Label -> Bool
robustAround image label = forall pertubation .
let perturbedImage = image - pertubation in
boundedByEpsilon pertubation and validImage perturbedImage =>
advises perturbedImage label

@dataset
trainingImages : Vector Image n

@dataset
trainingLabels : Vector Label n

@property
robust : Vector Bool n
robust = foreach i . robustAround (trainingImages ! i) (trainingLabels ! i) We may later use notation $f(\mathbf{x}) = \mathbf{y}$ coordinates {
(0,0,0)(1,0,0)(0,1,0)(1,1,1)
}; \neuron : (x_1:\Real) \to (x_2: \Real) \to (y: \Real)\\
\neuron \; x_1 \; x_2 = b + w_{x_1} \times x_1 + w_{x_2} \times x_2 \neuron : (x_1:\Real) \to (x_2: \Real) \to (y: \Real\; \{y=0 \lor y=1\})\\
\neuron \; x_1 \; x_2 = S \; (-0.9 + 0.5 x_1 + 0.5 x_2)

where
$$
\begin{array}{l}
S~x =
\begin{cases}
1, & \text{if } x\geq 0\\
0, & \text{otherwise}
\end{cases}
\end{array}
$$ \neuron : (x_1:\Real) \to (x_2: \Real) \to (y: \Real\;\{y=0 \lor y=1\})\\
\neuron \; x_1 \; x_2 = S \; (-0.9 + 0.5 x_1 + 0.5 x_2)

Verify
$$
\setlength{\arraycolsep}{2pt}
\begin{array}{rcl}
\neurontest & : & (x_1: \Real\;\{\truthy\;x_1\})
\to (x_2: \Real\;\{\truthy\;x_2\})
\to (y: \Real\;\{y=1\})\\
\neurontest & = & \neuron
\end{array}
$$

It may be useful to look into types of neural nets! Data Augmentation

Suppose we are given a data set $\mathcal{D} = \{(\x_1, \y_1), \ldots , (\x_n, \y_n)\}$.
Prior to training, generate new training data samples close to existing data and label them with the same output as the original data. C. Shorten, T.M. Khoshgoftaar:
A survey on image data augmentation for deep learning. J.
Big Data 6, 60 (2019) Solutions Involving Loss Functions

Given a data set $\mathcal{D}$, a function ${f: \Real^n \rightarrow \Real^m}$, and a penalty function $\lfloor \ . \rfloor: \Real^m \rightarrow \Real^m \rightarrow \Real$, a loss function is defined as
\begin{equation}\label{eqn:loss}
\loss{\x, \y} = \lfloor \y, f(\x) \rfloor
\end{equation}

\begin{example}[Cross Entropy Loss Function]
\label{eq:cross-entropy}
Given a function ${f: \Real^n \rightarrow [0,1]^m}$, the cross-entropy loss is defined as
\begin{equation}\label{eq:ce}
\losssymbol_{ce}(\x, \y) = - \sum_{i=1}^{m} \y_i \; \log(f(\x)_i)
\end{equation}
where $\y_i$ is the true probability for class $i$ and $f(\x)_i$ the probability for class $i$ as predicted by $f$ when applied to $\x$.
\end{example} Adversarial Training for Robustness

standard training minimises loss $\lossfn(\xt, \y)$ between the predicted value $f(\xt)$ and the true value $\y$, for each entry $(\xt, \y)$ in $\mathcal{D}$,
instead minimise the loss with respect to the worst-case perturbation of each sample in $\mathcal{D}$.
Replace the standard training objective with:
$\max_{\forall \xs : \distance{\xs}{\xt} \leq \epsilon} \lossfn(\xs, \y)$.

I.J. Goodfellow, J. Shlens, C. Szegedy: Explaining and harnessing adversarial examples. 3rd International Conference on Learning Representations,
ICLR 2015, San Diego, CA, USA, May 7-9, 2015, Conference Track Proceedings (2015) Lipshitz Continuity

Optimise for:

\begin{equation*}
\label{eq:lipschitz-robustness}
\forall \xs: \distance{\xs}{\xt} \leq \epsilon \Rightarrow \distance{f(\xs)}{f(\xt)} \leq L \distance{\xs}{\xt}
\end{equation*}

P. Pauli, A. Koch, J. Berberich, P. Kohler, F. Allgower: Training robust neural networks
using Lipschitz bounds. IEEE Control Systems Letters (2021)
H. Gouk, E. Frank, B. Pfahringer, M.J. Cree: Regularisation of neural networks by enforcing Lipschitz continuity. Machine Learning 110(2), 393–416 (2021) State of the Art

Problems:
- "Robustness property" is informally understood
- Two parts are not very well-connected
- The whole process is very error prone...
- ... and unfriendly to the user N. Ślusarz, E. Komendantskaya, M. Daggitt and R. Stewart
Differentiable Logics for Neural Network Verification. FOMLAS 2022 Type-driven Program Synthesis

We understood which properties adversarial learning algorithms optimise for (and rendered them as refinement types of networks);

We can convert these properties into loss functions in a systematic way, and use them in training;

We understood mathematical properties of the resulting loss functions.

So, overall, we advanced our understanding and our formalisation. How about programming practices? Tier 3 can bring better...
- understanding (from theoretical point of view)
- modelling methods (for programmers)
- better prodictivity (for Tiers 1, 2) Could they be modelled by the same tools?
- only if the "control properties" fit with the syntax of the searching engine
- incorporating "control properties" into search would significantly increase its complexity
- "control properties" may be orthogonal to searching (optimality vs abstraction): e.g. the problem of optimal taxi allocation vs law compliance
- automated solvers struggle with corner cases when it comes to more abstract domains, such as law interpretation. 