Introduction to Machine Learning
A machine learning algorithm is an algorithm that is able to learn from data.
But what do we mean by learning?
Tom Mitchell provides a succinct deﬁnition:
“A computer program is said to learn from experience E with respect to some class of tasks T and performance measure P, if its performance at tasks in T, as measured by P, improves with experience E.”
The Task, T
Machine learning enables us to tackle tasks that are too diﬃcult to solve with ﬁxed programs written and designed by human beings. From a scientiﬁc and philosophical point of view, machine learning is interesting because developing our understanding of it entails developing our understanding of the principles that underlie intelligence.
In this relatively formal deﬁnition of the word “task,” the process of learning itself is not the task. Learning is our means of attaining the ability to perform the task.
Many kinds of tasks can be solved with machine learning. Some of the most common machine learning tasks include the following:
- Classification with missing inputs
- Machine translation
- Anomaly detection
Of course, many other tasks and types of tasks are possible. The types of tasks we list here are intended only to provide examples of what machine learning can do, not to deﬁne a rigid taxonomy of tasks.
The Performance Measure, P
To evaluate the abilities of a machine learning algorithm, we must design a quantitative measure of its performance. Usually this performance measure P is speciﬁc to the task T being carried out by the system.
The Experience, E
In supervised learning, we are given a data set and already know what our correct output should look like, having the idea that there is a relationship between the input and the output.
Unsupervised learning allows us to approach problems with little or no idea what our results should look like. We can derive structure from data where we don't necessarily know the effect of the variables.
Though unsupervised learning and supervised learning are not completely formal or distinct concepts, they do help roughly categorize some of the things we do with machine learning algorithms. Traditionally, people refer to regression, classiﬁcationand structured output problems as supervised learning. Density estimation in support of other tasks is usually considered unsupervised learning.