Machine Learning History: The Complete Timeline

September 9, 2022

Startechup machine learning history blog

A momentous event that put a spotlight on machine learning was in 2016 when Google DeepMind’s AlphaGo AI made history by defeating one of the best Go players in the world. Being a complex board game that requires strong intuition and abstract thinking, a lot of people were shocked to know that machines can think like humans.

Machine learning algorithms have become ubiquitous today, powering everything from search engines to self-driving cars. But, how did it all start?

In this blog, we’ll explore the complete timeline of the history of machine learning. Let’s get started!

1943 – The First Mathematical Model of a Biological Neuron

Startechup machine learning quote

Walter Pitts and Warren McCulloch created the first mathematical model of neural networks in 1943. Their scientific paper, “A Logical Calculus of the Ideas Immanent in Nervous Activity,? was used to create algorithms that mimic human thought processes.

This McCulloch Pitts Neuron has very limited capability and has no learning mechanism. Yet it was the real starting point for the modern discipline of machine learning and later led the way for deep learning and quantum machine learning.

1949 – The Hebb Synapse

Canadian psychologist Donald O. Hebb published his book “The Organization of Behavior: A Neuropsychological Theory.” Here, Hebb theorizes on neuron excitement and communication between neurons that influenced how psychologists view stimulus processing in the mind.

The first use of the concept was in studying how brains learn. It also paved the way for the development of computational machines mimicking natural neurological processes, such as machine learning.

1950 – The Turing Test

The Turing Test was proposed by Alan Turing, an English computer scientist, as a measure of a computer’s intelligence in 1950. It’s a way to measure artificial intelligence. If someone can’t tell if they’re talking to another person or a computer, then the computer is considered intelligent.

Turing test has been criticized on the grounds that it is difficult to create a fair and accurate test, as well as because intelligence is not adequately measured by this test alone. However, it remains an essential milestone in the history of artificial intelligence research.

Startechup contact us

1952 – Machine Learning and the Game of Checkers

English mathematician Arthur Samuel created a computer learning program for playing championship-level computer checkers, which was created for play on the IBM 701. He initiated alpha-beta pruning, a design that measures each side’s chances to win.

This computer program chooses its next move using a minimax algorithm, which calculates the best possible move for a player in a game by minimizing the opponent’s maximum gain and maximizing the player’s minimum gain.

Arthur Samuel is the first person to create and popularize the term “machine learning.”

1956 – The Birthplace of Artificial Intelligence

In machine learning history, the Dartmouth Workshop in 1956 is widely considered to be the founding event of artificial intelligence as a field. Computer scientist John McCarthy invited well-known mathematicians, scientists, and researchers to a six to eight-week workshop. They gathered at Dartmouth College to establish and brainstorm the AI and ML research fields.

Dartmouth college workshop 1956

1958 – The Perceptron

The psychologist Frank Rosenblatt attempted to build “the first machine capable of producing an original idea” and subsequently designed the Perceptron, the first neural network ever produced.

He combined Donald Hebb?s model of brain cell interaction with Arthur Samuel?s machine learning efforts. It was fed a series of punch cards and, after 50 tries, learned to identify cards with markings on the left from markings on the right.

Despite its promise, the perceptron could not identify many kinds of visual patterns causing researchers to become frustrated. It would be several years before investors’ and funding agencies’ frustrations faded away.

1963 – A Game of Tic Tac Toe

Computer Scientist Donald Michel designed Machine Educable Noughts And Crosses Engine (MENACE), a large pile of matchboxes that contained several beads and used reinforcement learning to play tic-tac-toe.

MENACE works a little like a neural network. It is randomly optimized initially, but after playing a few games, it adjusts to favor winning strategies in each situation.

You can compete with MENACE here.

1965 – The Multilayer Neural Networks Presented

Alexey (Oleksii) Ivakhnenko and Valentin Lapa are scientists who worked together to develop the first-ever multi-layer perceptron. It’s a hierarchical representation of a neural network that uses a polynomial activation function and is trained using the Group Method of Data Handling (GMDH).

Ivakhnenko is often considered the father of deep learning.

1967 – The Nearest Neighbor Algorithm

Thomas Cover and Peter Hart published his “Nearest Neighbor Pattern Classification” in 1967. It laid a foundation for recognizing patterns and regression in machine learning.

The Nearest Neighbor algorithm is a method for very basic pattern recognition that was developed to allow computers to conduct rudimentary pattern detection. It works by comparing existing data and classifying it as the nearest neighbor, which means the most similar item in memory, which can help travel salesmen in a random city.

1973 – 20th Century AI Winter

The Lighthill report by James Lighthill in 1973 presented a very pessimistic forecast for the development of core aspects in AI research, stating, “In no part of the field have the discoveries made so far produced the major impact that was then promised.? As a result, the British government cut the funding for AI research in all but two universities. It’s part of the history of machine learning known as the AI winter.

1979 – Neocognitron and The Stanford Cart

Japanese computer scientist Kunihiko Fukushima publishes his work on Neocognitron, a hierarchical multilayered network used to detect patterns and inspire convolutional neural networks used for analyzing images. It sparked a revolution in what we now call AI.

Stanford cart 1979

In the same year, a group of researchers from Stanford University created a robot called the Cart. It was a decades-long endeavor that evolved in various forms from 1960 to 1980. Created initially as a remote-controlled television-equipped mobile robot, it became a radio-linked machine to a large mainframe computer that can independently navigate obstacles in a room.

The invention was state-of-the-art at the time, and machine learning shifted as a probable tool to create and eventually revitalize an autonomous vehicle.

1981 – Explanation Based-Learning

Machine learning has come a long way since its inception in 1981. That year, Gerald Dejong introduced the concept of Explanation Based Learning (EBL), in which a computer analyses training data and creates a general rule it can follow by discarding unimportant data. For example, if the software is instructed to concentrate on the queen in chess, it will discard all non-immediate-effect pieces. This laid the foundation for modern supervised learning techniques.

1982 – The Hopfield Network

In 1982, American scientist John Hopfield creates Hopfield Network, which is nothing but a recurrent neural network. It’s a special kind whose response differs from other neural networks.

The Hopfield network is an associative memory, which means it can store and recall patterns. It serves as a content-addressable memory system and would be instrumental for further RNN models of the modern deep learning era.

Startechup nettalk quote

1985 – The NETTalk

In the mid-1980s, Terrence Sejnowski and Charles R. Rosenberg develop NETtalk. It was created with the goal of constructing simplified models that might shed light on human learning.

Following a knowledge-driven approach, it learns to pronounce written English text by being shown text as input and matching phonetic transcriptions for comparison. By simplifying models of human cognitive operations, it could generate human-like text similar to how a baby learns.

1986 – Restricted Boltzmann Machine

Initially introduced as Harmonium, Restricted Boltzmann Machine (RBM) was invented by cognitive scientist Paul Smolensky in 1986. It rose to prominence after Stanford scientist Geoffrey Hinton and collaborators invented fast learning algorithms for them in the mid-2000s.

RBM is faster than the traditional Boltzmann Machine because it “restricts” connections between nodes. It’s an algorithm useful for dimensionality reduction, classification, regression, collaborative filtering, feature learning, and topic modeling.

1989 – Boosting for Machine Learning

The concept of boosting was first presented in a 1990 paper titled ?The Strength of Weak Learnability? by Robert Schapire and Yoav Freund. It marked a necessary development for the evolution of machine learning.

As Schapire states, “A set of weak learners can create a single strong learner.” It simply translates to producing numerous weaker models and combines their predictions to convert them into single powerful model.

Neural network

1991 – The Vanishing Gradient Problem

Although the start of the 1990s popularised methods such as support vector machines, there were still challenges found along the way. The vanishing gradient problem was first identified by Sepp Hochreiter. It was a challenge in machine learning development, specifically with deep neural networks.

As the number of layers in a network increases, the value of the derivative decreases until it eventually vanishes altogether. This can make the learning process extremely slow and difficult to manage.

For years to come, this issue will continue to irritate the community.

1992 – Playing Backgammon

Researcher Gerald Tesauro created a program based on an artificial neural network capable of playing backgammon with abilities that matched top human players. The backgammon-playing software, called TD-Gammon. It could play at a high level after just a few hours of training, and it continued to improve as it played more games.

The program’s success was a significant milestone in artificial intelligence and the history of machine learning, as it showed that neural networks could be used to create programs that could learn and improve through experience.

1997 – Deep Blue and the Milestone of LSTM

In 1997, IBM’s Deep Blue became the first computer chess-playing system to defeat a reigning world chess champion when it beat Garry Kasparov.


It’s also the year when Sepp Hochreiter and J