History of Machine Learning: The Complete Timeline [UPDATED]

A momentous event that put a spotlight on machine learning was in 2016 when Google DeepMind’s AlphaGo AI made history by defeating one of the best Go players in the world. Being a complex board game that requires strong intuition and abstract thinking, a lot of people were shocked to know that machines can think like humans.

Machine learning algorithms have become ubiquitous today, powering everything from search engines to self-driving cars. But, how did it all start?

In this blog, we’ll explore the complete timeline of the history of machine learning. Let’s get started!

Table of Contents

1943 – The First Mathematical Model of a Biological Neuron

Walter Pitts and Warren McCulloch created the first mathematical model of neural networks in 1943. Their scientific paper, “A Logical Calculus of the Ideas Immanent in Nervous Activity,” was used to create algorithms that mimic human thought processes.

This McCulloch Pitts Neuron has very limited capability and has no learning mechanism. Yet it was the real starting point for the modern discipline of machine learning and later led the way for deep learning and quantum machine learning.

1949 – The Hebb Synapse

Canadian psychologist Donald O. Hebb published his book “The Organization of Behavior: A Neuropsychological Theory.” Here, Hebb theorizes on neuron excitement and communication between neurons that influenced how psychologists view stimulus processing in the mind.

The first use of the concept was in studying how brains learn. It also paved the way for the development of computational machines mimicking natural neurological processes, such as machine learning.

1950 – The Turing Test

The Turing Test was proposed by Alan Turing, an English computer scientist, as a measure of a computer’s intelligence in 1950. It’s a way to measure artificial intelligence. If someone can’t tell if they’re talking to another person or a computer, then the computer is considered intelligent.

Turing test has been criticized on the grounds that it is difficult to create a fair and accurate test, as well as because intelligence is not adequately measured by this test alone. However, it remains an essential milestone in the history of artificial intelligence research.

StarTechUP Artificial Intelligence Services

1952 – Machine Learning and the Game of Checkers

English mathematician Arthur Samuel created a computer learning program for playing championship-level computer checkers, which was created for play on the IBM 701. He initiated alpha-beta pruning, a design that measures each side’s chances to win.

This computer program chooses its next move using a minimax algorithm, which calculates the best possible move for a player in a game by minimizing the opponent’s maximum gain and maximizing the player’s minimum gain.

Arthur Samuel is the first person to create and popularize the term “machine learning.”

1956 – The Birthplace of Artificial Intelligence

In machine learning history, the Dartmouth Workshop in 1956 is widely considered to be the founding event of artificial intelligence as a field. Computer scientist John McCarthy invited well-known mathematicians, scientists, and researchers to a six to eight-week workshop. They gathered at Dartmouth College to establish and brainstorm the AI and ML research fields.

1958 – The Perceptron

The psychologist Frank Rosenblatt attempted to build “the first machine capable of producing an original idea” and subsequently designed the Perceptron, the first neural network ever produced.

He combined Donald Hebb’s model of brain cell interaction with Arthur Samuel’s machine learning efforts. It was fed a series of punch cards and, after 50 tries, learned to identify cards with markings on the left from markings on the right.

Despite its promise, the perceptron could not identify many kinds of visual patterns, causing researchers to become frustrated. It would be several years before investors’ and funding agencies’ frustrations faded away.

1963 – A Game of Tic Tac Toe

Computer Scientist Donald Michel designed Machine Educable Noughts And Crosses Engine (MENACE), a large pile of matchboxes that contained several beads and used reinforcement learning to play tic-tac-toe.

MENACE works a little like a neural network. It is randomly optimized initially, but after playing a few games, it adjusts to favor winning strategies in each situation.

You can compete with MENACE here.

1965 – The Multilayer Neural Networks Presented

Alexey (Oleksii) Ivakhnenko and Valentin Lapa are scientists who worked together to develop the first-ever multi-layer perceptron. It’s a hierarchical representation of a neural network that uses a polynomial activation function and is trained using the Group Method of Data Handling (GMDH).

Ivakhnenko is often considered the father of deep learning.

1967 – The Nearest Neighbor Algorithm

Thomas Cover and Peter Hart published his “Nearest Neighbor Pattern Classification” in 1967. It laid a foundation for recognizing patterns and regression in machine learning.

The Nearest Neighbor algorithm is a method for very basic pattern recognition that was developed to allow computers to conduct rudimentary pattern detection. It works by comparing existing data and classifying it as the nearest neighbor, which means the most similar item in memory, which can help travel salesmen in a random city.

Machine Learning Development

Check out our services page to avail our Machine Learning development services

1973 – 20th Century AI Winter

The Lighthill report by James Lighthill in 1973 presented a very pessimistic forecast for the development of core aspects in AI research, stating, “In no part of the field have the discoveries made so far produced the major impact that was then promised.” This led to reduced AI research funding in all but two British universities, marking a period in machine learning history known as the AI winter.

1979 – Neocognitron and The Stanford Cart

Japanese computer scientist Kunihiko Fukushima published his work on Neocognitron, a hierarchical multilayered network used to detect patterns and inspire convolutional neural networks used for analyzing images. It sparked a revolution in what we now call AI.

In the same year, a group of researchers from Stanford University created a robot called the Cart. It was a decades-long endeavor that evolved in various forms from 1960 to 1980. Created initially as a remote-controlled television-equipped mobile robot, it became a radio-linked machine to a large mainframe computer that can independently navigate obstacles in a room.

The invention was state-of-the-art at the time, and machine learning shifted as a probable tool to create and eventually revitalize an autonomous vehicle.

1981 – Explanation Based-Learning

Machine learning has come a long way since its inception in 1981. That year, Gerald Dejong introduced the concept of Explanation Based Learning (EBL), in which a computer analyses training data and creates a general rule it can follow by discarding unimportant data. For example, if the software is instructed to concentrate on the queen in chess, it will discard all non-immediate-effect pieces. This laid the foundation for modern supervised learning techniques.

1982 – The Hopfield Network

In 1982, American scientist John Hopfield created the Hopfield Network, which is nothing but a recurrent neural network. It’s a special kind whose response differs from other neural networks.

The Hopfield network is an associative memory, which means it can store and recall patterns. It serves as a content-addressable memory system and would be instrumental for further RNN models of the modern deep learning era.

1985 – The NETTalk

In the mid-1980s, Terrence Sejnowski and Charles R. Rosenberg developed NETtalk. It was created with the goal of constructing simplified models that might shed light on human learning.

Following a knowledge-driven approach, it learns to pronounce written English text by being shown text as input and matching phonetic transcriptions for comparison. By simplifying models of human cognitive operations, it could generate human-like text similar to how a baby learns.

1986 – Restricted Boltzmann Machine

Initially introduced as Harmonium, Restricted Boltzmann Machine (RBM) was invented by cognitive scientist Paul Smolensky in 1986. It rose to prominence after Stanford scientist Geoffrey Hinton and collaborators invented fast learning algorithms for them in the mid-2000s.

RBM is faster than the traditional Boltzmann Machine because it “restricts” connections between nodes. It’s an algorithm useful for dimensionality reduction, classification, regression, collaborative filtering, feature learning, and topic modeling.

1989 – Boosting for Machine Learning

The concept of boosting was first presented in a 1990 paper titled “The Strength of Weak Learnability” by Robert Schapire and Yoav Freund. It marked a necessary development for the evolution of machine learning.

As Schapire states, “A set of weak learners can create a single strong learner.” It simply translates to producing numerous weaker models and combines their predictions to convert them into a single powerful model.

1991 – The Vanishing Gradient Problem

Although the start of the 1990s popularised methods such as support vector machines, there are still challenges found along the way. Sepp Hochreiter first identified the vanishing gradient problem. It was a challenge in machine learning development, specifically with deep neural networks.

As the number of layers in a network increases, the value of the derivative decreases until it eventually vanishes altogether. This can make the learning process extremely slow and difficult to manage.

For years to come, this issue will continue to irritate the community.

1992 – Playing Backgammon

Researcher Gerald Tesauro created a program based on an artificial neural network capable of playing backgammon with abilities that matched top human players. The backgammon-playing software is called TD-Gammon. It could play at a high level after just a few hours of training, and it continued to improve as it played more games.

The program’s success was a significant milestone in artificial intelligence and the history of machine learning, as it showed that neural networks could be used to create programs that could learn and improve through experience.

1997 – Deep Blue and the Milestone of LSTM

In 1997, IBM’s Deep Blue became the first computer chess-playing system to defeat a reigning world chess champion when it beat Garry Kasparov.

It’s also the year when Sepp Hochreiter and Jürgen Schmidhuber published a groundbreaking paper on “Long Short-Term Memory” (LSTM). It’s a recurrent neural network architecture that will revolutionize deep learning in future decades.

2002 – The Release of Torch

In 2002, the open-source machine learning library Torch was released. This library allowed for more flexibility and customizability than other libraries at the time and quickly became popular among researchers.

2006 – Deep Belief Network

This year marks a remarkable time in the history of machine learning because Geoffrey Hinton created fast-learning algorithms to explain new algorithms that help computers distinguish objects and text in images and videos.

Together with Ruslan Salakhutdinov, Osindero, and Teh, they published the paper “A fast learning algorithm for deep belief nets,” in which they stacked multiple RBMs together in layers and called them Deep Belief Networks. The training process is much more efficient for large amounts of data.

2009 – ImageNet

Fei-Fei Li, a professor at Stanford, launched ImageNet, a database of 14 million labeled images, in 2009. It would be a benchmark for deep learning researchers participating in ImageNet competitions (ILSVRC) every year.

The Economist described the creation of this database as an exceptional event for popularizing AI throughout the tech community, marking a new era of deep learning history.

2010 – Microsoft’s Kinect

A remarkable year for machine learning history is the release of Kinect, a motion-sensing input device for the Xbox 360 gaming console. It can track 20 different human features at 30 times per second.

2011 – IBM’s Watson and Google Brain

Watson is a cognitive system developed by IBM powered by artificial intelligence and natural language processing. In 2011, Watson competed on the game show Jeopardy! against two human competitors and won. This made it the first computer system ever to win a quiz show against humans.

During the same year, Google’s X Lab team developed a machine learning algorithm named Google Brain. The aim was to create a deep neural network that could learn how to autonomously browse YouTube videos and recognize cats in digital images, just like the human brain.

The team presented its paper, “Building high-level features using large scale unsupervised learning,” that highlighted the discovery of the possibility of training a face detector without having to label images as containing a face or not. It was a significant breakthrough in the history of machine learning, especially in image processing.

2012 – ImageNet Classification

In 2012, Alex Krizhevsky, Geoffrey Hinton, and Ilya Sutskever published a research paper detailing a model that can reduce the error rate in image recognition systems by leaps and bounds.

AlexNet, a GPU-based CNN model created by Alex Krizhevsky, won Imagenet’s image classification contest with an accuracy of 84%. It significantly improved over the 75 percent success rate of prior models. This victory starts a deep learning revolution that will span the globe.

2014 – Facebook’s DeepFace and Google’s Sibyl

Facebook developed DeepFace, a deep learning facial software algorithm that can recognize and verify individuals on photos with human accuracy. It’s one of the advanced computer algorithms that can identify human faces with an accuracy of 97.35%. This landmark achievement in facial recognition algorithms will profoundly impact Facebook’s ability to keep user data secure and fight crime.

Another milestone in the history of machine learning is the release of Google’s Sibyl, a large-scale machine learning system, to the public. The system also includes numerous sophisticated algorithms for predicting user behavior.

2015 – Platform for Machine Learning Algorithms and Toolkit

Amazon launches its own machine learning platform. The e-commerce giant makes machine learning accessible to anyone with an Amazon Web Services (AWS) account. The platform provides a set of tools and algorithms for data scientists to build and train models.

Microsoft had also developed the Distributed Machine Learning Toolkit, which allowed for the efficient sharing of machine learning problems across multiple computers.

2016 – AlphaGo Algorithm and Face2Face

Go is an ancient Chinese board game with so many possible moves at each step that future positions are hard to predict. When the AlphaGo algorithm was developed in March 2016, it shocked the world by defeating one of the best Go players, Lee Sedol.

Also, in 2016, a team of scientists unveiled Face2Face at the Conference on Computer Vision and Pattern Recognition. Most “DeepFake” software today is based on its framework and algorithms.

2017 – Waymo

Waymo became the first self-driving car company to operate without human intervention. The company’s vehicles have now driven over 5 million miles on public roads, with human drivers only intervening when necessary. The launch of Waymo’s self-driving taxi service marked a major milestone for the company, and it is now working towards expanding its fleet of vehicles and services. Later the same year, they introduced completely autonomous taxis in the city of Phoenix.

2018 – DeepMind’s AlphaFold

After creating AlphaGo, the team took the first step in developing algorithms for problems exactly like protein folding. AlphaFold was built to predict the 3D shapes of proteins, the fundamental molecules of life. They trained a neural network on thousands of known proteins until it could independently predict 3D structures from amino acids. Eventually, it uses it to predict the distances between pairs of amino acids and the angles between the chemical bonds that connect them.

2020 – GPT-3 and the Rise of No-Code AI

When the world was grappling with the pandemic in 2020, OpenAI created an artificial intelligence algorithm, GPT-3, that could generate human-like text. In its time, it’s the most advanced language model in the world, using 175 billion parameters and Microsoft Azure’s AI supercomputer for training.

Aside from that, Zapier discovers an enormous rise in the use of no-code or low-code AI tools from the beginning of 2020. Some popular no-code AI platforms include Google’s AutoML, Amazon’s SageMaker, and Microsoft’s Azure ML. It allows users with no coding experience to train and deploy machine learning algorithms. This movement is encouraged by the demand for businesses to produce AI applications fast at no extra cost.

2021 – TrustML and OpenAI’s DALL-E

Indian American computer scientist Himabindu “Hima” Lakkaraju not only co-founded the Trustworthy ML Initiative (TrustML), but she also leads the AI4LIFE research group at Harvard. Her goal is to make machine learning more attainable for laypeople while continuing studies on how to make these models interpretable, fair, private, and secure.

Introduced in January 2021, DALL-E is a variant of GPT-3, a language-processing model from OpenAI. It delves into generating images from text, adding a whole new dimension to language processing. Powered by the transformer neural network, DALL-E is reshaping how we interact with AI technology.

2022 – ChatGPT’s Debut, DeepMind’s AlphaTensor, and More T2I Models

OpenAI unveiled an early ChatGPT demo on November 30, 2022. The chatbot went viral on social media, showcasing its versatility. From travel planning to writing fables and coding, users marveled at its capabilities.

Within five days, it amassed over a million users.

In October, DeepMind introduced AlphaTensor. According to a DeepMind blog, AlphaTensor extends AlphaZero, which excelled in chess and Go. This new work progresses from games to tackling unsolved mathematical problems.

The release of DALL-E 2 also happened this year, recognized as one of TIME Magazine’s 100 inventions. Also, Midjourney dropped their v1, and Stable Diffusion hit the scene, setting the stage for text-to-image (T2I) models.

2023 – LLMs and Computer Vision Reign the Scene

In 2023, we saw the rise of LLMs or Large Language Models, with GPT-4 being launched on March 14, 2023.

We also saw the evolution of LLMs into multimodal systems, or what they call Multimodal LLMs. Notable MLLMs include OpenAI’s GPT-4 Vision and Google DeepMind’s Gemini. These allow users to interact with the system using text, images, and speech.

Additionally, computer vision continued to make significant progress. In September 2023, Google introduced Vision Transformer, a deep learning-based model that performs image recognition tasks better than previous methods. It uses self-attention mechanisms similar to those used in language models, bringing the worlds of natural language processing and computer vision closer together.

2024 and Beyond

In the future, we can expect machine learning improvements on:

Quantum Machine Learning (QML)

Quantum computers lead to faster processing of data, enhancing the algorithm’s ability to analyze and draw meaningful insights from data sets.

Machine Learning Operationalization Management (MLOps)

This helps machine learning algorithms deployed in production to perform optimally and reliably.

[Also read: What is MLOps and Why Do We Need It in 2024?]

Automated Machine Learning (AutoML)

AutoML will make the process of training data easier, helping with data labeling and reducing human error in operations.

Robotic Process Automation (RPA)

A data-driven approach is needed before an RPA bot can process it, and machine learning will help it produce fewer errors.

Discover Machine Learning with Us!

Undeniably, machine learning trends are vital in enterprises as they enable entrepreneurs to understand customers’ behavior and business functioning behavior. If you want to discover how machine learning can help your business, contact us!

History of Machine Learning: The Complete Timeline [UPDATED]

1943 – The First Mathematical Model of a Biological Neuron

1949 – The Hebb Synapse

1950 – The Turing Test

1952 – Machine Learning and the Game of Checkers

1956 – The Birthplace of Artificial Intelligence

1958 – The Perceptron

1963 – A Game of Tic Tac Toe

1965 – The Multilayer Neural Networks Presented

1967 – The Nearest Neighbor Algorithm

Machine Learning Development

1973 – 20th Century AI Winter

1979 – Neocognitron and The Stanford Cart

1981 – Explanation Based-Learning

1982 – The Hopfield Network

1985 – The NETTalk

1986 – Restricted Boltzmann Machine

1989 – Boosting for Machine Learning

1991 – The Vanishing Gradient Problem

1992 – Playing Backgammon

1997 – Deep Blue and the Milestone of LSTM

2002 – The Release of Torch

2006 – Deep Belief Network

2009 – ImageNet

2010 – Microsoft’s Kinect

2011 – IBM’s Watson and Google Brain

2012 – ImageNet Classification

2014 – Facebook’s DeepFace and Google’s Sibyl

2015 – Platform for Machine Learning Algorithms and Toolkit

2016 – AlphaGo Algorithm and Face2Face

2017 – Waymo

2018 – DeepMind’s AlphaFold

2020 – GPT-3 and the Rise of No-Code AI

2021 – TrustML and OpenAI’s DALL-E

2022 – ChatGPT’s Debut, DeepMind’s AlphaTensor, and More T2I Models

2023 – LLMs and Computer Vision Reign the Scene

2024 and Beyond

Quantum Machine Learning (QML)

Machine Learning Operationalization Management (MLOps)

Automated Machine Learning (AutoML)

Robotic Process Automation (RPA)

Discover Machine Learning with Us!

About the author: Andrea Jacinto - Content Writer

Newsletter

MORE INSIGHTS

StartechUP

Careers / Jobs

Services

Technologies

Our Clients

Blog and resources

Contact