Everyday we come across news and updates about new ideas developed in the area of machine learning and artificial intelligence. Even though most of these innovations seem to be evolving over a period of time, the older or more traditional methods seem to be becoming obsolete.
Flipping an ML book from the 80s, we see that not all the machine learning techniques lasted for a long time because of certain limitations or scalability concerns/issues. While there exist plenty of useful machine learning methods, only a handful of them are put into practice. In this article, we will look into the now-redundant machine learning ideas.
In the 1960s, Stanford University professor Bernard Widrow and his doctoral student, Ted Hoff developed Adaptive Linear Neuron or ADALINE. It is an artificial neural network with a single-layer that uses memistors. The network consists of a weight, a bias and a summation function.
ADALINE was very useful in the field of telecommunications and engineering but its impact was very limited in cognitive science due to problems like linear separability, and its vulnerability to gradient explosion when an inappropriate learning rate is chosen.
In 1985, Geoffrey Hinton from the computer science department of Carnegie–Mellon University, developed the Boltzmann Machine that consisted of a two-layer neural network which also includes the building blocks of the deep network. Essentially, this machine is a recurrent neural network with the units divided into visible (V) and hidden (H) units with nodes making binary decisions and containing certain biases. This model is also referred to as the ‘Unsupervised’ learning model.
The nodes of the two different layers are interconnected across layers, but two nodes of the same layers are not linked. This resulted in the lack of intra-layer communication.
In 1994, building on the Boltzmann machine, Hinton with Peter Dayan from University of Toronto developed the Helmholtz machine. Named after Hermann von Helmholtz, the machine was created to examine the human perceptual system as a statistical inference engine and infer the possible causes of noise in sensory input. The machine consists of two networks—one network that recognises the input data and creates a distribution over hidden variables, and the other that generates values of the data and the hidden variables.
In contrast to ADALINE, the Helmholtz and Boltzmann machines are also artificial neural networks but contain two networks working across two layers. In recent times, with the introduction of GANs and VAEs in different machine learning applications, Boltzmann machines have become more or less obsolete.
With the rise of Neural Radiance Fields (NeRF), there has been a resurgence of the Kernel Method, which was at its peak in 2010. This method uses basis functions to map the input data and create a different output in space. Following that, the models can be trained on the newly created feature space, instead of the input space, resulting in an increase in performance.
In 2015, Google DeepMind’s Alex Graves developed Neural Turing Machines (NTM). This architecture contains two components—a neural network controller and a memory bank. What differentiates NTM from other architectures is that, though it is also a two-layered model, it interacts with the memory matrix as well. Inspired by NTM, Atilim Baydin, University of Oxford, created another learning method called differentiable neural computer (DNC) with a dynamic memory to resolve memory limitations faced in NTM.
Several traditional machine learning techniques are dead as they have not been improved over time with the evolving technology while many have been revived into better working models. A number of older machine learning models were capable of inferring data, but lacked the ability to store information or rather use memory to make the process faster.
Prior to the deep learning revolution, neural networks were also considered dead. They were only revived after the launch of the restricted Boltzmann machine, which was then considered an obscure algorithm already.
However, some of the traditional algorithms/methods might be revolutionary with a different implementation on modern hardware, different and large datasets, and more. Additionally, many revolutionary ideas are inspired by old ideas, from a different perspective. “Reviving old techniques and coupling them with the latest understanding is a classic PhD thesis topic,” says Daniel VanderMeer.