Long Short Term Memory Networks

LSTM networks represented a milestone in addressing the vanishing gradient problem that plagued the training of deep neural architectures. This problem occurs when gradients used to train the network diminish exponentially as they are back-propagated through layers,...

Conv Nets by Yann LeCun

The researchers demonstrated the application of backpropagation in neural networks for recognizing handwritten digits, specifically focusing on zip code digits provided by the U.S. Postal Service. The network architecture was specifically designed and constrained for...

New training regime of Adaline algorithms

The 1988 paper “MADALINE RULE II: A Training Algorithm for Neural Networks,” authored by B. Widrow et al. and presented at the International Conference on Neural Networks (ICNN), introduces an advanced training algorithm known as MADALINE RULE II. This...

Boltzman Machines

Boltzmann machines are a type of stochastic recurrent neural network invented in 1983 by Geoffrey Hinton and Terry Sejnowski, and later popularized in cognitive sciences by Hinton, Sejnowski, and Yann LeCun. Inspired by their construction in statistical physics,...

First Conv Neural Networks

K. Fukushima's 1980 paper, “Neocognitron: A Self-organizing Neural Network Model for a Mechanism of Pattern Recognition Unaffected by Shift in Position,” published in Biological Cybernetics, introduces a groundbreaking neural network model designed to...

GPT-4

The GPT-4 Technical Report by OpenAI details the development of GPT-4, a multimodal model that processes both text and image inputs to produce text outputs. GPT-4 shows human-level performance on various professional and academic benchmarks, including scoring in the...