GPT-3

GPT-3, an upgrade to GPT-2, represents a significant leap forward in natural language processing with its 175 billion parameters. This massive model achieved state-of-the-art performance across a variety of NLP tasks by effectively leveraging few-shot, one-shot, and...

GPT-2

GPT-2, an upgrade to GPT-1, significantly advanced the capabilities of large-scale language models. It demonstrated that such models can perform a variety of tasks without explicit supervision by leveraging vast amounts of text data for training. This breakthrough...

Generative Adversarial Networks

GANs consist of two neural networks that operate in tandem, engaging in a game-like competition. The architecture includes a generator network that creates data samples (such as images) and a discriminator network that evaluates these samples, distinguishing between...

First CNN on GPU

The 2011 paper “Flexible, High Performance Convolutional Neural Networks for Image Classification” by Dan C. Cireşan, Ueli Meier, Jonathan Masci, Luca M. Gambardella, and Jürgen Schmidhuber was a pioneering work in the field of deep learning. It was the...

Feature extraction by neural networks

The authors demonstrated that neural networks with multiple hidden layers can effectively learn compact, meaningful representations of high-dimensional data. These learned representations, or embeddings, can capture the essential structure of the input data,...

Le Net

An early instance of a successful gradient-based learning technique is detailed in the paper “Gradient-Based Learning Applied to Document Recognition” by Y. LeCun et al., published in the Proceedings of the IEEE in 1998. This work exemplifies the practical...