What is Transformer Architecture? | AI Glossary Definition

Detailed Explanation

Introduced in the 2017 paper 'Attention Is All You Need' by Google researchers, the Transformer is a deep learning architecture that revolutionized Natural Language Processing. It relies on a mechanism called 'self-attention', allowing the model to weigh the importance of different words in a sentence regardless of their position. This architecture enables parallel processing of data and is the foundation for models like GPT, BERT, and Gemini.

Transformer Architecture

Detailed Explanation

Related Terms

Large Language Model (LLM)