RAG was introduced in 2020 by Patrick Lewis et al. The original publication https://arxiv.org/pdf/2005.11401

As per the document, RAG intends to make LLMs more precise by augmenting them with parameterised memory. Language models are seq2seq parameterised memory, with RAG, the query of a user first retrieves relevant information from non-parameterised data like parsed and indexed documents and then pass them further to language models.

RAG is supposed to help LLMs deliver better accuracy for deep knowledge answering tasks like in case of education.

RAG is a fine-tuning method for industry specific use cases of language models. Avoids retraining of models on datasets to generate a new parameterised language model.

Information retrieval technology is old, 1970s.