This publish was co-written with Rafael Guedes.
Meta has launched three main variations of Llama, a large-scale language mannequin (LLM), in addition to a minor (if you happen to can name it that) replace (model 3.1). The primary launch of Llama in early 2023 marked a significant step ahead for the pure language processing (NLP) open supply neighborhood, to which Meta has constantly contributed by sharing its newest LLM variations.
To make sure accuracy, we have to distinguish between open LLM and open supply LLM. Open supply software program historically publishes its supply code below a selected public use and modification license. Within the context of LLM, open LLMs normally publish the mannequin weights and preliminary code. On the similar time, open supply LLMs share your complete coaching course of, together with the coaching knowledge, below a permissive license. Most fashions in the present day, together with Meta’s Llama, don’t publish the datasets used for coaching and subsequently fall into the open LLM class.
Llama has gone by means of three main architectural iterations. Model 1 launched a number of enhancements to the unique Transformer structure. Model 2 applied Grouped-Question Consideration (GQA) for bigger fashions. Model 3 prolonged GQA to smaller fashions and launched a extra environment friendly…

