Meta announces the Llama 2 Long large-scale natural language model
Meta has recently unveiled ‘Llama 2 Long’, a grandiloquent natural language processing model adept at managing extensive textual content. Constructed on a foundation that corresponds to 32,768 tokens and boasts 70 billion parameters, this model’s performance is purportedly superior to the similarly expansive GPT-3.5-Turbo-16K version.
Llama 2 Long’s prowess is manifested in its dexterity with elongated texts, deftly understanding inter-contextual relationships, thus catering to more intricate and diverse artificial intelligence interaction needs. This encompasses fluid interactions with chatbots or profound analyses of voluminous documents.
Historically, large-scale natural language models equipped to tackle lengthy texts have been principally tailored for commercial applications. In a divergent move, Meta’s introduction of Llama 2 Long builds upon the open-source foundation of Llama 2, generously offering it in a similar open-source format for a wider spectrum of researchers and developers.
For training modalities, Llama 2 serves as the bedrock, supplemented by pre-training via 400 billion tokens. Concurrently, these tokens are fragmented into smaller sequences. For instance, during the training with 7 billion tokens and 13 billion parameter models, a sequence of 32,768 tokens is deployed. Similarly, for models with 30 billion and 70 billion parameters, training is conducted through 16,384 tokens.
Such a methodology empowers Llama 2 Long to deliver superior context recognition in elongated texts. As the content’s length swells, the model’s capacity to discern context escalates, aptly serving intricate programming, content analyses, or more complex conversational interactions. Moreover, it paves the way for training expansive natural language models at a comparatively reduced cost.