LLama 2 Overreview: Meta’s open source large language model (LLM)
About LLama 2
Llama 2 is an updated version of the Llama language model by Meta AI, and is fully open-source and available to download and run locally. The Llama 2 large language model is free for both personal and commercial use, and has many improvements over its last iteration. The model is available in the following sizes and parameters:
Model | Download |
---|---|
Llama 2 7B | Source – HF – GPTQ – ggml |
Llama 2 7B Chat | Source – HF – GPTQ – ggml |
Llama 2 13B | Source – HF – GPTQ – ggml |
LLama 2 13B Chat | Source – HF – GPTQ – ggml |
Llama 2 70B | Source – HF – GPTQ |
Llama 2 70B Chat | Source – GPTQ |
Hardware Requirements
The model you use will vary depending on your hardware. For good results, you should have at least 10GB VRAM at a minimum for the 7B model, though you can sometimes see success with 8GB VRAM. The 13B model can run on GPUs like the RTX 3090 and RTX 4090. The largest model, however, will require very powerful hardware like an A100 80GB.
Installing Llama 2 Model
To download the Llama 2 model, you will need to complete the registration form on the Meta AI page. They will then send you an activation key which will be required during installation of the model. The Llama models can be used on most LLM interfaces, such as Text Generation Web UI. We will be writing a full tutorial for installing the model soon, but in short: within the Text Generation Web UI, navigate to the models page and paste the Llama model HF space name you wish to use (7B is recommended, unless you have an extremely powerful GPU). For example, to install the 7B chat model, you would paste “meta-llama/Llama-2-7b-hf“. Keep in mind that the download and installation might take a few minutes..
The next generation of our open source large language model
Llama 2 is available for free for research and commercial use.
More model details
Llama 2 was pretrained on publicly available online data sources.
The fine-tuned model, Llama Chat, leverages publicly available instruction datasets and over 1 million human annotations.
Code Llama is a code generation model built on Llama 2, trained on 500B tokens of code. It supports common programming languages being used today, including Python, C++, Java, PHP, Typescript (Javascript), C#, and Bash.
基于大规模中文数据,从预训练开始对Llama2模型进行中文能力的持续迭代升级