I'm trying to understand whether Llama 3 (or other open source models) is fully open-source. Specifically, I would like to know:
- Is the source code for Llama 3 (including the tokenizer, transformers, and other components) available under an open-source license?
- Does the available source code provide everything necessary to build a large language model (LLM) with custom data using Llama 3's architecture?
For example, if I wanted to train my own model like Llama 3, would I have access to all the underlying code (for tokenization, model architecture, etc.), or are there any proprietary components involved?