Questions tagged [mistral]
3 questions
1
vote
0 answers
Mistral7B fine-tuning, large spike in loss curve
I fine-tuned mistral-7b for a text classification task, with LoRA adapter. When testing different hyperparameter combinations, I got these two loss curve charts:
Chart 1
Chart 2
Chart 1 and 2 use the exact same train and evaluation dataset, have…
JC Chen
- 11
- 1
1
vote
0 answers
Why are Mistral LLMs branded as enabling RAG-QA (Retrieval-Augmented Generation for Question Answering)?
I read on https://mistral.ai/news/mistral-large/:
Mistral Small benefits from the same innovation as Mistral Large regarding RAG-enablement and function calling.
What do they mean by RAG-enablement? I'm familiar with RAG-QA, but any LLM can be…
Franck Dernoncourt
- 3,473
- 2
- 21
- 39
0
votes
0 answers
Is a quantized Mistral LLM slower than non-quantized Mistral LLM?
For reference, I've been playing around with Mistral 7B v0.1 and v0.3 but did not like that I was limited by A100 availability on google colab. so I wanted to try 4 bit and 8 bit quantized models. but they are drastically slow.
at first it thought…