Questions tagged [mistral]

3 questions
1
vote
0 answers

Mistral7B fine-tuning, large spike in loss curve

I fine-tuned mistral-7b for a text classification task, with LoRA adapter. When testing different hyperparameter combinations, I got these two loss curve charts: Chart 1 Chart 2 Chart 1 and 2 use the exact same train and evaluation dataset, have…
JC Chen
  • 11
  • 1
1
vote
0 answers

Why are Mistral LLMs branded as enabling RAG-QA (Retrieval-Augmented Generation for Question Answering)?

I read on https://mistral.ai/news/mistral-large/: Mistral Small benefits from the same innovation as Mistral Large regarding RAG-enablement and function calling. What do they mean by RAG-enablement? I'm familiar with RAG-QA, but any LLM can be…
0
votes
0 answers

Is a quantized Mistral LLM slower than non-quantized Mistral LLM?

For reference, I've been playing around with Mistral 7B v0.1 and v0.3 but did not like that I was limited by A100 availability on google colab. so I wanted to try 4 bit and 8 bit quantized models. but they are drastically slow. at first it thought…