First test I tried to run a random taxation question through it Output: https://...

jmorgan · on April 17, 2024

The `mixtral:8x22b` tag still points to the text completion model – instruct is on the way, sorry!

Update: mixtral:8x22b now points to the instruct model:

  ollama pull mixtral:8x22b
  ollama run mixtral:8x22b

Zuiii · on April 18, 2024

Wait. Isn't it a breaking change to change the underlying model like this? Wouldn't people start running into consistency issues in production? (given ollama appears to be oriented towards backend use)

orra · on April 18, 2024

Sure, in theory. But if you move so fast that you already are running the base 8x22B model from last week, you can easily fix this.

I've long thought that if you want reproducibility and reliability, you need to pin your deps.

So, IMO, the change is very much worth it to reduce confusion going forward.

orost · on April 17, 2024

That's not the model this post is about. You used the base model, not trained for tasks. (The instruct model is probably not on ollama yet.)

mysteria · on April 17, 2024

Yeah this is exactly what happens when you ask a base model a question. It'll just attempt to continue what you already wrote based off its training set, so if you say have it continue a story you've written it may wrap up the story and then ask you to subscribe for part 2, followed by a bunch of social media comments with reviews.

Sohcahtoa82 · on April 18, 2024

It can be fun, though, to prompt a text completion with something like "I'm thinking about" and just seeing what random thing it completes it with.

byteknight · on April 17, 2024

I absolutely did not:

ollama run mixtral:8x22b

EDIT: I like how you ninja-editted your comment ;)

orost · on April 17, 2024

Considering "mixtral:8x22b" on ollama was last updated yesterday, and Mixtral-8x22B-Instruct-v0.1 (the topic of this post) was released about 2 hours ago, they are not the same model.

byteknight · on April 17, 2024

Are we looking at the same page?

https://imgur.com/a/y6XfpBl

And even the direct tag page: https://ollama.com/library/mixtral:8x22b shows 40-something minutes ago: https://imgur.com/a/WNhv70B

orost · on April 17, 2024

Let me clarify.

Mixtral-8x22B-v0.1 was released a couple days ago. The "mixtral:8x22b" tag on ollama currently refers to it, so it's what you got when you did "ollama run mixtral:8x22b". It's a base model only capable of text completion, not any other tasks, which is why you got a terrible result when you gave it instructions.

Mixtral-8x22B-Instruct-v0.1 is an instruction-following model based on Mixtral-8x22B-v0.1. It was released two hours ago and it's what this post is about.

(The last updated 44 minutes ago refers to the entire "mixtral" collection.)

gliptic · on April 17, 2024

And where does it say that's the instruct model?

belter · on April 17, 2024

I get:

ollama run mixtral:8x22b

Error: exception create_tensor: tensor 'blk.0.ffn_gate.0.weight' not found

Me1000 · on April 17, 2024

You need to update ollama to 0.1.32.

belter · on April 17, 2024

Thanks. That did it.

renewiltord · on April 17, 2024

Not instruct tuned. You're (actually) "holding it wrong".

woadwarrior01 · on April 17, 2024

Looks like an issue with the quantization that ollama (i.e llama.cpp) uses and not the model itself. It's common knowledge from Mixtral 8x7B that quantizing the MoE gates is pernicious to model perplexity. And yet they continue to do it. :)

cjbprime · on April 17, 2024

No, it's unrelated to quantization, they just weren't using the instruct model.