I run it with Llama.cpp on my RTX 3090. Also using the same Unsloth model. My co...

		jyap 23 days ago \| parent \| context \| favorite \| on: Granite 4.1: IBM's 8B Model Matching 32B MoE I run it with Llama.cpp on my RTX 3090. Also using the same Unsloth model. My config is similar to: https://github.com/noonghunna/club-3090/blob/master/docs/eng... I need to try out some of the other set ups mentioned in this repo for increased TPS.