Does it say anywhere what hardware they used for training?

rhogar · on July 4, 2023

The report does not detail hardware -- though it states that SDXL has 2.6B parameters in its UNet component, compared to SD 1.4/1.5 with 860M and SD 2.0/2.1 with 865M. So SDXL has roughly 3x more UNet parameters. In January, MosaicML claimed a model comparable to Stable Diffusion V2 could be trained with 79,000 A100-hours in 13 days. Some sort of inference can be made from this information, would be interested to hear someone with more insight here provide more perspective.

darqwolff · on July 4, 2023

wouldnt that mean more vram is required to load the model? they are claiming it will still work on 8 gb cards.

Filligree · on July 4, 2023

Stable Diffusion 1/2 were made to run on cards with as little as 3GB of memory.

Using the same techniques, yes, this will fit in 8.

brucethemoose2 · on July 4, 2023

I am guessing 8 bit quantization will be a thing for SDXL.

It should be easy(TM) with bitsandbytes, or ML compiler frameworks.

GaggiX · on July 4, 2023

bitsandbytes is only used during training with these models tho (the 8-bit Adamw) quantizing the weights and the activations to a range of 256 values when the model needs to output a range 256 values creates noticeable artifacts as they are not going to map 1-to-1.

liuliu · on July 5, 2023

Draw Things recently released a 8-bit quantized SD model that has comparable output as the FP16. It does use k-means based LUT and separate weights into blocks to minimize quantization errors.

GaggiX · on July 5, 2023

I was going to search on the internet about it, but then I realized you are the author (and there is nothing online I think). I imagine that the activations are left in FP16 and the weights are converted in FP16 during inference, right?

Btw very cool

liuliu · on July 5, 2023

Yes, computes are carried out in FP16 (so there is no compute efficiency gains, might be latency reductions due to memory-bandwidth saving). These savings are not realized yet because no custom kernels introduced yet.