One interesting next step to the idea is to indicate a continuity of meaning via the edits. For example, for each randomly selected gif, have the next gif share at least one tag. This allows each vaguely connected gif to create a randomly generated narrative by their connected meaning.
It should be possible to be even more prescriptive. E.g. write a "screenplay" (or even let GPT3 generate it) -- basically just a progression of tags -- and then pick random videos that match that.