How are these companies building video/image generation tools? From scratch, fine-tuning Llama, or something else?

How are these companies building video/image generation tools? From scratch, fine-tuning Llama, or something else?










There’s an enormous amount of LLM-based tools popping up lately, especially in video/image generation, each tied to a different company. Meanwhile, we only see a handful of really good open-source LLM models available.

So, my question is: How are these companies creating their video/image/avatar-generation tools? Are they building these models entirely from scratch, or are they leveraging existing LLMs like Llama, GPT, or something else?

If they are leveraging a model, are they simply using an API to interact with it, or are they actually fine-tuning those models with new data these companies collected for their specific use case?

If you’re guessing the answer, please let me know you’re guessing, as I’d like to hear from those with first-hand experience as well.

Here are some companies I’m referring to:

submitted by /u/conlake
[link] [comments]






/u/conlake





Go to original source





Posted

in

,

by