Does this count? https://www.marktechpost.com/2024/04/08/researchers-from-kaust-and-harvard-introduce-minigpt4-video-a-multimodal-large-language-model-llm-designed-specifically-for-video-understanding/
"Researchers from KAUST and Harvard Introduce MiniGPT4-Video: A Multimodal Large Language Model (LLM) Designed Specifically for Video Understanding"
https://vision-cair.github.io/MiniGPT4-video/
"MiniGPT4-video does not only consider visual content but also incorporates textual conversations, allowing the model to effectively answer queries involving both visual and text components. The proposed model outperforms existing state-of-the-art methods, registering gains of 4.22%, 1.13%, 20.82%, and 13.1% on the MSVD, MSRVTT, TGIF, and TVQA benchmarks respectively. "
@RobertCousineau would you care to suggest those metrics? I would resolve yes if news reports and and a publically accessible version would indicate success otherwise.
I will resolve yes if there is a version to rival or is described in relevant publications to have the potential to rival the likes of open AI's Chat GTP regionally. It will have to be publicly accessible.
@Undox Yes, and paired with the supercomputer Kaust alredy has things are looking quite proimising.