2025-01-22: LOKI is accepted to ICLR 2025. 2024-11: The source code and Datasets are released. Our evaluation framework supports over 20+ mainstream foundation models. Please see here for full model ...
This repository implements real-time image captioning using the BLIP (Bootstrapped Language-Image Pretraining) model. The system captures live video from your webcam, generates descriptive captions ...
Some results have been hidden because they may be inaccessible to you
Show inaccessible results