Nvidia发布的 LocateAnything 视觉语言定位模型,刚测试用了一下,极快极准,甚至有人喊出:

我上传了一张试过其它方法数竹签的图片(大多不准确)
这个模型一下子就标注出来,而且很准确,非常优秀
有兴趣的佬友可以去玩玩:
huggingface.co
LocateAnything - a Hugging Face Space by nvidia
Upload a photo or video and specify the object names you want to locate. The app finds those objects, draws colored boxes or points with labels on the visual, and returns the annotated image (or se...
2 个帖子 - 2 位参与者