论文标题

piggy背:审计的视觉问题回答了备份非深度学习专业人员的环境

PiggyBack: Pretrained Visual Question Answering Environment for Backing up Non-deep Learning Professionals

论文作者

Zhang, Zhihao, Luo, Siwen, Chen, Junyi, Lai, Sijia, Long, Siqu, Chung, Hyunsuk, Han, Soyeon Caren

论文摘要

我们提出了一个视觉响应平台Piggyback,该平台使用户可以轻松地应用最先进的视觉语言预期模型。 Piggyback支持全面的视觉问题回答任务,特别是数据处理,模型微调和结果可视化。我们集成了视觉语言模型,该模型由Huggingface介绍,后者是深度学习技术的开源API平台;但是,如果没有编程技能或深入的学习理解,就无法运行。因此,我们的Piggyback支持易于使用的基于浏览器的用户界面,并为通用用户和域专家提供了几种深度学习的视觉语言模型。 Piggyback包括以下好处:在MIT许可下的自由供应,由于基于Web而引起的可移植性,因此在几乎任何平台上运行,全面的数据创建和处理技术,以及易于对基于深度学习的视觉语言预测的模型的易用性。该演示视频可在YouTube上找到,可以在https://youtu.be/iz44rz1lf4s上找到。

We propose a PiggyBack, a Visual Question Answering platform that allows users to apply the state-of-the-art visual-language pretrained models easily. The PiggyBack supports the full stack of visual question answering tasks, specifically data processing, model fine-tuning, and result visualisation. We integrate visual-language models, pretrained by HuggingFace, an open-source API platform of deep learning technologies; however, it cannot be runnable without programming skills or deep learning understanding. Hence, our PiggyBack supports an easy-to-use browser-based user interface with several deep learning visual language pretrained models for general users and domain experts. The PiggyBack includes the following benefits: Free availability under the MIT License, Portability due to web-based and thus runs on almost any platform, A comprehensive data creation and processing technique, and ease of use on deep learning-based visual language pretrained models. The demo video is available on YouTube and can be found at https://youtu.be/iz44RZ1lF4s.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源