金属铸造缺陷检测中有效的神经网接近

论文标题

金属铸造缺陷检测中有效的神经网接近

Efficient Neural Net Approaches in Metal Casting Defect Detection

论文作者

Lal, Rohit, Bolla, Bharath Kumar, Ethiraj, Sabeesh

论文摘要

在钢制造业中，最紧迫的挑战之一是识别表面缺陷。早期鉴定铸造缺陷可以帮助提高性能，包括简化生产过程。不过，深度学习模型有助于弥合这一差距并自动化大多数此类过程，但需要提出轻巧的模型，可以在更快的推理时间内轻松部署这些模型。这项研究提出了一种轻巧的体系结构，该体系结构在准确性和推理时间方面与复杂的预训练的CNN体系结构（如Mobilenet，Inception和Resnet）相比，具有有效的效率，包括视觉变压器。已经实验了方法，以最大程度地减少计算需求，例如深度可分离卷积和全球平均池（GAP）层，包括提高建筑效率和增强的技术。我们的结果表明，具有深度分离卷积的590K参数的自定义模型优于预验证的架构，例如重置和视觉变压器的准确性（81.87％）（81.87％），并且在重置，intection和Vision Transformers等较贴心的范围（12毫秒）方面（12毫秒）（12毫秒）（12毫秒）。 Blurpool表现出了其他技术的表现，精度为83.98％。增强对模型性能有矛盾的影响。在推理时间上，深度和3x3卷积之间没有直接相关性，但是，它们通过使网络能够更深入并减少可训练的参数数量来提高模型效率，从而在提高模型效率方面发挥了直接作用。我们的工作阐明了一个事实，即可以构建具有高效体系结构和更快推理时间的自定义网络，而无需依靠预训练的架构。

One of the most pressing challenges prevalent in the steel manufacturing industry is the identification of surface defects. Early identification of casting defects can help boost performance, including streamlining production processes. Though, deep learning models have helped bridge this gap and automate most of these processes, there is a dire need to come up with lightweight models that can be deployed easily with faster inference times. This research proposes a lightweight architecture that is efficient in terms of accuracy and inference time compared with sophisticated pre-trained CNN architectures like MobileNet, Inception, and ResNet, including vision transformers. Methodologies to minimize computational requirements such as depth-wise separable convolution and global average pooling (GAP) layer, including techniques that improve architectural efficiencies and augmentations, have been experimented. Our results indicate that a custom model of 590K parameters with depth-wise separable convolutions outperformed pretrained architectures such as Resnet and Vision transformers in terms of accuracy (81.87%) and comfortably outdid architectures such as Resnet, Inception, and Vision transformers in terms of faster inference times (12 ms). Blurpool fared outperformed other techniques, with an accuracy of 83.98%. Augmentations had a paradoxical effect on the model performance. No direct correlation between depth-wise and 3x3 convolutions on inference time, they, however, they played a direct role in improving model efficiency by enabling the networks to go deeper and by decreasing the number of trainable parameters. Our work sheds light on the fact that custom networks with efficient architectures and faster inference times can be built without the need of relying on pre-trained architectures.

下载PDF全文

下载文献需遵守相关版权规定

论文标题