GMP*：调整良好的逐渐级修剪可以胜过大多数BERT旋转方法

论文标题

GMP*：调整良好的逐渐级修剪可以胜过大多数BERT旋转方法

GMP*: Well-Tuned Gradual Magnitude Pruning Can Outperform Most BERT-Pruning Methods

论文作者

Kurtic, Eldar, Alistarh, Dan

论文摘要

我们重新审视了大型语言模型的经典渐进幅度修剪（GMP）基线的性能，重点介绍了各种流行任务的经典BERT基准测试。尽管文献中现有的证据表明GMP的性能差，但我们表明，一种简单而通用的变体，我们称为GMP*，可以匹配，有时甚至超过更复杂的最新方法。我们的结果为将来的工作提供了一个简单而强大的基线，突出了参数调整对基准的重要性，甚至可以在这种情况下提高最先进的二阶修剪方法的性能。

We revisit the performance of the classic gradual magnitude pruning (GMP) baseline for large language models, focusing on the classic BERT benchmark on various popular tasks. Despite existing evidence in the literature that GMP performs poorly, we show that a simple and general variant, which we call GMP*, can match and sometimes outperform more complex state-of-the-art methods. Our results provide a simple yet strong baseline for future work, highlight the importance of parameter tuning for baselines, and even improve the performance of the state-of-the-art second-order pruning method in this setting.

下载PDF全文

下载文献需遵守相关版权规定

论文标题