Papers
arxiv:2502.07408

Maximal Brain Damage Without Data or Optimization: Disrupting Neural Networks via Sign-Bit Flips

Published on Apr 16
· Submitted by
Moshe kimhi
on Apr 20
#1 Paper of the day
Authors:
,

Abstract

Deep neural networks exhibit catastrophic vulnerability to minimal parameter bit flips across multiple domains, which can be identified and mitigated through targeted protection strategies.

AI-generated summary

Deep Neural Networks (DNNs) can be catastrophically disrupted by flipping only a handful of parameter bits. We introduce Deep Neural Lesion (DNL), a data-free and optimizationfree method that locates critical parameters, and an enhanced single-pass variant, 1P-DNL, that refines this selection with one forward and backward pass on random inputs. We show that this vulnerability spans multiple domains, including image classification, object detection, instance segmentation, and reasoning large language models. In image classification, flipping just two sign bits in ResNet-50 on ImageNet reduces accuracy by 99.8%. In object detection and instance segmentation, one or two sign flips in the backbone collapse COCO detection and mask AP for Mask R-CNN and YOLOv8-seg models. In language modeling, two sign flips into different experts reduce Qwen3-30B-A3B-Thinking from 78% to 0% accuracy. We also show that selectively protecting a small fraction of vulnerable sign bits provides a practical defense against such attacks.

Community

Paper author Paper submitter
edited about 21 hours ago

Deep Neural Networks (DNNs) can be catastrophically disrupted by flipping only a handful of parameter bits. We introduce Deep Neural Lesion (DNL), a data-free and optimization-free method
that locates critical parameters, and an enhanced single-pass variant, 1P-DNL, that refines this selection with one forward and backward pass on random inputs. We show that this vulnerability
spans multiple domains, including image classification, object detection and instance segmentation,
and reasoning large language models. In image classification, flipping just two sign bits in ResNet50 on ImageNet reduces accuracy by 99.8%. In object detection and instance segmentation, one
or two sign flips in the backbone collapse COCO detection and mask AP for Mask R-CNN and
YOLOv8-seg models. In language modeling, two sign flips into different experts reduce Qwen3-
30B-A3B-Thinking from 78% to 0% accuracy. We also show that selectively protecting a small
fraction of vulnerable sign bits provides a practical defense against such attacks.

1
2
3

Sign up or log in to comment

Get this paper in your agent:

hf papers read 2502.07408
Don't have the latest CLI?
curl -LsSf https://hf.co/cli/install.sh | bash

Models citing this paper 0

No model linking this paper

Cite arxiv.org/abs/2502.07408 in a model README.md to link it from this page.

Datasets citing this paper 0

No dataset linking this paper

Cite arxiv.org/abs/2502.07408 in a dataset README.md to link it from this page.

Spaces citing this paper 0

No Space linking this paper

Cite arxiv.org/abs/2502.07408 in a Space README.md to link it from this page.

Collections including this paper 3