| label | remark |
|---|---|
| 对抗样本生成算法 | 常见的对抗样本生成算法 |
| PGD | Reliable Evaluation of Adversarial Robustness with an Ensemble of Diverse Parameter-free Attacks |
| FGSM | EXPLAINING AND HARNESSING ADVERSARIAL EXAMPLES |
| LLM Self Denfense | LLM Self Defense: By Self Examination, LLMs Know They Are Being Tricked |
| LLM(universal atk) | Universal and Transferable Adversarial Attacks on Aligned Language Models |