{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,7,7]],"date-time":"2025-07-07T20:10:09Z","timestamp":1751919009741,"version":"3.41.2"},"reference-count":183,"publisher":"MIT Press","license":[{"start":{"date-parts":[[2025,7,7]],"date-time":"2025-07-07T00:00:00Z","timestamp":1751846400000},"content-version":"vor","delay-in-days":187,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"content-domain":{"domain":["direct.mit.edu"],"crossmark-restriction":true},"short-container-title":[],"published-print":{"date-parts":[[2025,7,3]]},"abstract":"<jats:title>Abstract<\/jats:title>\n               <jats:p>As NLP models are used by a growing number of end-users, an area of increasing importance is NLP Security (NLPSec): assessing the vulnerability of models to malicious attacks and developing comprehensive countermeasures against them. While work at the intersection of NLP and cybersecurity has the potential to create safer NLP for all, accidental oversights can result in tangible harm (e.g., breaches of privacy or proliferation of malicious models). In this emerging field, however, the research ethics of NLP have not yet faced many of the long-standing conundrums pertinent to cybersecurity, until now. We thus examine contemporary works across NLPSec, and explore their engagement with cybersecurity\u2019s ethical norms. We identify trends across the literature, ultimately finding alarming gaps on topics like harm minimization and responsible disclosure. To alleviate these concerns, we provide concrete recommendations to help NLP researchers navigate this space more ethically, bridging the gap between traditional cybersecurity and NLP ethics, which we frame as \u201cwhite hat NLP\u201d. The goal of this work is to help cultivate an intentional culture of ethical research for those working in NLP Security.<\/jats:p>","DOI":"10.1162\/tacl_a_00762","type":"journal-article","created":{"date-parts":[[2025,7,7]],"date-time":"2025-07-07T19:37:06Z","timestamp":1751917026000},"page":"709-743","update-policy":"https:\/\/doi.org\/10.1162\/mitpressjournals.corrections.policy","source":"Crossref","is-referenced-by-count":0,"title":["NLP Security and Ethics, in the Wild"],"prefix":"10.1162","volume":"13","author":[{"given":"Heather","family":"Lent","sequence":"first","affiliation":[{"name":"Aalborg University, Denmark. hcle@cs.aau.dk"}]},{"given":"Erick","family":"Galinkin","sequence":"additional","affiliation":[{"name":"NVIDIA Corporation, USA"}]},{"given":"Yiyi","family":"Chen","sequence":"additional","affiliation":[{"name":"Aalborg University, Denmark"}]},{"given":"Jens Myrup","family":"Pedersen","sequence":"additional","affiliation":[{"name":"Aalborg University, Denmark"}]},{"given":"Leon","family":"Derczynski","sequence":"additional","affiliation":[{"name":"NVIDIA Corporation, USA"},{"name":"IT University of Copenhagen, Denmark"}]},{"given":"Johannes","family":"Bjerva","sequence":"additional","affiliation":[{"name":"Aalborg University, Denmark. jbjerva@cs.aau.dk"}]}],"member":"281","published-online":{"date-parts":[[2025,7,3]]},"reference":[{"key":"2025070715370487600_bib1","doi-asserted-by":"crossref","DOI":"10.1145\/3461702.3462624","article-title":"Persistent anti-muslim bias in large language models","author":"Abid","year":"2021"},{"key":"2025070715370487600_bib2","unstructured":"OpenAI\n              Josh Achiam\n            , StevenAdler, SandhiniAgarwal, LamaAhmad, IlgeAkkaya, Florencia LeoniAleman, DiogoAlmeida, JankoAltenschmidt, SamAltman, ShyamalAnadkat, RedAvila, IgorBabuschkin, SuchirBalaji, ValerieBalcom, PaulBaltescu, Haim ingBao, MoBavarian, JeffBelgum, IrwanBello, JakeBerdine, GabrielBernadett-Shapiro, ChristopherBerner, LennyBogdonoff, OlegBoiko, Made laineBoyd, Anna-LuisaBrakman, GregBrockman, TimBrooks, MilesBrundage, KevinButton, TrevorCai, RosieCampbell, AndrewCann, BrittanyCarey, ChelseaCarlson, RoryCarmichael, BrookeChan, CheChang, FotisChantzis, DerekChen, SullyChen, RubyChen, JasonChen, MarkChen, BenjaminChess, ChesterCho, CaseyChu, Hyung WonChung, DaveCummings, JeremiahCurrier, YunxingDai, CoryDecareaux, ThomasDegry, NoahDeutsch, DamienDeville, ArkaDhar, DavidDohan, SteveDowling, SheilaDunning, AdrienEcoffet, AttyEleti, TynaEloundou, DavidFarhi, LiamFedus, NikoFelix, Sim\u2019on PosadaFishman, JustonForte, IsabellaFulford, LeoGao, ElieGeorges, ChristianGibson, VikGoel, TarunGogineni, GabrielGoh, RaphaelGontijo-Lopes, JonathanGordon, MorganGrafstein, ScottGray, RyanGreene, JoshuaGross, Shixiang ShaneGu, YufeiGuo, ChrisHallacy, JesseHan, JeffHarris, YuchenHe, MikeHeaton, Johannes Heidecke, ChrisHesse, AlanHickey, WadeHickey, PeterHoeschele, BrandonHoughton, KennyHsu, ShengliHu, XinHu, JoostHuizinga, ShantanuJain, ShawnJain, JoanneJang, AngelaJiang, RogerJiang, HaozhunJin, DennyJin, ShinoJomoto, BillieJonn, HeewooJun, TomerKaftan, LukaszKaiser, AliKamali, IngmarKanitscheider, Nitish ShirishKeskar, TabarakKhan, LoganKilpatrick, Jong WookKim, ChristinaKim, YongjikKim, HendrikKirchner, Jamie RyanKiros, MatthewKnight, DanielKokotajlo, LukaszKondraciuk, AndrewKondrich, ArisKonstantinidis, KyleKosic, GretchenKrueger, VishalKuo, MichaelLampe, IkaiLan, TeddyLee, JanLeike, JadeLeung, DanielLevy, Chak MingLi, RachelLim, MollyLin, StephanieLin, Mateusz Litwin, TheresaLopez, RyanLowe, PatriciaLue, AnnaMakanju, KimMalfacini, SamManning, TodorMarkov, YanivMarkovski, BiancaMartin, KatieMayer, AndrewMayne, BobMcGrew, Scott MayerMcKinney, ChristineMcLeavey, PaulMcMillan, JakeMcNeil, DavidMedina, AalokMehta, JacobMenick, LukeMetz, AndreyMishchenko, PamelaMishkin, VinnieMonaco, EvanMorikawa, Daniel P.Mossing, TongMu, MiraMurati, OlegMurk, DavidM\u2019ely, AshvinNair, ReiichiroNakano, RajeevNayak, ArvindNeelakantan, RichardNgo, HyeonwooNoh, OuyangLong, CullenO\u2019Keefe, Jakub W.Pachocki, AlexPaino, JoePalermo, AshleyPantuliano, GiambattistaParascandolo, JoelParish, EmyParparita, AlexandrePassos, MikhailPavlov, AndrewPeng, AdamPerelman, Filipede Avila Belbute Peres, MichaelPetrov, Henrique Pond\u00e9de Oliveira Pinto, MichaelPokorny, MichellePokrass, Vitchyr H.Pong, TollyPowell, AletheaPower, BorisPower, ElizabethProehl, RaulPuri, AlecRadford, Jack W.Rae, AdityaRamesh, CameronRaymond, FrancisReal, KendraRimbach, CarlRoss, BobRotsted, HenriRoussez, NickRyder, Mario D.Saltarelli, TedSanders, ShibaniSanturkar, GirishSastry, HeatherSchmidt, DavidSchnurr, JohnSchulman, DanielSelsam, KylaSheppard, TokiSherbakov, JessicaShieh, SarahShoker, PranavShyam, SzymonSidor, EricSigler, MaddieSimens, JordanSitkin, KatarinaSlama, IanSohl, Benjamin D.Sokolowsky, YangSong, NatalieStaudacher, Felipe PetroskiSuch, NatalieSummers, IlyaSutskever, JieTang, Nikolas A.Tezak, MadeleineThompson, PhilTillet, AminTootoonchian, ElizabethTseng, PrestonTuggle, NickTurley, JerryTworek, JuanFelipe, Cer\u2019onUribe, AndreaVallone, ArunVijayvergiya, ChelseaVoss, Carroll L.Wainwright, Justin JayWang, AlvinWang, BenWang, JonathanWard, JasonWei, C. J.Weinmann, AkilaWelihinda, PeterWelinder, JiayiWeng, LilianWeng, MattWiethoff, DaveWillner, ClemensWinter, SamuelWolrich, HannahWong, LaurenWorkman, SherwinWu, JeffWu, MichaelWu, KaiXiao, TaoXu, SarahYoo, KevinYu, Qim ingYuan, WojciechZaremba, RowanZellers, ChongZhang, MarvinZhang, ShengjiaZhao, TianhaoZheng, JuntangZhuang, WilliamZhuk, and BarretZoph. 2023. Gpt-4 technical report."},{"key":"2025070715370487600_bib3","article-title":"Regina couple says possible ai voice scam nearly cost them $9,400","author":"Ackerman","year":"2022","journal-title":"Regina Leader Post"},{"key":"2025070715370487600_bib4","first-page":"137","article-title":"Arabic synonym BERT-based adversarial examples for text classification","volume-title":"Proceedings of the 18th Conference of the European Chapter of the Association for Computational Linguistics: Student Research Workshop","author":"Alshahrani","year":"2024"},{"issue":"4","key":"2025070715370487600_bib5","first-page":"15","article-title":"Machine ethics: Creating an ethical intelligent agent","volume":"28","author":"Anderson","year":"2007","journal-title":"AI Magazine"},{"issue":"2","key":"2025070715370487600_bib6","doi-asserted-by":"publisher","first-page":"71","DOI":"10.1109\/MSP.2012.52","article-title":"The menlo report","volume":"10","author":"Bailey","year":"2012","journal-title":"IEEE Security and Privacy"},{"key":"2025070715370487600_bib7","doi-asserted-by":"publisher","first-page":"3248","DOI":"10.18653\/v1\/2021.findings-acl.287","article-title":"Defending pre-trained language models from adversarial word substitution without performance sacrifice","volume-title":"Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021","author":"Bao","year":"2021"},{"key":"2025070715370487600_bib8","doi-asserted-by":"publisher","first-page":"6","DOI":"10.18653\/v1\/2020.acl-tutorials.2","article-title":"Integrating ethics into the NLP curriculum","volume-title":"Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics: Tutorial Abstracts","author":"Bender","year":"2020"},{"key":"2025070715370487600_bib9","doi-asserted-by":"publisher","first-page":"3504","DOI":"10.18653\/v1\/2020.coling-main.313","article-title":"Decolonising speech and language technology","volume-title":"Proceedings of the 28th International Conference on Computational Linguistics","author":"Bird","year":"2020"},{"key":"2025070715370487600_bib10","doi-asserted-by":"publisher","first-page":"7817","DOI":"10.18653\/v1\/2022.acl-long.539","article-title":"Local languages, third spaces, and other high-resource scenarios","volume-title":"Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)","author":"Bird","year":"2022"},{"key":"2025070715370487600_bib11","article-title":"Into the laions den: Investigating hate in multimodal datasets","author":"Birhane","year":"2023"},{"key":"2025070715370487600_bib12","article-title":"Multimodal datasets: Misogyny, pornography, and malignant stereotypes","author":"Birhane","year":"2021"},{"key":"2025070715370487600_bib13","article-title":"A case study-based cybersecurity ethics curriculum","volume-title":"ASE @ USENIX Security Symposium","author":"Blanken-Webb","year":"2018"},{"key":"2025070715370487600_bib14","doi-asserted-by":"publisher","first-page":"5486","DOI":"10.18653\/v1\/2022.acl-long.376","article-title":"Systematic inequalities in language technology performance across the world\u2019s languages","volume-title":"Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)","author":"Blasi","year":"2022"},{"key":"2025070715370487600_bib15","article-title":"Language models are few-shot learners","author":"Brown","year":"2020"},{"key":"2025070715370487600_bib16","first-page":"1","article-title":"Conducting cybersecurity research legally and ethically.","volume":"8","author":"Burstein","year":"2008","journal-title":"LEET"},{"key":"2025070715370487600_bib17","doi-asserted-by":"publisher","first-page":"35","DOI":"10.18653\/v1\/2023.trustnlp-1.4","article-title":"Pay attention to the robustness of Chinese minority language models! Syllable-level textual adversarial attack on Tibetan script","volume-title":"Proceedings of the 3rd Workshop on Trustworthy Natural Language Processing (TrustNLP 2023)","author":"Xi","year":"2023"},{"key":"2025070715370487600_bib18","article-title":"Stealing part of a production language model","author":"Carlini","year":"2024"},{"key":"2025070715370487600_bib19","article-title":"Police use of facial recognition technology in canada and the way forward","author":"Canadian Civil Liberties Association CCLA","year":"2001"},{"key":"2025070715370487600_bib20","first-page":"11","article-title":"Context-aware adversarial attack on named entity recognition","volume-title":"Proceedings of the Ninth Workshop on Noisy and User-generated Text (W-NUT 2024)","author":"Chen","year":"2024"},{"key":"2025070715370487600_bib21","doi-asserted-by":"publisher","first-page":"668","DOI":"10.18653\/v1\/2022.findings-emnlp.47","article-title":"Expose backdoors on the way: A feature-based efficient defense against textual backdoor attacks","volume-title":"Findings of the Association for Computational Linguistics: EMNLP 2022","author":"Chen","year":"2022"},{"key":"2025070715370487600_bib22","doi-asserted-by":"publisher","DOI":"10.1609\/aaai.v39i22.34533","article-title":"Against all odds: Overcoming typology, script, and language confusion in multilingual embedding inversion attacks","volume-title":"Proceedings of the AAAI Conference on Artificial Intelligence","author":"Chen","year":"2025"},{"key":"2025070715370487600_bib23","doi-asserted-by":"publisher","first-page":"7808","DOI":"10.18653\/v1\/2024.acl-long.422","article-title":"Text embedding inversion security for multilingual language models","volume-title":"Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)","author":"Chen","year":"2024"},{"key":"2025070715370487600_bib24","doi-asserted-by":"publisher","first-page":"11215","DOI":"10.18653\/v1\/2022.emnlp-main.770","article-title":"Textual backdoor attacks can be more harmful via two simple tricks","volume-title":"Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing","author":"Chen","year":"2022"},{"key":"2025070715370487600_bib25","doi-asserted-by":"publisher","first-page":"4511","DOI":"10.18653\/v1\/2021.emnlp-main.371","article-title":"Multi-granularity textual adversarial attack with behavior cloning","volume-title":"Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing","author":"Chen","year":"2021"},{"key":"2025070715370487600_bib26","doi-asserted-by":"publisher","first-page":"5490","DOI":"10.18653\/v1\/2022.emnlp-main.369","article-title":"TABS: Efficient textual adversarial attack for pre-trained NL code model using semantic beam search","volume-title":"Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing","author":"Choi","year":"2022"},{"key":"2025070715370487600_bib27","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-030-29053-5","volume-title":"The Ethics of Cybersecurity","author":"Christen","year":"2020"},{"key":"2025070715370487600_bib28","article-title":"Horizon 2020 work programme 2014\u20132015","author":"EC-European Commission","year":"2013","journal-title":"Science with and for Society"},{"key":"2025070715370487600_bib29","article-title":"The dark side of security by obscurity and cloning MiFare classic rail and building passes anywhere, anytime","author":"Courtois","year":"2009"},{"key":"2025070715370487600_bib30","first-page":"351","article-title":"When hal kills, who\u2019s to blame? Computer ethics","author":"Dennett","year":"1997","journal-title":"HAL\u2019s Legacy: 2001\u2019s Computer as Dream and Reality"},{"key":"2025070715370487600_bib31","article-title":"Bert: Pre-training of deep bidirectional transformers for language understanding","volume-title":"North American Chapter of the Association for Computational Linguistics","author":"Devlin","year":"2019"},{"key":"2025070715370487600_bib32","first-page":"4171","article-title":"BERT: Pre-training of deep bidirectional transformers for language understanding","volume-title":"Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers)","author":"Devlin","year":"2019"},{"key":"2025070715370487600_bib33","doi-asserted-by":"publisher","first-page":"7865","DOI":"10.1287\/opre.2021.0562","article-title":"UOR: Universal backdoor attacks on pre-trained language models","volume-title":"Findings of the Association for Computational Linguistics: ACL 2024","author":"Wei","year":"2024"},{"key":"2025070715370487600_bib34","article-title":"China\u2019s ubiquitous facial recognition tech sparks privacy backlash","author":"Dudley","year":"2020"},{"key":"2025070715370487600_bib35","article-title":"H2020 programme guidance: How to complete your ethics self-assessment","author":"European Commission Directorate-General for Research Innovation ECDGRI","year":"2016"},{"key":"2025070715370487600_bib36","doi-asserted-by":"publisher","first-page":"143","DOI":"10.18653\/v1\/2024.privatenlp-1.15","article-title":"Deconstructing classifiers: Towards a data reconstruction attack against text classification models","volume-title":"Proceedings of the Fifth Workshop on Privacy in Natural Language Processing","author":"Elmahdy","year":"2024"},{"key":"2025070715370487600_bib37","unstructured":"European Parliament and Council of the European Union. \n          2016. Regulation (EU) 2016\/679 of the European Parliament and of the Council."},{"key":"2025070715370487600_bib38","article-title":"Gray hat hacking: Morally black and white","author":"Falk","year":"2004"},{"key":"2025070715370487600_bib39","doi-asserted-by":"publisher","first-page":"7322","DOI":"10.18653\/v1\/2023.findings-acl.461","article-title":"Modeling adversarial attack on pre-trained language models as sequential decision making","volume-title":"Findings of the Association for Computational Linguistics: ACL 2023","author":"Fang","year":"2023"},{"key":"2025070715370487600_bib40","article-title":"Towards a responsible AI development lifecycle: Lessons from information security","author":"Galinkin","year":"2022","journal-title":"ArXiv"},{"key":"2025070715370487600_bib41","doi-asserted-by":"publisher","first-page":"2942","DOI":"10.18653\/v1\/2022.naacl-main.214","article-title":"Triggerless backdoor attack for NLP tasks with clean labels","volume-title":"Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies","author":"Gan","year":"2022"},{"key":"2025070715370487600_bib42","doi-asserted-by":"publisher","first-page":"202","DOI":"10.18653\/v1\/2024.trustnlp-1.17","article-title":"Semantic-preserving adversarial example attack against BERT","volume-title":"Proceedings of the 4th Workshop on Trustworthy Natural Language Processing (TrustNLP 2024)","author":"Gao","year":"2024"},{"key":"2025070715370487600_bib43","unstructured":"US Government Accountability Office GAO. 2023. Facial recognition services: Federal law enforcement agencies should take actions to implement training, and policies for civil liberties."},{"key":"2025070715370487600_bib44","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1007\/s43681-021-00069-w","article-title":"Ethical funding for trustworthy AI: Proposals to address the responsibilities of funders to ensure that projects adhere to trustworthy AI practice","author":"Gardner","year":"2022","journal-title":"AI and Ethics"},{"key":"2025070715370487600_bib45","article-title":"The AI assault on women: What iran\u2019s tech enabled morality laws indicate for women\u2019s rights movements","author":"George","year":"2023"},{"key":"2025070715370487600_bib46","article-title":"Generative language models and automated influence operations: Emerging threats and potential mitigations","author":"Goldstein","year":"2023"},{"issue":"1","key":"2025070715370487600_bib47","doi-asserted-by":"publisher","first-page":"132","DOI":"10.1093\/ojlr\/rwaa015","article-title":"From the Tree of Knowledge and the Golem of Prague to Kosher autonomous cars: The ethics of artificial intelligence through Jewish eyes","volume":"9","author":"Goltz","year":"2020","journal-title":"Oxford Journal of Law and Religion"},{"key":"2025070715370487600_bib48","doi-asserted-by":"publisher","first-page":"706","DOI":"10.18653\/v1\/2024.naacl-long.40","article-title":"Two heads are better than one: Nested PoE for robust defense against multi-backdoors","volume-title":"Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers)","author":"Graf","year":"2024"},{"key":"2025070715370487600_bib49","doi-asserted-by":"publisher","first-page":"137","DOI":"10.1109\/ICDM.2018.00029","article-title":"Defending against adversarial samples without security through obscurity","volume-title":"2018 IEEE International Conference on Data Mining (ICDM)","author":"Guo","year":"2018"},{"issue":"4","key":"2025070715370487600_bib50","doi-asserted-by":"publisher","DOI":"10.1016\/j.patter.2022.100462","article-title":"The role of the african value of ubuntu in global AI inclusion discourse: A normative ethics perspective","volume":"3","author":"Gwagwa","year":"2022","journal-title":"Patterns"},{"key":"2025070715370487600_bib51","doi-asserted-by":"publisher","first-page":"2327","DOI":"10.18653\/v1\/2020.emnlp-main.182","article-title":"Adversarial attack and defense of structured prediction models","volume-title":"Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)","author":"Han","year":"2020"},{"key":"2025070715370487600_bib52","doi-asserted-by":"publisher","first-page":"5009","DOI":"10.18653\/v1\/2022.findings-emnlp.368","article-title":"Invernet: An inversion attack framework to infer fine-tuning datasets through word embeddings","volume-title":"Findings of the Association for Computational Linguistics: EMNLP 2022","author":"Hayet","year":"2022"},{"key":"2025070715370487600_bib53","article-title":"Spear phishing with large language models","author":"Hazell","year":"2023","journal-title":"arXiv preprint arXiv:2305 .06972"},{"key":"2025070715370487600_bib54","doi-asserted-by":"publisher","first-page":"287","DOI":"10.18653\/v1\/2023.trustnlp-1.25","article-title":"IMBERT: Making BERT immune to insertion-based backdoor attacks","volume-title":"Proceedings of the 3rd Workshop on Trustworthy Natural Language Processing (TrustNLP 2023)","author":"He","year":"2023"},{"key":"2025070715370487600_bib55","article-title":"Transferring troubles: Cross-lingual transferability of backdoor attacks in LLMs with instruction tuning","author":"He","year":"2024"},{"key":"2025070715370487600_bib56","doi-asserted-by":"publisher","first-page":"953","DOI":"10.18653\/v1\/2023.emnlp-main.60","article-title":"Mitigating backdoor poisoning attacks through the lens of spurious correlation","volume-title":"Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing","author":"He","year":"2023"},{"key":"2025070715370487600_bib57","doi-asserted-by":"publisher","first-page":"287","DOI":"10.1145\/3600211.3604690","article-title":"Self-destructing models: Increasing the costs of harmful dual uses of foundation models","volume-title":"Proceedings of the 2023 AAAI\/ACM Conference on AI, Ethics, and Society","author":"Henderson","year":"2023"},{"key":"2025070715370487600_bib58","doi-asserted-by":"crossref","DOI":"10.1002\/9780470281819","volume-title":"The handbook of information and computer ethics","author":"Himma","year":"2008"},{"key":"2025070715370487600_bib59","article-title":"What buddhism can do for AI ethics","author":"Hongladarom","year":"2021"},{"key":"2025070715370487600_bib60","doi-asserted-by":"publisher","first-page":"591","DOI":"10.18653\/v1\/P16-2096","article-title":"The social impact of natural language processing","volume-title":"Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)","author":"Hovy","year":"2016"},{"key":"2025070715370487600_bib61","doi-asserted-by":"publisher","first-page":"1459","DOI":"10.18653\/v1\/2024.findings-naacl.94","article-title":"Composite backdoor attacks against large language models","volume-title":"Findings of the Association for Computational Linguistics: NAACL 2024","author":"Huang","year":"2024"},{"key":"2025070715370487600_bib62","doi-asserted-by":"publisher","first-page":"1368","DOI":"10.18653\/v1\/2020.findings-emnlp.123","article-title":"TextHide: Tackling data privacy in language understanding tasks","volume-title":"Findings of the Association for Computational Linguistics: EMNLP 2020","author":"Huang","year":"2020"},{"key":"2025070715370487600_bib63","doi-asserted-by":"publisher","first-page":"4193","DOI":"10.18653\/v1\/2024.acl-long.230","article-title":"Transferable embedding inversion attack: Uncovering privacy risks in text embeddings without model queries","volume-title":"Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)","author":"Huang","year":"2024"},{"key":"2025070715370487600_bib64","first-page":"69","article-title":"Compassionate AI and selfless robots: A buddhist approach","author":"Hughes","year":"2012","journal-title":"Robot ethics: The ethical and social implications of robotics"},{"issue":"1","key":"2025070715370487600_bib65","doi-asserted-by":"publisher","first-page":"e0314658","DOI":"10.1371\/journal.pone.0314658","article-title":"Summon a demon and bind it: A grounded theory of LLM red teaming","volume":"20","author":"Inie","year":"2025","journal-title":"PLOS One"},{"key":"2025070715370487600_bib66","unstructured":"ISO 29147:2018. 2018. Information technology \u2014 security techniques \u2014 vulnerability disclosure. Standard, International Organization for Standardization, Geneva, CH."},{"key":"2025070715370487600_bib67","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1007\/s43545-021-00092-y","article-title":"A sociological analysis of tibetan language policy issues in china","volume":"1","author":"Jia","year":"2021","journal-title":"SN Social Sciences"},{"key":"2025070715370487600_bib68","article-title":"Mistral 7b","author":"Jiang","year":"2023"},{"key":"2025070715370487600_bib69","doi-asserted-by":"publisher","first-page":"11614","DOI":"10.18653\/v1\/2022.emnlp-main.798","article-title":"WeDef: Weakly supervised backdoor defense for text classification","volume-title":"Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing","author":"Jin","year":"2022"},{"key":"2025070715370487600_bib70","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1038\/s42256-019-0088-2","article-title":"The global landscape of AI ethics guidelines","author":"Jobin","year":"2019","journal-title":"Nature Machine Intelligence"},{"key":"2025070715370487600_bib71","doi-asserted-by":"publisher","first-page":"13977","DOI":"10.18653\/v1\/2023.findings-emnlp.932","article-title":"Thorny roses: Investigating the dual use dilemma in natural language processing","volume-title":"Findings of the Association for Computational Linguistics: EMNLP 2023","author":"Kaffee","year":"2023"},{"key":"2025070715370487600_bib72","doi-asserted-by":"publisher","first-page":"1616","DOI":"10.18653\/v1\/2021.findings-acl.141","article-title":"BERT-defense: A probabilistic model based on BERT to combat cognitively inspired orthographic adversarial attacks","volume-title":"Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021","author":"Keller","year":"2021"},{"key":"2025070715370487600_bib73","article-title":"Alignment of language agents","author":"Kenton","year":"2021"},{"key":"2025070715370487600_bib74","doi-asserted-by":"publisher","first-page":"25","DOI":"10.18653\/v1\/2022.finnlp-1.4","article-title":"Toward privacy-preserving text embedding similarity with homomorphic encryption","volume-title":"Proceedings of the Fourth Workshop on Financial Technology and Natural Language Processing (FinNLP)","author":"Kim","year":"2022"},{"key":"2025070715370487600_bib75","article-title":"The semantic scholar open data platform","author":"Kinney","year":"2023","journal-title":"ArXiv"},{"key":"2025070715370487600_bib76","article-title":"On tables with numbers, with numbers","author":"Kogkalidis","year":"2024"},{"key":"2025070715370487600_bib77","first-page":"5145","article-title":"Ethical frameworks and computer security trolley problems: Foundations for conversations","volume-title":"32nd USENIX Security Symposium (USENIX Security 23)","author":"Kohno","year":"2023"},{"key":"2025070715370487600_bib78","volume-title":"Proceedings of the RaPID Workshop - Resources and ProcessIng of linguistic, para-linguistic and extra-linguistic Data from people with various forms of cognitive\/psychiatric\/developmental impairments - within the 13th Language Resources and Evaluation Conference","author":"Kokkinakis","year":"2022"},{"key":"2025070715370487600_bib79","doi-asserted-by":"publisher","first-page":"202","DOI":"10.1007\/978-3-031-04036-8_9","article-title":"Ethics in cybersecurity. What are the challenges we need to be aware of and how to handle them?","volume-title":"Cybersecurity of Digital Service Chains: Challenges, Methodologies, and Tools","author":"Kozhuharova","year":"2022"},{"key":"2025070715370487600_bib80","unstructured":"LAION.ai. 2023. Safety review for laion-5b. https:\/\/laion.ai\/notes\/laion-maintenance\/. Accessed: 2024-03-05."},{"key":"2025070715370487600_bib81","doi-asserted-by":"publisher","first-page":"85","DOI":"10.1007\/978-3-030-63672-2_4","article-title":"AI in the EU: Ethical guidelines as a governance tool","author":"Larsson","year":"2021","journal-title":"The European Union and the Technology Shift"},{"key":"2025070715370487600_bib82","doi-asserted-by":"publisher","first-page":"1095","DOI":"10.18653\/v1\/2022.findings-naacl.83","article-title":"Phrase-level textual adversarial attack with label preservation","volume-title":"Findings of the Association for Computational Linguistics: NAACL 2022","author":"Lei","year":"2022"},{"key":"2025070715370487600_bib83","doi-asserted-by":"publisher","first-page":"30","DOI":"10.18653\/v1\/W17-1604","article-title":"Ethical by design: Ethics best practices for natural language processing","volume-title":"Proceedings of the First ACL Workshop on Ethics in Natural Language Processing","author":"Leidner","year":"2017"},{"key":"2025070715370487600_bib84","first-page":"6439","article-title":"What a creole wants, what a creole needs","volume-title":"Proceedings of the Thirteenth Language Resources and Evaluation Conference","author":"Lent","year":"2022"},{"key":"2025070715370487600_bib85","doi-asserted-by":"publisher","first-page":"950","DOI":"10.1162\/tacl_a_00682","article-title":"CreoleVal: Multilingual multitask benchmarks for creoles","volume":"12","author":"Lent","year":"2024","journal-title":"Transactions of the Association for Computational Linguistics"},{"key":"2025070715370487600_bib86","article-title":"Media manipulation and disinformation online","author":"Lewis","year":"2017"},{"key":"2025070715370487600_bib87","first-page":"5053","article-title":"Contextualized perturbation for textual adversarial attack","volume-title":"Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies","author":"Li","year":"2021"},{"key":"2025070715370487600_bib88","doi-asserted-by":"publisher","first-page":"14022","DOI":"10.18653\/v1\/2023.findings-acl.881","article-title":"Sentence embedding leaks more information than you expect: Generative embedding inversion attack to recover the whole sentence","volume-title":"Findings of the Association for Computational Linguistics: ACL 2023","author":"Li","year":"2023"},{"key":"2025070715370487600_bib89","doi-asserted-by":"publisher","first-page":"8818","DOI":"10.18653\/v1\/2023.findings-acl.561","article-title":"Defending against insertion-based textual backdoor attacks via attribution","volume-title":"Findings of the Association for Computational Linguistics: ACL 2023","author":"Li","year":"2023"},{"key":"2025070715370487600_bib90","doi-asserted-by":"publisher","first-page":"2985","DOI":"10.18653\/v1\/2024.naacl-long.165","article-title":"ChatGPT as an attack tool: Stealthy textual backdoor attack via blackbox generative model trigger","volume-title":"Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers)","author":"Li","year":"2024"},{"key":"2025070715370487600_bib91","doi-asserted-by":"publisher","first-page":"6193","DOI":"10.18653\/v1\/2020.emnlp-main.500","article-title":"BERT-ATTACK: Adversarial attack against BERT using BERT","volume-title":"Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)","author":"Li","year":"2020"},{"key":"2025070715370487600_bib92","doi-asserted-by":"publisher","first-page":"3023","DOI":"10.18653\/v1\/2021.emnlp-main.241","article-title":"Backdoor attacks on pre-trained models by layerwise weight poisoning","volume-title":"Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing","author":"Li","year":"2021"},{"key":"2025070715370487600_bib93","doi-asserted-by":"publisher","first-page":"338","DOI":"10.18653\/v1\/2023.acl-long.20","article-title":"Text adversarial purification as defense against adversarial attacks","volume-title":"Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)","author":"Li","year":"2023"},{"key":"2025070715370487600_bib94","doi-asserted-by":"publisher","first-page":"7236","DOI":"10.18653\/v1\/2023.acl-long.399","article-title":"Multi-target backdoor attacks for code pre-trained models","volume-title":"Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)","author":"Li","year":"2023"},{"key":"2025070715370487600_bib95","doi-asserted-by":"publisher","first-page":"444","DOI":"10.18653\/v1\/2021.findings-emnlp.40","article-title":"BFClass: A backdoor-free text classification framework","volume-title":"Findings of the Association for Computational Linguistics: EMNLP 2021","author":"Li","year":"2021"},{"key":"2025070715370487600_bib96","doi-asserted-by":"publisher","first-page":"2100","DOI":"10.18653\/v1\/2024.emnlp-main.126","article-title":"An inversion attack against obfuscated embedding matrix in language model inference","volume-title":"Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing","author":"Lin","year":"2024"},{"key":"2025070715370487600_bib97","article-title":"RoBERTa: A robustly optimized BERT pretraining approach","author":"Liu","year":"2019","journal-title":"ArXiv"},{"key":"2025070715370487600_bib98","doi-asserted-by":"publisher","first-page":"3850","DOI":"10.18653\/v1\/2023.findings-acl.237","article-title":"Maximum entropy loss, the silver bullet targeting backdoor attacks in pre-trained language models","volume-title":"Findings of the Association for Computational Linguistics: ACL 2023","author":"Liu","year":"2023"},{"key":"2025070715370487600_bib99","doi-asserted-by":"publisher","first-page":"48","DOI":"10.18653\/v1\/2021.nuse-1.5","article-title":"Gender and representation bias in GPT-3 generated stories","volume-title":"Proceedings of the Third Workshop on Narrative Understanding","author":"Li","year":"2021"},{"key":"2025070715370487600_bib100","doi-asserted-by":"publisher","first-page":"4727","DOI":"10.18653\/v1\/2022.naacl-main.348","article-title":"A study of the attention abnormality in trojaned BERTs","volume-title":"Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies","author":"Lyu","year":"2022"},{"key":"2025070715370487600_bib101","doi-asserted-by":"publisher","first-page":"101382","DOI":"10.1016\/j.techsoc.2020.101382","article-title":"Ethics in cybersecurity research and practice","volume":"63","author":"Macnish","year":"2020","journal-title":"Technology in Society"},{"key":"2025070715370487600_bib102","doi-asserted-by":"publisher","first-page":"4871","DOI":"10.18653\/v1\/2023.acl-long.268","article-title":"Ethical considerations for machine translation of indigenous languages: Giving a voice to the speakers","volume-title":"Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)","author":"Mager","year":"2023"},{"issue":"1","key":"2025070715370487600_bib103","doi-asserted-by":"publisher","first-page":"33","DOI":"10.1145\/3043955","article-title":"Taking the high road white hat, black hat: The ethics of cybersecurity","volume":"8","author":"Dianne Martin","year":"2017","journal-title":"ACM Inroads"},{"issue":"2","key":"2025070715370487600_bib104","doi-asserted-by":"publisher","first-page":"67","DOI":"10.1109\/MSP.2010.67","article-title":"Ethics in security vulnerability research","volume":"8","author":"Matwyshyn","year":"2010","journal-title":"IEEE Security & Privacy"},{"key":"2025070715370487600_bib105","doi-asserted-by":"publisher","DOI":"10.1515\/9781614518792","volume-title":"Loss and Renewal: Australian Languages since Colonisation","author":"Meakins","year":"2016"},{"key":"2025070715370487600_bib106","article-title":"Tree of attacks: Jailbreaking black-box llms automatically","author":"Mehrotra","year":"2023","journal-title":"arXiv preprint arXiv:2312.02119"},{"key":"2025070715370487600_bib107","doi-asserted-by":"publisher","first-page":"15551","DOI":"10.18653\/v1\/2023.acl-long.867","article-title":"NOTABLE: Transferable backdoor attacks against prompt-based NLP models","volume-title":"Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)","author":"Mei","year":"2023"},{"key":"2025070715370487600_bib108","article-title":"Efficient estimation of word representations in vector space","volume-title":"International Conference on Learning Representations","author":"Mikolov","year":"2013"},{"key":"2025070715370487600_bib109","article-title":"Distributed representations of words and phrases and their compositionality","volume-title":"Neural Information Processing Systems","author":"Mikolov","year":"2013"},{"key":"2025070715370487600_bib110","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1145\/2996758.2996764","article-title":"Sherlock vs moriarty: A smartphone dataset for cybersecurity research","volume-title":"Proceedings of the 2016 ACM workshop on Artificial intelligence and security","author":"Mirsky","year":"2016"},{"issue":"4","key":"2025070715370487600_bib111","doi-asserted-by":"publisher","DOI":"10.7249\/MR964","article-title":"The legitimization of strategic information warfare: Ethical considerations","volume":"11","author":"Molander","year":"1998","journal-title":"AAAS Professional Ethics Report"},{"issue":"4","key":"2025070715370487600_bib112","doi-asserted-by":"publisher","first-page":"18","DOI":"10.1109\/MIS.2006.80","article-title":"The nature, importance, and difficulty of machine ethics","volume":"21","author":"Moor","year":"2006","journal-title":"IEEE Intelligent Systems"},{"key":"2025070715370487600_bib113","doi-asserted-by":"publisher","first-page":"12448","DOI":"10.18653\/v1\/2023.emnlp-main.765","article-title":"Text embeddings reveal (almost) as much as text","volume-title":"Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing","author":"Morris","year":"2023"},{"key":"2025070715370487600_bib114","doi-asserted-by":"publisher","first-page":"141","DOI":"10.1145\/3630106.3658546","article-title":"\u201ci searched for a religious song in amharic and got sexual content instead\u201d: Investigating online harm in low-resourced languages on youtube","volume-title":"The 2024 ACM Conference on Fairness, Accountability, and Transparency","author":"Nigatu","year":"2024"},{"key":"2025070715370487600_bib115","doi-asserted-by":"publisher","first-page":"552","DOI":"10.18653\/v1\/2022.acl-short.61","article-title":"Canary extraction in natural language understanding models","volume-title":"Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)","author":"Parikh","year":"2022"},{"issue":"1","key":"2025070715370487600_bib116","doi-asserted-by":"publisher","first-page":"34","DOI":"10.1109\/TTS.2020.2974991","article-title":"Responsible AI\u2014Two frameworks for ethical design practice","volume":"1","author":"Peters","year":"2020","journal-title":"IEEE Transactions on Technology and Society"},{"key":"2025070715370487600_bib117","article-title":"A principled framework for evaluating on typologically diverse languages","author":"Ploeger","year":"2024"},{"key":"2025070715370487600_bib118","doi-asserted-by":"publisher","first-page":"9558","DOI":"10.18653\/v1\/2021.emnlp-main.752","article-title":"ONION: A simple and effective defense against textual backdoor attacks","volume-title":"Proceedings of the 2021 Conference on Empirical Methods in Nat ural Language Processing","author":"Qi","year":"2021"},{"key":"2025070715370487600_bib119","doi-asserted-by":"publisher","first-page":"4569","DOI":"10.18653\/v1\/2021.emnlp-main.374","article-title":"Mind the style of text! Adversarial and backdoor attacks based on text style transfer","volume-title":"Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing","author":"Qi","year":"2021"},{"key":"2025070715370487600_bib120","doi-asserted-by":"publisher","first-page":"443","DOI":"10.18653\/v1\/2021.acl-long.37","article-title":"Hidden killer: Invisible textual backdoor attacks with syntactic trigger","volume-title":"Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers)","author":"Qi","year":"2021"},{"key":"2025070715370487600_bib121","doi-asserted-by":"publisher","first-page":"4873","DOI":"10.18653\/v1\/2021.acl-long.377","article-title":"Turn the combination lock: Learnable textual backdoor attacks via word substitution","volume-title":"Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers)","author":"Qi","year":"2021"},{"key":"2025070715370487600_bib122","first-page":"140:1\u2013140:67","article-title":"Exploring the limits of transfer learning with a unified text-to-text transformer","volume":"21","author":"Raffel","year":"2019","journal-title":"Journal of Machine Learning Research"},{"key":"2025070715370487600_bib123","doi-asserted-by":"publisher","first-page":"3836","DOI":"10.18653\/v1\/2022.naacl-main.281","article-title":"Residue-based natural language adversarial attack detection","volume-title":"Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies","author":"Raina","year":"2022"},{"key":"2025070715370487600_bib124","doi-asserted-by":"publisher","first-page":"823","DOI":"10.18653\/v1\/2022.aacl-main.62","article-title":"Some languages are more equal than others: Probing deeper into the linguistic disparity in the NLP world","volume-title":"Proceedings of the 2nd Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics and the 12th International Joint Conference on Natural Language Processing (Volume 1: Long Papers)","author":"Ranathunga","year":"2022"},{"key":"2025070715370487600_bib125","unstructured":"Rapid7. 2022. 2022 vulnerability intelligence report. Technical report, Rapid7, Boston, MA."},{"key":"2025070715370487600_bib126","doi-asserted-by":"publisher","DOI":"10.1007\/s44163-022-00028-2","article-title":"Islamic virtue-based ethics for artificial intelligence","volume":"2","author":"Raquib","year":"2022","journal-title":"Discover Artificial Intelligence"},{"key":"2025070715370487600_bib127","doi-asserted-by":"publisher","first-page":"1085","DOI":"10.18653\/v1\/P19-1103","article-title":"Generating natural language adversarial examples through probability weighted word saliency","volume-title":"Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics","author":"Ren","year":"2019"},{"issue":"3","key":"2025070715370487600_bib128","doi-asserted-by":"publisher","first-page":"697","DOI":"10.1093\/ejil\/chad039","article-title":"Unmasking the term \u2018dual use\u2019 in EU spyware export control","volume":"34","author":"Riecke","year":"2023","journal-title":"European Journal of International Law"},{"key":"2025070715370487600_bib129","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1515\/ijsl-2017-0001","article-title":"Introduction: The transformation of tibet\u2019s language ecology in the twenty-first century","volume":"2017","author":"Roche","year":"2017","journal-title":"International Journal of the Sociology of Language"},{"key":"2025070715370487600_bib130","doi-asserted-by":"publisher","first-page":"417","DOI":"10.4324\/9781315561271-53","article-title":"Language revitalization of tibetan 1","volume-title":"The Routledge Handbook of Language Revitalization","author":"Roche","year":"2018"},{"key":"2025070715370487600_bib131","first-page":"1160","article-title":"A classification-guided approach for adversarial attacks against neural machine translation","volume-title":"Proceedings of the 18th Conference of the European Chapter of the Association for Computational Linguistics (Volume 1: Long Papers)","author":"Sadrizadeh","year":"2024"},{"issue":"3429","key":"2025070715370487600_bib132","doi-asserted-by":"publisher","first-page":"741","DOI":"10.1126\/science.132.3429.741","article-title":"Some moral and technical consequences of automation\u2014A refutation","volume":"132","author":"Samuel","year":"1960","journal-title":"Science"},{"key":"2025070715370487600_bib133","article-title":"Distilbert, a distilled version of bert: Smaller, faster, cheaper and lighter","author":"Sanh","year":"2019","journal-title":"ArXiv"},{"issue":"CSCW2","key":"2025070715370487600_bib134","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1145\/3555556","article-title":"Privacy research with marginalized groups: What we know, what\u2019s needed, and what\u2019s next","volume":"6","author":"Sannon","year":"2022","journal-title":"Proceedings of the ACM on Human-Computer Interaction"},{"key":"2025070715370487600_bib135","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/2021.naacl-main.295","article-title":"Beyond fair pay: Ethical implications of NLP crowdsourcing","volume-title":"North American Chapter of the Association for Computational Linguistics","author":"Shmueli","year":"2021"},{"key":"2025070715370487600_bib136","doi-asserted-by":"publisher","first-page":"1","DOI":"10.23919\/TMA58422.2023.10199005","article-title":"An analysis of war impact on ukrainian critical infrastructure through network measurements","volume-title":"2023 7th Network Traffic Measurement and Analysis Conference (TMA)","author":"Singla","year":"2023"},{"key":"2025070715370487600_bib137","doi-asserted-by":"publisher","DOI":"10.4135\/9781412990127.n26","article-title":"Science, technology, and the military","author":"Smit","year":"1995","journal-title":"Handbook of Science and Technology Studies. Thousand Oaks (Ca.): Sage"},{"key":"2025070715370487600_bib138","doi-asserted-by":"publisher","first-page":"3724","DOI":"10.18653\/v1\/2021.naacl-main.291","article-title":"Universal adversarial attacks with natural triggers for text classification","volume-title":"Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies","author":"Song","year":"2021"},{"key":"2025070715370487600_bib139","first-page":"519","article-title":"Using random perturbations to mitigate adversarial attacks on sentiment analysis models","volume-title":"Proceedings of the 18th International Conference on Natural Language Processing (ICON)","author":"Swenor","year":"2021"},{"key":"2025070715370487600_bib140","article-title":"Investigation finds AI image generation models trained on child abuse","author":"Thiel","year":"2023"},{"key":"2025070715370487600_bib141","article-title":"The devil\u2019s triangle: Ethical considerations on developing bot detection methods","volume-title":"2016 AAAI Spring Symposium Series","author":"Thieltges","year":"2016"},{"key":"2025070715370487600_bib142","doi-asserted-by":"publisher","DOI":"10.1515\/9783110310832.105","article-title":"The tibetic languages and their classification","author":"Tournadre","year":"2013"},{"key":"2025070715370487600_bib143","article-title":"Llama: Open and efficient foundation language models","author":"Touvron","year":"2023"},{"key":"2025070715370487600_bib144","article-title":"Llama 2: Open foundation and fine-tuned chat models","author":"Touvron","year":"2023"},{"key":"2025070715370487600_bib145","doi-asserted-by":"publisher","first-page":"129","DOI":"10.18653\/v1\/2023.findings-acl.10","article-title":"Layerwise universal adversarial attack on NLP models","volume-title":"Findings of the Association for Computational Linguistics: ACL 2023","author":"Tsymboi","year":"2023"},{"key":"2025070715370487600_bib146","article-title":"Recommendation on the ethics of artificial intelligence","author":"UNESCO","year":"2021"},{"key":"2025070715370487600_bib147","article-title":"Department of justice announces new policy for charging cases under the computer fraud and abuse act","author":"U. S. Department of Justice","year":"2022"},{"key":"2025070715370487600_bib148","doi-asserted-by":"publisher","first-page":"2153","DOI":"10.18653\/v1\/D19-1221","article-title":"Universal adversarial triggers for attacking and analyzing NLP","volume-title":"Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)","author":"Wallace","year":"2019"},{"key":"2025070715370487600_bib149","doi-asserted-by":"publisher","first-page":"4515","DOI":"10.18653\/v1\/2024.naacl-long.254","article-title":"Backdoor attacks on multilingual machine translation","volume-title":"Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers)","author":"Wang","year":"2024"},{"key":"2025070715370487600_bib150","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/2024.naacl-long.254","article-title":"Backdoor attack on multilingual machine translation","author":"Wang","year":"2024"},{"key":"2025070715370487600_bib151","doi-asserted-by":"publisher","first-page":"1808","DOI":"10.18653\/v1\/2024.emnlp-main.107","article-title":"DA3: A distribution-aware adversarial attack against language models","volume-title":"Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing","author":"Wang","year":"2024"},{"key":"2025070715370487600_bib152","doi-asserted-by":"publisher","first-page":"2757","DOI":"10.18653\/v1\/2023.acl-long.155","article-title":"RMLM: A flexible defense framework for proactively mitigating word-level adversarial attacks","volume-title":"Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)","author":"Wang","year":"2023"},{"key":"2025070715370487600_bib153","article-title":"Ethical and social risks of harm from language models","author":"Weidinger","year":"2021"},{"issue":"3410","key":"2025070715370487600_bib154","doi-asserted-by":"publisher","first-page":"1355","DOI":"10.1126\/science.131.3410.1355","article-title":"Some moral and technical consequences of automation: As machines learn they may develop unforeseen strategies at rates that baffle their programmers.","volume":"131","author":"Wiener","year":"1960","journal-title":"Science"},{"key":"2025070715370487600_bib155","article-title":"Bloom: A 176b-parameter open-access multilingual language model","author":"Workshop","year":"2023"},{"key":"2025070715370487600_bib156","doi-asserted-by":"publisher","first-page":"8116","DOI":"10.18653\/v1\/2024.acl-long.441","article-title":"Acquiring clean language models from backdoor poisoned datasets by downscaling frequency space","volume-title":"Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)","author":"Zongru","year":"2024"},{"key":"2025070715370487600_bib157","doi-asserted-by":"publisher","first-page":"2038","DOI":"10.18653\/v1\/2021.emnlp-main.154","article-title":"Reconstruction attack on instance encoding for language understanding","volume-title":"Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing","author":"Xie","year":"2021"},{"key":"2025070715370487600_bib158","doi-asserted-by":"publisher","first-page":"172","DOI":"10.18653\/v1\/2022.naacl-srw.22","article-title":"Differentially private instance encoding against privacy attacks","volume-title":"Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies: Student Research Workshop","author":"Xie","year":"2022"},{"key":"2025070715370487600_bib159","doi-asserted-by":"publisher","first-page":"587","DOI":"10.18653\/v1\/2022.naacl-main.43","article-title":"A word is worth a thousand dollars: Adversarial attack on tweets fools stock prediction","volume-title":"Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies","author":"Xie","year":"2022"},{"key":"2025070715370487600_bib160","doi-asserted-by":"publisher","first-page":"7054","DOI":"10.18653\/v1\/2022.findings-emnlp.523","article-title":"Weight perturbation as defense against adversarial word substitutions","volume-title":"Findings of the Association for Computational Linguistics: EMNLP 2022","author":"Jianhan","year":"2022"},{"key":"2025070715370487600_bib161","doi-asserted-by":"publisher","first-page":"6473","DOI":"10.18653\/v1\/2024.naacl-long.360","article-title":"LinkPrompt: Natural and universal adversarial attacks on prompt-based language models","volume-title":"Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers)","author":"Yue","year":"2024"},{"key":"2025070715370487600_bib162","doi-asserted-by":"publisher","first-page":"4078","DOI":"10.18653\/v1\/2021.naacl-main.321","article-title":"Grey-box adversarial attack and defence for sentiment classification","volume-title":"Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies","author":"Ying","year":"2021"},{"key":"2025070715370487600_bib163","doi-asserted-by":"crossref","first-page":"12951","DOI":"10.18653\/v1\/2023.acl-long.725","article-title":"BITE: Textual backdoor attacks with iterative trigger injection","volume-title":"Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)","author":"Yan","year":"2023"},{"key":"2025070715370487600_bib164","doi-asserted-by":"publisher","first-page":"8365","DOI":"10.18653\/v1\/2021.emnlp-main.659","article-title":"RAP: Robustness-Aware Perturbations for defending against backdoor attacks on NLP models","volume-title":"Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing","author":"Yang","year":"2021"},{"key":"2025070715370487600_bib165","doi-asserted-by":"publisher","first-page":"5543","DOI":"10.18653\/v1\/2021.acl-long.431","article-title":"Rethinking stealthiness of backdoor attack against NLP models","volume-title":"Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers)","author":"Yang","year":"2021"},{"key":"2025070715370487600_bib166","first-page":"3937","article-title":"CINO: A Chinese minority pre-trained language model","volume-title":"Proceedings of the 29th International Conference on Computational Linguistics","author":"Yang","year":"2022"},{"key":"2025070715370487600_bib167","doi-asserted-by":"publisher","first-page":"5339","DOI":"10.18653\/v1\/2024.findings-acl.317","article-title":"BadActs: A universal backdoor defense in the activation space","volume-title":"Findings of the Association for Computational Linguistics: ACL 2024","author":"Yi","year":"2024"},{"key":"2025070715370487600_bib168","article-title":"Low-resource languages jailbreak gpt-4","author":"Yong","year":"2024"},{"key":"2025070715370487600_bib169","doi-asserted-by":"publisher","first-page":"72","DOI":"10.18653\/v1\/2022.emnlp-main.6","article-title":"Backdoor attacks in federated learning by rare embeddings and gradient ensembling","volume-title":"Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing","author":"Ki","year":"2022"},{"key":"2025070715370487600_bib170","doi-asserted-by":"publisher","first-page":"12499","DOI":"10.18653\/v1\/2023.findings-emnlp.833","article-title":"Large language models are better adversaries: Exploring generative clean-label backdoor attacks against text classifiers","volume-title":"Findings of the Association for Computational Linguistics: EMNLP 2023","author":"You","year":"2023"},{"key":"2025070715370487600_bib171","doi-asserted-by":"publisher","first-page":"556","DOI":"10.18653\/v1\/2024.naacl-long.31","article-title":"Query-efficient textual adversarial example generation for black-box attacks","volume-title":"Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers)","author":"Zhen","year":"2024"},{"key":"2025070715370487600_bib172","doi-asserted-by":"publisher","first-page":"6066","DOI":"10.18653\/v1\/2020.acl-main.540","article-title":"Word-level textual adversarial attacking as combinatorial optimization","volume-title":"Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics","author":"Zang","year":"2020"},{"key":"2025070715370487600_bib173","doi-asserted-by":"publisher","first-page":"363","DOI":"10.18653\/v1\/2021.acl-demo.43","article-title":"OpenAttack: An open-source textual adversarial attack toolkit","volume-title":"Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing: System Demonstrations","author":"Zeng","year":"2021"},{"key":"2025070715370487600_bib174","doi-asserted-by":"publisher","first-page":"13189","DOI":"10.18653\/v1\/2024.emnlp-main.732","article-title":"BEEAR: Embedding-based adversarial removal of safety backdoors in instruction-tuned language models","volume-title":"Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing","author":"Yi","year":"2024"},{"key":"2025070715370487600_bib175","doi-asserted-by":"publisher","first-page":"454","DOI":"10.18653\/v1\/2021.acl-short.58","article-title":"An empirical study on adversarial attack on NMT: Languages and positions matter","volume-title":"Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 2: Short Papers)","author":"Zeng","year":"2021"},{"key":"2025070715370487600_bib176","doi-asserted-by":"publisher","first-page":"355","DOI":"10.18653\/v1\/2022.findings-emnlp.26","article-title":"Fine-mixing: Mitigating backdoors in fine-tuned language models","volume-title":"Findings of the Association for Computational Linguistics: EMNLP 2022","author":"Zhang","year":"2022"},{"key":"2025070715370487600_bib177","doi-asserted-by":"publisher","first-page":"339","DOI":"10.18653\/v1\/2022.findings-emnlp.25","article-title":"Dim-krum: Backdoor-resistant federated learning for NLP with dimension-wise krum-based aggregation","volume-title":"Findings of the Association for Computational Linguistics: EMNLP 2022","author":"Zhang","year":"2022"},{"key":"2025070715370487600_bib178","doi-asserted-by":"publisher","first-page":"9963","DOI":"10.18653\/v1\/2023.findings-acl.632","article-title":"FedPETuning: When federated learning meets the parameter-efficient tuning methods of pre-trained language models","volume-title":"Findings of the Association for Computational Linguistics: ACL 2023","author":"Zhang","year":"2023"},{"key":"2025070715370487600_bib179","first-page":"1251","article-title":"Random smooth-based certified defense against text adversarial attack","volume-title":"Findings of the Association for Computational Linguistics: EACL 2024","author":"Zhang","year":"2024"},{"key":"2025070715370487600_bib180","doi-asserted-by":"publisher","first-page":"12303","DOI":"10.18653\/v1\/2023.emnlp-main.757","article-title":"Prompt as triggers for backdoor attack: Examining the vulnerability in language models","volume-title":"Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing","author":"Zhao","year":"2023"},{"key":"2025070715370487600_bib181","doi-asserted-by":"publisher","first-page":"5459","DOI":"10.18653\/v1\/2023.findings-acl.337","article-title":"TextObfuscator: Making pre-trained language model a privacy protector via obfuscating word representations","volume-title":"Findings of the Association for Computational Linguistics: ACL 2023","author":"Zhou","year":"2023"},{"key":"2025070715370487600_bib182","doi-asserted-by":"publisher","first-page":"5482","DOI":"10.18653\/v1\/2021.acl-long.426","article-title":"Defense against synonym substitution-based adversarial attacks via Dirichlet neighborhood ensemble","volume-title":"Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers)","author":"Yi","year":"2021"},{"key":"2025070715370487600_bib183","article-title":"Universal and transferable adversarial attacks on aligned language models","author":"Zou","year":"2023","journal-title":"arXiv preprint arXiv:2307.15043"}],"container-title":["Transactions of the Association for Computational Linguistics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/direct.mit.edu\/tacl\/article-pdf\/doi\/10.1162\/tacl_a_00762\/2535130\/tacl_a_00762.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/direct.mit.edu\/tacl\/article-pdf\/doi\/10.1162\/tacl_a_00762\/2535130\/tacl_a_00762.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,7,7]],"date-time":"2025-07-07T19:37:19Z","timestamp":1751917039000},"score":1,"resource":{"primary":{"URL":"https:\/\/direct.mit.edu\/tacl\/article\/doi\/10.1162\/tacl_a_00762\/131583\/NLP-Security-and-Ethics-in-the-Wild"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2025]]},"references-count":183,"URL":"https:\/\/doi.org\/10.1162\/tacl_a_00762","relation":{},"ISSN":["2307-387X"],"issn-type":[{"value":"2307-387X","type":"electronic"}],"subject":[],"published-other":{"date-parts":[[2025]]},"published":{"date-parts":[[2025]]}}}