Bluesky Facebook Reddit Email

Examination of Large Language Model "red-teaming" defines it as a non-malicious team-effort activity to seek LLMs' limits and identifies 35 different techniques used to test them

01.15.25 | PLOS

Apple Watch Series 11 (GPS, 46mm)

Apple Watch Series 11 (GPS, 46mm) tracks health metrics and safety alerts during long observing sessions, fieldwork, and remote expeditions.


Examination of Large Language Model "red-teaming" defines it as a non-malicious team-effort activity to seek LLMs' limits and identifies 35 different techniques used to test them

Article URL : https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0314658

Article title: Summon a demon and bind it: A grounded theory of LLM red teaming

Author countries: US, Denmark

Funding: VILLUM Foundation, grant No. 37176: ATTiKA: Adaptive Tools for Technical Knowledge Acquisition. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

PLOS ONE

10.1371/journal.pone.0314658

Summon a demon and bind it: A grounded theory of LLM red teaming

15-Jan-2025

The authors have declared that no competing interests exist.

Keywords

Article Information

Contact Information

Hanna Abdallah
PLOS
onepress@plos.org

How to Cite This Article

APA:
PLOS. (2025, January 15). Examination of Large Language Model "red-teaming" defines it as a non-malicious team-effort activity to seek LLMs' limits and identifies 35 different techniques used to test them. Brightsurf News. https://www.brightsurf.com/news/1EOJNROL/examination-of-large-language-model-red-teaming-defines-it-as-a-non-malicious-team-effort-activity-to-seek-llms-limits-and-identifies-35-different-techniques-used-to-test-them.html
MLA:
"Examination of Large Language Model "red-teaming" defines it as a non-malicious team-effort activity to seek LLMs' limits and identifies 35 different techniques used to test them." Brightsurf News, Jan. 15 2025, https://www.brightsurf.com/news/1EOJNROL/examination-of-large-language-model-red-teaming-defines-it-as-a-non-malicious-team-effort-activity-to-seek-llms-limits-and-identifies-35-different-techniques-used-to-test-them.html.