BRIGHTSURF

Examination of Large Language Model "red-teaming" defines it as a non-malicious team-effort activity to seek LLMs' limits and identifies 35 different techniques used to test them

01.15.25 | PLOS

Summon a demon and bind it: A grounded theory of LLM red teaming

Naming the activity with an image. Answer to the question “What do you call this activity?” (Promptmancer, “A portrait of a promptmancer in the Lab” by feddie xtzeth—https://objkt.com/asset/KT1EEMp7Z2Dk2vKGYLYuJJiJgTdNSzsnGUyd/0). Promptmancer shows a character whose face resembles a black skull with red eyes sitting at a table with slightly raised hands, seemingly manipulating abstract shapes and figures on the wall in front of them without physical touch. The piece has a distinct science fanta Credit: Inie et al., 2025, PLOS One, CC-BY 4.0 (https://creativecommons.org/licenses/by/4.0/)

Examination of Large Language Model "red-teaming" defines it as a non-malicious team-effort activity to seek LLMs' limits and identifies 35 different techniques used to test them

Article URL : https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0314658

Article title: Summon a demon and bind it: A grounded theory of LLM red teaming

Author countries: US, Denmark

Funding: VILLUM Foundation, grant No. 37176: ATTiKA: Adaptive Tools for Technical Knowledge Acquisition. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

PLOS ONE

10.1371/journal.pone.0314658

Summon a demon and bind it: A grounded theory of LLM red teaming

15-Jan-2025

The authors have declared that no competing interests exist.

Keywords

Applied Sciences and Engineering Information Technology Technology

Article Information

Journal

PLOS ONE

DOI

10.1371/journal.pone.0314658

Article Publication Date

2025-01-15

Article Title

Summon a demon and bind it: A grounded theory of LLM red teaming

COI Statement

The authors have declared that no competing interests exist.

Contact Information

Hanna Abdallah

PLOS

onepress@plos.org

How to Cite This Article

APA:

PLOS. (2025, January 15). Examination of Large Language Model "red-teaming" defines it as a non-malicious team-effort activity to seek LLMs' limits and identifies 35 different techniques used to test them. Brightsurf News. https://www.brightsurf.com/news/1EOJNROL/examination-of-large-language-model-red-teaming-defines-it-as-a-non-malicious-team-effort-activity-to-seek-llms-limits-and-identifies-35-different-techniques-used-to-test-them.html

MLA:

"Examination of Large Language Model "red-teaming" defines it as a non-malicious team-effort activity to seek LLMs' limits and identifies 35 different techniques used to test them." Brightsurf News, Jan. 15 2025, https://www.brightsurf.com/news/1EOJNROL/examination-of-large-language-model-red-teaming-defines-it-as-a-non-malicious-team-effort-activity-to-seek-llms-limits-and-identifies-35-different-techniques-used-to-test-them.html.

Examination of Large Language Model "red-teaming" defines it as a non-malicious team-effort activity to seek LLMs' limits and identifies 35 different techniques used to test them

Meta Quest 3 512GB

Keywords

Article Information

Contact Information

How to Cite This Article