Cyber Affairs
No Result
View All Result
  • Login
  • Register
[gtranslate]
  • Home
  • Live Threat Map
  • Books
  • Careers
  • Latest
  • Podcast
  • Popular
  • Press Release
  • Reports
  • Tech Indexes
  • White Papers
  • Contact
Social icon element need JNews Essential plugin to be activated.
  • AI
  • Cyber Crime
  • Intelligence
  • Laws & Regulations
  • Cyber Warfare
  • Hacktivism
  • More
    • Digital Influence Mercenaries
    • Digital Diplomacy
    • Electronic Warfare
    • Emerging Technologies
    • ICS-SCADA
    • Books
    • Careers
    • Cyber Crime
    • Cyber Intelligence
    • Cyber Laws & Regulations
    • Cyber Warfare
    • Digital Diplomacy
    • Digital Influence Mercenaries
    • Electronic Warfare
    • Emerging Technologies
    • Hacktivism
    • ICS-SCADA
    • News
    • Podcast
    • Reports
    • Tech Indexes
    • White Papers
COMMUNITY
NEWSLETTER
  • AI
  • Cyber Crime
  • Intelligence
  • Laws & Regulations
  • Cyber Warfare
  • Hacktivism
  • More
    • Digital Influence Mercenaries
    • Digital Diplomacy
    • Electronic Warfare
    • Emerging Technologies
    • ICS-SCADA
    • Books
    • Careers
    • Cyber Crime
    • Cyber Intelligence
    • Cyber Laws & Regulations
    • Cyber Warfare
    • Digital Diplomacy
    • Digital Influence Mercenaries
    • Electronic Warfare
    • Emerging Technologies
    • Hacktivism
    • ICS-SCADA
    • News
    • Podcast
    • Reports
    • Tech Indexes
    • White Papers
NEWSLETTER
No Result
View All Result
Cyber Affairs
No Result
View All Result
  • Cyber Crime
  • Cyber Intelligence
  • Cyber Laws & Regulations
  • Cyber Warfare
  • Digital Diplomacy
  • Digital Influence Mercenaries
  • Electronic Warfare
  • Emerging Technologies
  • Hacktivism
  • ICS-SCADA
  • Reports
  • White Papers

New Jailbreak Attacks Uncovered in LLM chatbots like ChatGPT

admin by admin
Jul 19, 2023
in News
A A
0

LLMs have reshaped content generation, making understanding jailbreak attacks and prevention techniques challenging. Surprisingly, there’s a scarcity of public disclosures on countermeasures employed in chatbot services that are commercial LLM-based.

A practical study has been conducted by cybersecurity analysts from the following universities to bridge knowledge gaps, comprehensively understanding jailbreak mechanisms across diverse LLM chatbots while assessing the effectiveness of existing jailbreak attacks:-

  • Nanyang Technological University
  • University of New South Wales
  • Huazhong University of Science and Technology
  • Virginia Tech 

Experts evaluate popular LLM chatbots (ChatGPT, Bing Chat, and Bard), testing their responses to previously researched prompts. The study reveals that OpenAI’s chatbots are vulnerable to existing jailbreak prompts, while Bard and Bing Chat exhibit greater resistance.

LLM Jailbreak

To fortify jailbreak defenses in LLMs, security researchers recommend the following things:-

  • Augmenting ethical and policy-based measures
  • Refining moderation systems
  • Incorporating contextual analysis
  • Implementing automated stress testing

While their contributions can be summarized as follows:-

  • Reverse-Engineering Undisclosed Defenses
  • Bypassing LLM Defenses
  • Automated Jailbreak Generation
  • Jailbreak Generalization Across Patterns and LLMs
A jailbreak attack

Jailbreak exploits prompt manipulation to bypass usage policy measures in LLM chatbots, enabling the generation of responses and malicious content that violate the own policies of the chatbot.

Jailbreaking a chatbot involves crafting a prompt to conceal malicious questions and surpass protection boundaries. By simulating an experiment, the jailbreak prompt manipulates the LLM to generate responses that could potentially aid in malware creation and distribution.

Time-based LLM Testing

Experts conduct a comprehensive analysis by abstracting LLM chatbot services into a structured model comprising an LLM-based generator and a content moderator. This practical abstraction captures the essential dynamics without requiring in-depth knowledge of the internals.

Abstraction of an LLM chatbot

Uncertainties remain in the abstracted black-box system, including:-

  • Content moderator’s input question monitoring
  • LLM-generated data stream monitoring
  • Post-generation output checks
  • Content moderator mechanisms
The proposed LLM time-based testing strategy

Workflow

The security analysts’ workflow emphasizes preserving the original semantics of the initial jailbreak prompt throughout its transformed variant, reflecting the design rationale.

Overall workflow

While the complete methodology begins with:-

  • Dataset Building and Augmentation
  • Continuous Pretraining and Task Tuning
  • Reward Ranked Fine Tuning

The analysts leverage LLMs to automatically generate successful jailbreak prompts using a methodology based on text-style transfer in NLP.

Utilizing a fine-tuned LLM, their automated pipeline expands the range of prompt variants by infusing domain-specific jailbreaking knowledge.

However, apart from this, in this analysis, the cybersecurity researchers mainly used GPT-3.5, GPT-4, and Vicuna (An Open-Source Chatbot Impressing GPT-4) as benchmarks.

This analysis evaluates mainstream LLM chatbot services, highlighting their vulnerability to jailbreak attacks. Introducing JAILBREAKER, a novel framework that analyzes defenses and generates universal jailbreak prompts with a 21.58% success rate. 

Findings and recommendations are responsibly shared with providers, enabling robust safeguards against the abuse of LLM modules.

Read the full article here

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

[mc4wp_form id=”387″]

Recent News

  • Understanding the Implications & Guarding Privacy- Axios Security Group
  • Hackers Actively Using Pupy RAT to Attack Linux Systems
  • Buckle Up_ BEC and VEC Attacks Target Automotive Industry

Topics

  • AI
  • Books
  • Careers
  • Cyber Crime
  • Cyber Intelligence
  • Cyber Laws & Regulations
  • Cyber Warfare
  • Digital Diplomacy
  • Digital Influence Mercenaries
  • Electronic Warfare
  • Emerging Technologies
  • Hacktivism
  • ICS-SCADA
  • News
  • Podcast
  • Reports
  • Tech Indexes
  • Uncategorized
  • White Papers

Get Informed

[mc4wp_form id=”387″]

Social icon element need JNews Essential plugin to be activated.

Copyright © 2022 Cyber Affairs. All rights reserved.

No Result
View All Result
  • Home
  • Cyber Crime
  • Cyber Intelligence
  • Cyber Laws & Regulations
  • Cyber Warfare
  • Digital Diplomacy
  • Digital Influence Mercenaries
  • Electronic Warfare
  • Emerging Technologies
  • Hacktivism
  • ICS-SCADA
  • Reports
  • White Papers

Copyright © 2022 Cyber Affairs. All rights reserved.

Welcome Back!

Login to your account below

Forgotten Password? Sign Up

Create New Account!

Fill the forms below to register

All fields are required. Log In

Retrieve your password

Please enter your username or email address to reset your password.

Log In
This website uses cookies. By continuing to use this website you are giving consent to cookies being used. Visit our Privacy and Cookie Policy.