Researchers were able to bypass safety guardrails in Nvidia's NeMo Framework


Researchers were able to bypass safety guardrails in Nvidia's NeMo Framework, a toolkit for building generative AI models, according to FT. 

The Robust Intelligence analysts found that the framework can be manipulated to ignore safety measures and expose personally identifiable and other private information.

Nvidia recently announced guardrails for its NeMo framework, available to businesses through its AI Enterprise software platform and AI Foundations service.

  • Developers can use the "topical guardrails" to prevent LLMs from issuing replies about certain subjects, along with security and safety restrictions.
  • For example, a company could use it to prevent its customer service chatbot from answering questions about proprietary data.
  • However, the researchers found ways to overcome the restrictions and steer an AI model into unrelated topics.
  • Given that it could result in potential data breaches, Robust Intelligence recommended that clients avoid using Nvidia's software product for now.

Post a Comment

Previous Next

Contact Form