The state of GenAI safety.
Enterprises are racing generative AI into customer-facing roles. The security model it brings is unlike anything traditional software taught us. Here's what we see across the field — and what credible assurance actually requires.
Why GenAI breaks differently
Traditional software is deterministic: the same input gives the same output, so one good test is reassuring. Generative AI is probabilistic — the same prompt can be handled safely in one run and produce a breach in the next. An attack that looks blocked on the first attempt can succeed on the third, fifth or tenth. Single-try validation creates false confidence, and that is precisely where most teams are exposed.
Where it fails most
- Prompt injection & jailbreaks — user input overrides the system's instructions and intended role.
- System-prompt & secret leakage — the model reveals internal instructions, keys or configuration.
- Privacy & personal-data exfiltration — among the most effective attack classes; models can be coaxed into disclosing sensitive data.
- Out-of-policy generation — the assistant is steered into content or actions it should refuse.
- The multilingual gap — guardrails tuned and benchmarked in English routinely underperform in other languages, where many users actually are.
What credible assurance looks like
Adversarial
Real attacks against the live system, not checklists or self-assessment.
Repeated & sampled
Each attack run many times — because risk is probabilistic, not one-shot.
Multilingual
Tests authored natively in every language you serve, not just English.
Mapped & continuous
Findings mapped to OWASP LLM, the EU AI Act and NIST AI RMF — and re-run on every change.
This page reflects NexusFinLabs' own view of publicly discussed GenAI security research and our engagement experience. It is general guidance, not legal advice.