• corbin@awful.systems
    link
    fedilink
    English
    arrow-up
    0
    ·
    13 days ago

    On a theoretical basis, this family of text-smuggling attacks can’t be prevented. Indeed, the writeup for the Copilot version, which Microsoft appears to have mitigated, suggested that some filtering of forbidden Unicode would be much easier than some fundamental fix. The underlying confusable deputy is still there and core to the product as advertised. On one hand, Google is right; it’s only exploitable via social engineering or capability misuse. On the other hand, social engineering and capability misuse are big problems!

    This sort of confused-deputy attack is really common in distributed applications whenever an automatic process is doing something on behalf of a human. The delegation of any capability to a chatbot is always going to lead to possible misuse because of one of the central maxims of capability security: the ability to invoke a capability is equivalent to the permission to invoke it. Also, in terms of linguistics and narremes, it is well-known that merely mentioning that a capability exists will greatly raise the probability that the chatbot chooses to invoke it, not unlike how a point-and-click game might provoke a player into trying every item at every opportunity. I’ll close with a quote from that Copilot writeup:

    Automatic Tool Invocation is problematic as long as there are no fixes for prompt injection as an adversary can invoke tools that way and (1) bring sensitive information into the prompt context and (2) probably also invoke actions.