How do language fashions deal with conflicting directions in a immediate?
Cognitive dissonance is a psychological time period describing the psychological discomfort skilled by a person holding two or extra contradictory beliefs. For instance, should youโre on the grocery retailer and see a checkout lane for โ10 gadgets or fewerโ however everybody within the line has 10 or extra gadgets, what are you alleged to do?
Throughout the context AI, I needed to understand how massive language fashions (LLMs) cope with cognitive dissonance within the type of contradictory directions (for instance, prompting an LLM to translate from English to Korean, however giving examples of English to French translations).
On this article, I conduct experiments by offering LLMs with contradictory info to determine which of the contradictory info LLMs usually tend to align with.
As a person, you possibly can inform an LLM what to do in one in all 3 ways:
- Instantly describing the duty within the system message
- Instantly describing the duty within the regularโฆ