Anthropic Study Highlights AI Models Can ‘Pretend’ to Have Different Views During Training
2024-12-19 12:37:00 Anthropic published a new study where it found that artificial intelligence (AI) models can pretend to hold different views during training while holding onto their original preferences. On…