Anthropic regularly publishes research papers on the subject and details different methods they use to prevent misalignment/jailbreaks/etc. And it's not even about fear of being sued, but needing to deliver some level of resilience and stability for real enterprise use cases. I think there's a pretty clear profit incentive for safer models.
Anthropic is investing, conservatively, $100+ billion in AI infrastructure and development. A 20-person research team could put out several papers a year. That would cost them what, $5 million a year, or one half of one percent? They don't have to spend much to get that kind of output.
Not to be cynical about it BUT a few safety papers a year with proper support is totally within the capabilities of a single PhD student and it costs about 100-150k to fund them through a university. Not saying that’s what Anthropocene does, I’m just saying chump change for those companies.
Sometimes I think people misunderstand how hard of problem AI safety actually is. It's politics and mathematics wrapped up in a black box of interactions we barely understand.
More so we train them on human behavior and humans have a lot of rather unstable behaviors.
> You are very off (unfortunately) about how little PhD students are being paid
All in costs for a PhD student include university overheads & tuition fees. The total probably doesn't hit $150k but is 2-3x the stipend that the student is receiving.
Someone currently working in academia might have current figures to hand.
Worth mentioning that numbers for the US are unlikely to be representative when discussing it as a whole, though might be relevant to this specific case.
In the UK the all in cost of a PhD student starts somewhere around £45k once you include overheads I believe. If you need expensive lab support then it probably goes up from there.
So about $75k for the bottom end? The quoted numbers sound about right in PPP terms in that case.
What do you base this on?
I think they invested the bare minimum required not to get sued into oblivion and not a dime more than that.