You must log in or register to comment.
This post is not specifically about LLMs, though?
That’s what people have been pointing. The 60 hours of training should have been a dead giveaway.
I hope the neurons use a logistic activation function. If it’s a saturating linear one, the result will still be full of surprises.