I don't understand why your conclusion is that "the model must be thinking beyon...

		pera on Feb 17, 2023 \| parent \| context \| favorite \| on: We Found an Neuron in GPT-2 I don't understand why your conclusion is that "the model must be thinking beyond the next token": the model doesn't need to do that to generate a well-formed sentence because it's not constrained by the size of the sentence.