Tree Search Distillation for Language Models Using PPO - 信息索引