Hello, I saw in the paper that different instruction nodes can be fused using self attention. Can you explain the process in detail