rdtand's picture
v5: max-not-sum sibling aggregation, kernel shape mask, joint input_global โ€” validator: ppl=4.16, mean_NLL=1.43, MTP P0=89.5%
09de726 verified