Understanding the Transformer Model P1
For my assignment, I have to partially implement the T5 transformer model. However, the only thing I have for implementing it is intuition. I don’t understand many parts of it deeply enough. And the full code for the implementation is sprawling - and also, because of its sprawling nature, makes breaking down the components difficult. There is quite a good guide for it here, where the authors take us through each aspect of its implementation with notes. I think this is a good starting point. ...