Recurrent Models: Decoding Faster with Lower Latency and Higher Throughput
:::info
Authors:
(1) Soham De, Google DeepMind and with Equal contributions;
(2) Samuel L. Smith, Google DeepMind and with Equal contributions;
(3) Anushan Fernando, Google DeepMind and with Equal con...