tf.distribute 101: Training Keras on Multiple Devices and Machines
Content Overview
Introduction
Setup
Single-host, multi-device synchronous training
Using callbacks to ensure fault tolerance
tf.data performance tips
Multi-worker distributed synchronous training
Exa...