How to deal with highly imbalanced data?

  1. Data -
    • undersampling
    • oversampling
    • SMOTE
    • synthetic samples
  2. Model -
    • class-weights proportional to number of samples
    • large batches so that each batch contains at least a few positive samples
    • monitor precision and recall, not accuracy
    • focal loss

References
How to handle imbalance data
Handling imbalance dataset in deep learning
Keras Notebook by F. Chollet\