Taro Logo
Profile picture
Yayun Jin, Ph.D.ML Engineer at Reddit | Ex-Microsoft & Workday | Mentoring 200+ Engineers into ML Roles

EDA

In this segment, we explore the importance of Exploratory Data Analysis (EDA) in machine learning interviews, especially during notebook-based exercises. This is a key moment to distinguish ourselves by demonstrating analytical depth and real-world intuition.

  • We move beyond just checking for missing values or printing basic stats, which signals a surface-level approach.
  • Instead, we analyze feature distributions, identifying skewness, outliers, and unusual patterns that may affect modeling.
  • We examine relationships and interactions between variables, not just computing correlations, but exploring potential dependencies and dynamics.
  • We assess data quality, including duplicates, inconsistent categorical values, and signs of data leakage.
  • This deeper EDA shows that we can interrogate the data critically, a trait that interviewers associate with strong, production-ready ML practitioners.

If you want to learn even more from Yayun: