Flight Delay Prediction

This project involves the prediction of departure flight delay using the Bureau of Transport Statistic on-time performance dataset, and weather data provided by NOAA. 11 gigabytes of data were cleaned, explored, and engineered with Apache Spark to build a gradient boosted tree model that predicts departure delay with a precision of 92% and a recall of 86%. Full Project Notebook