# Histogram of departure delays
# ggplot(df, aes(x = DepDelayMinutes)) +
# geom_histogram(bins = 50)Predicting Flight Delays Between BWI and EWR: A Machine Learning Analysis of Weather, Network Effects, and Operational Factors
1 Abstract
Flight delays remain one of the most persistent operational challenges in modern aviation, affecting airline efficiency, passenger satisfaction, and the broader transportation network. In this study, we examine flight delays within the BWI–EWR market pair using machine learning, exploratory data analysis, and engineered operational features.
Using Bureau of Transportation Statistics (BTS) flight records enriched with weather and temporal features, we construct predictors related to schedule timing, prior aircraft movement, weather conditions, and airport-level activity. We evaluate multiple supervised learning approaches including logistic regression, ridge regression, classification trees, random forests, and K-nearest neighbors.
Our findings suggest that delay behavior is shaped by a combination of operational and environmental effects, especially upstream aircraft delay propagation, airport congestion patterns, and weather-related variables. Tree-based approaches and penalized regression provide the most stable predictive performance, while simpler baseline models remain valuable for interpretation.
2 Introduction & Background
2.1 Motivation
Flight delays represent one of the most visible operational challenges in modern aviation. A single disruption can ripple across aircraft rotations, crews, passengers, and airports, creating cascading effects throughout the network.
Understanding the causes of delay propagation is critical for improving airline efficiency and passenger reliability.
2.2 Data and Methods
Aircraft registry features were initially considered, but because only a 2025 FAA registry snapshot was available while the flight dataset represents 2019 operations, these features could introduce temporal inconsistencies. Therefore, the analysis focuses primarily on operational variables derived directly from the flight dataset, such as aircraft rotation features, turnaround time, and previous-flight delays, which more directly capture delay propagation mechanisms.
2.3 Why the BWI–EWR Market Pair
The BWI–EWR corridor sits within one of the most operationally complex airspaces in the United States. Newark Liberty International Airport operates within the highly congested New York airspace, while BWI serves as a key mid-Atlantic hub.
Flights between these airports are vulnerable to multiple sources of disruption:
- upstream aircraft delays
- weather disruptions
- airspace congestion
- operational scheduling constraints
2.4 What This Research Contributes
This study contributes a route-focused machine learning framework for airline delay analysis. By combining exploratory analysis, feature engineering, classification modeling, and model comparison, we provide a systematic approach for identifying the drivers of flight delays.
3 Data
3.1 Dataset Overview
This analysis uses flight operations data from the Bureau of Transportation Statistics (BTS). These records contain detailed flight-level information including:
- scheduled departure and arrival times
- actual departure and arrival times
- delay categories
- carrier information
- origin and destination airports
The dataset is enriched with weather variables and engineered temporal features.
3.2 Limitations
Several limitations should be noted:
- BTS delay categories are aggregated and sometimes coarse
- weather observations may not perfectly represent local flight conditions
- route-level analysis cannot capture all national network effects
- airline operational decisions such as crew scheduling are not directly observed
Despite these limitations, the dataset provides a strong foundation for studying delay patterns.
4 Feature Engineering
4.1 Delay Indicator
To simplify classification, we define a binary delay variable:
\[ Delayed_i = \begin{cases} 1 & \text{if DepartureDelay}_i > 15 \\ 0 & \text{otherwise} \end{cases} \]
4.2 Prior Aircraft Delay
Upstream delays are captured by the delay of the previous aircraft flight:
\[ PriorDelay_i = ArrivalDelay_{previous} \]
4.3 Turnaround Time
Turnaround time captures the buffer between flights:
\[ Turnaround_i = DepartureTime_i - PreviousArrivalTime_i \]
4.4 Weather Severity
Weather features combine several meteorological indicators:
\[ WeatherSeverity = f(wind, visibility, precipitation, storm\ indicators) \]
4.5 Temporal Features
Time-based indicators include:
- departure hour
- day of week
- month
- holiday flags
These help capture cyclical delay patterns.
5 Exploratory Data Analysis
5.1 Distribution of Delays
5.2 Delays by Time of Day
# Delay frequency by departure hour5.3 Weather vs Delay
# Weather severity vs delay outcome5.4 Delay Propagation
# Previous arrival delay vs next departure delay6 Research Questions
This study seeks to answer the following questions:
- Which operational and weather variables most strongly predict flight delays?
- How strongly do delays propagate from prior flights?
- Which machine learning models best classify delayed flights?
- Can unsupervised methods reveal underlying structure in delay patterns?
7 Building Classifiers
7.1 Logistic Regression
We begin with a baseline logistic regression model.
\[ P(Y_i = 1|X_i) = \frac{1}{1 + e^{-\eta_i}} \]
\[ \eta_i = \beta_0 + \beta_1 X_1 + \beta_2 X_2 + ... + \beta_p X_p \]
# logistic regression model7.2 Ridge Logistic Regression
To address multicollinearity and improve stability we apply ridge regularization.
# ridge logistic regression7.3 Classification Tree
Decision trees provide interpretable classification rules.
# classification tree7.4 Random Forest
Random forests improve predictive accuracy through ensemble learning.
# random forest model7.5 K-Nearest Neighbors
KNN classifies flights based on nearby observations in feature space.
# knn model8 Unsupervised Analysis
8.1 Principal Component Analysis
PCA identifies dominant dimensions in the feature space.
# PCA analysis8.2 K-Means Clustering
Clustering reveals natural groupings among flight observations.
# kmeans clustering9 Model Comparison
We compare models using metrics such as:
- accuracy
- precision
- recall
- F1 score
# model comparison table10 Results
10.1 Key Findings
The analysis suggests that delay outcomes are influenced by several interacting factors:
- upstream aircraft delays
- airport congestion patterns
- weather conditions
- temporal scheduling effects
10.2 Operational Interpretation
Predictive models can help identify flights at elevated risk of delay before departure, enabling airlines and planners to anticipate disruptions.
11 Discussion & Limitations
Several limitations should be considered:
- missing operational factors such as crew and maintenance constraints
- imperfect weather measurements
- route-specific findings that may not generalize
Despite these challenges, the framework provides a replicable approach for studying delay behavior.
12 Conclusion
This study demonstrates how machine learning methods can be applied to understand delay dynamics in the BWI–EWR market pair.
Tree-based methods and penalized regression provide the best balance of predictive performance and interpretability. The analysis highlights the importance of upstream delay propagation and temporal operational patterns.
Future work could extend this framework to larger airline networks and incorporate time-series modeling approaches.