Predicting Flight Delays Between BWI and EWR: A Machine Learning Analysis of Weather, Network Effects, and Operational Factors

Machine Learning
Aviation
Flight Delays
Authors

Jonathan Wilson

Karan [Last Name]

Irena [Last Name]

Val [Last Name]

Published

March 13, 2026

1 Abstract

Flight delays remain one of the most persistent operational challenges in modern aviation, affecting airline efficiency, passenger satisfaction, and the broader transportation network. In this study, we examine flight delays within the BWI–EWR market pair using machine learning, exploratory data analysis, and engineered operational features.

Using Bureau of Transportation Statistics (BTS) flight records enriched with weather and temporal features, we construct predictors related to schedule timing, prior aircraft movement, weather conditions, and airport-level activity. We evaluate multiple supervised learning approaches including logistic regression, ridge regression, classification trees, random forests, and K-nearest neighbors.

Our findings suggest that delay behavior is shaped by a combination of operational and environmental effects, especially upstream aircraft delay propagation, airport congestion patterns, and weather-related variables. Tree-based approaches and penalized regression provide the most stable predictive performance, while simpler baseline models remain valuable for interpretation.


2 Introduction & Background

2.1 Motivation

Flight delays represent one of the most visible operational challenges in modern aviation. A single disruption can ripple across aircraft rotations, crews, passengers, and airports, creating cascading effects throughout the network.

Understanding the causes of delay propagation is critical for improving airline efficiency and passenger reliability.

2.2 Data and Methods

Aircraft registry features were initially considered, but because only a 2025 FAA registry snapshot was available while the flight dataset represents 2019 operations, these features could introduce temporal inconsistencies. Therefore, the analysis focuses primarily on operational variables derived directly from the flight dataset, such as aircraft rotation features, turnaround time, and previous-flight delays, which more directly capture delay propagation mechanisms.

2.3 Why the BWI–EWR Market Pair

The BWI–EWR corridor sits within one of the most operationally complex airspaces in the United States. Newark Liberty International Airport operates within the highly congested New York airspace, while BWI serves as a key mid-Atlantic hub.

Flights between these airports are vulnerable to multiple sources of disruption:

  • upstream aircraft delays
  • weather disruptions
  • airspace congestion
  • operational scheduling constraints

2.4 What This Research Contributes

This study contributes a route-focused machine learning framework for airline delay analysis. By combining exploratory analysis, feature engineering, classification modeling, and model comparison, we provide a systematic approach for identifying the drivers of flight delays.


3 Data

3.1 Dataset Overview

This analysis uses flight operations data from the Bureau of Transportation Statistics (BTS). These records contain detailed flight-level information including:

  • scheduled departure and arrival times
  • actual departure and arrival times
  • delay categories
  • carrier information
  • origin and destination airports

The dataset is enriched with weather variables and engineered temporal features.

3.2 Limitations

Several limitations should be noted:

  • BTS delay categories are aggregated and sometimes coarse
  • weather observations may not perfectly represent local flight conditions
  • route-level analysis cannot capture all national network effects
  • airline operational decisions such as crew scheduling are not directly observed

Despite these limitations, the dataset provides a strong foundation for studying delay patterns.


4 Feature Engineering

4.1 Delay Indicator

To simplify classification, we define a binary delay variable:

\[ Delayed_i = \begin{cases} 1 & \text{if DepartureDelay}_i > 15 \\ 0 & \text{otherwise} \end{cases} \]

4.2 Prior Aircraft Delay

Upstream delays are captured by the delay of the previous aircraft flight:

\[ PriorDelay_i = ArrivalDelay_{previous} \]

4.3 Turnaround Time

Turnaround time captures the buffer between flights:

\[ Turnaround_i = DepartureTime_i - PreviousArrivalTime_i \]

4.4 Weather Severity

Weather features combine several meteorological indicators:

\[ WeatherSeverity = f(wind, visibility, precipitation, storm\ indicators) \]

4.5 Temporal Features

Time-based indicators include:

  • departure hour
  • day of week
  • month
  • holiday flags

These help capture cyclical delay patterns.


5 Exploratory Data Analysis

5.1 Distribution of Delays

# Histogram of departure delays
# ggplot(df, aes(x = DepDelayMinutes)) +
#   geom_histogram(bins = 50)

5.2 Delays by Time of Day

# Delay frequency by departure hour

5.3 Weather vs Delay

# Weather severity vs delay outcome

5.4 Delay Propagation

# Previous arrival delay vs next departure delay

6 Research Questions

This study seeks to answer the following questions:

  1. Which operational and weather variables most strongly predict flight delays?
  2. How strongly do delays propagate from prior flights?
  3. Which machine learning models best classify delayed flights?
  4. Can unsupervised methods reveal underlying structure in delay patterns?

7 Building Classifiers

7.1 Logistic Regression

We begin with a baseline logistic regression model.

\[ P(Y_i = 1|X_i) = \frac{1}{1 + e^{-\eta_i}} \]

\[ \eta_i = \beta_0 + \beta_1 X_1 + \beta_2 X_2 + ... + \beta_p X_p \]

# logistic regression model

7.2 Ridge Logistic Regression

To address multicollinearity and improve stability we apply ridge regularization.

# ridge logistic regression

7.3 Classification Tree

Decision trees provide interpretable classification rules.

# classification tree

7.4 Random Forest

Random forests improve predictive accuracy through ensemble learning.

# random forest model

7.5 K-Nearest Neighbors

KNN classifies flights based on nearby observations in feature space.

# knn model

8 Unsupervised Analysis

8.1 Principal Component Analysis

PCA identifies dominant dimensions in the feature space.

# PCA analysis

8.2 K-Means Clustering

Clustering reveals natural groupings among flight observations.

# kmeans clustering

9 Model Comparison

We compare models using metrics such as:

  • accuracy
  • precision
  • recall
  • F1 score
# model comparison table

10 Results

10.1 Key Findings

The analysis suggests that delay outcomes are influenced by several interacting factors:

  • upstream aircraft delays
  • airport congestion patterns
  • weather conditions
  • temporal scheduling effects

10.2 Operational Interpretation

Predictive models can help identify flights at elevated risk of delay before departure, enabling airlines and planners to anticipate disruptions.


11 Discussion & Limitations

Several limitations should be considered:

  • missing operational factors such as crew and maintenance constraints
  • imperfect weather measurements
  • route-specific findings that may not generalize

Despite these challenges, the framework provides a replicable approach for studying delay behavior.


12 Conclusion

This study demonstrates how machine learning methods can be applied to understand delay dynamics in the BWI–EWR market pair.

Tree-based methods and penalized regression provide the best balance of predictive performance and interpretability. The analysis highlights the importance of upstream delay propagation and temporal operational patterns.

Future work could extend this framework to larger airline networks and incorporate time-series modeling approaches.


13 References