Breakdowns, Bottlenecks & Buses: A Data-Driven Look at NYC School Transit Delays
OBJECTIVE: Analyze breakdown and delay patterns across New York City school buses to uncover common causes, identify peak delay times, and recommend actionable improvements for operational efficiency.
BACKGROUND: The New York City Department of Education relies on a vast network of contracted bus providers to transport students safely and on time. However, recurring delays and mechanical issues affect service reliability. This analysis examines incident-level records of bus breakdowns and delays to answer key operational questions:
What are the most common reasons for delays?
How do delay times vary by bus company and depot?
Is there a correlation between specific days of the week and delay frequency?
TECH STACK:
Microsoft Excel: End-to-end data analysis and visualization
Excel Functions:
IFERROR
,VALUE
,TEXT
,AVERAGE
,YEAR
,COUNTIF
Pivot Tables: Used to aggregate by vendor, weekday, and delay reason
Bar & Line Charts: Visualized breakdown trends, delay time ranges, and day-of-week frequencies
Filtering & Logical Categorization: Applied to classify short vs. long delays and flag breakdowns
PROCESS:
Data Cleaning:
◇ Removed rows with missing dates, delay reasons, or invalid placeholders
◇ Standardized formatting for timestamps and numeric fields
◇ Used Excel formulas to deriveDay_of_Week
, delay duration categories, and binary breakdown flags
◇ Categorized delay types as “Running Late” or “Breakdown”
◇ CleanedHow_Long_Delayed
field to enable numerical aggregation of short and long delay estimatesTrend Analysis & Visualization:
◇ Created pivot tables summarizing delays by reason, vendor, and weekday
◇ Built charts to show frequency and duration of delays across operators
◇ Analyzed day-of-week patterns and company-specific performance
KEY INSIGHTS:
Common Causes: Heavy traffic and mechanical issues are the top delay contributors, with mechanical issues frequently resulting in full breakdowns
Delay Time Variability: Vendors differed in performance - some averaged delays over 60 minutes while others maintained shorter average estimates under 45 minutes
Weekday Patterns: Mondays and Fridays experienced the most delays, hinting at transitional issues at the beginning and end of the week
Operational Gaps: Companies with longer delays and frequent breakdowns may benefit from preventive maintenance policies or contractual review
CHALLENGES & SOLUTIONS:
⚠️ Challenge 1: Inconsistent Delay Time Categories
✅ Solution: Reclassified using logical groupings (e.g., short vs. long) for consistent analysis
⚠️ Challenge 2: Default or Invalid Timestamps (e.g., 1900-01-01)
✅ Solution: Filtered and flagged using Excel’s IFERROR
and logical filters
⚠️ Challenge 3: Raw Data Was Not Analysis-Ready
✅ Solution: Built calculated columns and used pivot tables to create structured, meaningful views
DATA SOURCES:
Dataset: School_Bus_Breakdown_and_delays_in_NYC.xlsx
Source: The dataset was sourced from opendata.cityofnewyork.us
Timeframe: Multi-year records from 2016 to 2023
DATA DICTIONARY:
Occurred_On: Date and time of the delay or breakdown
Reason: Description of the issue causing the delay
Breakdown_or_Running_Late: Binary label for full breakdown vs. typical delay
Short_Delay_Time_Estimate: Estimated duration of the delay in minutes
Long_Delay_Time_Estimate: Estimated duration of the delay in minutes
Bus Company Name: Operator responsible for the affected vehicle
Depot: Dispatch location of the bus
School_Age_or_PreK
: Indicates student age group on board
VISUAL WALKTHROUGH: Each visualization below corresponds to the key questions above, now brought to life through data storytelling.
Click on any image to enlarge it to full screen.
Findings:
The majority of breakdowns are due to mechanical issues, including instances where buses "won't start," which also appears to be a mechanical problem.
Recommendation:
Increase preventative maintenance to reduce mechanical failures. Consider scheduling bus servicing on a monthly or quarterly basis.
Findings:
The primary cause of delays is due to traffic.
Potential Solution:
Adjust bus routes and/or pickup times to mitigate traffic-related delays.
Note: Nearly 14,000 delays are due to mechanical issues. Implementing a preventative maintenance schedule will help reduce instances of buses running late.
Findings:
These are the top ten companies with the highest delay times.
Recommendation:
Use this data to compare the number of delays, boroughs, and total number of trips.
Note: Implementing a preventative maintenance schedule could help reduce delay times. Additionally, consider whether replacing the bus company with a more reliable option is necessary.
Findings:
The boroughs with the longest delays are in the main areas of New York City, where the highest concentrations of people and vehicles contribute to prolonged wait times.
Findings:
The largest number of delays occur on Mondays, with a significant decrease by Friday.
Conclusion:
Since most delays are caused by traffic, we can infer that the decline on Fridays is due to fewer people commuting to work or driving.
Findings:
There is a gradual decrease in breakdowns throughout the week, with a higher percentage occurring on Monday and declining as the week progresses.
Recommendations:
Schedule maintenance over the weekends to reduce breakdowns.