World Life Expectancy - Part 2: Exploratory Data Analysis
OBJECTIVE: Conduct exploratory data analysis (EDA) on global life expectancy trends to uncover patterns, correlations, and disparities across countries, years, and socioeconomic factors. This analysis builds on the cleaned dataset prepared in Part 1.
BACKGROUND: Understanding life expectancy requires more than just clean data. It calls for context-driven analysis. This project explores how life expectancy varies by GDP, BMI, development status, and over time. It also investigates which countries saw the greatest or smallest changes in life expectancy over the past 15+ years.
The data cleaning completed in Part 1 was essential for making the dataset suitable for deeper analysis. While this project did not aim to fully answer specific research questions or include visualizations, the cleaned data structure enables future exploration.
This analysis was conducted entirely in SQL, with a focus on demonstrating how key trends and relationships could potentially be explored through queries, aggregations, and statistical breakdowns.
With a reliable dataset in place, the following types of questions could potentially be explored:
How much has life expectancy changed across countries?
What’s the relationship between life expectancy and economic/health factors?
How do developed and developing countries compare?
Are there countries with unusually slow or fast improvements?
TECH STACK:
SQL: All analysis conducted using aggregations, window functions, and filtering
Window Functions: Used
SUM() OVER()
to analyze cumulative trends in adult mortalityGROUP BY & HAVING Clauses: Applied for segment-level insights and data integrity checks
Ordering & Ranking: Used to surface top/bottom performers in change, GDP, and BMI metrics
PROCESS:
Descriptive Trends:
◇ Calculated each country's min, max, and range in life expectancy
◇ Ranked countries by greatest and smallest improvements in life expectancy over time
◇ Summarized average global life expectancy by year, observing upward trendsComparative Analysis:
◇ Aggregated average life expectancy by development status (Developed vs. Developing)
◇ Evaluated correlations between GDP and life expectancy, showing positive economic-health links
◇ Assessed relationship between BMI and life expectancy, highlighting extremesCumulative Metrics:
◇ Used window function to compute rolling adult mortality totals by country over time
KEY INSIGHTS:
While this project did not perform in-depth statistical validation, the following trends were observed through exploratory queries:
Global Growth: Most countries have experienced measurable improvements in life expectancy since 2000
Top Improvers: Certain countries showed gains of over 20 years in life expectancy
Low Variance: Some countries had minimal change, prompting further investigation into health systems or stability
Economic Correlation: Countries with higher average GDPs tend to have higher life expectancy
BMI Patterns: Nations with extreme average BMI levels often show distinctive life expectancy trends
Development Disparity: Developed countries still hold a significant life expectancy lead over developing nations
CHALLENGES & SOLUTIONS:
⚠️ Challenge 1: Raw data contained outliers and zero values that distorted aggregates
✅ Solution: Used HAVING
clauses to exclude invalid records from analysis
⚠️ Challenge 2: Needed to track change over time at a granular level
✅ Solution: Applied MIN()
, MAX()
, and difference calculations grouped by country
⚠️ Challenge 3: Cumulative adult mortality trends were not visible in static aggregates
✅ Solution: Leveraged SUM() OVER(PARTITION BY Country ORDER BY Year)
to track trends year by year
DATA SOURCES:
Cleaned Dataset:
world_life_expectancy_cleaned.sql
Source: Derived from raw global life expectancy data used in Part 1
Timeframe: Covers multi-decade life expectancy, GDP, BMI, and adult mortality records by country
DATA DICTIONARY:
Country: Nation or territory name
Year: Observation year
Life expectancy: Average years a newborn is expected to live
GDP: Gross domestic product per capita
BMI: Average body mass index by country and year
Adult Mortality: Death rate of adults (typically ages 15–60) per 1,000 population
Status: Country classification (Developed or Developing)
VIEW THE SQL ANALYSIS CODE:
Want to explore how the analysis was built in SQL?
👉 Click here to view the full SQL exploratory analysis code — includes trend detection, country ranking, correlation breakdowns, and rolling mortality insights.
This code demonstrates a structured, data-driven approach to uncovering meaningful patterns in global health.
The GitHub repository is well-documented, query-by-query, to support reproducibility and transparency.