🌍 Global Development Analytics

Click the banner to view the full analysis report

🌍 Global Development Analytics

An End-to-End Data Science Project in R

🚀 Overview

This project delivers an end-to-end data science analysis of the key drivers behind the Human Development Index (HDI) across 180+ countries from 2000 to 2022.

Going beyond descriptive statistics, the study integrates economic theory, robust data engineering, and predictive modeling, comparing traditional econometric approaches (Linear Regression) with machine learning methods (Random Forest) to uncover non-linear dynamics in global development.

The result is a reproducible, research-grade workflow suitable for policy analysis, academic research, and applied data science portfolios.

📊 Key Visualizations

1️⃣ Data Quality & Correlation Structure

Data Quality & Imputation	Correlation Analysis

Group-wise imputation of missing time-series data.	Strong correlation (0.91) between Life Expectancy and HDI.

2️⃣ Economic & Structural Insights

The Preston Curve	Structural Clustering (PCA)

Diminishing returns of GDP on Life Expectancy.	Distinct regional and income-based development clusters.

🧠 Modeling & Results

📈 Model Performance

Linear Regression (Baseline): RMSE = 0.056
Random Forest (ML): RMSE = 0.026
Performance Gain: 54% improvement in predictive accuracy

➡️ This confirms that human development follows non-linear patterns poorly captured by linear models.

🔍 Feature Importance (Random Forest)

Key Drivers of HDI:

GDP per Capita (Primary Driver)
Life Expectancy
Health Expenditure
Unemployment Rate (Critical bottleneck effect)

📊 Key Findings

Machine Learning Superiority: Random Forest significantly outperforms traditional regression, highlighting complex interactions in development indicators.
Preston Curve Validated: Economic growth yields diminishing returns on health and human development after a threshold.
Policy-Relevant Bottlenecks: Health spending and labor market conditions meaningfully constrain development outcomes beyond income alone.

View the Report

📄 View Full Analysis Report - Download `report.html` and open in your browser for the complete interactive report with all visualizations and code.

🛠️ Tech Stack

🔧 Language

R (4.x)

📦 Data Engineering

tidyverse, janitor
WDI (World Bank API)
naniar (Missing data diagnostics & imputation)

📊 Visualization & EDA

ggplot2, GGally
corrplot
factoextra (PCA & clustering)

🤖 Modeling

tidymodels
randomForest
vip (Model interpretability)

🌐 Data Sources

World Bank Open Data API
UNDP Human Development Reports

💻 How to Run the Project

Clone the repository

git clone https://github.com/your-username/global-development-analytics.git

Open human_development_index_R
Run data collection:
```
scripts/01_data_collection.R
```
Generate the full report:
```
rmarkdown::render("report.Rmd")
```

📂 Project Structure

├── data/            # Raw and processed datasets
├── scripts/         # Modular R scripts (ETL, EDA, Modeling)
├── results/         # Figures and model outputs
├── report.Rmd       # Reproducible analysis report
├── mega-hdi-analysis.Rproj
└── README.md        # Project documentation

⭐ If you find this project useful, feel free to star the repository or reach out for collaboration.

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
data		data
results		results
scripts		scripts
LICENSE		LICENSE
README.md		README.md
project_cover_photo.png		project_cover_photo.png
report.Rmd		report.Rmd
report.html		report.html
report.md		report.md
report.pdf		report.pdf

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

🌍 Global Development Analytics

An End-to-End Data Science Project in R

🚀 Overview

📊 Key Visualizations

1️⃣ Data Quality & Correlation Structure

2️⃣ Economic & Structural Insights

🧠 Modeling & Results

📈 Model Performance

🔍 Feature Importance (Random Forest)

📊 Key Findings

View the Report

📄 View Full Analysis Report - Download `report.html` and open in your browser for the complete interactive report with all visualizations and code.

🛠️ Tech Stack

🔧 Language

📦 Data Engineering

📊 Visualization & EDA

🤖 Modeling

🌐 Data Sources

💻 How to Run the Project

📂 Project Structure

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

🌍 Global Development Analytics

An End-to-End Data Science Project in R

🚀 Overview

📊 Key Visualizations

1️⃣ Data Quality & Correlation Structure

2️⃣ Economic & Structural Insights

🧠 Modeling & Results

📈 Model Performance

🔍 Feature Importance (Random Forest)

📊 Key Findings

View the Report

📄 View Full Analysis Report - Download report.html and open in your browser for the complete interactive report with all visualizations and code.

🛠️ Tech Stack

🔧 Language

📦 Data Engineering

📊 Visualization & EDA

🤖 Modeling

🌐 Data Sources

💻 How to Run the Project

📂 Project Structure

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

📄 View Full Analysis Report - Download `report.html` and open in your browser for the complete interactive report with all visualizations and code.

Packages