Projects

Discover the story behind the data

CardioRisk Insights

Exploratory Health Data Dashboard (Excel Project)

Project Overview

This Excel-based data analysis project was designed to demonstrate professional data analytics skills using a clean dataset representing demographic and clinical variables associated with the presence or absence of heart disease.

The project aims to showcase proficiency in data cleaning, descriptive statistics, and visual analytics — all within the Excel environment.

Objectives

  • Perform an exploratory data analysis (EDA) to identify patterns and relationships between key health indicators

  • Create statistical summaries and visualizations that communicate insights clearly and effectively

  • Design an executive dashboard with key performance indicators (KPIs), dynamic charts, and slicers for interactive exploration

Methodological Highlights

The project was developed entirely within Microsoft Excel, emphasizing professional-level analytical techniques and structured data workflows. The following methodological steps summarize the key analytical processes and Excel functionalities applied:

  • Data Cleaning and Preparation:
    The dataset was systematically verified for missing, inconsistent, or duplicated values. Categorical fields were standardized and corrected to ensure uniformity across all records

  • Categorical Value Mapping:
    Functions such as XLOOKUP and VLOOKUP were used to assign standardized labels to categorical variables (e.g., gender, chest pain type, fasting blood sugar category). This allowed for automated classification and improved readability throughout the analysis

  • Data Categorization and Grouping:
    Custom-defined categories were created using the IFS function to generate analytical groupings such as Age Group (e.g., <35, 35-49, >65) and BMI Category (e.g., Normal, Overweight, Obese). These groupings enabled more intuitive visualizations and comparative analysis across population segments

  • Use of Structured Excel Tables:
    All datasets were converted into structured Excel Tables to maintain organized referencing, facilitate dynamic range expansion, and improve formula consistency. This structure also ensured compatibility with Pivot Tables and Charts

  • Descriptive Statistics:
    Core statistical measures — including mean, standard deviation, minimum, and maximum — were computed using native Excel functions (AVERAGE, STDEV, MIN, MAX, COUNTIFS, etc.). Group-based statistics were calculated with conditional formulas to compare sex and disease categories

  • Multiple Pivot Tables were created to summarize categorical and numerical variables efficiently, while Pivot Charts provided dynamic visualizations supporting the executive dashboard

  • Slicers were integrated into the dashboard to enable interactive filtering by sex, disease status, age group, and cholesterol range, allowing users to explore patterns dynamically

  • Data Visualization:
    A range of visualization types were designed, including radar charts, clustered column charts, bar charts, and scatter plots, all formatted for clarity and professional presentation

Key Insights

  • Individuals with heart disease tend to show higher average values for cholesterol, oldpeak, and age, regardless of gender

  • Healthy female participants display the most balanced health profiles across variables

  • The radar and dashboard visuals clearly distinguish demographic and metabolic risk patterns, aligning with known cardiovascular trends

Table 1. Descriptive Statistics of Key Numerical Variables (n = 3,069)

The dataset reveals a population with elevated cardiovascular risk indicators — notably high cholesterol, blood pressure, and BMI levels. These findings underline the relevance of further analysis by sex and disease status to identify differential health profiles and potential intervention priorities

Figure 1. Age distribution among individuals with and without heart disease

The age distribution of the study population appears relatively uniform, with moderate variations across age intervals. This pattern suggests that participants are well represented across different age groups, without a strong concentration in any specific range

The absence of marked skewness or extreme values indicates a balanced sampling, minimizing potential age-related bias. Overall, these results imply that age is evenly distributed among individuals with and without heart disease, providing a solid demographic foundation for subsequent comparative analyses

The appropirate number of classes for the histogram was determined using the Sturge's Rule

Figure 2. Multidimensional Cardiovascular Risk Profile by Gender and Disease Status

The radar chart visually compares normalized cardiovascular indicators across four demographic-health groups: women with and without heart disease, and men with and without the condition

Distinct patterns emerge—individuals with heart disease (both sexes) consistently exhibit higher normalized values for age, cholesterol, and oldpeak, highlighting the influence of these factors on disease presence

Conversely, maximum heart rate (thalach) and BMI remain more balanced across groups, suggesting less discriminative power in this dataset

Interestingly, women with heart disease show the broadest radar spread, indicating a stronger cumulative risk profile, while healthy males display the most compact profile, reflecting comparatively favorable cardiovascular metrics

Table 2. Impact of Smoking on Age-Related ST Depression Patterns in Young Adults

This table summarizes the pairwise correlations among key cardiovascular variables (age, resting blood pressure, cholesterol, maximum heart rate, ST depression, and BMI) across young adult subgroups. Noticeable patterns emerge only in groups with diagnosed heart disease, particularly among smokers, where age and oldpeak exhibit the strongest negative relationship—indicating earlier onset and faster progression of ischemic response with advancing age

Figure 3A. Young Non-Smokers with Heart Disease

Displays a moderate negative correlation (r = -0.356) between age and oldpeak, suggesting that even in non-smoking individuals, increasing age correlates with reduced cardiac stress tolerance once disease is established

Figure 3B. Young Smokers with Heart Disease

Shows the strongest inverse relationship (r = -0.514) between age and oldpeak. This accentuated pattern reflects how smoking amplifies ischemic vulnerability, accelerating cardiovascular deterioration in early adulthood

Figure 3C. Young Smokers without Heart Disease

Reveals a weaker correlation (r = -0.152), indicating that while smoking begins to influence cardiovascular markers, the absence of disease still buffers against marked physiological decline

Turning Data into Insight: Uncovering Hidden Patterns in Cardiovascular Risk

This analytical project reveals how structured data exploration transforms clinical patterns into actionable insights. Age and ST depression emerge as key indicators linking lifestyle factors—particularly smoking—to early cardiovascular decline. Beyond health analytics, the approach showcases the analytical precision, data visualization, and business-oriented storytelling essential for strategic decision-making across industries

Services

Data analysis and insights

Data Cleaning
A visual representation of data cleaning processes.
A visual representation of data cleaning processes.

Refining datasets for accuracy.

An infographic illustrating exploratory data analysis.
An infographic illustrating exploratory data analysis.
A dashboard showcasing data visualization techniques.
A dashboard showcasing data visualization techniques.
Exploratory Analysis

Uncovering patterns and insights.

Creating impactful visual narratives.

Visualization

FAQ

What is data analysis?

Data analysis transforms data into insights.

Why is it important?

It supports informed decision-making and strategy.

What tools do you use?

I use Python, SQL, Excel, and Power BI.

How do you approach projects?

I focus on data cleaning and visualization.

Can you work remotely?

Yes, I can work remotely and collaborate.

What industries do you serve?

I serve various industries, including science and tech.