DA-102 Excel Sprint Summary

Modified on Wed, 23 Jul at 1:54 PM

TABLE OF CONTENTS


DA-102 Excel

Sprint summary

Getting Started

Topic 1: Excel Basics


- Introduction to Excel

What is it?  
Excel is a powerful spreadsheet application used for data organization, analysis, and visualization.


How the data looks like:


How this analysis can be performed:


Analyse the cricket dataset:



Outcome:


Excel plays an important role in creating tables to hold data. 


Where is it used?  

It is widely used in business, finance, education, and research for data analysis and reporting 


How is it used?  

  1. Open Excel

  2. Create a new ‘Blank Workbook’

  3. File -> Save

  4. Save it with filename <your name>.xlsx




Takeaways / best practices  
- Keep data organized and well-structured for easier analysis.
- Regularly save your work to prevent data loss.
- Use clear and descriptive labels for columns and rows.
- Familiarize yourself with Excel functions to enhance analysis capabilities.
- Always double-check formulas for accuracy.

- Worksheets and Workbooks

Worksheets are individual pages within a workbook that contain data, while workbooks are files that can contain multiple worksheets, commonly used in data analytics for organizing and analyzing data.


Worksheets:

At the bottom you can see “Sheet1” that is the name of your worksheet you can rename it as per the need.





Where is it used?
- In spreadsheet software like Microsoft Excel, Google Sheets, and data analysis tools.

How is it used?
- Create a new workbook.
- Add multiple worksheets for different datasets or analyses.
- Input or import data into the worksheets.
- Use formulas and functions to analyze data.
- Create charts and visualizations to represent findings.
- Save and share the workbook for collaboration.


Workbook:



Takeaways / best practices:
- Organize data logically across worksheets for clarity.
- Use clear naming conventions for worksheets.
- Regularly back up workbooks to prevent data loss.
- Utilize data validation to maintain data integrity.
- Document formulas and analyses for future reference.

- Importing Data

Importing data refers to the process of bringing data from external sources into a data analytics environment for analysis.


Different data formats:


Above are a few common ones. You can explore more formats here

Where is it used?
- Data analytics platforms
- Business intelligence tools
- Data visualization software
- Machine learning frameworks

How is it used?
- Identify the data source (e.g., databases, CSV files, APIs).
- Establish a connection to the data source.
- Use appropriate tools or programming languages (e.g., SQL, Python, R) to extract the data.
- Clean and preprocess the data as needed.
- Load the data into the analytics environment for analysis.


Importing Data in Excel:

  • Open Excel

  • Go to Files → Open files from this device

  • Select the file you want to open 

  • In this case the CSV dataset file here.

  • The file will open in a new window in excel format.



Tips:

If you open a CSV file in your desktop version, make sure you save it in Excel format. 

Else you might lose data, as it saves only one sheet.


Takeaways / best practices:
- Ensure data quality and integrity before importing.
- Use standardized formats for data consistency.
- Document the data import process for reproducibility.
- Regularly update and maintain data connections.
- Be mindful of data privacy and compliance regulations.

- Data Entry and Formatting

Data Entry and Formatting refers to the process of inputting and organizing data in a structured manner for analysis.





1.Fixing Alignment:


2.Wrapping Text:



3.Highlighting column header:


Data Validation in Excel

Data Validation in Excel is a feature that allows you to control the type of data or the values that users enter into a cell or range of cells.


What can be done using data formatting? 

  • Define rules for data type to be entered in a cell    

  • Prevent Invalid entries

  • Specify Input/Error Message

  • Dropdown lists 

Applying Data Validation:


On the sales data :

  1. Select the cells where you want to apply data validation.

  2. In the Excel ribbon, click on the “Data” tab.

  3. In the “Data Tools” group, click on “Data Validation.”

  4. Set the Validation Criteria to include numbers under the “Settings” tab. 

  5. Apply the Validation


Date Formatting:

  1. Select the column or cells containing the dates.

  2. Right-click the selected cells and choose Format Cells.

  3. In the Format Cells dialog box, select the Number tab.

  4. Click on Date in the Category list. Scroll through the list of date formats and choose the one that looks like "yyyy-mm-dd"

  5. Click OK to apply the format.



Conditional Formatting in Excel:

Conditional Formatting in Excel is a feature that allows you to automatically apply formatting to cells based on the values they contain or specific conditions you set. 


  1. Select the cells on which you want to apply cell formatting

  2. In the Excel ribbon, click on the "Home" tab

  3. In the "Styles" group, click on "Conditional Formatting."

  4. Choose from the predefined rules like "Highlight Cells Rules”

  5. Set the Conditions

  6. Select the formatting style you want to apply when the condition is met. 

  7. Preview and Apply




Where is it used?
- Business intelligence
- Market research
- Financial analysis
- Academic research

How is it used?
- Collect data from various sources (surveys, databases, etc.)
- Input data into spreadsheets or databases
- Ensure data accuracy through validation checks
- Format data consistently (e.g., date formats, numerical precision)
- Organize data into tables or categories for analysis
- Use software tools for data cleaning and transformation

Takeaways / best practices:
- Always verify data accuracy before analysis.
- Maintain consistent formatting to avoid confusion.
- Use automated tools for repetitive tasks to minimize errors.
- Document data sources and changes for transparency.
- Regularly back up data to prevent loss.

- Cell Referencing

Cell referencing is the method of using cell addresses in spreadsheets to refer to data for calculations and analysis.






Relative Referencing:

  • A relative reference changes when you copy the formula to another cell. 

  • It’s relative to the position of the cell containing the formula.

  • Syntax: A1, B2, etc


Absolute Referencing:

  • An absolute reference remains constant no matter where you copy the formula. It always refers to the same cell.

  • Syntax: $A$1, $B$2, etc.



Mixed Referencing:

  • A mixed reference locks either the row or the column.

  • The part with the dollar sign ($) remains constant, while the other part changes.

  • Syntax: $A1, A$1, etc.


Where is it used?
- Data analysis in spreadsheets (e.g., Excel, Google Sheets)
- Financial modeling
- Reporting and dashboards

How is it used?
- Identify the cell or range of cells you want to reference.
- Use the cell address (e.g., A1, B2) in formulas or functions.
- Choose between relative (changes when copied), absolute (remains constant), or mixed referencing (partially constant).
- Apply the formula to perform calculations or data manipulation.
- Drag the fill handle to copy the formula to adjacent cells if needed.

Takeaways / best practices:
- Use absolute references for fixed values to avoid errors in calculations.
- Keep formulas simple for better readability and maintenance.
- Document complex formulas with comments for clarity.
- Regularly check for broken references when copying or moving data.


Topic 2: Excel Functions

Use this dataset for this session

- Basic Excel Functions

What is it?  
Basic Excel functions are built-in formulas that perform calculations or data manipulation to analyze and summarize data.


Explore Excel Functions:





Where is it used?  
Used in data analytics for tasks like data cleaning, summarization, and visualization.

How is it used?  
- Open Excel and input data into cells.
- Select a cell where you want the result.
- Type the function name (e.g., SUM, AVERAGE, COUNT).
- Open parentheses and select the range of cells to include.
- Close parentheses and press Enter to see the result.

Takeaways / best practices  
- Familiarize yourself with commonly used functions like VLOOKUP, IF, and CONCATENATE.
- Use cell references instead of hardcoding values for dynamic calculations.
- Keep formulas simple and well-organized for easier troubleshooting.
- Regularly save your work to prevent data loss.

- Mathematical

Mathematics in data analytics refers to the application of mathematical concepts and techniques to analyze and interpret data.


Where is it used?
- Business intelligence
- Predictive modeling
- Statistical analysis
- Machine learning

How is it used?
- Define the problem and objectives.
- Collect and preprocess data.
- Apply statistical methods to summarize data.
- Use mathematical models to identify patterns and relationships.


- Validate models using statistical tests.
- Interpret results to inform decision-making.


Tip: 

When copying a cell that contains a formula, Excel will paste the formula by default not the value. To paste only the value and avoid carrying over the formula, use:

CTRL + SHIFT + V to paste values only.


Takeaways / best practices:
- Ensure data quality before analysis.
- Choose appropriate mathematical techniques based on the problem.
- Validate findings with multiple methods.
- Communicate results clearly to stakeholders.
- Continuously update models with new data for accuracy.

- Statistics

Statistics in the context of Data Analytics is the science of collecting, analyzing, interpreting, and presenting data to uncover insights and inform decision-making.




Where is it used?
- Business intelligence
- Market research
- Healthcare analytics
- Social sciences
- Sports analytics

How is it used?
- Define the problem or question to be answered.
- Collect relevant data through surveys, experiments, or existing datasets.
- Clean and preprocess the data to ensure accuracy.
- Apply statistical methods to analyze the data (e.g., descriptive statistics, inferential statistics).
- Interpret the results to draw conclusions or make predictions.
- Visualize the findings to communicate insights effectively.


Example:






Takeaways / best practices:
- Always understand the context of the data before analysis.
- Ensure data quality through rigorous cleaning and validation.
- Use appropriate statistical methods for the type of data and research question.
- Be cautious of biases and assumptions in data interpretation.
- Communicate results clearly and effectively to stakeholders.

- Date&Time

Date & Time in data analytics refers to the representation and manipulation of temporal data to analyze trends, patterns, and behaviors over time.






Where is it used?
- Time series analysis
- Event tracking
- Sales forecasting
- User behavior analysis
- Financial reporting

How is it used?
- Collect date and time data from various sources.
- Convert date and time into a standardized format.
- Extract relevant features (e.g., year, month, day, hour).
- Perform time-based aggregations (e.g., daily, weekly, monthly).
- Visualize trends using time series plots.
- Apply statistical methods to analyze temporal patterns.


Example:



Takeaways / best practices:
- Always standardize date and time formats for consistency.
- Be mindful of time zones when analyzing global data.
- Use appropriate granularity based on the analysis needs.
- Handle missing or irregular time data carefully.
- Document date and time transformations for reproducibility.

- Logical

Logical in the context of data analytics refers to the use of logical reasoning and operations to analyze data and derive insights.


Where is it used?
- Data cleaning and preprocessing
- Querying databases
- Building algorithms for data analysis
- Decision-making processes

How is it used?
- Identify the problem or question to be answered.
- Gather and prepare the relevant data.
- Apply logical operations (AND, OR, NOT) to filter or manipulate data.
- Use conditional statements to derive insights or make predictions.
- Validate results through testing and iteration.





Takeaways / best practices:
- Ensure data quality before applying logical operations.
- Document logical processes for reproducibility.
- Use clear and concise logical expressions to avoid confusion.
- Regularly review and update logical frameworks as data evolves.

- Lookup

Lookup in data analytics refers to the process of retrieving specific data from a dataset based on a defined key or criteria.



Example:



Where is it used?
- Data cleaning and preparation
- Merging datasets
- Data validation
- Reporting and dashboarding

How is it used?
- Identify the key or criteria for the lookup.
- Select the dataset containing the data to be retrieved.
- Use a lookup function or method to match the key with the corresponding data.
- Extract the relevant information based on the match.
- Integrate the retrieved data into the original dataset or analysis.

Takeaways / best practices:
- Ensure data consistency in keys to avoid mismatches.
- Use appropriate lookup functions based on the data structure.
- Validate the results of the lookup to confirm accuracy.
- Document the lookup process for reproducibility and clarity.

- Text

Text with context in data analytics refers to the integration of textual data and its surrounding information to enhance understanding and insights.

  1. Concat:



  1. Trim:


Where is it used?
- Sentiment analysis
- Customer feedback analysis
- Social media monitoring
- Market research

How is it used?
- Collect textual data from various sources (e.g., surveys, social media).
- Identify relevant context (e.g., time, location, user demographics).
- Analyze the text using natural language processing (NLP) techniques.
- Combine insights from the text with contextual data for deeper analysis.
- Visualize findings to communicate insights effectively.

Takeaways / best practices:
- Ensure data quality by cleaning and preprocessing text data.
- Use appropriate NLP tools and techniques for analysis.
- Always consider the context to avoid misinterpretation of text.
- Regularly update models and methods to adapt to changing data trends.
- Collaborate with domain experts to enhance contextual understanding.


Topic 3: Data Cleaning Basics

- Introduction to the Community Health Dataset

What is it?
A structured dataset containing information about public health indicators, demographics, and healthcare access in a given community.



The Data Analysis process:


Where is it used?
It is used in public health research, healthcare service planning, policy development, and data-driven decision-making in health-related projects.

How is it used?
• Load the dataset into a data analysis tool (Excel, Python, R, etc.)
• Explore column names and types of variables (categorical, numerical, date, etc.)
• Understand the scope – e.g., geographic coverage, time range, population groups
• Identify relevant fields for specific analysis (e.g., hospital visits, vaccination rates)
• Check for initial data quality issues before diving into analysis

--- Takeaways / best practices
• Understand the context and source of the dataset to avoid misinterpretation
• Pay attention to column names and data types for proper processing
• Ensure the dataset aligns with your analysis goals and questions
• Always conduct a basic scan for issues like missing values or invalid entries before deep analysis

- Conditional Formatting for Data Integrity

What is it?
A visual tool that highlights data cells based on specific rules or conditions to quickly spot anomalies or inconsistencies.


Where is it used?
Commonly used in spreadsheet tools like Excel or Google Sheets during data cleaning and validation stages.

How is it used?
• Select a range of data in your spreadsheet
• Apply conditional formatting rules (e.g., highlight cells > 1000 or text containing errors)
• Use color codes to mark duplicates, blank cells, or outliers
• Review highlighted cells for potential corrections or further investigation

--- Takeaways / best practices
• Use it to visually inspect data for red flags like extreme values, blanks, or incorrect entries
• Keep formatting rules simple and relevant to the data context
• Combine it with filters or sorting to analyze affected rows efficiently
• Do not rely solely on it—use alongside statistical checks or formulas for robust integrity review

- Handling Missing Data

What is it?
The process of identifying, analyzing, and addressing gaps or null values in your dataset.




Where is it used?
Essential in all data analysis workflows, especially when working with survey data, health records, or large datasets.

How is it used?
• Identify missing values using built-in functions (e.g., ISBLANK in Excel, .isnull() in Python)
• Quantify how much data is missing and in which columns
• Decide on a treatment strategy:

Remove rows/columns with too much missing data

Fill with mean, median, or mode (imputation)

Use default values or predictive modeling
• Document your approach and rationale for transparency

--- Takeaways / best practices
• Always analyze the pattern and reason for missingness before removing or imputing
• Avoid filling in missing data blindly—it can distort results
• Use visualization (e.g., heatmaps) to understand missing data distribution
• Consistent handling ensures more reliable and valid analysis

- Standardizing the Data

What is it?
The process of converting data into a consistent format to ensure accuracy and comparability.




Where is it used?
Applied during data cleaning and preparation stages before modeling or visualization, especially with multi-source datasets.

How is it used?
• Review inconsistencies in format (e.g., dates as text, uppercase vs lowercase)
• Convert data types where necessary (e.g., text to numbers)
• Normalize naming conventions (e.g., “Male” vs “male” vs “M”)
• Convert units of measure to a standard format (e.g., meters to kilometers)
• Apply transformations if needed (e.g., log-scaling, standard scaling for machine learning)

--- Takeaways / best practices
• Standardization reduces the risk of errors and makes data analysis smoother
• Consistent formats are crucial for filtering, grouping, and comparing data
• Automate where possible to avoid manual inconsistencies
• Keep a data dictionary or documentation of changes for transparency


Topic 4: Ensuring Data Accuracy and Gaining Insights


- Looking Up the Correct Value

Ever pulled data from a messy sheet and wondered, “Is this the right value?” That’s where lookup functions come in – they help fetch the correct data based on a reference.

What is it?
A way to retrieve accurate information from large datasets using lookup functions like: 1.VLOOKUP:



2.HLOOKUP:


3.XLOOKUP:


Where is it used?
Common in spreadsheets, dashboards, and reporting tools – especially when matching IDs to names, prices, or category labels.

How is it used?
• Identify the reference key (e.g., ID, code, name)
• Choose the correct lookup formula based on your data structure
• Apply the formula across relevant columns
• Double-check for #N/A or mismatches to spot issues
• Use exact or approximate match based on need

--- Takeaways / best practices
• Always clean your reference column to avoid lookup failures
• Avoid hardcoding values use dynamic cell references
• Validate your lookup with a few manual checks
• Prefer XLOOKUP for better flexibility over VLOOKUP

- Adding Data Validation

Tired of fixing typos or strange entries in your data? Data validation acts like a gatekeeper – it controls what users can enter in a cell.


Example:

Date of birth validation 


What is it?
A method to restrict data entry to a predefined format, range, or list.

Where is it used?
Heavily used in spreadsheets (like Excel or Google Sheets) for data entry forms, survey sheets, and shared data templates.

How is it used?
• Select the cell or range you want to validate
• Go to the data validation menu
• Set rules (e.g., only allow numbers, pick from a list, dates within a range)
• Add input messages or error alerts to guide users
• Test with sample data to ensure rules work

--- Takeaways / best practices
• Use drop-down lists for consistent categorical data
• Prevent blanks or invalid entries that break your analysis
• Combine with conditional formatting to visually flag errors
• Keep validation rules updated when your dataset evolves

- Exploratory Data Analysis

Think of this as getting to know your dataset before making decisions. EDA is where curiosity meets structure – it helps uncover patterns, trends, and surprises.

What is it?
A process of summarizing and visualizing the main features of a dataset before formal modeling.

Where is it used?
Used in almost every data project to understand the data landscape – from business analytics to scientific research.

How is it used?
• Start with basic descriptive statistics (mean, median, mode, count)
• Visualize data using charts (bar, histogram, boxplot, scatterplot)
• Look for distributions, correlations, and outliers
• Segment data by groups or categories for comparison
• Document insights or anomalies found during the process

--- Takeaways / best practices
• Always perform EDA before jumping into conclusions or models
• Use visuals to communicate patterns and issues effectively
• Let EDA guide the next steps—feature selection, cleaning, or hypothesis testing
• Don’t ignore outliers; investigate their meaning and impact

- Introduction to Statistics

Statistics is the language of data. It helps turn raw numbers into interpretable insights—essential for making data-driven decisions.

What is it?
A set of methods for collecting, analyzing, interpreting, and presenting data.

Where is it used?
Core to data analysis, research studies, predictive modeling, and decision-making in every industry.


Types:

  1. Descriptive Statistics:

It helps you summarize and describe the data you already have. 

It provides insights about past events but does not make predictions about the future.


  1. Inferential Statistics:

Inferential statistics allow you to use data from a sample (past event) to make predictions or inferences about future events. 




How is it used?
• Start with descriptive stats (mean, median, variance, standard deviation)
• Understand data distribution (normal, skewed, etc.)
• Use inferential stats (hypothesis testing, confidence intervals) for deeper analysis
• Apply correlation and regression techniques for relationship analysis
• Interpret results to support or refute assumptions

--- Takeaways / best practices
• Know when to use descriptive vs. inferential statistics
• Understand your data types—categorical vs. numerical
• Be cautious of statistical significance vs. real-world relevance
• Use statistics to support—not replace—domain understanding


Topic 5: Understanding Data Types and Descriptive Statistics

- Types of Data

There are different types of data, and knowing which one you're working with is the first step in any analysis. Broadly, data is categorized as qualitative (categorical) or quantitative (numerical).


1.Quantitative:

Two types:

  • DISCRETE : Can’t be measured but it can be counted

Salary [30000 , 15000, 20000 etc]

            Age [56, 35, 18, 46 etc]

  • CONTINUOUS: Can’t be counted but they can be measured

            Weight [ 45kg, 67kg, 85kg …]

            Height [ 5ft, 6ft, 5.4 ft …]


2.Qualitative:

Two types:

  • NOMINAL: Values represent discrete units and are used to label variables

            Gender [Male,Female]

            Marital Status [Married, Unmarried,....]

  • ORDINAL: Ordinal values represent discrete and ordered units.

             Education Level [ High school, Graduate, Post Graduate]

             Satisfaction level [ Poor, Average, Good, Excellent]



What is it?
Classification of data into types such as nominal, ordinal, interval, and ratio based on their nature and properties.COMMON DA

A TYPES
Where is it used?
Used in every analytics project to choose the right tools, techniques, and visualizations.

How is it used?
• Identify if data is categorical (e.g., gender, color) or numerical (e.g., age, salary)
• Further break down numerical data into discrete (countable) or continuous (measurable)
• Use ordinal data for ranked categories (e.g., survey satisfaction: low, medium, high)
• Understand which statistical methods apply to each data type

--- Takeaways / best practices
• Always determine data type before selecting any analysis method
• The type of data influences your choice of graphs, stats, and models
• Clean and label your data clearly to reflect its type
• Misclassifying data can lead to incorrect conclusions

- Quantifying and Analyzing Data

To make data useful, we often need to convert it into numbers and analyze it in structured ways.

What is it?
The process of turning raw data into measurable quantities for statistical or visual analysis.

Where is it used?
Applied across data analytics pipelines, from surveys to financial reports to dashboards.


Example:



How is it used?
• Assign numeric values to qualitative data where needed (e.g., encoding Yes = 1, No = 0)
• Use statistical tools to summarize trends (e.g., average income, total sales)
• Apply sorting, grouping, and aggregation techniques
• Compare variables to detect patterns, relationships, or differences
• Interpret outcomes to support decision-making

--- Takeaways / best practices
• Ensure the integrity of data before quantifying it
• Avoid forcing numerical analysis on purely categorical variables
• Use appropriate tools (e.g., Excel functions, Python/Pandas, Power BI)
• Visualization can help make quantitative insights clearer and more actionable

- Measures of Central Tendency

These are your go-to tools to understand the center or typical value of a dataset.

What is it?
Statistical measures that describe the center point of a dataset – typically mean, median, and mode.


Outliers:

Outliers are extreme values that differ significantly from the rest of the data

These values can skew your data analysis


MEDIAN:




MODE:

The mode is the most frequently occurring value in a dataset. 

It’s particularly useful for identifying trends and common occurrences

Syntax: =MODE(<range>)



Where is it used?
Used in data summaries, dashboards, business reporting, and performance comparisons.

How is it used?
• Calculate the mean by summing all values and dividing by the count
• Find the median by ordering values and selecting the middle one
• Identify the mode as the most frequently occurring value
• Compare these metrics to understand data skew or distribution

--- Takeaways / best practices
• Use mean for normally distributed data; median for skewed data
• Always check for outliers—these can distort the mean
• Mode is helpful for categorical or discrete data
• Comparing all three gives a clearer view of data distribution

Topic 6: From Batch Processing to Powerful Data Summaries


- Batch Processing

When working with large volumes of data, processing tasks one-by-one just isn’t efficient. Batch processing lets you handle it all in chunks automatically and quickly.

What is it?
A method of executing a series of data operations on multiple records at once, usually without user interaction.

Where is it used?
Used in ETL processes, scheduled reporting, data cleaning pipelines, and in backend data systems.

How is it used?
• Identify repetitive or time-intensive data operations (e.g., file imports, transformations)
• Set up batch scripts or workflows (e.g., in Excel, Python, R, SQL, or cloud tools)
• Schedule the batch jobs using a task scheduler or automation platform
• Monitor and log output to ensure successful execution
• Review processed outputs for quality assurance

--- Takeaways / best practices
• Ideal for repetitive tasks like report generation, cleaning, or formatting
• Test batch scripts on smaller samples before full runs
• Always include logging and error handling
• Helps in saving time and ensuring consistency across large datasets

- Analyzing Insurance Coverage

Ever wondered how much of a population is insured or how coverage affects healthcare access? This type of analysis helps answer such questions.

What is it?
The process of evaluating patterns, gaps, and trends in health insurance coverage using available data.



Example:

For each region, calculate the insurance coverage %

Total Insurance * 100 / Total bill


Where is it used?
Used in healthcare research, policy planning, public health projects, and insurance analytics.

How is it used?
• Identify variables related to insurance (e.g., insured/uninsured status, type of plan)
• Segment data by demographics (age, income, region)
• Compare insured vs. uninsured populations to assess disparities
• Use pivot tables or filters to summarize by coverage type or provider
• Visualize trends and gaps using charts

--- Takeaways / best practices
• Always clean and categorize insurance-related fields clearly
• Consider combining coverage data with outcome variables (e.g., hospital visits)
• Use percentages, not just counts, for fair comparisons
• Privacy is key ensure sensitive information is handled securely

- Data Lookup

When datasets don’t have everything in one place, you’ll often need to pull in missing details. That’s where data lookup comes in.

What is it?
A method for retrieving related information from another dataset or range based on a key identifier.

Where is it used?
Common in spreadsheets, databases, and scripting to join or enrich data.

How is it used?
• Identify the lookup key (e.g., ID, code, name)
• Use lookup formulas (e.g., VLOOKUP,  XLOOKUP)
• Match values from one dataset to another based on shared keys
• Validate results by sampling or checking for #N/A errors
• Handle mismatches or missing values appropriately





Example:

A sample function used in one of the cells in the dashboard

=INDEX('Patient Health Records'!J:J, MATCH('Patient Lookup'!$A$1, 'Patient Health Records'!$A:$A, 0))


--- Takeaways / best practices
• Always ensure both datasets are clean and consistent
• Prefer XLOOKUP for better performance and flexibility
• Sort data and check for duplicate keys before applying lookups
• Combine lookups with conditional formatting for quick review

- Advanced Pivot Table Techniques

You’ve used pivot tables but now it’s time to level up. Advanced techniques let you uncover richer, more interactive insights.

What is it?
Powerful features in pivot tables that allow dynamic summarization, filtering, and slicing of data.


Example:

Creating a pivot table slicer in Excel is a matter of seconds. Here's what you do:

  • Click anywhere in the pivot table.

  • go to the Analyze tab > Filter group, and click the Insert Slicer button.

  • The Insert Slicers dialog box will pop up and show the checkboxes for each of your pivot table fields. 

  • Select one or more fields for which you want to create a slicer.

  • Click OK.

As an example, let's add two slicers to filter our pivot table by Product and Reseller:

Two pivot table slicers are created immediately:



Tip:

To filter your pivot table by various time periods, such as years, quarters, months, or days, you can add a timeline - a dynamic filter by date fields.


Where is it used?
Widely used in Excel, Google Sheets, and BI tools like Power BI and Tableau for dashboarding and reporting.

How is it used?
• Use calculated fields to perform custom calculations within pivot tables
• Apply slicers and filters for interactive controls
• Group data by time (e.g., months, quarters) or custom ranges
• Nest multiple levels of rows/columns for deeper analysis
• Use % of total, running totals, and difference from previous for trend analysis

--- Takeaways / best practices
• Keep your source data structured and clean for best results
• Use descriptive labels for calculated fields and groupings
• Combine slicers with visuals to create mini dashboards
• Explore drill-downs to investigate underlying patterns in summary data



Topic 7: Bringing Data to Life with Basic Charts


- Why visualization is important?

Ever looked at a wall of numbers and felt lost? That’s where visualization steps in it turns complex data into clear, visual stories.

What is it?
Visualization is the graphical representation of data to help identify trends, patterns, and insights.

Where is it used?
Used in dashboards, presentations, reports, and exploratory data analysis across all industries.

How is it used?
• Select the right type of chart based on your data and message
• Apply consistent and clear labels, colors, and legends
• Highlight key insights using visuals rather than raw numbers
• Use interactive visuals in tools like Power BI or Tableau for deeper exploration
• Make comparisons, trends, or distributions more accessible to decision-makers

--- Takeaways / best practices
• Choose the simplest chart that effectively communicates the message
• Avoid clutter—every element should serve a purpose
• Use visuals to support storytelling, not just to decorate
• Always include context: titles, legends, and labels matter

- Components of a Chart

A chart isn’t just lines or bars it’s made up of essential elements that ensure the story is told clearly.

What is it?
The individual parts of a chart that together create meaning like the title, axis, data series, and legend.

Where is it used?
In all data visualization tools— Excel, Power BI, Tableau, and any reporting platform.





How is it used?
• Add a clear title that reflects the insight being presented
• Use the x-axis and y-axis to represent the right variables
• Format data labels to show exact values if needed
• Include a legend to explain different series or colors
• Apply gridlines or markers for better readability

--- Takeaways / best practices
• Always label axes and use meaningful titles
• Avoid overloading the chart with too many components
• Use consistent colors and formatting for clarity
• Test if someone unfamiliar can interpret it easily

- Basic Charts

Basic charts are your go-to tools for exploring and explaining data simple yet powerful visuals that highlight key points.

What is it?
Fundamental chart types such as column, bar, pie, and line charts used for standard data presentation.

Where is it used?
In data summaries, dashboards, business reports, and exploratory data tasks.

How is it used?
• Understand what you're trying to show comparison, trend, or part of a whole
• Choose the appropriate chart type (e.g., column for comparison, pie for share)
• Insert chart using built-in tools in Excel, Google Sheets, or BI software
• Format labels, titles, and colors for clarity
• Adjust axes, sort values, or add filters for enhanced interpretation


Prerequisite for Chart Creation:

In order to create a chart , we need to summarize the data accordingly. 


Example:


--- Takeaways / best practices
• Know which chart works best for your data type and message
• Use basic charts for clear, easy-to-read visuals
• Don’t mix too many types in one view—keep it focused
• Always test your chart’s readability and impact



- Column Chart

Need to compare values across categories? Column charts are perfect for side-by-side comparisons.

What is it?
A chart that uses vertical bars to represent data values across categories.





Where is it used?
Used in business reports, trend analysis, and comparing multiple groups or time periods.

How is it used?
• Select categorical and numerical data (e.g., Region vs. Sales)
• Insert a column chart using chart tools
• Add data labels for clarity
• Sort columns to emphasize patterns
• Customize colors, gridlines, and titles

--- Takeaways / best practices
• Great for comparing a few categories—not ideal for large datasets
• Keep bar widths and spacing uniform
• Order columns logically (alphabetical or by value)
• Use labels and legends for added clarity

- Bar chart

Similar to column charts, but flipped horizontally bar charts shine when you have long category names or many items.

What is it?
A chart that displays data with horizontal bars, making it easier to compare across categories.



Where is it used?
Used in survey results, ranked lists, and categorical comparisons.

How is it used?
• Choose data with categories and associated values
• Insert a bar chart (horizontal format)
• Format axis labels to avoid overlap
• Sort bars for visual emphasis
• Use color or grouping to show segments or comparisons

--- Takeaways / best practices
• Ideal when category names are long or numerous
• Avoid using bar charts for time series data
• Sort bars from highest to lowest for easier interpretation
• Ensure there’s enough space for all labels

Column Chart Vs Bar Chart:



- Pie Chart

Want to show parts of a whole? Pie charts can work but they need to be used wisely.

What is it?
A circular chart that shows proportions or percentages of a total.


Where is it used?
Used in financial breakdowns, market share displays, or resource distribution.

How is it used?
• Select categorical data with corresponding values
• Insert a pie chart from chart options
• Use labels or percentage displays for clarity
• Limit the number of slices (ideally under 6)
• Highlight key slices using colors or callouts

--- Takeaways / best practices
• Use pie charts only when showing part-to-whole relationships
• Avoid using with too many categories—it becomes cluttered
• Consider using a bar chart if comparisons are important
• Always label slices clearly to avoid confusion


- Line chart

Tracking changes over time? Line charts make it easy to see trends, growth, or declines.

What is it?
A chart that uses connected points to show trends over time or continuous data.




Where is it used?
Ideal for time series data, KPIs, or monitoring trends like sales, temperature, or usage.

How is it used?
• Use a continuous variable on the x-axis (e.g., dates, months)
• Plot the metric of interest on the y-axis
• Connect data points with lines
• Add multiple lines to compare trends across categories
• Label axes and lines clearly

--- Takeaways / best practices
• Best for showing change over time
• Avoid clutter limit the number of lines to 3–4
• Use markers to highlight key points if needed
• Ensure the time axis is evenly spaced for accurate interpretation



- Scatter plot

When you want to see how two variables relate like cholesterol level of people for north and south region outcomes a scatter plot makes the connection visible.






What is it?
A chart that displays individual data points plotted across two axes to identify relationships or patterns.

Where is it used?
Used in statistical analysis, correlation studies, trend detection, and regression modeling.

How is it used?
• Choose two numerical variables (e.g., cost vs. visits, age vs. blood pressure)
• Plot one on the x-axis and the other on the y-axis
• Each point represents a data record
• Add trend lines or regression curves if analyzing correlations
• Optionally, color points by category to add a third variable

--- Takeaways / best practices
• Best for spotting trends, clusters, or outliers in paired data
• Ensure both axes are scaled correctly and consistently
• Avoid overplotting—use transparency or size controls if too many points
• Use titles and legends to clearly explain the variables shown

- Metrics of Interest

In any dataset, not everything matters equally. Metrics of interest are the values that actually drive insight and decision-making.


Example:

Health analysis

  • BMI/Diabetic trends between regions

  • Most prevalent health-condition per region

  • Risks per region

  • Smoking and drinking trends across regions

  • Correlation between smoking and breathing



What is it?
Key numerical indicators that help measure performance, patterns, or specific outcomes.

Where is it used?
Used across dashboards, performance reports, KPIs, and monitoring systems.

How is it used?
• Identify business or analysis goals (e.g., reduce cost, increase access)
• Select relevant metrics (e.g., average wait time, insurance coverage rate)
• Ensure each metric has a clear definition and unit
• Track metrics over time, across categories, or against benchmarks
• Use filters to explore them by region, gender, or other dimensions

--- Takeaways / best practices
• Focus on metrics that align with your objectives
• Define metrics clearly to avoid misinterpretation
• Ensure data sources for each metric are clean and consistent
• Regularly review and update the list of tracked metrics as priorities evolve

- Visualise the Metrics

Having metrics is good but visualizing them turns them into actionable insights people can understand at a glance.


Example:


  1. Create a new sheet named “Report - Health”

  2. Create a pivot table to understand the relation between BMI and Diabetic trends in each region

    1. Row: Region

    2. Column: Diabetes

    3. Value: Average BMI

  3. Round off all the values to 2 decimal places







What is it?
The process of representing important metrics through visual tools like charts, dashboards, and infographics.

Where is it used?
Widely used in business intelligence platforms, executive dashboards, and public health reports.

How is it used?
• Select the right chart type for each metric (e.g., line for trends, bar for comparison)
• Highlight key metrics using KPIs, scorecards, or gauges
• Use color coding or conditional formatting to show performance levels
• Group related metrics together in interactive dashboards
• Make visuals responsive to filters or user selections

--- Takeaways / best practices
• Simplicity matters show one clear insight per chart
• Use consistent formats and labels across metrics
• Let visuals guide decision-making, not just display data
• Always test visuals with actual users for clarity and relevance


Topic 9: Designing, Combining, and Finalizing Interactive Reports

- Make it Interactive

Static reports are outdated interactive elements that allow users to explore data on their own terms.

What is it?
The process of adding dynamic controls like filters, slicers or clickable elements to enhance data exploration.


Filters:


Multiple filters:


Example:

Your manager wants to filter the “Age vs Income” chart by state along with the region:




Where is it used?
Common in tools like Power BI, Tableau, Google Data Studio, and Excel dashboards.

How is it used?
• Identify key fields users may want to filter or explore
• Add slicers, drop-downs, or clickable elements for drill-downs
• Link visuals to these controls for responsive updates
• Apply tooltips and hover interactions for extra insights
• Test interactivity for usability and performance

--- Takeaways / best practices
• Interactivity empowers decision-makers to find their own answers
• Keep controls simple too many filters can confuse users
• Use visual cues to guide user actions (e.g., buttons, highlights)
• Test with real users to ensure intuitive navigation

- Introduction to Report Design

Great reports don’t just show data they tell a story clearly and visually.

What is it?
The practice of structuring and laying out reports for clarity, effectiveness, and visual appeal.

Where is it used?
In dashboards, executive summaries, board presentations, and operational reporting.

How is it used?
• Define the report’s objective and target audience
• Choose the layout: tabs, sections, flow direction
• Group related content (KPIs, trends, summaries)
• Use consistent color schemes, fonts, and alignment
• Include titles, legends, and contextual notes


Inspirations:



--- Takeaways / best practices
• Design with the reader in mind—focus on what's actionable
• Don’t overcrowd a page—use whitespace and structure
• Maintain consistency in font sizes, colors, and layouts
• Start with key insights, then add supporting visuals

- Combining Reports

Sometimes, one report isn’t enough you need to bring multiple views into one comprehensive summary.

What is it?
Merging data or visuals from multiple reports or datasets into a single, unified output.

Where is it used?
Used in executive dashboards, cross-functional reporting, and strategic reviews.

How is it used?
• Identify which reports or datasets need to be combined
• Align structure, formatting, and metrics for consistency
• Use summary pages or master tabs to bring elements together
• Apply linking tools or consolidated visuals
• Test to ensure filters and calculations work across all views

--- Takeaways / best practices
• Align definitions and formats before merging reports
• Highlight the relationship between different reports (e.g., via navigation)
• Combine only what adds value avoid clutter
• Document data sources for transparency


- Formatting Reports

A report may be accurate, but poor formatting can make it unreadable or confusing.

What is it?
Applying consistent structure, style, and alignment to improve readability and professionalism.


Example:







Where is it used?
Across dashboards, print reports, presentations, and stakeholder updates.

How is it used?
• Set up standard fonts, sizes, and alignment
• Use color themes and conditional formatting
• Format numbers consistently (e.g., currency, %s, decimals)
• Align titles, visuals, and filters neatly
• Include a cover page or intro section when necessary

--- Takeaways / best practices
• Format with the audience in mind make key info easy to find
• Use consistent visual styling for trust and professionalism
• Align elements visually for a cleaner layout
• Keep accessibility in mind contrast and font size matter

- Final Review

Before you hit “send” or “publish,” every report needs one last check to ensure quality and impact.

What is it?
The last step where the entire report is reviewed for accuracy, clarity, completeness, and usability.


Review Checklist:

You can use this checklist to finalize your report: 

  • The data analysis is summarized cleanly.  

  • The graphs are depicting data correctly.

    • X-axis and Y-axis labels are intact

    • Legends are in place

  • The pivots and slicers are working as expected.

  • The report is in clean and readable format


Where is it used?
Before submitting reports to stakeholders, launching dashboards, or sharing insights.

How is it used?
• Review all visuals and numbers for correctness
• Check alignment, layout, and formatting
• Ensure filters, buttons, and interactivity work properly
• Validate that all data sources are current and reliable
• Get feedback from a peer or sample user

--- Takeaways / best practices
• Final review helps catch errors you missed during creation
• Test from the perspective of your audience
• Always update the date and versioning info
• A polished report builds credibility and confidence

- Data Changes

Data evolves. Reports must too. Knowing how to adapt to data changes keeps your outputs accurate and relevant.

What is it?
The process of updating or adjusting reports in response to changes in the underlying dataset or structure.





Where is it used?
In automated dashboards, recurring reports, and dynamic databases.

How is it used?
• Monitor your data sources for updates or schema changes
• Refresh connections and recalculate metrics if needed
• Adjust visuals, filters, or logic to fit new data fields
• Validate that changes didn’t break existing visuals or KPIs
• Communicate any major updates to users or stakeholders

--- Takeaways / best practices
• Design reports to be adaptable to new data
• Always test thoroughly after a data update
• Keep a change log for transparency and tracking
• Notify users if data changes affect interpretation or results


Was this article helpful?

That’s Great!

Thank you for your feedback

Sorry! We couldn't be helpful

Thank you for your feedback

Let us know how can we improve this article!

Select at least one of the reasons
CAPTCHA verification is required.

Feedback sent

We appreciate your effort and will try to fix the article