# Descriptive Statistics and Visualizing Data

0

## Requirements

• This course is intended for beginners. No prior statistics knowledge is required.

## Description

Visualizing Data

The data visualization process aims to make sense of the raw data, presenting it in a manner that is easy to understand even for non-experts.

Not everyone you are presenting data to will understand statistics, but most people can look at a visual and tell what the story is!

For example:

• Pie charts and bar charts are the most commonly used data visualization techniques; they can be created easily using standard spreadsheet programs like Microsoft Excel.
• Numerical data can be represented in histograms, while categorical data can be visualized using frequency tables and charts.
• Cumulative frequencies count the number of data points up to a given value and are used to find the median, quartile, and percentiles in a data group.
• Relative frequency represents how often something happens relative to some total. For example, the relative frequency may be used to show that a given sales team won 10 of its last 13 contracts, while the cumulative frequency will show the median number of contracts won across the entire year.

Visualizing data is an important skill, and that ability is essential whether running a business or teaching a class.

You can get a sense of the importance of the data by looking at numbers on a spreadsheet, but a well-chosen visual representation of that data can be much more helpful.

Descriptive Statistics and Numerical Data Summary

Charts and graphs are useful for representing data in a user-friendly format, but there are times when it is more important to summarize data numerically.

A quality numerical data summary has meaning even to individuals who have no previous experience with the data being presented. You do not have to be a scientist or a doctor to understand that the average cholesterol for a given male population is 220, while the cholesterol level for women in the same population is 190.

For example:

To understand the power of numerical data summary, it is important to understand the difference between the median and the mean:

• the mean is the average of all observations, while the median is the point at which half the numbers are larger, and half are smaller.
• The mode is another important numerical data representation: it is the observation that occurs most frequently.
• The mean, median, and mode are important, but so is the variance.
• The term variance is used to measure how widely, or narrowly, spread out a group of numbers is. For instance, a set of numbers in which all values are 7 has zero variance. These terms summarize a set of numbers and help the observer make sense of the results.

Whether the data being represented is a list of baseball statistics, salary data for a Fortune 500 company, or the cholesterol levels of heart patients, the summarization techniques are the same.

## Who this course is for:

• Anyone interested in data analysis for improving quality and processes in business and industry
• Anyone who wants to be a Data Analyst, Business User, Functional Consultant, Administrator, Solutions Architect, Developer
• Any manager or businessperson responsible for decision-making
• Anyone who wants to learn statistics for business analytics
• Anyone interested in a career in data science or business analytics