Intro to data visualization¶
Readings¶
Business Analytics: Chapter 2 - Describing the Distribution of a Single Variable
Business Analytics: Chapter 3 - Finding Relationships Among Variables
Both Ch 2 and 3 are chock full of good stuff for using Excel to start exploring data. Hopefully some of it is review for you but I’m sure there is much new stuff for most of you.
The following compressed folder contains a few pdfs discussing principles of graphical excellence and development of effective business dashboards. I’ve also included it in the Downloads-DataViz.zip file.
Downloads¶
Screencasts and other activities¶
Start with a general introduction to data visualization principles.
SCREENCAST: Data Viz Intro (15:45)
Now we’ll go into more detail by using Excel specific examples.
First some general table and graph principles.
A tour of the Conditional Formatting features in Excel (including creating formula based formats).
Summary statistics and plots such as histograms and box plots are one of the ways that we visualize the distribution of a dataset. In these next few screencasts, we’ll use the DAT along with Excel formulas for doing descriptive statistics. In addition, I’ll show how using range names or Excel Tables can facilitate efficient formula creation. Then I’ll show you three different ways to create histograms (and we are going to see a few more as well in later parts of the course) - using the Data Analysis ToolPak, using the FREQUENCY() array function, and using the newish Excel histogram chart type. Histograms can also be created using Pivot Tables and Charts and I’ll show that in the upcoming session on multidimensional data modeling and analysis. In the screencast on histograms, I’ll also demo the newish box & whisker plots.
Now, see some advanced chart techniques.
SCREENCAST: Dynamic Charts (9:30)
The final few slides introduce motion charts, small multiples and some future possibilities. Check out the links on those slides. In particular, the notion of small multiples has become quite important in the field of data visualization. We’ll see that these are quite easy to create with tools like Tableau, but are much more tedious to do in Excel. Creating small multiples with programmatic tools like R or Python is also quite easy. Here’s an example from a blog post I did on Great Lakes water level analysis with R.
Explore (OPTIONAL)¶
Towards a Theory of Bullshit Visualization - a wickedly good read
- A few Bret Victor creations:
What do analysts actually do day to day? Enterprise Data Analysis and Visualization: An Interview Study
The classic TED talk by Hans Rosling - the world meets animated bubble charts
A Tableau story from the World Happiness Report - see Figure 2.1