For our final projects in data visualization class, we partnered with different domain experts at UW and the Seattle area to find interesting data sets or analysis problems. My group worked with Dargan Frierson of UW Atmospheric Sciences to display climate change data. Here are the results of our project:
This quarter I’ve been taking Jeff Heer’s data visualization class at UW. For our third assignment we had to create an interactive visualization. I partnered up with fellow oceanographer Michelle Weirathmueller to visualize data from the World Ocean Atlas.
The Human Microbiome Project was an NIH-sponsored initiative to characterize the healthy and diseased microbiome from the human mouth, gut, lung/nose, skin, and vagina. Metagenomic, whole genome, and 16S rRNA sequencing were used to generate taxonomic and functional data.
I used this microbial community data to create a series of animated bubble charts. Each chart shows an averaged community profile from the healthy human microbiome. Circles are labeled by genus and coloured by class. Circle size is a percentage of each OTU making up the community calculated from several samples.
I followed much of the Canadian election over Twitter because I live in Seattle. This got me wondering whether there is a large overlap in Twitter followers between the party leaders or if people just follow a single party.
Sets are typically represented with Venn diagrams. The first option I came across was Ben Frederickson’s code for producing proportional Venn diagrams with D3. The result is really great! Unfortunately, it’s not really possible to represent five proportional overlapping sets with circles. You can see that overlaps between Duceppe, Mulcair, and May were excluded from this representation.
Ultimately, I like this chord diagram the best. This visualization was translated into D3 by Mike Bostock. While it doesn’t show all possible set intersections, it is easy to read and you can get a sense of the overlap between each leader’s followers. People do, however, get counted multiple times in this diagram if they are following more than two party leaders.
Mulcair has a lot less followers than I expected considering he was the leader of the opposition for four years. Harper and Trudeau’s followers have remarkably similar affiliations with a large chunk that follow only Harper and/or Trudeau.
Databases can be used to access much larger dynamic dataset and even record interactions with users. I used data from the Fatal Encounters project to create a choropleth map showing fatal encounters with the police.
Police brutality and racial discrimination are some of the most important social issues in America right now. It has been pointed out by the Las Vegas Review-Journal in its series Deadly Force (Nov. 28, 2011) that
“The nation’s leading law enforcement agency [FBI] collects vast amounts of information on crime nationwide, but missing from this clearinghouse are statistics on where, how often, and under what circumstances police use deadly force. In fact, no one anywhere comprehensively tracks the most significant act police can do in the line of duty: take a life.”
The Fatal Encounters project aims to remedy this situation by compiling data on deaths occurring during encounters with the police. The general public can assist by submitting records through a form on the website and the data is freely available.
The issue of discrimination by the police is complicated by correlations with crime and poverty. I believe that there is an appetite by consumers of the media to explore these types of data for themselves. The choropleth map that I created allows the user to filter the data by gender, race, mental illness, cause of death, and official disposition of the police. The data can also be viewed as total deaths per state or deaths per million people to control for population size. Users can click on any count statistic to link to articles associated with that statistic.
The data set is still incomplete and there is some bias. For example, there are 84 deaths per million people in Nevada; that’s much higher than any other state. What’s going on? Are the police in Nevada particularly deadly? Is Nevada especially crime-ridden? More likely the project’s author, D. Brian Burghart an instructor at the University of Nevada, Reno has focused on collecting data from his own state. This bias will disappear as more data is collected. Visualizations and projects like this can inspire people to think critically about stories in the media, investigate their own stories, and participate in data collection.
Since this map has been created, a more polished visualization has been published by Fatal Encounters with Silk.
Interactive Data Visualization for the Web by Scott Murray
Using a MySQL database as a source of data by D3noob
Graph data from a MySQL database in Python by modern data
Get Apache, MySQL, PHP and phpMyAdmin working on OSX 10.10 Yosemite, Coolest Guides on the Planet by Neil Gee
In 1951 the graphic designer Will Burtin, published a plot to visualize the efficacy of 3 antibiotics on 16 different bacteria. Antibiotic efficacy is measured by the minimum concentration required to inhibit bacterial growth.
This plot was admired for its simplicity in comparing the effects of different antibiotics. But it does not clearly show how the bacteria group together in their response. Different visualizations can emphasize different aspects of the data. (See article in American Scientist).
I used Burtin’s data to create a simple bar plot that alternates between displays of different antibiotics. The plot more clearly shows that Gram positive bacteria are more resistant to streptomycin and neomycin while Gram negative bacteria are more resistant to penicillin. It also shows which bacteria do not follow the trend. Gram-staining uses violet and pink dyes to distinguish between different groups of bacteria by cell wall structure.
Interactive Data Visualization for the Web by Scott Murray