Follow Gregory Piatetsky, No. 1 on LinkedIn Top Voices in Data Science & Analytics

KDnuggets Home » News » 2014 » Jun » Software » Data Visualization of Census Data with R ( 14:n16 )

Data Visualization of Census Data with R


This article shows step-by-step how to use R to access US Census Data, visualize it, and plot it on the map.



By Krishna Prasad, June 2014.

The article mainly focuses on how to use R to access and visualize census data.  There are contributed packages that greatly enhance your ability to interact with the graphs you create in R. I will mainly focus on obtaining data from the US Census via an API connection and plotting data on different types of US maps.

Our first step is figuring out how to use the Census API within R.

Then we use the acs.lookup function to find the required data in all tables using key words.

For example, the following are the search results for the keywords owner, occupied, and median.

>acs.lookup(endyear=2012, span=5,dataset="acs", keyword= c("owner", "occupied", "median"), case.sensitive=F)

An object of class "acs.lookup"

endyear= 2012  ; span= 5

results:

variable.code           table.number                         table.name
1    B25021_002       B25021               MEDIAN NUMBER OF ROOMS BY TENURE
2    B25037_002       B25037               MEDIAN YEAR STRUCTURE BUILT BY TENURE
3    B25039_002       B25039               MEDIAN YEAR HOUSEHOLDER MOVED INTO UNIT BY TENURE
4    B25119_002       B25119               Median Household Income by Tenure

variable.name
1               Median number of rooms -- Owner occupied
2               Median year structure built -- Owner occupied
3                Median year householder moved into unit -- Owner occupied
4                Median household income in the past 12 months (in 2012 inflation-adjusted dollars) -- Owner occupied (dollars)

Visualizing Census Data on Maps

Using choroplethr simplifies the creation of choropleths (thematic maps) in R. It provides native support for creating choropleths from US Census data. This functionality is available with the choroplethr_acs function.

How it Works—

The choroplethr package does not store any data locally. Instead, it uses the R acs package to get ACS data via the Census API. This means a few things for users of choroplethr.

>library(acs)
>api.key.install(key=" your secret key here")
>choroplethr_acs("B01002", "state", endyear=2012, span=5)


Table B01002 has 3 columns.  Please choose the column to render:

1: Median Age by Sex: Median age -- Total:
2: Median Age by Sex: Median age -- Male
3: Median Age by Sex: Median age -- Female

Selection: 1

US Censure with R: Median Age of Home Buyers
Fig. 1 US Census - Median Age of Home Buyers

According to the National Association of Home Builders (NAHB) study,the average buyer is expected to stay in a home for 13 years. To know the major cost paid by home buyers, we combined median home price data and average home insurance over a period of 13 years, and plotted the data on the US map to give a clear view of the total costs by state.

#Downloading median home price data
>my.states=geo.make(state="*")
>home_median _price<-acs.fetch(geography=my.states, table.number="B25077")
>write.csv(home_median _price, file=".home_median _price.csv")


Downloaded Average Latitude and Longitude for US States from MAX MIND

#mergingthree data frames average insurance and median home price
>Total_Cost<- merge (home_median _price,home_average_insurance,Lat_Long, by="State")
# adding median home price and 13 years average insurance
>Total_Cost$Sum<- Total_Cost $Median_Price+Total_Cost$Average_Insurance
# plottingdata on the US map
>install.packages("ggmap")
>install.packages("mapproj")
>library(ggmap)
>library(mapproj)<br< >map<- get_map(location = 'US', zoom = 4)
>ggmap(map)
> TC <- ggmap(map) + geom_point(aes(x = Longitude, y = Latitude, size = Total.Cost.in.USD), data = state_median_income, alpha = .5)+ ggtitle("Total Cost of Homes in the US")
> TC


Analyzing US Census with R: Total Cost of Homes in the US

Krishna Prasad Author Profile: Krishna Prasad is a Data Analyst with experience programming in Python and R. He is a Computer Science Engineer from JNTU, Hyderabad.






Related:


Sign Up