<< Chapter < Page | Chapter >> Page > |
Introduction
The US housing crisis has undermined the world economy in wide reaching and poorly understood ways. Although there is a lot ofspeculation over the causes and the effects of the housing crisis, most of these ideas come from opinionated blogs or news articles that do not list theirsources. This lack of data becomes perilous as the US government invests trillions of dollars based on untested hypotheses concerning the crisis. OurPFUG's focus is to compile, clean, and analyze data pertaining to the housing crisis to get a clearer picture of what is actually going on.
Overview and Motivation
Real Estate Bubble : Around 2006, house prices rose much higher than their true value. Eventually, housing prices became so high, it was difficult for currentowners to afford their house. As foreclosure rates increased, house prices began to plummet. This has largely affected the global economy.
Little Public Organized Data : There is a lot of speculation over the causes and the effects of the housing crisis. Unfortunately, most of these ideas come from opinionatedblogs or news articles that don’t list their sources. Therefore, it is difficult to collect reliable information.
Government Expenditures : The government has already exhausted millions of dollars in order to aid those affected by housing crisis. With such littlepublic data about the crisis, we are left wondering what data the government is using.
Still Unfolding : It is important to realize that the housing crisis in ongoing. This allows us to track its progression and hopefully make predictionsfor the upcoming years.
Large Data Sets : The housing crisis serves as a perfect model for visualizing large data sets. Most data sets we collect usually cover multiple years,counties and variables.
Problems with Large Data
Hard To Find : All of the data we have collected come from multiple sources. Currently, thereis no central repository where data can be found.
Licenses and Fees : Some of the data sets have licenses that do not allow us to reproduce or publish any of our findings. Also many of the data sets cost largeamounts of money to purchase.
Size : Some data sets were as large as 10 GB. In order to work around this problem, we wereable to extract certain parts of the data sets without having to completely download them.
Dirty : Most of the data sets we find are what we call “dirty.” They are usually unorganized andpractically unreadable.
Data Sets
To view our most current data sets and work, please visit our PFUG's website: http://github.com/hadley/data- housing-crisis . Some of our major data sets include...
Notification Switch
Would you like to follow the 'The art of the pfug' conversation and receive update notifications?