This is a breakdown of the data gathered from the GSS survey on Real Estate as well as the correlation between Hours Worked and Income in the year of 1989.
Full report can be found at the following link: http://chrisso.tv/pdf/GSS_Survey_RealEstateReport.pdf
- Use Python to clean data fields into proper data types for aggregation and visualization.
- Run correlation coefficients, as well as R-squared, to assess how strongly various data points are correlated.
- Run a linear regression to forecast future trends, based on current data.
This report was a fun one to put together because it involved all of the major aspects of data, specifically the processes of cleaning, optimizing, aggregating, and visualizing data. To have real-world data that wasn't already optimized meant that I had to configure it in a way that I could make something presentable out of it. Ultimately, I was able to extract a surprising amount of information with the data given as well as learn a lot about the full process from start to finish.