US House Price Index (HPI) vs. Consumer Price Index (CPI)
Data Analysis Using Python
Import Python packages for data manipulation and visualization¶
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
%matplotlib inline
import pandas_datareader.data as web
Read US House Price Index (HPI) and US Consumer Price Index (CPI) data from Econdb into new Pandas dataframes¶
hpi_df = web.DataReader('ticker=HOUUS', 'econdb')
cpi_df = web.DataReader('ticker=CPIUS', 'econdb')
Look at first and last few rows of data to see start/end dates and general layout¶
hpi_df.head()
| hpi_type | traditional |
|---|---|
| hpi_flavor | purchase-only |
| Seasonal Adjustment | sa |
| None | None |
| TIME_PERIOD | |
| 2017-06-01 | 242.6 |
| 2017-07-01 | 244.0 |
| 2017-08-01 | 245.7 |
| 2017-09-01 | 246.6 |
| 2017-10-01 | 247.6 |
hpi_df.tail()
| hpi_type | traditional |
|---|---|
| hpi_flavor | purchase-only |
| Seasonal Adjustment | sa |
| None | None |
| TIME_PERIOD | |
| 2021-10-01 | 358.5 |
| 2021-11-01 | 363.1 |
| 2021-12-01 | 367.9 |
| 2022-01-01 | 373.6 |
| 2022-02-01 | 381.4 |
cpi_df.head()
| area_code | U.S. city average |
|---|---|
| item_code | All items |
| base_code | Current |
| None | None |
| TIME_PERIOD | |
| 2017-06-01 | 244.2 |
| 2017-07-01 | 244.2 |
| 2017-08-01 | 245.2 |
| 2017-09-01 | 246.4 |
| 2017-10-01 | 246.6 |
cpi_df.tail()
| area_code | U.S. city average |
|---|---|
| item_code | All items |
| base_code | Current |
| None | None |
| TIME_PERIOD | |
| 2021-12-01 | 280.1 |
| 2022-01-01 | 281.9 |
| 2022-02-01 | 284.2 |
| 2022-03-01 | 287.7 |
| 2022-04-01 | 288.7 |
Combine HPI and CPI data into one table. Similar to doing a full outer join with time_period as key.¶
df = pd.concat([hpi_df,cpi_df],axis=1)
df
| hpi_type | traditional | U.S. city average |
|---|---|---|
| hpi_flavor | purchase-only | All items |
| Seasonal Adjustment | sa | Current |
| None | None | None |
| TIME_PERIOD | ||
| 2017-06-01 | 242.6 | 244.2 |
| 2017-07-01 | 244.0 | 244.2 |
| 2017-08-01 | 245.7 | 245.2 |
| 2017-09-01 | 246.6 | 246.4 |
| 2017-10-01 | 247.6 | 246.6 |
| 2017-11-01 | 249.2 | 247.3 |
| 2017-12-01 | 250.4 | 247.8 |
| 2018-01-01 | 252.2 | 248.7 |
| 2018-02-01 | 254.2 | 249.4 |
| 2018-03-01 | 255.1 | 249.6 |
| 2018-04-01 | 255.9 | 250.1 |
| 2018-05-01 | 257.3 | 250.8 |
| 2018-06-01 | 258.3 | 251.1 |
| 2018-07-01 | 259.3 | 251.3 |
| 2018-08-01 | 260.5 | 251.7 |
| 2018-09-01 | 261.0 | 252.2 |
| 2018-10-01 | 261.9 | 252.9 |
| 2018-11-01 | 263.4 | 252.7 |
| 2018-12-01 | 264.1 | 252.6 |
| 2019-01-01 | 265.6 | 252.5 |
| 2019-02-01 | 266.5 | 253.1 |
| 2019-03-01 | 267.5 | 254.3 |
| 2019-04-01 | 269.1 | 255.2 |
| 2019-05-01 | 270.5 | 255.3 |
| 2019-06-01 | 271.2 | 255.4 |
| 2019-07-01 | 272.2 | 255.9 |
| 2019-08-01 | 273.0 | 256.2 |
| 2019-09-01 | 274.7 | 256.6 |
| 2019-10-01 | 275.9 | 257.3 |
| 2019-11-01 | 277.3 | 257.8 |
| 2019-12-01 | 279.5 | 258.3 |
| 2020-01-01 | 281.3 | 258.7 |
| 2020-02-01 | 283.5 | 259.0 |
| 2020-03-01 | 284.4 | 258.2 |
| 2020-04-01 | 285.2 | 256.1 |
| 2020-05-01 | 284.7 | 255.9 |
| 2020-06-01 | 287.7 | 257.2 |
| 2020-07-01 | 291.1 | 258.5 |
| 2020-08-01 | 296.0 | 259.6 |
| 2020-09-01 | 300.7 | 260.2 |
| 2020-10-01 | 305.1 | 260.4 |
| 2020-11-01 | 308.5 | 260.7 |
| 2020-12-01 | 312.5 | 261.6 |
| 2021-01-01 | 315.9 | 262.2 |
| 2021-02-01 | 319.4 | 263.3 |
| 2021-03-01 | 325.1 | 265.0 |
| 2021-04-01 | 331.0 | 266.7 |
| 2021-05-01 | 336.8 | 268.6 |
| 2021-06-01 | 342.7 | 271.0 |
| 2021-07-01 | 347.6 | 272.2 |
| 2021-08-01 | 351.0 | 273.1 |
| 2021-09-01 | 354.4 | 274.2 |
| 2021-10-01 | 358.5 | 276.6 |
| 2021-11-01 | 363.1 | 278.5 |
| 2021-12-01 | 367.9 | 280.1 |
| 2022-01-01 | 373.6 | 281.9 |
| 2022-02-01 | 381.4 | 284.2 |
| 2022-03-01 | NaN | 287.7 |
| 2022-04-01 | NaN | 288.7 |
Remove rows with empty cells because they will later cause problems with calculated columns and graphing¶
df.dropna(inplace=True)
df
| hpi_type | traditional | U.S. city average |
|---|---|---|
| hpi_flavor | purchase-only | All items |
| Seasonal Adjustment | sa | Current |
| None | None | None |
| TIME_PERIOD | ||
| 2017-06-01 | 242.6 | 244.2 |
| 2017-07-01 | 244.0 | 244.2 |
| 2017-08-01 | 245.7 | 245.2 |
| 2017-09-01 | 246.6 | 246.4 |
| 2017-10-01 | 247.6 | 246.6 |
| 2017-11-01 | 249.2 | 247.3 |
| 2017-12-01 | 250.4 | 247.8 |
| 2018-01-01 | 252.2 | 248.7 |
| 2018-02-01 | 254.2 | 249.4 |
| 2018-03-01 | 255.1 | 249.6 |
| 2018-04-01 | 255.9 | 250.1 |
| 2018-05-01 | 257.3 | 250.8 |
| 2018-06-01 | 258.3 | 251.1 |
| 2018-07-01 | 259.3 | 251.3 |
| 2018-08-01 | 260.5 | 251.7 |
| 2018-09-01 | 261.0 | 252.2 |
| 2018-10-01 | 261.9 | 252.9 |
| 2018-11-01 | 263.4 | 252.7 |
| 2018-12-01 | 264.1 | 252.6 |
| 2019-01-01 | 265.6 | 252.5 |
| 2019-02-01 | 266.5 | 253.1 |
| 2019-03-01 | 267.5 | 254.3 |
| 2019-04-01 | 269.1 | 255.2 |
| 2019-05-01 | 270.5 | 255.3 |
| 2019-06-01 | 271.2 | 255.4 |
| 2019-07-01 | 272.2 | 255.9 |
| 2019-08-01 | 273.0 | 256.2 |
| 2019-09-01 | 274.7 | 256.6 |
| 2019-10-01 | 275.9 | 257.3 |
| 2019-11-01 | 277.3 | 257.8 |
| 2019-12-01 | 279.5 | 258.3 |
| 2020-01-01 | 281.3 | 258.7 |
| 2020-02-01 | 283.5 | 259.0 |
| 2020-03-01 | 284.4 | 258.2 |
| 2020-04-01 | 285.2 | 256.1 |
| 2020-05-01 | 284.7 | 255.9 |
| 2020-06-01 | 287.7 | 257.2 |
| 2020-07-01 | 291.1 | 258.5 |
| 2020-08-01 | 296.0 | 259.6 |
| 2020-09-01 | 300.7 | 260.2 |
| 2020-10-01 | 305.1 | 260.4 |
| 2020-11-01 | 308.5 | 260.7 |
| 2020-12-01 | 312.5 | 261.6 |
| 2021-01-01 | 315.9 | 262.2 |
| 2021-02-01 | 319.4 | 263.3 |
| 2021-03-01 | 325.1 | 265.0 |
| 2021-04-01 | 331.0 | 266.7 |
| 2021-05-01 | 336.8 | 268.6 |
| 2021-06-01 | 342.7 | 271.0 |
| 2021-07-01 | 347.6 | 272.2 |
| 2021-08-01 | 351.0 | 273.1 |
| 2021-09-01 | 354.4 | 274.2 |
| 2021-10-01 | 358.5 | 276.6 |
| 2021-11-01 | 363.1 | 278.5 |
| 2021-12-01 | 367.9 | 280.1 |
| 2022-01-01 | 373.6 | 281.9 |
| 2022-02-01 | 381.4 | 284.2 |
Rename columns to better describe data¶
df.index = df.index.rename('Date')
df.columns = ['HPI','CPI']
df.head()
| HPI | CPI | |
|---|---|---|
| Date | ||
| 2017-06-01 | 242.6 | 244.2 |
| 2017-07-01 | 244.0 | 244.2 |
| 2017-08-01 | 245.7 | 245.2 |
| 2017-09-01 | 246.6 | 246.4 |
| 2017-10-01 | 247.6 | 246.6 |
For each index, calculate columns for Percent Change from Previous Month and Cumulative Percent Change¶
df['HPI Pct Change from Prev Month'] = df['HPI'].pct_change(1) * 100
df['HPI Cumulative Pct Change'] = df['HPI'].diff().cumsum() / df['HPI'].iloc[0] * 100
df['CPI Pct Change from Prev Month'] = df['CPI'].pct_change(1) * 100
df['CPI Cumulative Pct Change'] = df['CPI'].diff().cumsum() / df['CPI'].iloc[0] * 100
df
| HPI | CPI | HPI Pct Change from Prev Month | HPI Cumulative Pct Change | CPI Pct Change from Prev Month | CPI Cumulative Pct Change | |
|---|---|---|---|---|---|---|
| Date | ||||||
| 2017-06-01 | 242.6 | 244.2 | NaN | NaN | NaN | NaN |
| 2017-07-01 | 244.0 | 244.2 | 0.577082 | 0.577082 | 0.000000 | 0.000000 |
| 2017-08-01 | 245.7 | 245.2 | 0.696721 | 1.277824 | 0.409500 | 0.409500 |
| 2017-09-01 | 246.6 | 246.4 | 0.366300 | 1.648805 | 0.489396 | 0.900901 |
| 2017-10-01 | 247.6 | 246.6 | 0.405515 | 2.061006 | 0.081169 | 0.982801 |
| 2017-11-01 | 249.2 | 247.3 | 0.646204 | 2.720528 | 0.283861 | 1.269451 |
| 2017-12-01 | 250.4 | 247.8 | 0.481541 | 3.215169 | 0.202184 | 1.474201 |
| 2018-01-01 | 252.2 | 248.7 | 0.718850 | 3.957131 | 0.363196 | 1.842752 |
| 2018-02-01 | 254.2 | 249.4 | 0.793021 | 4.781533 | 0.281464 | 2.129402 |
| 2018-03-01 | 255.1 | 249.6 | 0.354052 | 5.152514 | 0.080192 | 2.211302 |
| 2018-04-01 | 255.9 | 250.1 | 0.313603 | 5.482275 | 0.200321 | 2.416052 |
| 2018-05-01 | 257.3 | 250.8 | 0.547089 | 6.059357 | 0.279888 | 2.702703 |
| 2018-06-01 | 258.3 | 251.1 | 0.388651 | 6.471558 | 0.119617 | 2.825553 |
| 2018-07-01 | 259.3 | 251.3 | 0.387147 | 6.883759 | 0.079650 | 2.907453 |
| 2018-08-01 | 260.5 | 251.7 | 0.462784 | 7.378401 | 0.159172 | 3.071253 |
| 2018-09-01 | 261.0 | 252.2 | 0.191939 | 7.584501 | 0.198649 | 3.276003 |
| 2018-10-01 | 261.9 | 252.9 | 0.344828 | 7.955482 | 0.277557 | 3.562654 |
| 2018-11-01 | 263.4 | 252.7 | 0.572738 | 8.573784 | -0.079083 | 3.480753 |
| 2018-12-01 | 264.1 | 252.6 | 0.265756 | 8.862325 | -0.039573 | 3.439803 |
| 2019-01-01 | 265.6 | 252.5 | 0.567967 | 9.480627 | -0.039588 | 3.398853 |
| 2019-02-01 | 266.5 | 253.1 | 0.338855 | 9.851608 | 0.237624 | 3.644554 |
| 2019-03-01 | 267.5 | 254.3 | 0.375235 | 10.263809 | 0.474121 | 4.135954 |
| 2019-04-01 | 269.1 | 255.2 | 0.598131 | 10.923331 | 0.353913 | 4.504505 |
| 2019-05-01 | 270.5 | 255.3 | 0.520253 | 11.500412 | 0.039185 | 4.545455 |
| 2019-06-01 | 271.2 | 255.4 | 0.258780 | 11.788953 | 0.039170 | 4.586405 |
| 2019-07-01 | 272.2 | 255.9 | 0.368732 | 12.201154 | 0.195771 | 4.791155 |
| 2019-08-01 | 273.0 | 256.2 | 0.293902 | 12.530915 | 0.117233 | 4.914005 |
| 2019-09-01 | 274.7 | 256.6 | 0.622711 | 13.231657 | 0.156128 | 5.077805 |
| 2019-10-01 | 275.9 | 257.3 | 0.436840 | 13.726298 | 0.272798 | 5.364455 |
| 2019-11-01 | 277.3 | 257.8 | 0.507430 | 14.303380 | 0.194326 | 5.569206 |
| 2019-12-01 | 279.5 | 258.3 | 0.793365 | 15.210223 | 0.193949 | 5.773956 |
| 2020-01-01 | 281.3 | 258.7 | 0.644007 | 15.952185 | 0.154859 | 5.937756 |
| 2020-02-01 | 283.5 | 259.0 | 0.782083 | 16.859027 | 0.115964 | 6.060606 |
| 2020-03-01 | 284.4 | 258.2 | 0.317460 | 17.230008 | -0.308880 | 5.733006 |
| 2020-04-01 | 285.2 | 256.1 | 0.281294 | 17.559769 | -0.813323 | 4.873055 |
| 2020-05-01 | 284.7 | 255.9 | -0.175316 | 17.353669 | -0.078094 | 4.791155 |
| 2020-06-01 | 287.7 | 257.2 | 1.053741 | 18.590272 | 0.508011 | 5.323505 |
| 2020-07-01 | 291.1 | 258.5 | 1.181787 | 19.991756 | 0.505443 | 5.855856 |
| 2020-08-01 | 296.0 | 259.6 | 1.683270 | 22.011542 | 0.425532 | 6.306306 |
| 2020-09-01 | 300.7 | 260.2 | 1.587838 | 23.948887 | 0.231125 | 6.552007 |
| 2020-10-01 | 305.1 | 260.4 | 1.463252 | 25.762572 | 0.076864 | 6.633907 |
| 2020-11-01 | 308.5 | 260.7 | 1.114389 | 27.164056 | 0.115207 | 6.756757 |
| 2020-12-01 | 312.5 | 261.6 | 1.296596 | 28.812861 | 0.345224 | 7.125307 |
| 2021-01-01 | 315.9 | 262.2 | 1.088000 | 30.214345 | 0.229358 | 7.371007 |
| 2021-02-01 | 319.4 | 263.3 | 1.107946 | 31.657049 | 0.419527 | 7.821458 |
| 2021-03-01 | 325.1 | 265.0 | 1.784596 | 34.006595 | 0.645651 | 8.517609 |
| 2021-04-01 | 331.0 | 266.7 | 1.814826 | 36.438582 | 0.641509 | 9.213759 |
| 2021-05-01 | 336.8 | 268.6 | 1.752266 | 38.829349 | 0.712411 | 9.991810 |
| 2021-06-01 | 342.7 | 271.0 | 1.751781 | 41.261336 | 0.893522 | 10.974611 |
| 2021-07-01 | 347.6 | 272.2 | 1.429822 | 43.281121 | 0.442804 | 11.466011 |
| 2021-08-01 | 351.0 | 273.1 | 0.978136 | 44.682605 | 0.330639 | 11.834562 |
| 2021-09-01 | 354.4 | 274.2 | 0.968661 | 46.084089 | 0.402783 | 12.285012 |
| 2021-10-01 | 358.5 | 276.6 | 1.156885 | 47.774114 | 0.875274 | 13.267813 |
| 2021-11-01 | 363.1 | 278.5 | 1.283124 | 49.670239 | 0.686913 | 14.045864 |
| 2021-12-01 | 367.9 | 280.1 | 1.321950 | 51.648805 | 0.574506 | 14.701065 |
| 2022-01-01 | 373.6 | 281.9 | 1.549334 | 53.998351 | 0.642628 | 15.438165 |
| 2022-02-01 | 381.4 | 284.2 | 2.087794 | 57.213520 | 0.815892 | 16.380016 |
Calculate summary statistics for combined table to get a better understanding of data distribution¶
df.describe()
| HPI | CPI | HPI Pct Change from Prev Month | HPI Cumulative Pct Change | CPI Pct Change from Prev Month | CPI Cumulative Pct Change | |
|---|---|---|---|---|---|---|
| count | 57.000000 | 57.000000 | 56.000000 | 56.000000 | 56.000000 | 56.000000 |
| mean | 289.147368 | 258.357895 | 0.812528 | 19.529502 | 0.271663 | 5.901193 |
| std | 38.452829 | 9.833324 | 0.524276 | 15.779321 | 0.292642 | 3.985909 |
| min | 242.600000 | 244.200000 | -0.175316 | 0.577082 | -0.813323 | 0.000000 |
| 25% | 260.500000 | 251.700000 | 0.384169 | 7.532976 | 0.115775 | 3.224816 |
| 50% | 275.900000 | 256.200000 | 0.633359 | 14.014839 | 0.234374 | 4.995905 |
| 75% | 312.500000 | 261.600000 | 1.163110 | 29.163232 | 0.429850 | 7.186732 |
| max | 381.400000 | 284.200000 | 2.087794 | 57.213520 | 0.893522 | 16.380016 |
Create time series graphs for indices to see trends¶
plt.figure(figsize=(12,8))
plt.plot(df['HPI'],label='HPI')
plt.plot(df['CPI'],label='CPI')
plt.title('US House Price Index vs Consumer Price Index')
plt.xlabel('Year', fontsize=12)
plt.ylabel('Index', fontsize=12)
plt.tick_params(axis='both', which='major', labelsize=12)
plt.legend(fontsize=12);
plt.figure(figsize=(12,8))
plt.plot(df['HPI Cumulative Pct Change'],label='HPI')
plt.plot(df['CPI Cumulative Pct Change'],label='CPI')
plt.title('US House Price Index vs Consumer Price Index')
plt.xlabel('Year', fontsize=12)
plt.ylabel('Cumulative Pct Change', fontsize=12)
plt.tick_params(axis='both', which='major', labelsize=12)
plt.legend(fontsize=12);
plt.figure(figsize=(12,8))
plt.plot(df['HPI Pct Change from Prev Month'],label='HPI')
plt.plot(df['CPI Pct Change from Prev Month'],label='CPI')
plt.title('US House Price Index vs Consumer Price Index')
plt.xlabel('Year', fontsize=12)
plt.ylabel('Pct Change from Prev Month', fontsize=12)
plt.tick_params(axis='both', which='major', labelsize=12)
plt.legend(fontsize=12);
Add boxplot to see data distribution from another perspective¶
plt.figure(figsize=(12,8))
sns.boxplot(data=df[['HPI Pct Change from Prev Month','CPI Pct Change from Prev Month']])
plt.title('US House Price Index vs Consumer Price Index')
plt.tick_params(axis='both', which='major', labelsize=12)
plt.ylabel('Pct Change from Prev Month', fontsize=12);
Analysis
The CPI increased slowly but steadily from mid-2017 until early 2020 when it decreased slightly, then began to increase a bit more sharply in 2021.
Meanwhile, the HPI grew at a similarly steady, but higher rate until early 2020 when it skyrocketed.
This means that from early 2020 to early 2022, everyday items that consumers purchase (CPI) have grown a weighted average of 13% more expensive, while the weighted average price of single-family housing (HPI) rose 40%.
This could be due to several factors. During the beginning of the COVID pandemic in 2020, the initial small drop in HPI and CPI could have been because of people reducing their non-essential spending while the pandemic altered their daily lives. It is possible that this temporarily lowered demand and therefore prices for housing and consumer goods.
After a few months, there may have been greater housing demand from people relocating from higher density urban areas to more suburban and rural areas as a result of infection fears and the ability for more people to work remotely. People may have also desired homes with more space because they were spending more time at home.
The fact that the HPI has continued to increase suggests that homes have not been built fast enough to keep up with demand. Unless the housing supply can be increased or the demand decreased, the HPI will likely continue to grow at a 1%-2% or higher monthly rate.
The CPI monthly percent change increased from 0%-0.5% before 2020 to 0.25%-1% starting in 2021. This could be the result of supply chain issues during the pandemic and there is a possibility of even higher growth if these kinds of problems persist.