Research Article: Calibration and evaluation of Quigley’s hybrid housing price model in Microsoft Excel

Date Published: April 25, 2019

Publisher: Public Library of Science

Author(s): Alan G. Phipps, Dingding Li, Shihe Fu.


Quigley derived his hybrid price model to improve the precision of predicted prices of sold homes by statistically merging data of resold homes in a repeat sales model with that of once-sold homes in a single sales hedonic price model. The literature has few applications of the hybrid model aside from those by Quigley and his collaborators. Two reasons for this underuse may be its computational intensiveness and its marginal empirical improvement in comparison with two other models. This paper first demystifies this computational intensiveness by calibrating models in Microsoft Excel with transferable procedures into other software. It second evaluates the hybrid price model’s empirical improvement as a reason for its underuse by predicting prices of 2,559 sold and resold homes observed in two inner-city neighbourhoods in Windsor, Ontario, during a 30-year period. The results as hypothesized are its lower standard errors of regression coefficients and higher simple R-squared than those of a single sales hedonic price model. Moreover, the hybrid model’s predictions have higher correlations than those of the single sales model with not only in-sample observed prices or changes in prices but also out-of-sample ones. The conclusion speculates in plans for future research about reasons for two models’ similar or dissimilar regression coefficients and standard errors predicting correspondingly similar or dissimilar sale prices of homes through time.

Partial Text

Quigley and his collaborators derived the hybrid price model as a synthesis of two different models of the determinants of prices of durables such as homes [1–3]. One of two models, the hedonic housing price model, predicts the value of a home by statistically comparing its attributes with those of different individual homes across space and/or through time (e.g., [4–8]). This model assumes analytically that each observed home is different from others and thus has been sold once, even while some are the same homes sold twice or more. These resold homes are specifically analysed in a second model, the repeat sales model, which predicts the change in the value of a resold home from the difference in times and the change in its attributes between its original sale and later resale (e.g., [7, 9–11]).

Four equations in this section formalize the single sales hedonic price model and the repeat sales model as predictors of the value of a home or the change in its value as a function of its time of sale or resale, and its attributes or the changes in them such as of its dwelling unit and neighbourhood. Each equation refers to an observed home such an ith once-sold one at time t, and/or a jth resold one when either sold for the first time at time t or resold at a later time τ; and Eqs (1) through (3) are different from Eq (4) as the latter correlates differences in prices with changes in attributes and differences in time. Otherwise, four equations are similar in their calculation of implicit prices for a home’s attributes and times of sale by means of parameters from a linear or nonlinear curve-fitting method such as ordinary least squares in this study. They therefore formalize an assumption about each having the same parameters or rescaled values of them for periods of sale or attributes or changes in attributes. This assumption permits the stacking of Eqs (1) through (4) as the mathematical foundation for the hybrid price model. Their stacking consequently also specifies the order of observations in four submatrices on an Excel worksheet.

The results of the rerun LINEST array formula for the hybrid price model resemble those in Fig 2, and together with 95% confidence intervals of attributes’ values, they are displayed in Table 1 for comparison with those of the single sales hedonic model. Each model’s regression coefficients and standard errors are also used in Fig 7 for calculating 95% confidence intervals of predicted annual percentage changes in prices for 2,559 house sales in two neighbourhoods between 1986 and mid-2017. In a comparison of models’ results, the hybrid price model’s statistical results have improved as hypothesized with more precise estimates of prices than those of the single sales hedonic model but statistically the same regression coefficients (Table 1). Each of the hybrid model’s standard errors for 31 statistically-significant regression coefficients of times of sale and resale is lower than the corresponding one of the single sales hedonic model, even though these standard errors have similar values (and each is lower than the corresponding repeat sales model’s one in results available from the authors).

This study’s demonstration of the calibration of Quigley’s hybrid price model in Excel with transferable procedures into other statistical software has simplified its computation if the intensiveness of this is a reason for its underuse. The hybrid model was operationalized in this study by statistically rescaling data for the prices, times of (re)sale and attributes of the dwelling unit and neighbourhood of 1,275 once-sold ones with those of 1,284 resold homes during a 30-year period. In addition to this long study period, an innovation was in rescaling the data for the attributes of homes as well as their times of sale, although the former subsequently contributed less than the latter to the hybrid price model’s marginally improved predictions of sold house prices in two inner-city neighbourhoods. Data for these observations’ 70 variables were analyzed within the executable limits of a linear regression procedure in Excel, but a statistical analysis package will be utilized if data are available for refinements such as quarterly periods of sales rather than annual ones, or changes in prices of attributes through time rather than assumed enduring ones. So far, the hybrid price model has had the hypothesized empirical improvement in modelling house prices. First, it statistically explains more of the variation in the house price dependent variable than does a single sales hedonic price model or a repeat sales model. Second, it has more accurate predictions than the former model of changes in both in-sample and out-of-sample house prices through time; and it has more accurate in-sample predictions than the latter model. It achieves these improvements with regression coefficients for homes’ times of sale and resale having marginally lower standard errors than those in the single sales hedonic price model or the repeat sales model.