← Back to Blog

How to predict building periods from building images

  • Engineering & Design
Published by Hassen Miri - 15 September 2022
Design & Engineering Building Facade (1)

At PriceHubble, our real estate automated valuation models (AVMs) are powered by large amounts of precise and insightful data. A study of these models showed that the construction year of the property building is a key variable for an accurate estimation of the properties’ prices.

However, in several of our datasets, the construction year is missing, thus preventing our Data Science teams from leveraging this variable to create more accurate models. It is in this context that, at PriceHubble, we decided to use the real estate property images to infer the construction year in our datasets.

Data

The data we use to achieve this task consist of real estate property offers that contain facade images of the building, along with different variables describing the property itself. We filtered this data to only retain offers for which the construction year is known, and is between 1850 and 2020.

Hassen figure 1

Strategy

We decided to approach this task as a classification problem, as regression problems based on images are less trivial, and require more data of higher quality. To this end, we converted the construction years to construction decades. This allowed us to obtain a training dataset with labels ranging into 17 classes representing the different decades.

A traditional solution for image-based classifications is the use of Convolutional neural networks (Convnet) [1]. However, although state-of-the-art Convnet architectures have now reached great performance for image classifications, these networks still need huge amounts of data for optimal training.

For this reason, we chose to use a Transfer Learning approach instead: using a Convnet architecture that has already been trained on a problem similar to ours, so that we only need to retrain the last layers of the network. That way, the prior knowledge acquired from the pre-training on a different and much larger image dataset allows us to decrease the training time and get great accuracy with a reasonable amount of data samples. For our use case, we chose the EfficientNetV2 [2] architecture pre-trained on the ImageNet dataset.

To further improve our model accuracy, we enhanced our model with some structural data that we have along with the facade images, such as the geographic location (latitude and longitude) and the energy consumption level. These variables are indeed closely correlated with the construction period.

Implementation

The implementation progresses in several stages:

  • First, we created a multi-layer perceptron network that takes, as input, the processed numerical data of the property, and we trained it on our dataset.
  • In parallel, we also trained our EfficientNetV2 architecture on the images of our training dataset.
  • Finally, we dropped the last layer of these two trained neural networks, concatenated their respective outputs, and fed it into a third multilayer perceptron neural network, which we trained while freezing the weights of the first two networks.
Hassen figure 2

Improving the accuracy

So far, the described approach uses a multi classification model. To get better accuracy, and since the construction decade is an ordinal variable, we took account of the order between different classes in our data representation.

A first method consists in using one-hot encoding for our label vector. So, if we have K classes and our label belongs to the class k, our target vector will be the vector t = (0, ..., 0, 1, 0, ..., 0), where only the element tk is set to 1 and all others to 0.

In this method, to infer the order of the classes in our classification, we change the encoding so that If a data point belongs to the category k, it is classified automatically into lower order categories (1, 2, ..., k − 1) as well. So the target vector of x is t = (1, 1, .., 1, 0, 0, 0), where

ti (1 ≤ i ≤ k) is set to 1 and other elements zeros [3].

Hassen figure 3

The problem with this method is that our neural network doesn’t guarantee consistency of the output as illustrated in figure 3. To overcome this, we use the Coral (COnsistent RAnk Logits) [4]. This architecture uses the weight sharing technique in the penultimate layer to ensure order consistency of the output.

Conclusion

By using transfer learning on EfficientNet V2, combining it with an architecture that takes the structured data in the input and changing the target encoding and the penultimate layer to implement the Coral technique, we succeeded to obtain 43% accuracy. We also considered the metric where we allow the prediction to be one decade away from the true label and we reached 60% accuracy on this given metric.

References

[1] An Introduction to Convolutional Neural Networks, Keiron O'Shea, Ryan Nash

[2] EfficientNetV2: Smaller Models and Faster Training, Mingxing Tan, Quoc V. Le

[3] A neural network approach to ordinal regression, Jianlin Cheng

[4] Rank consistent ordinal regression for neural networks with application to age estimation

Wenzhi Cao, Vahid Mirjalili, Sebastian Raschka


See also

Design & Engineering Regression Splines

Regression Splines for Real Estate Valuation

The purpose of an AVM is, given a set of property characteristics, to return the most accurate price estimate for this property. In this article, we explain how Regression Splines can be useful to effectively build such systems.

  • Engineering & Design
Read more →
Blog post Maxime Nannan.png

Inside PriceHubble: Meet Maxime, Lead Data Acquisition

Maxime Nannan, Lead Data Acquisition at PriceHubble, tells us everything about his background, experiences and what he loves in his work.

  • Inside PriceHubble
Read more →
Blog PH Women in Tech

Women in Tech: Cécile Gontier, Full Stack Developer

In our new blog series, we would like to introduce women who work at PriceHubble in the areas of tech and engineering. We would like to show what their career paths have been so far, what their daily work looks like, what they are particularly interested in and why they enjoy working at PriceHubble.

  • Inside PriceHubble
Read more →
Thank you for your inquiry. We will contact you shortly.
Something went wrong. Please try again in a while.