How to predict building periods from building images
Engineering & Design
Published by Hassen Miri - 15 September 2022
At PriceHubble, our real estate automated valuation models (AVMs) are powered by large amounts of precise and insightful data. A study of these models showed that the construction year of the property building is a key variable for an accurate estimation of the properties’ prices.
However, in several of our datasets, the construction year is missing, thus preventing our Data Science teams from leveraging this variable to create more accurate models. It is in this context that, at PriceHubble, we decided to use the real estate property images to infer the construction year in our datasets.
The data we use to achieve this task consist of real estate property offers that contain facade images of the building, along with different variables describing the property itself. We filtered this data to only retain offers for which the construction year is known, and is between 1850 and 2020.
We decided to approach this task as a classification problem, as regression problems based on images are less trivial, and require more data of higher quality. To this end, we converted the construction years to construction decades. This allowed us to obtain a training dataset with labels ranging into 17 classes representing the different decades.
A traditional solution for image-based classifications is the use of Convolutional neural networks (Convnet) . However, although state-of-the-art Convnet architectures have now reached great performance for image classifications, these networks still need huge amounts of data for optimal training.
For this reason, we chose to use a Transfer Learning approach instead: using a Convnet architecture that has already been trained on a problem similar to ours, so that we only need to retrain the last layers of the network. That way, the prior knowledge acquired from the pre-training on a different and much larger image dataset allows us to decrease the training time and get great accuracy with a reasonable amount of data samples. For our use case, we chose the EfficientNetV2  architecture pre-trained on the ImageNet dataset.
To further improve our model accuracy, we enhanced our model with some structural data that we have along with the facade images, such as the geographic location (latitude and longitude) and the energy consumption level. These variables are indeed closely correlated with the construction period.
The implementation progresses in several stages:
First, we created a multi-layer perceptron network that takes, as input, the processed numerical data of the property, and we trained it on our dataset.
In parallel, we also trained our EfficientNetV2 architecture on the images of our training dataset.
Finally, we dropped the last layer of these two trained neural networks, concatenated their respective outputs, and fed it into a third multilayer perceptron neural network, which we trained while freezing the weights of the first two networks.
Improving the accuracy
So far, the described approach uses a multi classification model. To get better accuracy, and since the construction decade is an ordinal variable, we took account of the order between different classes in our data representation.
A first method consists in using one-hot encoding for our label vector. So, if we have K classes and our label belongs to the class k, our target vector will be the vector t = (0, ..., 0, 1, 0, ..., 0), where only the element tk is set to 1 and all others to 0.
In this method, to infer the order of the classes in our classification, we change the encoding so that If a data point belongs to the category k, it is classified automatically into lower order categories (1, 2, ..., k − 1) as well. So the target vector of x is t = (1, 1, .., 1, 0, 0, 0), where
ti (1 ≤ i ≤ k) is set to 1 and other elements zeros .
The problem with this method is that our neural network doesn’t guarantee consistency of the output as illustrated in figure 3. To overcome this, we use the Coral (COnsistent RAnk Logits) . This architecture uses the weight sharing technique in the penultimate layer to ensure order consistency of the output.
By using transfer learning on EfficientNet V2, combining it with an architecture that takes the structured data in the input and changing the target encoding and the penultimate layer to implement the Coral technique, we succeeded to obtain 43% accuracy. We also considered the metric where we allow the prediction to be one decade away from the true label and we reached 60% accuracy on this given metric.
 An Introduction to Convolutional Neural Networks, Keiron O'Shea, Ryan Nash
 EfficientNetV2: Smaller Models and Faster Training, Mingxing Tan, Quoc V. Le
 A neural network approach to ordinal regression, Jianlin Cheng
 Rank consistent ordinal regression for neural networks with application to age estimation
The purpose of an AVM is, given a set of property characteristics, to return the most accurate price estimate for this property. In this article, we explain how Regression Splines can be useful to effectively build such systems.
In our new blog series, we would like to introduce women who work at PriceHubble in the areas of tech and engineering. We would like to show what their career paths have been so far, what their daily work looks like, what they are particularly interested in and why they enjoy working at PriceHubble.