For Data Acquisition and Data pre-processing click here.

Class definition

Semantic annotation of the point clouds according to a CityGML Level Of Detail 3/4 has been defined. In addition to this open data model, IFC standard and the AAT (Art and Architecture Thesaurus) has been taken into account.

9 classes have been therefore selected, plus another one defined as “other”, containing all the points not belonging to the previous classes (e.g. paintings, altars, benches, statues, waterspouts…).

"arch":0, "column":1, "moldings":2, "floor":3, "door_window":4, "wall":5, "stairs":6, "vault":7, "roof":8, "other":9

These classes have been used for the point clouds labelling. Nevertheless, the possibility of further extending this scheme for a higher Level of Detail (LOD 4/5), to be exploited for Instance Segmentation, is planned.


The dataset is composed of 17 annotated and another 10 non-annotated point clouds, the latter of which could be labelled by users and added to the main dataset.
Many of the scenes included in the ArCH benchmark are part (or a candidate) of the UNESCO World Heritage List (WHL):

Other scenes are nevertheless part of historical built heritage and represent various historical periods and architectural styles. This difference could constitute a drawback in the definition of the dataset classes, as it introduces elements of inhomogeneity within the same classes. However, providing the neural network with differing elements improves its ability to generalise among various CH case studies.

Among the labelled scenes of the benchmark, 15 scenes are available for training and 2 for testing. They all include churches, chapels, porticoes, loggias, pavilions and cloisters.
The 2 test scenes (named A and B) have different characteristics:

  • the first (A_SMG_portico) represents a simple, almost symmetrical building on one level and with more standard and repetitive geometric elements;
  • the second (B_SMV_chapel_27to35) represents a complex, non-symmetrical building, structured on two levels, surveyed both indoor and outdoor, with different types of vaults, stairways and windows.

These two test scenes were chosen to (i) simplify the comparisons of the results, (ii) assess the effectiveness of the proposed algorithms and (iii) try to highlight the generalisation and learning capability of the networks not only on a relatively simple scene but also on a complex one.

Skip to Download


NamePreviewN. of pointsSceneData acquisitionN. of classes (excluded “other”)subsampling (cm)
1_TR_cloister15,740,229Indoor/outdoorTLS + UAV8/91
4_CA_church4,850,807OutdoorTLS + UAV6/91
5_SMV_chapel_13,783,412OutdoorTLS + UAV9/91
6_SMV_chapel_2to46,326,871Indoor/OutdoorTLS + UAV9/91
7_SMV_chapel_243,571,064OutdoorTLS + UAV9/91
8_SMV_chapel_283,156,753OutdoorTLS + UAV9/91
9_SMV_chapel_102,193,189Indoor/OutdoorTLS + UAV6/91
10_SStefano_portico_13,783,699OutdoorTerrestrial photogrammetry8/91
11_SStefano_portico_210,047,392OutdoorTerrestrial photogrammetry8/91
14_TRE_square9,409,239*OutdoorTerrestrial photogrammetry8/91.5
TOTAL (million)102,139,969
* n. of points updated from Matrone et al., 2020


NamePreviewN. of pointsSceneData acquisitionNumber of classes (excluded “other”)Subsampling (cm)
A_SMG_portico17,798,012*OutdoorTLS + UAV9/91
B_SMV_chapel_27to3516,200,442Indoor/OutdoorTLS + UAV9/91
TOTAL (million)33,998,454
* n. of points updated from Matrone et al., 2020