How it works

Deep Learning directly on 3D

Unlike other approaches, VL3D++ does not convert the cloud into 2D images.
Learning happens directly on the spatial structure of the points.
This enables:

Greater geometric precision
Better understanding of complex shapes
Application to very different domains with the same approach

How is it done?

Neighbourhoods and receptive fields

They are the basis of the input to the models. VL3D++ uses spherical or rectangular neighbourhoods to capture the local environment of each point. Advanced methods are used such as Grid Subsampling (cell-based division), FPS (Furthest Point Sampling) for optimal coverage, and hierarchical versions that let the network learn features at different resolution scales.

Neural architectures

The framework implements the state of the art in 3D processing. It supports everything from classic PointNet/PointNet++ to point convolutions (KPConv) and sparse convolution networks (Sparse 3D) that drastically optimise memory usage. These models are usually wrapped in hierarchical autoencoder-type structures (U-Net) for dense and precise segmentation.

Inference

Trained models are integrated into a Predictive Pipeline. During inference, the cloud is processed in blocks or overlapping neighbourhoods, allowing the analysis of massive point clouds that exceed RAM capacity using specialised sequencers.

The complete VL3D++ pipeline

01/Import the raw point cloud

VL3D++ works directly on 3D point clouds generated by aerial or terrestrial LiDAR, photogrammetry, 3D scanners, volumetric reconstructions (CT/MRI), depth sensors, bathymetry or industrial scanning, etc. It operates on 3D geometry without converting data into images or going through intermediate 2D representations.

02/Preprocessing

Application of Data Mining components for the extraction of geometric, volumetric, height and colour (HSV) features. Use of Transformers for data normalisation and cleaning (Imputers), together with Decorators that enable intelligent subsampling (FPS) and structure smoothing (Simple smoother).

03/Point selection and neighbourhood definition

To analyse the cloud, the system does not process all points at once. It selects points of interest and defines their local neighbourhood, that is, the set of nearby points providing geometric context. This neighbourhood is defined through geometric volumes such as spheres, cubes or cylinders that encapsulate the spatial environment of each point.

04/Training

Modular configuration through JSON files that define the exact pipeline flow. It includes advanced validation strategies such as Stratified K-folding, class balancing techniques (SMOTE) and dynamic hyperparameter optimisation (Grid/Random Search).

04.1/Receptive field: ordering the unordered

Point clouds are unordered by nature. A neural network needs ordered, fixed-size inputs. Here is one of the key technical points of VL3D++. Neighbourhoods are transformed into regular structures called receptive fields. That is the key: turning unordered points into inputs a neural network can process.

04.2/Deep Learning processing in 3D

The receptive fields feed the deep neural network. The network learns complex geometric patterns directly on the 3D structure, without working with images or 2D projections.

04.3/Prediction on each cloud fragment

The model generates a prediction for each receptive field, assigning classes and confidence levels to the points analysed in that local environment. At this point, the result is still fragmented into small portions of the cloud.

04.4/Propagation and reconstruction of the complete cloud

Local predictions are propagated and aggregated until the original point cloud is rebuilt, now fully classified. This process is repeated systematically until the entire cloud is covered.

05/Model prediction and evaluation: errors, uncertainty and metrics

Generation of a serialised Predictive Pipeline that encapsulates all of the model's knowledge. Automatic evaluation of results through standard metrics (OA, R, P, F1-score, IoU) and creation of detailed visual reports. Once the model is built, the system generates additional information to evaluate the quality of the result: • Areas where the model is wrong • Areas where the model has doubts • Standard quantitative point-by-point metrics.

The model improves on its own, with minimal expert input

VL3D++ does not need an expert to classify thousands of clouds by hand.
The system automatically detects where it has doubts and asks for help only in those areas.

We start with a few already-classified clouds.

The model learns to recognise patterns in 3D.

It classifies new clouds and detects where it's unsure.

The expert only reviews those areas and the system improves.

How does the model learn to classify point clouds better?

The model learns, detects where it doubts, the expert fixes errors and the system improves on its own.

"Human-in-the-loop" paradigm.
A semi-automatic training approach where a human expert (oracle) collaborates with the model. It enables working with large volumes of unlabelled data, optimising manual supervision time.

Iterative Refinement Cycle.
The process starts with an "initial budget" (a small set of labelled data). The model trains, predicts on the rest of the cloud and the expert steps in to correct and validate the most critical results.

Uncertainty-Guided Selection.
The system automatically identifies the regions where the model has more doubts (high uncertainty). The expert only labels these key areas, which maximises the model's improvement with the minimum human effort.

Labelling Efficiency.
By focusing human attention on the most informative points, state-of-the-art accuracy is achieved without the need to manually process billions of points exhaustively.

Metrics that validate the result

Every classification comes with standard metrics
that allow the result to be evaluated objectively:

Overall Accuracy

It's the total percentage of correct predictions of the model. It tells how many points, out of all those processed, were assigned to their correct category. It is the simplest metric to understand overall performance.

Precision

Measures the reliability of the model's predictions. If the model classifies a point as "tree", precision tells you how likely it is that it really is one, avoiding false alarms.

Recall

Evaluates the model's ability to find all elements of a class. A high recall means the model identifies almost all objects present (for instance, buildings), leaving almost none out.

F1 Score

A balanced combination between precision and recall. It provides a joint view of the model's quality, ensuring it is both reliable and able to detect most of the points.

Jaccard Index

Measures how much the model's prediction overlaps with reality. It is a fundamental metric in 3D segmentation that rewards when the predicted silhouette fits perfectly with the real one.

wP, wR, wF1, wIoU

Versions of the previous metrics that are adjusted according to the abundance of each category. They prevent success in very frequent classes (such as terrain) from masking errors in poorly represented classes (such as urban furniture).

MCC

A metric that measures the actual correlation between the model's predictions and physical reality. Unlike other metrics, it considers all hits and errors (both positive and negative) in a balanced way. Its value ranges between -1 and +1, where +1 indicates a perfect model, 0 a performance no better than chance, and -1 a total contradiction; making it one of the most demanding reliability tests in data science.

Kappa

Measures how much better the model is compared to a classification made purely by chance. It is an extra guarantee that the results are the fruit of real learning by the system.

We foster an ecosystem of open innovation and continuous collaboration between academia and industry. Our technical and research team can guide you to explore how to apply VL3D++ to your case, collaborate on research or study technology-transfer pathways. Anyone interested in professional point cloud classification, or who needs the development of a specific technical functionality for their workflows, can contact our development team directly. We are open to joint research projects, technology transfer and advanced support for the implementation of the framework in production environments.