Out-of-Core GPU Volume Rendering of Large-Scale Medical Datasets using MMU-style memory Managment

Jun 23, 2026
Vulkan / C++ / Computer Graphics
Computer Graphics

Company

Objective

Massive data streaming and visualization.

Tools & Technologies

Vulkan, C++, GLSL, ISPC, ECS, MMU, Virtual Texturing

Description

This is a port to Vulkan of a previous OpenGL volumetric render engine written in C++. Volumetric data are processed through a nodal-based out-of-core multi-threaded system and vectorized with the ISPC intel compiler. The software adheres to the Entity-component-System (ECS) paradigm and its UX has been inspired by Unity Software.

Multiple usual techniques have been developed to minimize the number of samples taken along a ray (empty-space skipping, early ray termination, pre-integrated transfer function,image downscaling, stochastic jittering, adaptative sampling… ). Some of them have been designed specifically.

Shading is done with gradients generated on-the-fly through finite differences. Rays are launched from a glsl shader.

Massive volume Visualization (750Gb)

Data have been duplicated here on disk (no instancing). The chosen approach here is virtual texturing (Barret (3), Beyer et al (4)) as opposed to an octree (Crassin (5)). The current implementation have been optimized with Nsight to identify main bottlenecks and to reach maximal performance. Bricks are fetched in a ray-guided fashion on a separate thread. Vulkan async transfer queues are used here, and bricks are transferred to main brick pool via compute shaders.

Transfer Function Edition

Transfer functions are edited via an HSV-opacity channels editor. Pre-integrated TF have been used to improve both quality and performance and solve the Nyquist frequency problem. See Engel et al (1)

Volumetric Raytraced Shadows

Raytraced shadows are secondary rays that are heavy to compute. I developed an acceleration structure called "interval list octree" inspired by Li, Mueller and Kaufman (2)

Empty-space skipping

Empty space is skipped by sampling a texture indicating absence of density to sample. To also hide latency of density texture fetch (the main bottleneck), rays are ray-marched four by four samples that are shaded together.