Introducing Ammolite

Introduction

Welcome to the inaugural article of the metaview project. metaview is an effort to create a universal platform for VR/AR applications similarly to how the web browser is a platform for web applications. For more information about metaview and its goals, see About. This article covers ammolite, a work-in-progress rendering engine with focus on VR/AR.

The most unique aspect about a VR/AR application compared to a regular desktop application is undoubtedly the level of immersion it facilitates. This is achieved using an indespensable peripheral -- the VR headset. The VR headset provides two main features: position and motion tracking of the headset itself and a pair of screens to provide 3D vision with depth perception. ammolite aims to create a solution for the latter; the goal is to create a rendering engine optimized for VR/AR applications.

Keeping this goal in mind, a universal solution will require a wide range of features, including support for physically based rendering in order to deliver the ability to create as reallistic environments/models as possible. The GL Transmission Format seems to be a perfect fit for the task; it is a widespread format for graphics asset delivery built with modern graphics APIs in mind, meaning artists can create assets in popular 3D modeling tools such as Blender and it will be possible to use those assets within ammolite.

Vulkan is a cross-platform low-level graphics API, which means it suits the needs of the project perfectly, as high performance will be vital in our use-case. For the same reason of high-performance, the programming language Rust has been chosen, as its design focuses on making it possible to use multi-threading and paralellism effectively, along with many other useful features that modern languages usually grant.

Development and current status

There are two productive ways one can currently interface with Vulkan within Rust -- using either Vulkano or gfx-rs. While gfx-rs is a highly sophisticated library that provides a Vulkan-like API and is implemented on top of Vulkan and other graphics APIs like OpenGL 2.1, DX11/12 and Metal, since VR headsets require high-end hardware, we can assume the hardware used supports Vulkan and take the simpler path of choosing Vulkano over gfx-rs. That's as low-level as it gets, which, of course, carries the obvious advantages and disadvantages -- while being heavily customizable, this customization requires a lot of boilerplate code to get anything working.

Getting a triangle on screen took ~550 lines of code.
Figuring out the math for perspective projection, the intricacies of loading a texture to the GPU and the way the depth buffer works took another ~1100 lines of code.

Figuring out how to load the glTF models was not particularly difficult, as there already is an implementation of a glTF parser which handles the text processing for us, the only task left to do is to transfer the data to the GPU. Thankfully, Vulkano provides nice synchronization primitives in the form of GpuFutures (similar to Java's Futures or JS' Promises but manages GPU-CPU synchronization).

The Damaged Helmet glTF model with the albedo (base color) texture.

Transparency

It turns out that transparency is a difficult problem in rasterization. The reason being, rasterization works by drawing triangles on top of a framebuffer. Thus, in order to draw transparent objects, we need to draw all opaque objects first and then draw the transparent ones over the opaque objects that have already been accumulated in the framebuffer. Not only that, triangles containing transparency themselves have to be rendered in back-to-front order so as to get the correct resulting fragment color (that's not entirely correct either, but I'll leave the details out). That means we have to sort the models before drawing them, but sorting geometry is a very expensive task.

$$ C_{f,n} = \alpha_n C_n + (1 - \alpha_n)C_{f,n-1} $$ $$ C_{f,0} = C_0 $$
Calculating the final fragment color $C_{f,n}$ when drawing geometry in back-to-front order, with $C_n$ being the RGB vector and $\alpha_n$ being the opacity component. $C_0$ is the background color with $\alpha_0 = 1$.

Enter order-independent transparency (OIT). OIT is a technique of rendering transparent primitives without having to sort them on the CPU first. There are various ways to go about OIT and there is always a trade-off between quality and computation time. As a first implementation of transparency for ammolite, I chose Weighted Blended OIT, as it seemed like the most simple yet versatile implementation of OIT that would produce good results. This technique approximates the resulting color by making a weighted sum of the RGB components respectively, with the weight being larger the closer to the camera it is. It is obviously an approximation, since we don't use the recurrent relation above, as there is no order of primitives.

$$ C_f = \frac{\sum_{i=1}^n C_i \cdot w(z_i,\alpha_i)}{\sum_{i=1}^n w(z_i,\alpha_i)}\left(1-\prod_{i=1}^n(1-\alpha_i)\right) + C_0 \prod_{i=1}^n(1-\alpha_i) $$
Calculating the final fragment color $C_f$ when drawing geometry with Weighted Blended OIT, with $w$ being the weight function, which is decreasing in both parameters. Note that this expression contains no recurrence.
An implementation of Weighted Blended OIT within ammolite and alpha masking passing all glTF specification tests. ~3900 lines of code in total.
The albedo with Weighted Blended OIT of the Steampunk Explorer model by miashow. Note the transparency of the light bulb and the "windshield".

Normal mapping

The way normal mapping works is by storing the normal directions relative to the models surface. Instead of having to increase the amount of vertices to add plasticity to the model, we can take advantage of saving these normal directions into a texture.

$$ s\in\left\langle 0;1\right\rangle ^{3} $$ $$ \vec{n_{t}}=2\vec{s}-1 $$
We unpack $s$, the color sample at the specific texture coordinate, into the vector $\vec{n_{t}}$. This is the "surface-relative" normal.

Here is what a typical normal map looks like.

Normal maps are typically mostly blueish-gray textures. This one is taken from the Damaged Helmet model. The blue color is dominant, because it corresponds to the Z coordinate of the normal, most of those point directly away from the surface.

These vectors we retrieve from the texture are in, so called, tangent space. Since I do most of the fragment shader calculations in world space, as I find it the most intuitive, we have to transform the vector.

$$ M_{\mathrm{T}\rightarrow\mathrm{K}}=\left(\vec{t_{w}},\vec{b_{w}},\vec{n_{w}}\right) $$ $$ \vec{n_{f}}=\mathrm{normalize}(M_{\mathrm{T}\rightarrow\mathrm{K}}\cdot\vec{n_{t}}) $$
$M_{\mathrm{T}\rightarrow\mathrm{K}}$ is the transformation matrix from tangent space to canonical space, constructed by vectors in world space, namely the tangent $\vec{t_w}$, the bitangent $\vec{b_w}$ and the normal $\vec{n_w}$. We get the resulting final normal in world space $\vec{n_{f}}$ by applying this matrix to the sampled normal in tangent space.

Now let's compare the difference between using and not using a normal map.

An animated GIF demonstrating the difference between using and not using a normal map. As we can see, the difference is quite significant, normal mapping provides quite a lot of detail to our geometry.

Physically based rendering

The glTF specification describes a reference implementation of PBR, which I have attempted to add to ammolite. It seems like the most common way of representing light in a scene is via multiple precomputed cubemaps with some kind of spacial interpolation between them.

A cubemap is suitable for representing static light sources in a scene.

Unfortunately, glTF currently does not specify a way to provide cubemaps for a scene. This may be by design, as the expected cubemaps should correspond to how the scene is rendered in the implementation, therefore cubemaps should be generated by the implementation itself. Nevertheless, I think it should still be possible to let the artist specify which locations in the scene to create cubemaps from. Ideally, one would use ray tracing techniques to sample the radiance of desired points in a dynamic scene, as ray tracing is particularly suitable for this kind of problem. Traditional rasterization techniques fall short, especially when trying to render dynamic volumetric lights. Rasterization does, however, offer an elegant solution for the simplest case of dynamic lighting -- point lights, where the source of radiance is reduced to a single point with no volume.

Taking these considerations in mind, I've decided to implement the reference PBR implementation using point lights only first. While writing this article, I have also discovered what I currently consider the most well written guide on PBR theory and implementation -- Learn OpenGL, which has been an incredibly useful resource while trying to debug my implementation of the glTF reference.

What follows are some results of the implementation. It's clear that missing image-based lighting (IBL) has a significant impact on the resulting look of the model, as not having any IBL seems to make the model look mostly dark. The scene in which the models were rendered contained 3 white point lights with no attenuation.

The current implementation contains ~4700 lines of code.

Conclusion and future work

From the results of the current PBR implementation, it is evident that having a way to specify volumetric lighting would be desirable. Utilizing image-based lighting seems like the most versatile approach when using conventional rasterization. This approach requires texture mip-mapping, which is not currently implemented in Vulkano. Recently released consumer nVidia GPUs provide hardware acceleration for ray tracing, a technique, which provides a solution for rendering lighting in dynamic scenes. Unfortunately, this technique is too computationally intensive to be currently used in VR, due to the required high resolutions VR headsets must offer.

In the future, I would like to add support for stereo rendering for VR headsets as well as more glTF compatibility improvements, such as animation.

The source code for ammolite is, of course, publicly accessible, as will be the rest of the metaview codebase.

Acknowledgements

I have to give a shoutout to RenderDoc, which has been, and continues to be, an invaluable tool when learning the Vulkan API and saved me hours of manual debugging. A huge thanks to baldurk for making the tool free and available to anyone.

The Damaged Helmet glTF model used in this article is published under the Creative Commons Attribution-NonCommercial license, was created by theblueturtle_ and is available for download here.