Streamlining Subpasses
Just over two years ago, the Khronos Vulkan® Working Group introduced the VK_KHR_dynamic_rendering extension in a blog titled “Streamlining Render Passes”. This extension allowed developers to bypass the creation of the complex render pass and framebuffer objects that had been in Vulkan since version 1.0, significantly reducing the complexity required to start rendering in Vulkan.
While VK_KHR_dynamic_rendering solved several problems with rendering, any developer using input attachments or subpasses would not be able to port to the new API - no equivalent functionality existed when using dynamic rendering.
Today we’re happy to announce the release of VK_KHR_dynamic_rendering_local_read, which adds support for local dependencies to dynamic rendering; enabling developers to fully move over to dynamic rendering as support is rolled out.
Local Reads
The primary functionality exposed in this extension is the ability to execute pipeline barriers using VK_DEPENDENCY_BY_REGION_BIT inside a dynamic render pass, allowing framebuffer-local dependencies between draw calls on either side of the barrier. These dependencies can be between rendering attachments and input attachments (as in the original render pass API), or additionally between storage resources or pointer accesses for greater flexibility.

Local reads can be used for techniques such as order independent transparency, deferred rendering, or simple post-processing techniques that do not rely on neighboring values. A sample on how to make use of this should be available in the Vulkan samples repository soon.
Mapping to Subpasses
For newly written applications, using this extension is relatively straightforward — there’s a new layout for attachments to use, and if you want to use input attachments then they’ll need to have descriptors created for them as before — but there’s no other setup required. If you want to read a storage image with a local dependency, just insert a barrier with the “by region” flag and you’re good to read it in the following fragment shaders:
VkMemoryBarrier2 memoryBarrier = {
.sType = VK_STRUCTURE_TYPE_MEMORY_BARRIER_2,
.srcStageMask = VK_PIPELINE_STAGE_2_FRAGMENT_SHADER_BIT,
.dstStageMask = VK_PIPELINE_STAGE_2_FRAGMENT_SHADER_BIT,
.srcAccessMask = VK_ACCESS_2_SHADER_WRITE_BIT,
.dstAccessMask = VK_ACCESS_2_SHADER_READ_BIT };
VkDependencyInfo dependencyInfo = {
.sType = VK_STRUCTURE_TYPE_DEPENDENCY_INFO,
.dependencyFlags = VK_DEPENDENCY_BY_REGION_BIT,
.memoryBarrierCount = 1,
.pMemoryBarriers = &memoryBarrier };
vkCmdPipelineBarrier2(commandBuffer, &dependencyInfo);
For applications looking to port content from render pass objects, we’ve added a few extra bits to the API to map your existing shader code bases to this new API. Color attachments can be remapped to different indices between pipelines to allow emulation of switching subpasses, and input attachment indices can be remapped to different color, depth, or stencil attachments, or directly to bound descriptors.
typedef struct VkRenderingAttachmentLocationInfoKHR {
VkStructureType sType;
const void* pNext;
const uint32_t colorAttachmentCount;
const uint32_t* pColorAttachmentLocations;
} VkRenderingAttachmentLocationInfoKHR;
void vkCmdSetRenderingAttachmentLocationsKHR(
VkCommandBuffer commandBuffer,
const VkRenderingAttachmentLocationInfoKHR* pLocationInfo);
typedef struct VkRenderingInputAttachmentIndexInfoKHR {
VkStructureType sType;
const void* pNext;
const uint32_t colorAttachmentCount;
const uint32_t* pColorAttachmentInputIndices;
uint32_t depthInputAttachmentIndex;
uint32_t stencilInputAttachmentIndex;
} VkRenderingInputAttachmentIndexInfoKHR;
void vkCmdSetRenderingInputAttachmentIndicesKHR(
VkCommandBuffer commandBuffer,
const VkRenderingInputAttachmentIndexInfoKHR* pInputAttachmentIndexInfo);
Notably, this setup for mapping to the previous API can be entirely ignored for new applications as the mappings default to the API array indices, but it is available for any who want to make use of it. Applications porting existing shader bases that don’t make use of subpasses will likely also find they can make do without specifying the mappings.
Staying On-Chip
Render pass objects were designed to be more expressive than implementations were necessarily able to accelerate, leaving developers unsure when or whether they would keep data in tile buffers between subpasses or not. With this extension, the API has been designed so that use cases which would require vendors to split render passes are not expressible within a single dynamic render pass (though are still expressible by using multiple passes), reducing the possibility of performance cliffs from falling off chip.
Conclusion
This extension will be available as part of the Vulkan Roadmap 2024 milestone, which requires that new high-end devices from each vendor will support it. However, we expect this to be more widely available — there are no specific hardware requirements beyond those needed for Vulkan 1.0, so platforms with regular driver updates should see rollout of this extension over the next year for a large variety of hardware.
The extension proposal for VK_KHR_dynamic_rendering_local_read goes into significantly more detail about how this extension can be used, and is a good place to look for more information for anyone looking to use it.