radv: use radv_nir_opt_tid_function for shuffles

The main motivation were open coded clustered inclusive scans
and clustered broadcasts in the gdeflate decompression shader used by
DirectStorage.

Foz-DB Navi21 (only the_last_of_us_part1 is affected):
Totals from 8 (0.01% of 79395) affected shaders:
Instrs: 6230 -> 5438 (-12.71%)
CodeSize: 33376 -> 29148 (-12.67%)
Latency: 77017 -> 72917 (-5.32%)
InvThroughput: 10190 -> 9280 (-8.93%)
Copies: 566 -> 569 (+0.53%)
PreSGPRs: 528 -> 524 (-0.76%)
PreVGPRs: 232 -> 230 (-0.86%)
VALU: 2889 -> 2616 (-9.45%)
SALU: 1748 -> 1491 (-14.70%); split: -14.82%, +0.11%

Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24650>
This commit is contained in:
Georg Lehmann 2023-08-12 16:51:21 +02:00 committed by Marge Bot
parent ca88783318
commit 39de178656

View File

@ -384,6 +384,16 @@ radv_postprocess_nir(struct radv_device *device, const struct radv_graphics_stat
NIR_PASS(_, stage->nir, radv_nir_lower_fs_intrinsics, stage, gfx_state);
}
/* LLVM could support more of these in theory. */
bool use_llvm = radv_use_llvm_for_stage(pdev, stage->stage);
radv_nir_opt_tid_function_options tid_options = {
.use_masked_swizzle_amd = true,
.use_dpp16_shift_amd = !use_llvm && gfx_level >= GFX8,
.use_clustered_rotate = !use_llvm,
.hw_subgroup_size = stage->info.wave_size,
};
NIR_PASS(_, stage->nir, radv_nir_opt_tid_function, &tid_options);
enum nir_lower_non_uniform_access_type lower_non_uniform_access_types =
nir_lower_non_uniform_ubo_access | nir_lower_non_uniform_ssbo_access | nir_lower_non_uniform_texture_access |
nir_lower_non_uniform_image_access;