This drops the mesa/gallium lists from some build rules, since zink common
rules brings them in already. If we do more driver common rules, we might
end up with those core lists appearing in the yaml multiple times, but
that seems like a small price to pay for not being able to forget some.
Reviewed-by: Guilherme Gallo <guilherme.gallo@collabora.com>
Acked-by: Eric Engestrom <eric@engestrom.ch>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17287>
Because !references merging happens after yaml parsing, this lets us
remove a duplicated definition between .test-source-dep.yml and
.gitlab-ci.yml.
Reviewed-by: Guilherme Gallo <guilherme.gallo@collabora.com>
Acked-by: Eric Engestrom <eric@engestrom.ch>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17287>
Signed-off-by: Konstantin Seurer <konstantin.seurer@gmail.com>
Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17242>
nir_ine_imm(b, nir_iand_imm(b, x, mask), 0) and
nir_i2b(b, nir_iand_imm(b, x, mask)) are common
patterns which become quite messy when they are
part of a larger expression. Clang-format does
not improve things either and we can end up with
some rather interesting looking code.
(RADV ray tracing pipeline and query lowering)
Signed-off-by: Konstantin Seurer <konstantin.seurer@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17242>
Change NGG GS emit vertex code to emit combined shared stores,
also change the export vertex code to emit combined shared loads.
This results in more optimal code generation, ie. fewer LDS
instructions are generated.
GS vertices are stored using an odd stride to minimize the chance
of bank conflicts, which means that unfortunately
we still can't use an alignment higher than 4 here,
so the best we can get are some ds_read2_b32 instructions.
Fossil DB stats on Navi 21 (formerly Sienna Cichlid):
Totals from 135 (0.10% of 128653) affected shaders:
VGPRs: 6416 -> 6512 (+1.50%)
CodeSize: 529436 -> 503792 (-4.84%)
MaxWaves: 2952 -> 2924 (-0.95%)
Instrs: 93384 -> 90176 (-3.44%)
Latency: 290283 -> 293611 (+1.15%); split: -0.36%, +1.50%
InvThroughput: 81218 -> 82598 (+1.70%)
Copies: 6603 -> 6606 (+0.05%)
PreVGPRs: 5037 -> 5076 (+0.77%)
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11425>
In order to reduce moves when coalescing multiple registers into a
larger register, RA will try to coalesce MERGE instructions with their
definitions. For example, for something like this in GLSL:
uint a = ...;
uint b = ...;
uint64 x = packUint2x32(a, b);
The compiler will try to coalesce x with a and b, in the same way as
something like:
uint a = ...;
uint b = ...;
...
uint x = phi(a, b);
with the crucial difference that the definitions of a and b only clobber
part of the register, instead of the whole thing. This information is
carried through the compound flag and compMask bitmask. If compound is
set, then the value has been coalesced in such a way that not all the
defs clobber the entire register. The compMask bitmask describes which
subregister each def clobbers, although it does it in a slightly
convoluted way. It's an invariant that once compound is set on one def,
it must be set for all the defs in a given coalesced value.
In more detail, the constraints pass will first create extra moves:
uint a = ...;
uint b = ...;
uint a' = a;
uint b' = b;
uint64 x = packUint2x32(a', b');
and then RA will merge values involved in MERGE/SPLIT instructions,
merging x with a' and b' and making the combined value compound -- this
is relatively simple, and will always succeed since we just created a'
and b', so they never interfere with x, and x has no other definitions,
since we haven't started coalescing moves yet. Basically, we just replaced
the MERGE instruction with an equivalent sequence of partial writes to the
destination. The tricky part comes when we try to merge a' with a
and b' with b. We need to transfer the compound information from a' to a
and b' to b, which copyCompound() does, but we also need to transfer it
to any defs coalesced with a and b, which the code failed to do. Similarly,
if x is the argument to a phi instruction, then when we try to merge it
with other arguments to the same phi by coalescing moves, we'd have
problems guaranteeing that all the other merged defs stay up-to-date.
One tricky part of fixing this is that in order to properly propagate
the information from a' to a, we need to do it before the defs for a and
a' are merged in coalesceValues(), since we need to know which defs are
merged with a but not a' -- after coalesceValues() returns, all the defs
have been combined, so we don't know which is which. I took the approach
of calling copyCompound() inside coalesceValues(), instead of
afterwards.
v2: (mhenning) This now loops over mergedDefs in copyCompound, to update
it for changes made in bcf6a9ec
Cc: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: Karol Herbst <kherbst@redhat.com>
Tested-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: M Henning <drawoc@darkrefraction.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17115>
If there is a z24 unorm depth buffer, but it's the hw is using
a z32, the border color needs to be clamped appropriately.
This creates a second sampler with the clamped border color,
and uses it if needed. The checks might need some tightening up.
Fixes: zink on radv
dEQP-GLES31.functional.texture.border_clamp.range_clamp.nearest_unorm_depth
dEQP-GLES31.functional.texture.border_clamp.range_clamp.nearest_unorm_depth_uint_stencil_sample_depth
Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17305>
Fixes the following compiler warning:
svga_pipe_query.c:1295:17: warning: 'result.u64' may be used uninitialized [-Werror=maybe-uninitialized]
Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17121>
Now that we have a common pipeline layout with reference counting, we
don't need these driver hooks for reference counting anymore.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17286>
There's some tricky stuff in here with properly handling Vulkan
allocation scopes and reference counting. Probably best to do it once.
Also, this means that common code can now take references to descriptor
set layouts which seems useful.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17286>
D3D12 wants Width, Height and Depth to be aligned on the block width,
height and depth size. But Vulkan allows the width, height or depth to
be unaligned at the image boundary if image.{width,height,depth} is
not aligned.
Let's explicitly align things in the copy paths.
Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17250>
D3D12_RESOURCE_STATE_DEPTH_READ only provides access for fixed-function
depth/stencil test. If we want the shaders to be able to read the
depth/stencil attachment, we need to combine
D3D12_RESOURCE_STATE_DEPTH_READ and
D3D12_RESOURCE_STATE_PIXEL_SHADER_RESOURCE.
Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17250>
Sometimes there's no obvious mappings between a VkImageLayout and
a D3D12_RESOURCE_STATE, so let's just provide a helper that takes
before/after resource states instead of old/new layouts, and use it
to fix the resolve case.
Fixes: 35356b1173 ("dzn: Cache and pack transition barriers")
Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17250>
We are about to introduce the same function taking D3D12_RESOURCE_STATES
arguments instead of VkImageLayout ones. Let's rename the function
accordingly.
Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17250>
We want to pack ResourceBarriers() call as much as we can, so let's
not dzn_cmd_buffer_queue_transition_barriers() when we could still queue
new barriers.
Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17250>
Some Vulkan -> D3D12 API mismatches force us to do behind-the-scene
transitions to get resources in the right state when executing
non-native operations. In this case, caching the transition until the
resource is actually used might save us unneeded transitions.
The packing aspect is mostly useful to limit the ExecuteBarriers()
call overhead. Right now we do per-resource packing, and any hole
in the subresource range will trigger several ExecuteBarriers()
calls. This can be improved by collecting barriers in a separate
array, and flushing the collected transition barriers just before
executing the operation using the subresources pointed by those
barriers. While not impossible, it'd be more verbose than what we
have right now, so I'm not entirely convinced it's worth it.
Caching could be improved to avoid any unnecessary flush when we do
blit or copy operations and transition the resources back to their
original state, since the user might decide to transition the image to
a new layout just after that. But doing that would require keeping
track of all resources used by dispatch/draw operations, which in turn
implies keeping info about which of the descriptor set resources are
used by the graphics/compute pipelines. Not sure the it's worth the
extra complexity given D3D12 enhanced barriers are just around the
corner, and those map pretty nicely to the vulkan barrier+image-layout
model.
Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17274>