Commit Graph

146816 Commits

Author SHA1 Message Date
Mike Blumenkrantz
35f4051a90 zink: fix 64bit float shader ops
this was being set from back before zink actually supported 64bit
natively and only 32bit was functional, but it breaks 64bit support

cc: mesa-stable

fixes (lavapipe):
KHR-GL46.gpu_shader_fp64.builtin.mod_dvec2
KHR-GL46.gpu_shader_fp64.builtin.mod_dvec3
KHR-GL46.gpu_shader_fp64.builtin.mod_dvec4

Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15274>
(cherry picked from commit 5fae35fb17)
2022-03-10 23:24:00 +00:00
Dave Airlie
129966c778 gallivm/nir: extract a valid texture index according to exec_mask.
When using indirect textures, some lanes may not be active,
particularly in a loop, so as with some other areas, extracting
the correct lane is needed here. This extracts the last valid one.

KHR-GL45.texture_barrier.* on zink.

Fixes: e168d148d7 ("gallivm/nir: handle non-uniform texture offsets")

Acked-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15259>
(cherry picked from commit 8346983775)
2022-03-10 23:24:00 +00:00
Pierre-Eric Pelloux-Prayer
45e0a5ddb4 radeonsi: change rounding mode to round to even
Use ROUND_TO_EVEN instead of TRUNCATE; this matches what pal and radv do.

This fixes the spec@ext_framebuffer_multisample@turn-on-off tests.

Cc: mesa-stable

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Mihai Preda <mhpreda@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15240>
(cherry picked from commit 9c49550163)
2022-03-10 23:24:00 +00:00
Mike Blumenkrantz
1aba006c84 zink: invalidate non-punted recycled descriptor sets that are not valid
these sets may contain refs from the descriptors which need to be removed
to avoid invalid memory access if the ref is leaked

cc: mesa-stable

Acked-by: Emma Anholt <emma@anholt.net>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15226>
(cherry picked from commit e0030bc39f)
2022-03-10 23:24:00 +00:00
Mike Blumenkrantz
3cca717bc1 zink: stop leaking descriptor sets
when migrating a recycled set here, the set was previously invalid and
in the recycled table, meaning it can be reused directly so long as
it's first invalidated

the previous code would instead pop a different set off the allocation array,
leaking this one

cc: mesa-stable

Acked-by: Emma Anholt <emma@anholt.net>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15226>
(cherry picked from commit d63f3c31b7)
2022-03-10 23:24:00 +00:00
Mike Blumenkrantz
3579041897 zink: mark fbfetch push sets as non-cached
these can't be cached, so ensure the value isn't uninitialized

Test case 'KHR-GL46.blend_equation_advanced.blend_all.GL_HARDLIGHT_KHR_all_qualifier'..
==1193311== Conditional jump or move depends on uninitialised value(s)
==1193311==    at 0x634EF05: update_push_ubo_descriptors (zink_descriptors.c:1230)
==1193311==    by 0x634FCC5: zink_descriptors_update (zink_descriptors.c:1412)
==1193311==    by 0x63A5EA1: void zink_draw<(zink_multidraw)1, (zink_dynamic_state)3, true, false>(pipe_context*, pipe_draw_info const*, unsigned int, pipe_draw_indirect_info const*, pipe_draw_start_count_bias const*, unsigned int, pipe_vertex_state*, unsigned int) (zink_draw.cpp:788)
==1193311==    by 0x635A0AF: void zink_draw_vbo<(zink_multidraw)1, (zink_dynamic_state)3, true>(pipe_context*, pipe_draw_info const*, unsigned int, pipe_draw_indirect_info const*, pipe_draw_start_count_bias const*, unsigned int) (zink_draw.cpp:907)
==1193311==    by 0x6174397: tc_call_draw_single (u_threaded_context.c:3150)
==1193311==    by 0x616C403: tc_batch_execute (u_threaded_context.c:211)
==1193311==    by 0x616CA44: _tc_sync (u_threaded_context.c:362)
==1193311==    by 0x61725E9: tc_texture_map (u_threaded_context.c:2274)
==1193311==    by 0x5A9DAE9: pipe_texture_map_3d (u_inlines.h:572)
==1193311==    by 0x5A9EB80: st_ReadPixels (st_cb_readpixels.c:530)
==1193311==    by 0x5A0647A: read_pixels (readpix.c:1178)
==1193311==    by 0x5A0647A: _mesa_ReadnPixelsARB (readpix.c:1195)
==1193311==    by 0x5A06517: _mesa_ReadPixels (readpix.c:1210)

cc: mesa-stable

Acked-by: Emma Anholt <emma@anholt.net>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15226>
(cherry picked from commit 9a91a520de)
2022-03-10 23:24:00 +00:00
Mike Blumenkrantz
7efd944384 zink: fix descriptor cache pointer array allocation
this got mixed up during some refactor and started indexing based on the
number of bindings instead of the number of descriptors, which means
that array descriptor bindings would have overlapping array memory

cc: mesa-stable

Acked-by: Emma Anholt <emma@anholt.net>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15226>
(cherry picked from commit ab3725f533)
2022-03-10 23:24:00 +00:00
Mike Blumenkrantz
ffee465701 zink: wait on program cache fences before destroying programs
if these still have outstanding cache jobs, deleting the object now
will cause a crash

maybe fixes some cts flakiness?

cc: mesa-stable

Acked-by: Emma Anholt <emma@anholt.net>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15226>
(cherry picked from commit c5f585f45a)
2022-03-10 23:24:00 +00:00
Mike Blumenkrantz
969b7b4a8a zink: use a fence for pipeline cache update jobs
otherwise there's nothing to wait on

cc: mesa-stable

Acked-by: Emma Anholt <emma@anholt.net>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15226>
(cherry picked from commit 382798ddbd)
2022-03-10 23:24:00 +00:00
Mike Blumenkrantz
2579355eea zink: always update shader variants when rebinding a gfx program
the variant data may have changed in the meanwhile, so do a pass to
ensure that the correct variant is used

cc: mesa-stable

fixes caselist:
dEQP-GLES31.functional.shaders.sample_variables.sample_mask.discard_half_per_two_samples.multisample_texture_4
dEQP-GLES31.functional.shaders.sample_variables.sample_mask.discard_half_per_two_samples.singlesample_rbo

Acked-by: Emma Anholt <emma@anholt.net>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15226>
(cherry picked from commit 9a6c58b2f7)
2022-03-10 23:24:00 +00:00
Alyssa Rosenzweig
7e57cfd1f7 panfrost: Fix set_sampler_views for big GL
Roughly use the freedreno logic to handle all the extra things that will
come up in our Piglit sooner than later.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13203>
(cherry picked from commit 304851422a)
2022-03-10 23:23:59 +00:00
Icecream95
2e061263c7 panfrost: Fix ubo_mask calculation
BITSET_MASK returns ~0 when given an input of zero, when we need it to
return 0 instead.

Fixes shaders with only sysvals but no UBOs when push constants are
disabled.

This breaks when 31 or 32 UBOs are used, but PAN_MAX_CONST_BUFFERS is
currently set to 16.

Fixes: c246af0dd8 ("panfrost: Only upload UBOs when needed")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15250>
(cherry picked from commit 9d4441c71a)
2022-03-10 23:23:59 +00:00
Icecream95
26d8573cd2 pan/bi: Make disassembler build reproducibly
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15250>
(cherry picked from commit 42caddcf6b)
2022-03-10 23:23:59 +00:00
Icecream95
215ee9b85d panfrost: Re-emit descriptors after resource shadowing
This could be made slightly more efficient by only setting the dirty
state that is needed, but eventually you reach a point where it's
cheaper to re-emit everything than work out what can or can't be kept.

Fixes rendering issues in Duckstation.

Fixes: cd2c1ef9da ("panfrost: Dirty track textures/samplers")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15250>
(cherry picked from commit d6c431c2e3)
2022-03-10 23:23:59 +00:00
Icecream95
8770ff5655 panfrost: Set dirty state in set_shader_buffers
Otherwise the pointer (which is uploaded as a sysval) won't be updated
when a new SSBO is bound.

Fixes: c34b760b9f ("panfrost: Dirty track constant buffers")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15250>
(cherry picked from commit b164ee0d7b)
2022-03-10 23:23:59 +00:00
Icecream95
65021851cb pan/bi: Check dependencies of both destinations of instructions
TEXC can have two destinations; the value for neither of them can be
used in the same bundle, so extend the code to check for this to
iterate over both destinations.

Fixes artefacts in the game "LIMBO".

Fixes: a303076c1a ("pan/bi: Add bi_instr_schedulable predicate")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15250>
(cherry picked from commit cb8c47b15e)
2022-03-10 23:23:59 +00:00
Icecream95
c627a2d201 panfrost: Set PIPE_CAP_QUADS_FOLLOW_PROVOKING_VERTEX_CONVENTION
Fixes arb-provoking-vertex-render Piglit test.

Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15250>
(cherry picked from commit d54efebf04)
2022-03-10 23:23:59 +00:00
Mike Blumenkrantz
d178bc8195 gallivm: avoid division by zero when computing cube face
this is illegal and produces NaNs which blow up the sample instr

cc: mesa-stable

fixes (llvmpipe and zink):
KHR-GL45.incomplete_texture_access.sampler
dEQP-GLES31.functional.program_uniform.by_pointer.render.array_in_struct.sampler2D_samplerCube_both
dEQP-GLES31.functional.program_uniform.by_pointer.render.array_in_struct.sampler2D_samplerCube_fragment
dEQP-GLES31.functional.program_uniform.by_pointer.render.array_in_struct.sampler2D_samplerCube_vertex
dEQP-GLES31.functional.program_uniform.by_pointer.render.basic.samplerCube_both
dEQP-GLES31.functional.program_uniform.by_pointer.render.basic.samplerCube_fragment
dEQP-GLES31.functional.program_uniform.by_pointer.render.basic.samplerCube_vertex
dEQP-GLES31.functional.program_uniform.by_pointer.render.basic_struct.sampler2D_samplerCube_both
dEQP-GLES31.functional.program_uniform.by_pointer.render.basic_struct.sampler2D_samplerCube_fragment
dEQP-GLES31.functional.program_uniform.by_pointer.render.nested_structs_arrays.sampler2D_samplerCube_both
dEQP-GLES31.functional.program_uniform.by_pointer.render.nested_structs_arrays.sampler2D_samplerCube_fragment
dEQP-GLES31.functional.program_uniform.by_pointer.render.nested_structs_arrays.sampler2D_samplerCube_vertex
dEQP-GLES31.functional.program_uniform.by_pointer.render.struct_in_array.sampler2D_samplerCube_both
dEQP-GLES31.functional.program_uniform.by_pointer.render.struct_in_array.sampler2D_samplerCube_fragment
dEQP-GLES31.functional.program_uniform.by_pointer.render.struct_in_array.sampler2D_samplerCube_vertex
dEQP-GLES31.functional.program_uniform.by_value.render.array_in_struct.sampler2D_samplerCube_both
dEQP-GLES31.functional.program_uniform.by_value.render.array_in_struct.sampler2D_samplerCube_fragment
dEQP-GLES31.functional.program_uniform.by_value.render.array_in_struct.sampler2D_samplerCube_vertex
dEQP-GLES31.functional.program_uniform.by_value.render.basic.samplerCube_both
dEQP-GLES31.functional.program_uniform.by_value.render.basic.samplerCube_fragment
dEQP-GLES31.functional.program_uniform.by_value.render.basic.samplerCube_vertex
dEQP-GLES31.functional.program_uniform.by_value.render.basic_struct.sampler2D_samplerCube_both
dEQP-GLES31.functional.program_uniform.by_value.render.basic_struct.sampler2D_samplerCube_fragment
dEQP-GLES31.functional.program_uniform.by_value.render.nested_structs_arrays.sampler2D_samplerCube_both
dEQP-GLES31.functional.program_uniform.by_value.render.nested_structs_arrays.sampler2D_samplerCube_fragment
dEQP-GLES31.functional.program_uniform.by_value.render.nested_structs_arrays.sampler2D_samplerCube_vertex
dEQP-GLES31.functional.program_uniform.by_value.render.struct_in_array.sampler2D_samplerCube_both
dEQP-GLES31.functional.program_uniform.by_value.render.struct_in_array.sampler2D_samplerCube_fragment
dEQP-GLES31.functional.program_uniform.by_value.render.struct_in_array.sampler2D_samplerCube_vertex

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15246>
(cherry picked from commit c82dcdf598)
2022-03-10 23:23:59 +00:00
Rhys Perry
418c6ba164 radv: include adjust_frag_coord_z in key
Fixes potential pipeline caching bug.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15175>
(cherry picked from commit feb7e30e2d)
[Eric: the disable_aniso_single_level part of the original commit was
dropped in the backport as it didn't apply]
2022-03-10 23:23:59 +00:00
Marek Olšák
1fe2b1c76b radeonsi: fix an assertion failure with register shadowing
The problem is that dirty_states must be 0 for any state that is NULL
in "queued". This code was flagging dirty_states for such states because
it was only looking at "emitted". It should have been looking at "queued".

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15209>
(cherry picked from commit a02dd17cb3)
2022-03-10 23:23:59 +00:00
Samuel Pitoiset
7bd9a72762 radv: disable DCC for Fable Anniversary, Dragons Dogma, GTA IV and more
Also Starcraft 2 and The Force Unleashed II.

These games are known to be affected by the feedback loop issue. We will
fix this properly soon but as a hotfix disabling DCC should be enough.

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/4424
Cc: mesa-stable
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15203>
(cherry picked from commit 4380916b76)
2022-03-10 23:23:59 +00:00
Samuel Pitoiset
513aac74cf radv,drirc: move RADV workarounds to 00-radv-defaults.conf
Because we have to maintain two different packages of Mesa, one
specific to RADV and another one for RadeonSI and such, it's a bit
annoying to have to synchronize the drirc entries. Currently, only our
Mesa package installs 00-mesa-defaults.conf which means we have to
backport the drirc RADV changes.

This splits 00-mesa-defaults.conf in two to move the drirc RADV entries
to src/amd/vulkan/00-radv-defaults.conf. Meson will install the file
only if RADV is built.

There is still a caveat for common drirc workarounds like for WSI but
they are rare enough and we could still duplicate them if needed.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15152>
(cherry picked from commit 53ca85ac2a)
2022-03-10 23:23:58 +00:00
Jonathan Gray
df8cd7d8a4 radv: use MAJOR_IN_SYSMACROS for sysmacros.h include
fixes build on OpenBSD
../src/amd/vulkan/radv_device.c:35:10: fatal error: 'sys/sysmacros.h' file not found

Fixes: 7aaa54feb5 ("radv: implement VK_EXT_physical_device_drm")
Signed-off-by: Jonathan Gray <jsg@jsg.id.au>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13448>
(cherry picked from commit f0398180a5)
2022-03-10 23:23:58 +00:00
Timur Kristóf
4d16399ebf ac/nir/ngg: Fix mixed up primitive ID after culling.
When NGG culling is enabled, make sure that the correct
primitive ID is exported by each lane.

Fixes: e97f0463a8 "ac/nir: Implement NGG deferred attribute culling in NIR."
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/6050
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15055>
(cherry picked from commit 3759a16d8a)
2022-03-10 23:23:58 +00:00
Marek Olšák
2872f75219 amd: add a workaround for an SQ perf counter bug
Cc: mesa-stable@lists.freedesktop.org
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15098>
(cherry picked from commit 197467c238)
2022-03-10 23:23:58 +00:00
Alyssa Rosenzweig
7e569dd59f panfrost: Push twice as many uniforms
The limit for Bifrost is twice as high as previously thought -- the limit is 64
*slots* of FAU, not 64 words. Each slot is 2 words. We can push twice as much,
saving a considerable number of cycles in some cases.

total instructions in shared programs: 2454260 -> 2431502 (-0.93%)
instructions in affected programs: 845176 -> 822418 (-2.69%)
helped: 3376
HURT: 304
helped stats (abs) min: 1.0 max: 60.0 x̄: 7.92 x̃: 6
helped stats (rel) min: 0.13% max: 45.45% x̄: 4.60% x̃: 4.11%
HURT stats (abs)   min: 1.0 max: 60.0 x̄: 13.06 x̃: 8
HURT stats (rel)   min: 0.16% max: 35.09% x̄: 7.58% x̃: 6.52%
95% mean confidence interval for instructions value: -6.50 -5.87
95% mean confidence interval for instructions %-change: -3.75% -3.43%
Instructions are helped.

total tuples in shared programs: 1963383 -> 1951560 (-0.60%)
tuples in affected programs: 638622 -> 626799 (-1.85%)
helped: 2959
HURT: 573
helped stats (abs) min: 1.0 max: 54.0 x̄: 5.61 x̃: 4
helped stats (rel) min: 0.15% max: 28.57% x̄: 3.61% x̃: 3.12%
HURT stats (abs)   min: 1.0 max: 50.0 x̄: 8.35 x̃: 6
HURT stats (rel)   min: 0.25% max: 27.34% x̄: 6.24% x̃: 4.92%
95% mean confidence interval for tuples value: -3.61 -3.08
95% mean confidence interval for tuples %-change: -2.18% -1.85%
Tuples are helped.

total clauses in shared programs: 387817 -> 365111 (-5.85%)
clauses in affected programs: 135527 -> 112821 (-16.75%)
helped: 3489
HURT: 25
helped stats (abs) min: 1.0 max: 43.0 x̄: 6.52 x̃: 5
helped stats (rel) min: 0.82% max: 58.33% x̄: 17.48% x̃: 15.87%
HURT stats (abs)   min: 1.0 max: 3.0 x̄: 1.56 x̃: 1
HURT stats (rel)   min: 2.94% max: 11.11% x̄: 6.87% x̃: 6.67%
95% mean confidence interval for clauses value: -6.67 -6.26
95% mean confidence interval for clauses %-change: -17.65% -16.96%
Clauses are helped.

total cycles in shared programs: 201842.21 -> 168754.04 (-16.39%)
cycles in affected programs: 84035.50 -> 50947.33 (-39.37%)
helped: 3547
HURT: 136
helped stats (abs) min: 0.041665999999999315 max: 54.0 x̄: 9.33 x̃: 8
helped stats (rel) min: 0.17% max: 80.77% x̄: 36.10% x̃: 36.84%
HURT stats (abs)   min: 0.041665999999999315 max: 1.0 x̄: 0.12 x̃: 0
HURT stats (rel)   min: 0.18% max: 12.24% x̄: 1.18% x̃: 0.61%
95% mean confidence interval for cycles value: -9.26 -8.71
95% mean confidence interval for cycles %-change: -35.34% -34.11%
Cycles are helped.

total arith in shared programs: 74918.46 -> 75022.62 (0.14%)
arith in affected programs: 22471.04 -> 22575.21 (0.46%)
helped: 1571
HURT: 1492
helped stats (abs) min: 0.041665999999999315 max: 1.125 x̄: 0.17 x̃: 0
helped stats (rel) min: 0.17% max: 40.00% x̄: 2.50% x̃: 1.96%
HURT stats (abs)   min: 0.041665999999999315 max: 2.375 x̄: 0.25 x̃: 0
HURT stats (rel)   min: 0.16% max: 100.00% x̄: 5.35% x̃: 2.37%
95% mean confidence interval for arith value: 0.02 0.05
95% mean confidence interval for arith %-change: 1.08% 1.56%
Arith are HURT.

total ldst in shared programs: 174812 -> 137889 (-21.12%)
ldst in affected programs: 81319 -> 44396 (-45.41%)
helped: 3722
HURT: 0
helped stats (abs) min: 1.0 max: 62.0 x̄: 9.92 x̃: 8
helped stats (rel) min: 1.82% max: 100.00% x̄: 47.18% x̃: 43.75%
95% mean confidence interval for ldst value: -10.20 -9.64
95% mean confidence interval for ldst %-change: -47.97% -46.39%
Ldst are helped.

total quadwords in shared programs: 1757124 -> 1714130 (-2.45%)
quadwords in affected programs: 584065 -> 541071 (-7.36%)
helped: 3474
HURT: 173
helped stats (abs) min: 1.0 max: 90.0 x̄: 12.66 x̃: 9
helped stats (rel) min: 0.26% max: 34.18% x̄: 8.78% x̃: 8.33%
HURT stats (abs)   min: 1.0 max: 26.0 x̄: 5.76 x̃: 4
HURT stats (rel)   min: 0.45% max: 20.66% x̄: 4.48% x̃: 2.63%
95% mean confidence interval for quadwords value: -12.21 -11.37
95% mean confidence interval for quadwords %-change: -8.36% -7.95%
Quadwords are helped.

total threads in shared programs: 52898 -> 53142 (0.46%)
threads in affected programs: 262 -> 506 (93.13%)
helped: 250
HURT: 6
helped stats (abs) min: 1.0 max: 1.0 x̄: 1.00 x̃: 1
helped stats (rel) min: 100.00% max: 100.00% x̄: 100.00% x̃: 100.00%
HURT stats (abs)   min: 1.0 max: 1.0 x̄: 1.00 x̃: 1
HURT stats (rel)   min: 50.00% max: 50.00% x̄: 50.00% x̃: 50.00%
95% mean confidence interval for threads value: 0.92 0.99
95% mean confidence interval for threads %-change: 93.69% 99.28%
Threads are helped.

total spills in shared programs: 161 -> 107 (-33.54%)
spills in affected programs: 54 -> 0
helped: 27
HURT: 0

total fills in shared programs: 1386 -> 796 (-42.57%)
fills in affected programs: 590 -> 0
helped: 27
HURT: 0

Fixes: d4dccea0ba ("panfrost: Add UBO push data structure")
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15239>
(cherry picked from commit 7bda838c56)
2022-03-10 23:23:58 +00:00
Alyssa Rosenzweig
673f812835 panfrost: Flush resources when shadowing
When we shadow a resource, the backing BO is changed; as such,
existing references to the resource become invalid. So batches accessing the
resource need to be flushed (or otherwise have their references invalidated).

The wrong behaviour change (not flushing) was introduced when we started
tracking resources instead of BOs. The issue manifested as a severe performance
regression in glmark2's -bbuffer test, particular the subdata subtest. The issue
is magnified on slow CPUs; without the fix, the test becomes completely CPU
bound

Relevant glmark2 -bbuffer test from 43fps to 84fps.

Apparently, this causes functional issues too -- this performance-minded change
also fixes a few piglits.

Fixes: cecb889481 ("panfrost: Do tracking of resources, not BOs")
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Reported-by: Chris Healy <cphealy@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13502>
(cherry picked from commit 988d5aae74)
2022-03-10 23:23:58 +00:00
Alyssa Rosenzweig
74ff12b0d6 panfrost: Handle NULL samplers
Fixes a NULL dereference in Piglit fp-fragment-position, getting the
test to pass.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13203>
(cherry picked from commit 5536852d60)
2022-03-10 23:23:58 +00:00
Alyssa Rosenzweig
67e2cd9ec6 panfrost: Handle NULL sampler views
Fixes a NULL dereference in Piglit fp-fragment-position.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13203>
(cherry picked from commit 53ef20f08d)
2022-03-10 23:23:58 +00:00
Alyssa Rosenzweig
3e1d5797ae panfrost: Fix FD resource_get_handle
When handle->type is WINSYS_HANDLE_TYPE_FD, the caller wants a file descriptor
for the BO backing the resource. We previously had two paths for this:

1. If rsrc->scanout is available, we prime the GEM handle from the KMS device
   (rsrc->scanout->handle) to a file descriptor via the KMS device.

2. If rsrc->scanout is not available, we prime the GEM handle from the GPU
   (bo->gem_handle) to a file descriptor via the GPU device.

In both cases, the caller passes in a resource (with BO) and expects out a file
descriptor. There are no direct GEM handles in the function signature; the
caller doesn't care which GEM handle we prime to get the file descriptor. In
principle, both paths produce the same file descriptor for the same BO, since
both GEM handles represent the same underlying resource (viewed from different
devices).

On grounds of redundancy alone, it makes sense to remove the rsrc->scanout path.
Why have a path that only works sometimes, when we have another path that works
always?

In fact, the issues with the rsrc->scanout path are deeper. rsrc->scanout is
populated by renderonly_create_gpu_import_for_resource, which does the
following:

1. Get a file descriptor for the resource by resource_get_handle with
   WINSYS_HANDLE_TYPE_FD
2. Prime the file descriptor to a GEM handle via the KMS device.

Here comes strike number 2: in order to get a file descriptor via the KMS
device, we had to /already/ get a file descriptor via the GPU device. If we go
down the KMS device path, we effectively round trip:

   GPU handle -> fd -> KMS handle -> fd

There is no good reason to do this; if everything works, the fd is the same in
each case. If everything works. If.

The lifetimes of the GPU handle and the KMS handle are not necessarily bound. In
principle, a resource can be created with scanout (constructing a KMS handle).
Then the KMS view can be destroyed (invalidating the GEM handle for the KMS
device), even though the underlying resource is still valid. Notice the GPU
handle is still valid; its lifetime is tied to the resource itself. Then a
caller can ask for the FD for the resource; as the resource is still valid, this
is sensible. Under the scanout path, we try to get the FD by priming the GEM
handle on the KMS device... but that GEM handle is no longer valid, causing the
PRIME ioctl to fail with ENOENT. On the other hand, if we primed the GPU GEM
handle, everything works as expected.

These edge cases are not theoretical; recent versions of Xwayland trigger this
ENOENT, causing issue #5758 on all Panfrost devices. As far as I can tell, no
other kmsro driver has this 'special' kmsro path; the only part of
resource_get_handle that needs special handling for kmsro is getting a KMS
handle.

Let's remove the broken, useless path, fix Xwayland, bring us in line with other
drivers, and delete some code.

Thank you for coming to my ted talk.

Closes: #5758
Fixes: 7da251fc72 ("panfrost: Check in sources for command stream")
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Reported-and-tested-by: Jan Palus <jpalus@fastmail.com>
Reviewed-by: Simon Ser <contact@emersion.fr>
Reviewed-by: James Jones <jajones@nvidia.com>
Acked-by: Daniel Stone <daniels@collabora.com>
Tested-by: Dan Johansen <strit@manjaro.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15120>
(cherry picked from commit b5734cc1c4)
2022-03-10 23:23:58 +00:00
Xiaohui Gu
293e3692b1 iris: Mark a dirty update when vs_needs_sgvs_element value changed
Add vs_needs_sgvs_element value check when updating vertex
element dirty state in iris_update_compiled_vs to solve
render error of Android game "Genshin Impact".

Signed-off-by: Xiaohui Gu <xiaohui.gu@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15142>
(cherry picked from commit 4d81c60e11)
2022-03-10 23:23:57 +00:00
Dave Airlie
ddcecb54b0 crocus: change the line width workaround for gfx4/5
This fixes piglit line-flat-clip-color and the hud fps counter.

Fixes: 6b7a68b7c2 ("crocus: add missing line smooth bits.")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15229>
(cherry picked from commit a10b5d7086)
2022-03-10 23:23:57 +00:00
Lionel Landwerlin
f2ff1142ba intel/fs: fix total_scratch computation
We only have a single prog_data::total_scratch for all shader variants
(SIMD 8, 16, 32). Therefore we should always max the total_scratch on
top of existing variant.

We probably haven't run into that issue before because we compile by
increasing SIMD size and higher SIMD size is more likely to spill. But
for bindless shaders with return shaders, if the last return part
doesn't spill, we completely ignore the previous parts' scratch
computation.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: mesa-stable
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15193>
(cherry picked from commit 96c8880900)
2022-03-10 23:23:57 +00:00
Lionel Landwerlin
f4213683da anv: fix fast clear type value with external images
Disable fast clear if not supported by the external modifier.

v2: Set fast_clear value to NONE in case of import/export from/to external

v3: Move logic next to existing acquire/release checks (Nanley)

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: mesa-stable
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/6056
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15096>
(cherry picked from commit 214092da87)
2022-03-10 23:23:57 +00:00
Connor Abbott
1e233de9e9 ir3/nir: Fix 1d array readonly images
ncoords includes the array index, and the NIR source has the array index
as its last component, so we have to insert the extra y coordinate in
the middle in this case.

Fixes: 0bb0cac ("freedreno/ir3: handle image buffer")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15114>
(cherry picked from commit 58d72f45e5)
2022-03-10 23:23:57 +00:00
Connor Abbott
3af00d0624 ir3: Don't always set bindless_tex with readonly images
Fixes: 274f381 ("ir3: Plumb through bindless support")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15114>
(cherry picked from commit 21ac044c3e)
2022-03-10 23:23:57 +00:00
Jonathan Gray
63885e7135 util: use correct type in sysctl argument
Fixes build on OpenBSD/macppc powerpc

error: incompatible pointer types passing 'int *' to parameter of type 'size_t *'
    (aka 'unsigned long *') [-Werror,-Wincompatible-pointer-types]

Fixes: 01bd21eef8 ("gallium: Import Dennis Smit cpu detection code.")
Signed-off-by: Jonathan Gray <jsg@jsg.id.au>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6511>
(cherry picked from commit 6250a3bc18)
2022-03-10 23:23:57 +00:00
Jonathan Gray
c9afc6a089 util: fix build with clang 10 on mips64
On mips64, the compiler does not allow use of non-zero argument with
__builtin_frame_address(). However, the returned frame address is only
used when PIPE_ARCH_X86 is defined. The compile error can be avoided
by making #ifdef PIPE_ARCH_X86 cover the getting of frame address too.

The argument checking of __builtin_frame_address() has been present
as a debug assert in clang 8. In clang 10, there is a proper runtime
check for the argument. This is why the build has not failed before.

Fixes: dc94a0506f ("gallium: Do not add -Wframe-address option for gcc <= 4.4.")
from Visa Hankala

Signed-off-by: Jonathan Gray <jsg@jsg.id.au>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6511>
(cherry picked from commit 0536b69133)
2022-03-10 23:23:57 +00:00
Jonathan Gray
28afbf3840 util/u_atomic: fix build on clang archs without 64-bit atomics
Make this build on clang architectures that don't have 64-bit atomic
instructions.  Clang doesn't allow redeclaration (and therefore
redefinition) of the __sync_* builtins.  Use #pragma redefine_extname
to work around that restriction.  Clang also turns __sync_add_and_fetch
into __sync_fetch_and_add (and __sync_sub_and_fetch into
__sync_fetch_and_sub) in certain cases, so provide these functions as
well.

Fixes: a6a38a038b ("util/u_atomic: provide 64bit atomics where they're missing")
patch from Mark Kettenis

Signed-off-by: Jonathan Gray <jsg@jsg.id.au>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6511>
(cherry picked from commit f12c107b03)
2022-03-10 23:23:57 +00:00
Jonathan Gray
e1a35767c1 util: fix util_cpu_detect_once() build on OpenBSD
Correct type for sysctl argument to fix the build.

../src/util/u_cpu_detect.c:631:29: error: incompatible pointer types passing 'int *' to parameter of type 'size_t *' (aka 'unsigned long *') [-Werror,-Wincompatible-pointer-types]
      sysctl(mib, 2, &ncpu, &len, NULL, 0);
                            ^~~~

Fixes: 5623c75e40 ("util: Fix setting nr_cpus on some BSD variants")
Signed-off-by: Jonathan Gray <jsg@jsg.id.au>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13448>
(cherry picked from commit afece589dc)
2022-03-10 23:23:57 +00:00
Jonathan Gray
237197ee69 util: unbreak non-linux mips64 build
Put linux specific path inside an ifdef.  Unbreaks mips64 build on
OpenBSD and likely other systems without Elf64_auxv_t.

Fixes: 88b234d7a7 ("gallivm: add basic mips64 support and set mcpu to mips64r5 on ls3a4000")
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15166>
(cherry picked from commit 7d609431d4)
2022-03-10 23:23:57 +00:00
Erik Faye-Lund
f6a5684750 docs: remove incorrect drivers from extension
This extension isn't wired up in Gallium, so there's just no way a
Gallium driver like Panfrost exposes it.

While there were support in i965 for this for the cancelled Broxton GPU,
thre's no such support in the Iris driver. And since Broxton has been
cancelled, it's unlikely to be wired up any time soon.

Fixes: da23a31726 ("docs/features: Update ASTC entries for Panfrost")
Acked-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15145>
(cherry picked from commit 834db3aa8d)
2022-03-10 23:23:56 +00:00
Dave Airlie
6ae28c43dc draw/so: don't use pre clip pos if we have a tes either.
This check for geom shader needed to be expanded for tess support.

dEQP-VK.transform_feedback.simple.depth_clip_control_tese with lvp

Fixes: dacf8f5f5c ("draw: hook up final bits of tessellation")

Acked-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15128>
(cherry picked from commit b77ef4dd60)
2022-03-10 23:23:56 +00:00
Rhys Perry
0e61e32758 anv: Enable nir_opt_access
This commit will enable pass for searching readonly / writeonly
access when it's missing.

We don't support shaderStorageImageReadWithoutFormat
and the optimization pass causes those shaders to
take the write-only path which does support formatless.

Following games are affected with positive result:
 - Wolfenstein: Youngblood
 - Wolfenstein II: The New Colossus https://gitlab.freedesktop.org/mesa/mesa/-/issues/3138
 - Rage 2 https://gitlab.freedesktop.org/mesa/mesa/-/issues/5791
 - The Surge 2 https://gitlab.freedesktop.org/mesa/mesa/-/issues/5805
 - Metro Exodus https://gitlab.freedesktop.org/mesa/mesa/-/issues/4703
 - DOOM Eternal https://gitlab.freedesktop.org/mesa/mesa/-/issues/4273

Cc: mesa-stable
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/3138,https://gitlab.freedesktop.org/mesa/mesa/-/issues/5791,https://gitlab.freedesktop.org/mesa/mesa/-/issues/4273
Signed-off-by: Mykhailo Skorokhodov <mykhailo.skorokhodov@globallogic.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15082>
(cherry picked from commit ded9cb904f)
2022-03-10 23:23:56 +00:00
Danylo Piliaiev
ea8391a890 turnip: Use LATE_Z when there might be depth/stencil feedback loop
Otherwise a shader invocation would read the value which should have
been set AFTER this shader invocation.

Fixes tests:
 dEQP-VK.rasterization.rasterization_order_attachment_access.depth.samples_1.multi_draw_barriers
 dEQP-VK.rasterization.rasterization_order_attachment_access.stencil.samples_1.multi_draw_barriers

Fixes: 71595a189a
("tu: Fix feedback loops in sysmem mode")

Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15106>
(cherry picked from commit dab34bd5c8)
2022-03-10 23:23:56 +00:00
Jason Ekstrand
8d920d52c5 anv: Don't assume depth/stencil attachments have depth
If a secondary command buffer is used and the client provides a
framebuffer and that framebuffer has a stencil-only attchment, we would
try to get the aux usage for the depth component of that attachment and
crash.  Check the aspects of the image before looking at aux usage.
This fixes at least the following SkQP tests on my Tigerlake:

 - vk_circular-clips
 - vk_filterfastbounds
 - vk_innershapes_bw
 - vk_lineclosepath
 - vk_multipicturedraw_rrectclip_simple
 - vk_pathinvfill
 - vk_quadclosepath
 - vk_rrect_clip_bw
 - vk_windowrectangles

Fixes: 0d8b9c529c ("anv: Allow PMA optimization to be enabled in secondary command buffers")
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Tested-by: Matt Turner <mattst88@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15048>
(cherry picked from commit df0e2a1565)
2022-03-10 23:23:56 +00:00
Lionel Landwerlin
38c023e16b anv: fix conditional render for vkCmdDrawIndirectByteCountEXT
We just forgot about conditional render for this entry point.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 2be89cbd82 ("anv: Implement vkCmdDrawIndirectByteCountEXT")
Tested-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14891>
(cherry picked from commit 93a90fc85d)
2022-03-10 23:23:56 +00:00
Paulo Zanoni
23094f01be iris: fix register spilling on compute shaders on XeHP
XeHP scratch space is handled differently. Commit ae18e1e707
implemented support for it, but handled it differently between render
and compute shaders: it calculates scratch_addr differently and
doesn't pin the buffer on compute. Make it work on compute shaders by
calling pin_scratch_space() from iris_compute_walker(), which fixes
both the address and the pinning.

This commit can be verified by the two-year-old-but-still-unreviewed
Piglit MR 234. You can also verify this by running a very simple
compute shader with INTEL_DEBUG=spill_fs.

References: https://gitlab.freedesktop.org/mesa/piglit/-/merge_requests/234
Fixes: ae18e1e707 ("iris: Add support for scratch on XeHP")
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15070>
(cherry picked from commit d10fd5b7c9)
2022-03-10 23:23:56 +00:00
Eric Engestrom
2d12eb323e .pick_status.json: Mark 3f7da0c584 as denominated 2022-03-10 23:23:56 +00:00
Eric Engestrom
52a36a4595 .pick_status.json: Mark 4ed7329236 as denominated 2022-03-10 23:23:56 +00:00