Commit Graph

194760 Commits

Author SHA1 Message Date
Daniel Stone
75c4f447bd ci/piglit: Use common $RESULTS_DIR
This means that $PIGLIT_RESULTS_DIR no longer works.

Signed-off-by: Daniel Stone <daniels@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31110>
2024-09-13 10:12:09 +01:00
Daniel Stone
b8c9bbabcf ci/dxvk: Use common results dir
Signed-off-by: Daniel Stone <daniels@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31110>
2024-09-13 10:12:09 +01:00
Daniel Stone
476a5aab34 ci/deqp: Use common $RESULTS_DIR
This means that setting $DEQP_RESULTS_DIR no longer works, but it does
clean up the CI setup.

Signed-off-by: Daniel Stone <daniels@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31110>
2024-09-13 10:12:09 +01:00
Daniel Stone
4143199be7 ci/android: Use common $RESULTS_DIR for cuttlefish
Signed-off-by: Daniel Stone <daniels@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31110>
2024-09-13 10:12:09 +01:00
Daniel Stone
9b6d14aed1 ci: Always create results dir from init
During init-stage2 (used for hardware jobs) and setup-test-env (used
for running directly on shared runners), make sure we always create a
results directory.

Signed-off-by: Daniel Stone <daniels@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31110>
2024-09-13 10:12:09 +01:00
Daniel Stone
111c15ae4a ci/bare-metal: Don't move structured log file
Just create it in the right place to begin with.

Signed-off-by: Daniel Stone <daniels@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31110>
2024-09-13 10:12:08 +01:00
Daniel Stone
2dbadf8109 ci: Avoid subshell for executing HWCI_TEST_SCRIPT
Ensure that $HWCI_TEST_SCRIPT is an executable we can run ourselves, and
run that directly instead of invoking a subshell.

Signed-off-by: Daniel Stone <daniels@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31110>
2024-09-13 10:12:08 +01:00
Daniel Stone
275727add0 ci/virgl: Special-case llvmpipe parallelisation
When we're running VirGL/Venus, we sometimes want to invert our
parallelism. As some commands can serialise at the host level, we don't
always want to launch as many test clients as we have CPU cores.
Instead, we want to use our parallelism for llvmpipe's rendering, and
launch only a single test at a time.

Signed-off-by: Daniel Stone <daniels@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31110>
2024-09-13 10:12:08 +01:00
Konstantin Seurer
bacf9752f4 radv: Work around broken terrain in Warhammer III
Hiding storage support for depth formats forces the game to take a
different, working path for terrain height map initialization.

cc: mesa-stable

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31152>
2024-09-13 07:48:02 +00:00
Martin Roukala (né Peres)
82946dc152 freedreno/ci: fix the stage of the a750 jobs
We were accidentally overriding the job stage in .b2c-freedreno-vk-test,
which ended up moving the a750 jobs to the `freedreno` stage instead of
`freedreno-postmerge`.

Fixes: 25c70888a5 ("ci/broadcom: Move manual/nightly jobs to postmerge stage")
Signed-off-by: Martin Roukala (né Peres) <martin.roukala@mupuf.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31142>
2024-09-13 01:51:45 +00:00
Caio Oliveira
5e47c5f94a intel/executor: Fix a couple of memory leaks in the tool
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31120>
2024-09-13 01:21:24 +00:00
Ian Romanick
3b13a0018f radv: Use nir_opt_generate_bfi to generate bitfield_select
v2: Move to radv_optimize_nir_algebraic. Suggested by Georg.

Tested-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31006>
2024-09-13 00:21:00 +00:00
Ian Romanick
55448cf43a radeonsi: Use nir_opt_generate_bfi to generate bitfield_select
Not tested.

v2: Move after nir_opt_algebraic. Suggested by Georg.

v3: has_bitfield_select is always enabled on GCN+. Suggested by Georg.

Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31006>
2024-09-13 00:21:00 +00:00
Ian Romanick
79bc1da203 r600: Use nir_opt_generate_bfi to generate bitfield_select
Not tested.

v2: Move after nir_opt_algebraic. Suggested by Georg.

Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31006>
2024-09-13 00:21:00 +00:00
Ian Romanick
447dae7c13 intel/brw: Use nir_opt_generate_bfi
No shader-db changes on any Intel platform.

The "regression" in SEND messages occurs because a loop containing a
SEND is unrolled.

v2: Move after nir_opt_algebraic. Suggested by Georg.

shader-db:

All Intel platforms had similar results. (Meteor Lake shown)
total instructions in shared programs: 19787034 -> 19785933 (<.01%)
instructions in affected programs: 373573 -> 372472 (-0.29%)
helped: 541 / HURT: 6

total cycles in shared programs: 906012612 -> 905626304 (-0.04%)
cycles in affected programs: 58456516 -> 58070208 (-0.66%)
helped: 382 / HURT: 180

fossil-db:

Lunar Lake
Totals:
Instrs: 140671401 -> 140670495 (-0.00%); split: -0.00%, +0.00%
Send messages: 12891430822 -> 12891430834 (+0.00%)
Loop count: 46905 -> 46904 (-0.00%)
Cycle count: 21527511599 -> 21530278999 (+0.01%); split: -0.00%, +0.02%
Spill count: 70728 -> 70766 (+0.05%)
Fill count: 139397 -> 139254 (-0.10%); split: -0.13%, +0.02%
Max live registers: 47512432 -> 47512500 (+0.00%)

Totals from 355 (0.06% of 549270) affected shaders:
Instrs: 878953 -> 878047 (-0.10%); split: -0.18%, +0.08%
Send messages: 19289 -> 19301 (+0.06%)
Loop count: 1243 -> 1242 (-0.08%)
Cycle count: 1434664642 -> 1437432042 (+0.19%); split: -0.06%, +0.25%
Spill count: 15826 -> 15864 (+0.24%)
Fill count: 38454 -> 38311 (-0.37%); split: -0.46%, +0.08%
Max live registers: 52530 -> 52598 (+0.13%)

Meteor Lake and DG2 had similar results. (Meteor Lake shown)
Totals:
Instrs: 152516575 -> 152516147 (-0.00%); split: -0.00%, +0.00%
Send messages: 7491001 -> 7491013 (+0.00%)
Loop count: 47588 -> 47587 (-0.00%)
Cycle count: 17124433133 -> 17126147156 (+0.01%); split: -0.01%, +0.02%
Max live registers: 31854704 -> 31854764 (+0.00%)

Totals from 402 (0.06% of 633223) affected shaders:
Instrs: 839338 -> 838910 (-0.05%); split: -0.09%, +0.04%
Send messages: 20203 -> 20215 (+0.06%)
Loop count: 1243 -> 1242 (-0.08%)
Cycle count: 1327042160 -> 1328756183 (+0.13%); split: -0.11%, +0.24%
Max live registers: 33237 -> 33297 (+0.18%)

Tiger Lake
*** Shaders only in 'before' results are ignored:
fossil-db/steam-native/wolfenstein_youngblood/b8cefe7f700304c4/fs.32/0
from 1 apps: fossil-db/steam-native/wolfenstein_youngblood

Totals:
Instrs: 150549467 -> 150548952 (-0.00%); split: -0.00%, +0.00%
Send messages: 7495582 -> 7495594 (+0.00%)
Loop count: 46605 -> 46604 (-0.00%)
Cycle count: 15472381586 -> 15472247085 (-0.00%); split: -0.00%, +0.00%
Spill count: 59776 -> 59775 (-0.00%)
Fill count: 103475 -> 103464 (-0.01%)
Scratch Memory Size: 2384896 -> 2383872 (-0.04%)
Max live registers: 31760724 -> 31760787 (+0.00%)
Max dispatch width: 5569928 -> 5569912 (-0.00%)

Totals from 525 (0.08% of 632443) affected shaders:
Instrs: 349074 -> 348559 (-0.15%); split: -0.25%, +0.11%
Send messages: 24355 -> 24367 (+0.05%)
Loop count: 849 -> 848 (-0.12%)
Cycle count: 187080291 -> 186945790 (-0.07%); split: -0.19%, +0.12%
Spill count: 483 -> 482 (-0.21%)
Fill count: 1372 -> 1361 (-0.80%)
Scratch Memory Size: 22528 -> 21504 (-4.55%)
Max live registers: 36705 -> 36768 (+0.17%)
Max dispatch width: 6272 -> 6256 (-0.26%)

Ice Lake
Totals:
Instrs: 151804923 -> 151804396 (-0.00%); split: -0.00%, +0.00%
Send messages: 7553216 -> 7553228 (+0.00%)
Loop count: 46196 -> 46195 (-0.00%)
Cycle count: 15099805668 -> 15099533898 (-0.00%); split: -0.00%, +0.00%
Fill count: 103978 -> 103979 (+0.00%)
Max live registers: 32168254 -> 32168323 (+0.00%)

Totals from 527 (0.08% of 637191) affected shaders:
Instrs: 347482 -> 346955 (-0.15%); split: -0.25%, +0.10%
Send messages: 24586 -> 24598 (+0.05%)
Loop count: 849 -> 848 (-0.12%)
Cycle count: 191147758 -> 190875988 (-0.14%); split: -0.16%, +0.02%
Fill count: 1392 -> 1393 (+0.07%)
Max live registers: 37379 -> 37448 (+0.18%)

Skylake
Totals:
Instrs: 140981504 -> 140980647 (-0.00%); split: -0.00%, +0.00%
Cycle count: 14653477192 -> 14653249734 (-0.00%); split: -0.00%, +0.00%
Fill count: 99636 -> 99637 (+0.00%)
Max live registers: 31472062 -> 31472126 (+0.00%)

Totals from 523 (0.08% of 626432) affected shaders:
Instrs: 335551 -> 334694 (-0.26%); split: -0.26%, +0.01%
Cycle count: 178047284 -> 177819826 (-0.13%); split: -0.14%, +0.02%
Fill count: 1100 -> 1101 (+0.09%)
Max live registers: 36734 -> 36798 (+0.17%)

Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31006>
2024-09-13 00:21:00 +00:00
Ian Romanick
6a09d33549 nir: Add a pass to generate BFI instructions from logical operations
Inspired by a commit message in !30934, I set about optimizing the code
generated for nir_copysign. It would be possible to just implement an
opt_algebraic pattern for the specific values used by nir_copysign, but
this casts a slightly larger net.

As noted in a comment in the code, there may be variations of the
pattern that this pass misses. The opt_algebraic pattern would miss them
too.

v2: Use nir_def_replace. Suggested by Alyssa. Allow more "root"
instruction types. Suggested by Georg.

v3: Treat extract_u16(x, 0) as (x & 0x0000ffff), and treat extract_u8(x,
0) as (x & 0x000000ff).

v4: Use nir_scalar. Suggested by Georg.

Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31006>
2024-09-13 00:21:00 +00:00
Ian Romanick
057c7c9f53 nir/algebraic: Recognize open-coded bitfield_reverse in XCOM 2
The XCOM 2 shaders in my shader-db use iadd instead of ior.

No fossil-db changes on any Intel platform.

shader-db:

All Intel platforms had similar results. (Meteor Lake shown)
total instructions in shared programs: 19787210 -> 19787034 (<.01%)
instructions in affected programs: 1187 -> 1011 (-14.83%)
helped: 6 / HURT: 0

total cycles in shared programs: 906024436 -> 906012612 (<.01%)
cycles in affected programs: 72978 -> 61154 (-16.20%)
helped: 6 / HURT: 0

Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31006>
2024-09-13 00:21:00 +00:00
Rhys Perry
97f4250a7c nir: skip opt_loop_peel_initial_break if continue block only has phis
Doing that optimization wouldn't do anything useful in this case.

nir_block_has_non_copy() is used by opt_loop_peel_initial_break().

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31002>
2024-09-12 23:36:58 +00:00
Rhys Perry
8410b4cdd6 nir/tests: add some loop peeling tests
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31002>
2024-09-12 23:36:58 +00:00
Rhys Perry
64ac601049 nir/opt_loop: skip peeling if the loop ends with any kind of jump
Any kind of jump prevents us from moving it to the top of the loop, not
just breaks.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Fixes: 6b4b044739 ("nir/opt_loop: add loop peeling optimization")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31002>
2024-09-12 23:36:58 +00:00
Rhys Perry
af3b099e0a nir/opt_loop: skip peeling if the break is non-trivial
If this nir_if contains continues or other breaks, we can't move it
outside the loop.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Fixes: 6b4b044739 ("nir/opt_loop: add loop peeling optimization")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31002>
2024-09-12 23:36:57 +00:00
Rhys Perry
4f44a944bb nir/opt_if: fix fighting between split_alu_of_phi and peel_initial_break
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Fixes: 6b4b044739 ("nir/opt_loop: add loop peeling optimization")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/11822
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31002>
2024-09-12 23:36:57 +00:00
David Rosca
b89d03838e frontends/va: Reset intra refresh in beginPicture
If app doesn't send updated intra refresh parameters, intra refresh
should be disabled instead of using values from old frame.

Reviewed-By: Sil Vilerino <sivileri@microsoft.com>
Reviewed-by: Ruijing Dong <ruijing.dong@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30992>
2024-09-12 22:50:21 +00:00
David Rosca
379dd3ff52 pipe: Remove unused fields in video rate control
Reviewed-By: Sil Vilerino <sivileri@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30992>
2024-09-12 22:50:21 +00:00
David Rosca
0818ed770d frontends/va: Create encoder at context creation
Instead of creating it when handling sequence parameter buffer. Drivers
don't use the max_references value (and it was only somewhat correct for
H264) and level is also available from sequence parameters.

Reviewed-By: Sil Vilerino <sivileri@microsoft.com>
Reviewed-by: Ruijing Dong <ruijing.dong@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30992>
2024-09-12 22:50:20 +00:00
David Rosca
5632a6e24f Revert "frontends/va: Process VAEncSequenceParameterBufferType first in
vaRenderPicture"

We now set default parameters at context creation, so this is not needed
anymore.

This reverts commit c970a9b663.

Reviewed-By: Sil Vilerino <sivileri@microsoft.com>
Reviewed-by: Ruijing Dong <ruijing.dong@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30992>
2024-09-12 22:50:20 +00:00
David Rosca
4e38b56d80 frontends/va: Set default encoding parameters at context creation
Instead of doing it when handling sequence parameter buffer, which could
overwrite previously set values.

Reviewed-By: Sil Vilerino <sivileri@microsoft.com>
Reviewed-by: Ruijing Dong <ruijing.dong@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30992>
2024-09-12 22:50:20 +00:00
David Rosca
77d5e8ab19 d3d12: Stop using base.level for H264 level_idc
Instead use the value from sequence parameters.

Reviewed-By: Sil Vilerino <sivileri@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30992>
2024-09-12 22:50:20 +00:00
David Rosca
bebda07718 radeonsi/vcn: Stop using base.level for H264 level_idc
Instead use the value from sequence parameters.

Reviewed-by: Ruijing Dong <ruijing.dong@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30992>
2024-09-12 22:50:20 +00:00
David Rosca
bfc36b0aef radeonsi/uvd_enc: Stop using base.level
Instead use the value from sequence parameters.
This needs to move the context buffer allocation to first frame
which also allows to remove the temporary surface allocation.

Reviewed-by: Ruijing Dong <ruijing.dong@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30992>
2024-09-12 22:50:20 +00:00
David Rosca
e891717936 radeonsi/vce: Stop using base.level and base.max_references
Instead use the values from sequence parameters.
This needs to move the context buffer allocation to first frame
which also allows to remove the temporary surface allocation.

Reviewed-by: Ruijing Dong <ruijing.dong@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30992>
2024-09-12 22:50:20 +00:00
David Rosca
96975bc32f radeonsi/vce,uvd_enc: Stop using obsolete rate control params
Other drivers don't use these and the values can be derived from other
fields.

Reviewed-by: Ruijing Dong <ruijing.dong@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30992>
2024-09-12 22:50:20 +00:00
Georg Lehmann
7fa7812219 nir: merge out of loop decision with nir_can_move_instr logic
One place to modify instead of two when adding new intrinsics here.

Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30906>
2024-09-12 21:49:34 +00:00
Georg Lehmann
91f8e32a85 nir/opt_sink: do not sink inverse_ballot out of loops
Inverse_ballot result is undefined if the input is not dynamically uniform.
And sinking out of loops might make the input divergent.

Fixes: 18a0ff137f ("nir: sink/move inverse_ballot like moves")

Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30906>
2024-09-12 21:49:34 +00:00
Georg Lehmann
1ec3cc2aed nir/opt_sink: do not sink load_ubo_vec4 out of loops
Same reason as for load_ubo.

Fixes: d199d65c3a ("nir/nir_opt_move,sink: Include load_ubo_vec4 as a load_ubo instr.")

Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30906>
2024-09-12 21:49:34 +00:00
Kenneth Graunke
02482604e5 intel/brw: Delete old-style surface and A64 message opcodes
These have now been replaced by the MEMORY_*_LOGICAL opcodes.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Acked-by: Rohan Garg <rohan.garg@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30828>
2024-09-12 20:54:36 +00:00
Kenneth Graunke
7090578c35 intel/brw: Switch load_ubo_uniform_block_intel over to memory intrinsics
While there are many cases that turn into the *_PULL_CONSTANT_LOAD ops
or push constants, this one piece was emitting surface block loads.
Switch it over to use the new intrinsics to delete a bunch of code.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Acked-by: Rohan Garg <rohan.garg@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30828>
2024-09-12 20:54:36 +00:00
Kenneth Graunke
b55f77161d intel/brw: Switch to emitting MEMORY_*_LOGICAL opcodes
We introduce a new fs_nir_emit_memory_access() helper that can handle
image, bindless image, SSBO, shared, global, and scratch memory, and
handles loads, stores, atomics, and block loads.  It translates each
of these NIR intrinsics into the new MEMORY_*_LOGICAL intrinsics.

As a result, we delete a lot of similar surface access emitter code.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Acked-by: Rohan Garg <rohan.garg@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30828>
2024-09-12 20:54:36 +00:00
Kenneth Graunke
3ba97176d6 intel/brw: Switch load_num_workgroups to the new memory intrinsic
A simple case we handle directly.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Acked-by: Rohan Garg <rohan.garg@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30828>
2024-09-12 20:54:36 +00:00
Kenneth Graunke
dc4770b005 intel/brw: Lower MEMORY_OPCODE_*_LOGICAL to HDC messages
This is more complicated.  We map the MEMORY_*_LOGICAL opcodes to the
older HDC messages: typed and untyped surface read/write/atomic (whether
float or integer), DWord and Byte scattered messages, OWord block, and
both A64, BTI, and stateless messages.

- MEMORY_MODE_* is used to select stateless-scratch, typed, or untyped.
- MEMORY_FLAG_TRANSPOSE is used to select block access.
- MEMORY_BINDING_TYPE = FLAT and 64-bit address size selects A64.
- Alignment and data type size select between byte/dword scattered or
  surface messages.

While we may not be able to handle the full generality of message
possibilities, we can handle everything we generate currently.  The plan
here is to assert/validate that we don't generate MEMORY_*_LOGICAL ops
on HDC-based platforms which can't support those particular messages.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Acked-by: Rohan Garg <rohan.garg@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30828>
2024-09-12 20:54:36 +00:00
Kenneth Graunke
3255c9cc49 intel/brw: Lower MEMORY_OPCODE_*_LOGICAL to LSC messages
This is pretty straightforward, as the new MEMORY_*_LOGICAL opcodes
are designed to match the new LSC's capabilities.  The main part is
constructing the message payload.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Rohan Garg <rohan.garg@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30828>
2024-09-12 20:54:36 +00:00
Kenneth Graunke
a82e8b1c6b intel/brw: Pretty-print memory logical opcodes
The new MEMORY_*_LOGICAL intrinsics have a lot of control sources with
a bunch of LSC_* enums (opcode, memory type, address type, address and
data sizes), as well as flags, coordinate components vs. components...
they unfortunately are nigh-unreadable with the default printing since
there's just a string of unreadable UD immediates in some order.

To fix this, we add some basic pretty-printing.  If a control source is
simply an enum whose value communicates the entire purpose, we print it.
If it has a numeric value (i.e. alignment, or data), we add a label.

For example:

   memory_store(16) (null):UD store shared flat addr: %2:UD coord_comps:1u align:16u d32 comps:2u data0: %3:UD

   memory_store(16) (null):UD store typed bti:%2+0.0<0>:UD addr: %3+0.0:D coord_comps:2u align:0u d32 comps:4u data0: %4:UD

This make them much easier to read.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Reviewed-by: Rohan Garg <rohan.garg@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30828>
2024-09-12 20:54:36 +00:00
Kenneth Graunke
2c67729386 intel/brw: Expose functions to convert LSC enums to strings
We had tables for these in the disassembler already, but I'd like to use
them in brw_print.cpp as well.  Just wrap the tables in convenience
functions we can use there.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Reviewed-by: Rohan Garg <rohan.garg@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30828>
2024-09-12 20:54:36 +00:00
Kenneth Graunke
d5f38be713 intel/brw: Introduce new MEMORY_*_LOGICAL opcodes
This is a new unified set of opcodes for memory access loosely patterned
after the new LSC-style data port messages introduced on Alchemist GPUs.

Rather than creating an opcode for every type of memory access, it has
only three opcodes: load, store, and atomic.  It has various sources to
indicate the rest:

- Binding type (raw pointer, pointer to surface state, or BT index)
- Address size (A64, A32, A16)
- Data size (bit size, number of components)
- Opcode (atomic opcode, or LOAD/STORE vs. LOAD_CMASK/STORE_CMASK)
- Mode (typed vs. untyped vs. shared-local vs. scratch)
- Address (and its dimensionality)
- Data (0 for loads, 1 for stores, 2 for atomics)
- Whether we want block access

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Reviewed-by: Rohan Garg <rohan.garg@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30828>
2024-09-12 20:54:36 +00:00
Kenneth Graunke
b8f264cfe4 intel/brw: Handle load/stores in lsc_op_for_nir_intrinsic()
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Reviewed-by: Rohan Garg <rohan.garg@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30828>
2024-09-12 20:54:36 +00:00
Kenneth Graunke
8a6903e50d intel/brw: Rename lsc_aop_for_nir_intrinsic to "op" instead of "aop"
This is going to handle more than atomics shortly.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Reviewed-by: Rohan Garg <rohan.garg@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30828>
2024-09-12 20:54:36 +00:00
Kenneth Graunke
e8883bd40b intel/brw: Use size_written for NoMask instructions in is_partial_write
The intention of inst->is_partial_write() is that it should return true
when any REG_SIZE (32B) chunk of inst's destination is written but not
fully overwritten.  This can be used to tell whether inst combines new
data with existing data, or screens off any previous writes, so the old
values are no longer required.

The existing (exec_size * brw_type_size_bytes(this->dst.type) < 32)
check doesn't work in a number of cases.  For example, LSC block loads
have exec_size == 1 and force_writemask_all set, but may write multiple
full registers of data.  (Currently, we only see them with exec_size 1
after logical-send-lowering, so our SHADER_OPCODE_SEND special case
was covering those.)  We had also special cased UNDEF.

Instead, we can simply check:

   1. Predication
   2. !inst->dst.contiguous()
   3. inst->dst.offset % REG_SIZE != 0
   4. inst->size_written % REG_SIZE != 0

We had the first three already, but #4 is new.  If either #3 or #4
are true, then that implies there is a REG_SIZE chunk of the destination
which is written, but not entirely written, so it's a partial write.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Reviewed-by: Rohan Garg <rohan.garg@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30828>
2024-09-12 20:54:36 +00:00
Kenneth Graunke
ab0b9b6792 intel/brw: Use NUM_BRW_OPCODES in can_omit_write() check
The intention here is to detect ALU hardware instructions, but not
virtual instructions that haven't been explicitly whitelisted.

For some reason we had arbitrarily hardcoded 128 here, but our virtual
opcodes don't start at 128.  They start at NUM_BRW_OPCODES.  So, use
that instead.

This prevents regressions later when we delete some opcodes, shifting
some virtual opcodes into the 72-128 range.

Cc: mesa-stable
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Reviewed-by: Rohan Garg <rohan.garg@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30828>
2024-09-12 20:54:36 +00:00
Lars-Ivar Hesselberg Simonsen
0c18aa996b panfrost: Enable support for depth clamping
Depth clamping was not enabled in panfrost, leading to the fixed range
[0.0, 1.0] always being used.

This commit sets the bits to enable depth clamp to near/far plane
depending on the passed rasterizer state.

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/11506
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31041>
2024-09-12 20:04:58 +00:00
Lars-Ivar Hesselberg Simonsen
f5dab1b77e panfrost: Fix near/far depth clip
Near/far depth clip was implemented by setting the low/high_depth_clamp
to -/+INFINITY, which is invalid on Mali.

This commit removes the modification of the depth clamp values and
enables depth clipping by setting the depth_cull_enable state in the ZS
descriptor for v9+ (the equivalent pre-v9 RSD state is already set
correctly) and setting the primitive cull flags correctly for both.

Finally, it disables PIPE_CAP_DEPTH_CLIP_DISABLE_SEPARATE for v9+, as
both plane clips are controlled by a single value.

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/11506
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31041>
2024-09-12 20:04:58 +00:00