This change is to support 10bit YUV input in addition to
original H264/HEVC 8bit output case. It adds
rvcn_enc_input_format_t and rvcn_enc_output_format_t for
picture input format and output format separately.
Signed-off-by: David (Ming Qiang) Wu <David.Wu3@amd.com>
Reviewed-by: Boyuan Zhang <Boyuan.Zhang@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20284>
There are a lot of optimizations in opt_algebraic that match ('ine', a,
0), but there are almost none that match i2b. Instead of adding a huge
pile of additional patterns (including variations that include both ine
and i2b), always lower i2b to a != 0.
At this point in the series, it should be impossible for anything to
generate i2b, so there /should not/ be any changes.
The failing test on d3d12 is a pre-existing bug that is triggered by
this change. I talked to Jesse about it, and, after some analysis, he
suggested just adding it to the list of known failures.
v2: Don't rematerialize i2b instructions in dxil_nir_lower_x2b.
v3: Don't rematerialize i2b instructions in zink_nir_algebraic.py.
v4: Fix zink-on-TGL CI failures by calling nir_opt_algebraic after
nir_lower_doubles makes progress. The latter can generate b2i
instructions, but nir_lower_int64 can't handle them (anymore).
v5: Add back most of the hunk at line 2125 of nir_opt_algebraic.py. I
had accidentally removed the f2b(bf2(x)) optimization.
v6: Just eliminate the i2b instruction.
v7: Remove missed i2b32 in midgard_compile.c. Remove (now unused)
emit_alu_i2orf2_b1 function from sfn_instr_alu.cpp. Previously this
function was still used. 🤷
No shader-db changes on any Intel platform.
All Intel platforms had similar results. (Ice Lake shown)
Instructions in all programs: 141165875 -> 141165873 (-0.0%)
Instructions helped: 2
Cycles in all programs: 9098956382 -> 9098956350 (-0.0%)
Cycles helped: 2
The two Vulkan shaders are helped because of the "new" (('b2i32',
('ine', ('ubfe', a, b, 1), 0)), ('ubfe', a, b, 1)) algebraic pattern.
Acked-by: Jesse Natalie <jenatali@microsoft.com> [earlier version]
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Tested-by: Daniel Schürmann <daniel@schuermann.dev> [earlier version]
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15121>
The shaders affected here are ones that were previously affected when
i2b was unconditionally lowered in opt_algebraic. There are a few places
where some transformations happen in a different order, so some
algebraic patterns are missed.
All Broadwell and newer Intel platforms had similar results. (Ice Lake shown)
total instructions in shared programs: 19914369 -> 19914566 (<.01%)
instructions in affected programs: 92375 -> 92572 (0.21%)
helped: 0 / HURT: 90
total cycles in shared programs: 853851470 -> 853867215 (<.01%)
cycles in affected programs: 12400663 -> 12416408 (0.13%)
helped: 28 / HURT: 69
Haswell and Ivy Bridge had similar results. (Haswell shown)
total instructions in shared programs: 16710721 -> 16710700 (<.01%)
instructions in affected programs: 108010 -> 107989 (-0.02%)
helped: 57 / HURT: 103
total cycles in shared programs: 884299412 -> 884306546 (<.01%)
cycles in affected programs: 12986423 -> 12993557 (0.05%)
helped: 87 / HURT: 102
total spills in shared programs: 14937 -> 14925 (-0.08%)
spills in affected programs: 12 -> 0
helped: 9 / HURT: 0
total fills in shared programs: 17569 -> 17557 (-0.07%)
fills in affected programs: 12 -> 0
helped: 9 / HURT: 0
Sandy Bridge
total instructions in shared programs: 13902341 -> 13902347 (<.01%)
instructions in affected programs: 7311 -> 7317 (0.08%)
helped: 3 / HURT: 8
total cycles in shared programs: 741795500 -> 741792266 (<.01%)
cycles in affected programs: 273308 -> 270074 (-1.18%)
helped: 9 / HURT: 2
No shader-db changes on any other Intel platform.
No fossil-db changes on any Intel platform.
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15121>
In a future commit, nit_type_conversion_op won't be able to handle i2b
(and in a much later commit f2b), so switch many users to the fully
featured function.
No shader-db or fossil-db changes on any Intel platform.
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15121>
In a future commit, nit_type_conversion_op won't be able to handle i2b
(and in a much later commit f2b), so switch many users to the fully
featured function.
Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15121>
In a future commit, nit_type_conversion_op won't be able to handle i2b
(and in a much later commit f2b), so switch many users to the fully
featured function.
No shader-db or fossil-db changes on any Intel platform.
v2: Use the actual bit size of the source to determine the conversion
op. With mediump, the "planned" bit size and the actual bit size might
be different. Fixes many, many Vulkan CTS assertion failures on any
platform that sets mediump_16bit_alu (e.g., Freedreno).
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> [v1]
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15121>
In a future commit, nit_type_conversion_op won't be able to handle i2b
(and in a much later commit f2b), so switch many users to the fully
featured function.
No shader-db or fossil-db changes on any Intel platform.
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15121>
In a future commit, nit_type_conversion_op won't be able to handle i2b
(and in a much later commit f2b), so switch many users to the fully
featured function.
In gl_nir_lower_packed_varyings, all of the type conversions are between
int32 and uint32 types. In NIR, those are just moves, so elide them.
No shader-db or fossil-db changes on any Intel platform.
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15121>
Previously these would return result->bit_size of 32 even though the
type might have been int16_t or uint16_t. This prevents many assertion
failures in "glsl: Use nir_type_convert instead of
nir_type_conversion_op" on zink.
Fixes: 5e922fbc16 ("glsl_to_nir: fix bitfield_extract with 16-bit operands")
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15121>
It is invalid to have Boolean variables as either shader inputs or
outputs, so there is no point to try to lower them in general. The only
use for this was some two-phase lowering of
nir_intrinsic_load_front_face that could be done in a single phase.
Create the SYSTEM_VALUE_FRONT_FACE as a uint and compare it with zero at
the same time.
No shader-db or fossil-db changes on any Intel platform.
v2: Remove dxil_nir_lower_bool_input from dxil_nir.h and drop it from
the other caller in the spirv_to_dxil codepath. Noticed by Jesse. Fix
setting bit size when loading SYSTEM_VALUE_FRONT_FACE. Caught by CI.
v3: Use nir_ine_imm. Change type of gl_FrontFacing GS output in
d3d12_nir_passes from Boolean to integer. Both suggested by Jesse.
Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15121>
There are a lot of optimizations in opt_algebraic that match ('ine', a,
0), but there are almost none that match i2b. Instead of adding a huge
pile of additional patterns (including variation that include both ine
and i2b), just emit a != 0 instead of i2b(a).
I think that the changes to the unit tests weaken them slightly, but
perhaps that's okay?
No shader-db changes on any Intel platform. The GLSL paths use other
means to generate i2b operations, but the SPIR-V paths use nir_i2b.
Presumably since 4676b3d3dd (nir: Use nir_test_mask instead of
i2b(iand)), no fossil-db changes either.
v2: Use nir_ine_imm. Suggested by Jesse.
Acked-by: Jesse Natalie <jenatali@microsoft.com>
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Tested-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15121>
No shader-db or fossil-db changes on any Intel platform.
v2: Add missed i2b1 in ir3_nir_opt_preamble.c.
v3: Add missed i2b1 in ac_nir_lower_ngg.c.
Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com>
Acked-by: Jesse Natalie <jenatali@microsoft.com>
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Tested-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15121>
v2: Remove the ineg from the b2i in the ior pattern. Suggested by
Jason.
All Ivy Bridge and newer Intel platforms had similar results. (Ice Lake shown)
total instructions in shared programs: 19914441 -> 19914369 (<.01%)
instructions in affected programs: 63507 -> 63435 (-0.11%)
helped: 24 / HURT: 0
total cycles in shared programs: 853869766 -> 853851470 (<.01%)
cycles in affected programs: 10551542 -> 10533246 (-0.17%)
helped: 24 / HURT: 0
All Intel platforms had similar results. (Ice Lake shown)
Instructions in all programs: 141163061 -> 141092683 (-0.0%)
Instructions helped: 14103
Instructions hurt: 55
Cycles in all programs: 9132376195 -> 9133183045 (+0.0%)
Cycles helped: 13775
Cycles hurt: 380
Spills in all programs: 18286 -> 18284 (-0.0%)
Spills helped: 1
Fills in all programs: 30647 -> 30643 (-0.0%)
Fills helped: 1
Gained: 133
Lost: 130
Acked-by: Jesse Natalie <jenatali@microsoft.com>
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Tested-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15121>
This helps because it enables cmod propagation to do more.
The removed patterns involving b2i will be handled by other existing
patterns after the unary operations are removed.
All Intel platforms had similar results. (Ice Lake shown)
total instructions in shared programs: 19914458 -> 19914441 (<.01%)
instructions in affected programs: 5456 -> 5439 (-0.31%)
helped: 17 / HURT: 0
total cycles in shared programs: 855302118 -> 853869766 (-0.17%)
cycles in affected programs: 327354347 -> 325921995 (-0.44%)
helped: 291 / HURT: 81
All Intel platforms had similar results. (Ice Lake shown)
Instructions in all programs: 141205979 -> 141205961 (-0.0%)
Instructions helped: 4
Instructions hurt: 3
SENDs in all programs: 7466919 -> 7466913 (-0.0%)
SENDs helped: 1
Cycles in all programs: 9133387327 -> 9133384475 (-0.0%)
Cycles helped: 3
Cycles hurt: 12
In the shader that was helped for sends, it appears that a NIR pass that
moves code out of loops was able to move 3 send operations outside a
loop after this change. I did not investigate further.
Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com>
Acked-by: Jesse Natalie <jenatali@microsoft.com>
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Tested-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15121>
This prevents ~400 shader-db regresssions and a handful of fossil-db
regressions after i2b is always lowered.
All Ivy Bridge and newer Intel platforms had similar results. (Ice Lake shown)
total cycles in shared programs: 855301494 -> 855302118 (<.01%)
cycles in affected programs: 52787 -> 53411 (1.18%)
helped: 4 / HURT: 5
All Intel platforms had similar results. (Ice Lake shown)
Instructions in all programs: 141206055 -> 141205979 (-0.0%)
Instructions helped: 14
Cycles in all programs: 9133376616 -> 9133387327 (+0.0%)
Cycles helped: 13
Cycles hurt: 3
Acked-by: Jesse Natalie <jenatali@microsoft.com>
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Tested-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15121>
No shader-db changes on any Intel platform.
All of the helped shaders were presumably regressed by 4676b3d3dd (nir:
Use nir_test_mask instead of i2b(iand)).
v2: Add some comments explaining why specific replacements are used. In
the umin pattern, only markup the first usage of 'b' in the source
pattern.
Tiger Lake, Ice Lake, and Skylake had similar results. (Ice Lake shown)
Instructions in all programs: 141384970 -> 141200966 (-0.1%)
Instructions helped: 45842
Cycles in all programs: 9133648977 -> 9133282672 (-0.0%)
Cycles helped: 26812
Cycles hurt: 6025
Gained: 23
Lost: 135
Acked-by: Jesse Natalie <jenatali@microsoft.com>
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Tested-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15121>
A loop below already adds all the permutations... including the 1-bit
version that isn't included in this group.
No shader-db or fossil-db changes on any Intel platform.
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com>
Acked-by: Jesse Natalie <jenatali@microsoft.com>
Tested-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15121>
The exact same pattern appears later (around line 1323).
No shader-db or fossil-db changes on any Intel platform.
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com>
Acked-by: Jesse Natalie <jenatali@microsoft.com>
Tested-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15121>
A later commit adds a pattern
(('umin', ('iand', a, '#b(is_pos_power_of_two)'),
('iand', c, '#b(is_pos_power_of_two)')),
('iand', ('iand', a, b), ('iand', c, b))),
When I originally made that pattern, I copied and pasted the search to
the replacement as
(('umin', ('iand', a, '#b(is_pos_power_of_two)'),
('iand', c, '#b(is_pos_power_of_two)')),
('iand', ('iand', a, '#b(is_pos_power_of_two)'),
('iand', c, '#b(is_pos_power_of_two)'))),
The caused the variables in the replacement to be marked is_constant,
and that resulted in an assertion failure deep inside nir_search.
src/compiler/nir/nir_search.c:530: construct_value: Assertion `!var->is_constant' failed.
These extra validation rules catch this kind of error at compile time
rather than at run time.
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com>
Acked-by: Jesse Natalie <jenatali@microsoft.com>
Tested-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15121>
The new format support is only tested on Ice Lake and onward. Makes the
next patch clearer.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jianxun Zhang <jianxun.zhang@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19937>
Tiger Lake and onward allow drivers to specify a compression format
independently from the surface format. So, even if the surface format
changes, hardware is still able to determine how to access the CCS.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jianxun Zhang <jianxun.zhang@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19937>
The sampler's decompressor seems to lack support for some types of
format re-interpretation. Use the more capable decompressor for these
cases. This will be needed to avoid regressing piglit's
arb_texture_view-rendering-formats in later commits.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jianxun Zhang <jianxun.zhang@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19937>
We'll want 2 batches :
* the main one
* another to contain dispatch commands to generate stuff in the
main batch
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20295>
And allow the function to get the very first address in the batch.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20295>
Gen11 added a nifty feature where we have three custom system-generated
values called extended parameters that we can set to any 32-bit values
we want. These work just like vertex and instance ID and are controlled
in the pipeline by the 3DSTATE_SGVS_2 packet. They are provided to the
draw call either by extra DWORDs on the end of 3DSTATE_PRIMITIVE or by
storing values to more state registers.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20295>
On Gen11 and above, the 3DPRIMITIVE command takes an optional additional
three DWORDs of data as "extended parameters". These extended
parameters only exist in the packet if "Extended Parameters Present" is
set. Because our packing code doesn't handle variable-length commands
well, this commit adds a second version of the command which isn't real
but is just a copy of 3DPRIMITIVE with the additional dwords where the
"Extended Parameters Present" defaults to true and "DWord Length" is
adjusted by 3 as needed. The 3DPRIMITIVE command is then the gen4-9
version which still works fine but doesn't have the new parameters.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20295>