Merge remote-tracking branch 'cros/master' into venus

* cros/master:
  nir/algebraic: Covert up-cast of down-cast to extract on Intel
  nir/algebraic: Clean up up-cast of down-cast when we can
  nir: Add some range analysis for used bits
  intel/nir: Lower 8-bit phis on Gen11+
  nir/lower_bit_size: Support phi instructions
  nir: Add a couple helpers for phis and cursors
  virgl: Return total video memory if available
  virgl: update headers
  frontends/va/config: Fix check for packed header config
  gitlab-ci: add intel APL and GLK devices with manual triggers
  gitlab-ci: build the iris gallium driver as well
  gitlab-ci: extend x86_64 kernel config to suport Intel devices
  intel/compiler: Free resources on test teardown
  intel/genxml: Free resource before exiting
  pan/bi: Use the correct size for UBO loads
  radv: Do pipe misalignment check per plane.
  broadcom/compiler: Merge instructions more efficiently
  meson: invalid keyword argument dependencies
  radv: only apply the MRT output NaN fixup to non-meta shaders
  ci: Update baremetal kernel to 5.11 plus patches
  lavapipe: add support for missing 10/10/10/2 formats.
  lavapipe: add support for 2/10/10/10 scaled formats.
  llvmpipe: don't support scaled formats outside vertex buffers
  util/format: add helper to check if a format is scaled.
  zink: support nir_intrinsic_group_memory_barrier
  features: mark off GL 4.6 and ES 3.1 for zink
  zink: GLSL 460
  zink: PIPE_CAP_GL_SPIRV
  zink: enable pipeline statistics cap
  zink: enable PIPE_CAP_QUERY_SO_OVERFLOW
  zink: enable PIPE_CAP_POLYGON_OFFSET_CLAMP
  zink: enable PIPE_CAP_DRAW_PARAMETERS
  zink: enable PIPE_CAP_TGSI_VOTE
  zink: add util function for submitting the compute batch
  zink: rewrite drawid based on shader key value
  zink: break out push constant creation in compiler and add drawid value
  zink: add a vs shader key for rewriting gl_DrawID
  zink: add a draw_id param to vs push constants
  zink: wrap shader gl_BaseVertex access with a bcsel based on push constant state
  zink: add push constant value to indicate whether the current draw is indexed
  zink: rework tcs injection to be more compatible with new push const struct
  zink: create a struct for tracking push constant layout
  zink: add handling for ARB_shader_draw_parameters variables in ntv
  zink: handle 1bit undef values in ntv
  zink: fix slot mapping for legacy gl io with tess stages
  zink: add support for pipeline statistics queries
  zink: hook up cs invocation queries to the compute batch
  zink: unset generated TCS if its parent TESS is unset
  aco: fix assertion in insert_exec_mask pass
  aco: fix transition_to_{WQM,Exact} if exec.back() is not in exec
  mesa: add debug code to catch missing _mesa_update_valid_to_render_state calls
  mesa: inline draw validate functions
  mesa: inline _mesa_set_draw_vao and set_varying_vp_inputs for draw calls
  mesa: gather errors and call _mesa_error only once in validate_Draw
  mesa: precompute draw time determination of enabled vertex arrays
  mesa: precompute _mesa_get_vao_vp_inputs
  mesa: set _DrawVAOEnabledAttribs only when it changes
  mesa: move gl_context::varying_vp_inputs into ctx->VertexProgram._VaryingInputs
  mesa: optimize set_varying_vp_inputs by precomputing the conditions
  mesa: validate numInstances in common functions to unify code
  mesa: move disallowed TFB in DrawElements on GLES from draws to state changes
  mesa: add a separate valid primitive mask just for glDrawElements
  mesa: don't skip draws with count == 0 or numInstances == 0
  mesa: skip MultiDrawArrays with primcount == 0
  mesa: remove an optional GL error about mapped buffers during execution
  mesa: call _mesa_update_state() before validation
  mesa: remove optional draw validation code to increase performance
  mesa: remove VERBOSE_DRAW
  mesa: optimize the dual source blend error checking using a bitmask
  mesa: inline _mesa_valid_to_render now that it doesn't do validation
  mesa: move blending validation from draws to state changes
  mesa: move GL_FILL_RECTANGLE validation from draws to state changes
  mesa: move ARB program and integer FBO validation from draws to state changes
  mesa: move FBO completeness checking from draws to state changes
  mesa: move some uniform debug code from draws to state changes
  mesa: move sampler uniform validation from draws to state changes
  mesa: move shader pipeline validation from draws to state changes
  mesa: don't report 1 for GL_VALIDATE_STATUS if user didn't validate pipeline
  mesa: add skeleton code for DrawPixels/CopyPixels/Bitmap precomputed validation
  mesa: inline check_valid_to_render
  mesa: fold most of check_valid_to_render into _mesa_update_valid_to_render_state
  mesa: move check_valid_to_render call into _mesa_valid_prim_mode
  mesa: precompute draw time prim validation during state changes
  mesa: precompute all valid primitive types at context creation
  mesa: optimize draw index type checking
  freedreno: Add missing dep on freedreno tracepoints.
  vulkan: document flags choice for vkGetDeviceQueue
  ci/v3d: Add V3D and V3DV testing
  ci: add option to overwrite CPU arch
  aco: add DeviceInfo
  aco: consider that GFX10.3 allocates LDS in 1024 byte blocks
  radv,aco: add radv_nir_compiler_options::wgp_mode
  aco: add Program::wgp_mode
  aco: fix waves calculation for wave32
  radv: round up max_lds_per_simd / lds_per_wave
  radv: use lds_{encode,alloc}_granularity
  ac: split lds_granularity into encode and allocation granularities
  radv: switch MaxWaves statistic to wave32 waves
  radv: fix max_lds_per_simd on GFX10
  ci: Bump deqp to current vulkan-cts-1.2.5.1
  intel/dump_gpu: mark bo as unmapped if its address changes
  intel/tools/aub: remove superfluous new line from error messages
  intel/tools/aub: handle truncated input file
  intel/tools/aub: print better error message when mmap fails
  panfrost: Move the blend logic out of the gallium driver
  panfrost: Move the blend lowering code out of the gallium driver
  panfrost: Rename pan_blend.h into pan_blend_cso.h
  panfrost: Use the pan_shader_prepare_rsd() helper
  panfrost: Provide a helper to prepare the shader related parts of an RSD
  panfrost: Move the shader compilation logic out of the gallium driver
  panfrost: Keep the compiler inputs in the context
  panfrost: Move sysval_to_id out of panfrost_sysvals
  panfrost: Prefix shader related helpers with pan_shader_
  panfrost: Hide backend compiler internals
  panfrost: Use panfrost_get_shader_options() in panfrost_build_blit_shader()
  amd: update addrlib
  radv: Properly handle modifier import failure.
  radv: Remove vk_format_has_stencil/depth helpers.
  radv: Remove the format table.
  radv: Start using util_format_description for everything.
  radv: Only support format with a PIPE_FORMAT.
  radv: Stop using plane_count.
  radv: Stop checking for MULTIPLANE layout.
  radv: Do not use generated table for plane formats.
  radv: Do no use vk_format for getting divisors.
  radv: Remove VK_SWIZZLE_*.
  radv: Use u_format helpers when possible.
  radv: Add plane width/height helpers.
  radv: Determine swizzles correctly.
  zink: fix detection of KHR_maintenance1/2
  lima: implement GL_EXT_texture_swizzle
  r600/sfn: Initialize FragmentShaderFromNir member m_pos_input.
  radeonsi: add debug options nodisplaytiling and nodisplaydcc
  radeonsi: skip s_sendmsg(gs_alloc_req) for NGG passthrough on new chips
  amd: sort chip enums based on hw revision
  ac/gpu_info: conceal L2 cache sizes
  ac/gpu_info: inline get_l2_cache_size and set cache sizes farther down
  ac/gpu_info: remove redundant radeon_info::num_sdp_interfaces
  ac/gpu_info: add radeon_info::num_tcc_blocks
  ac/gpu_info: rename num_tcc_blocks -> max_tcc_blocks
  ac/gpu_info: print use_late_alloc
  winsys/amdgpu: disallow pb_cache for backing buffers of sparse buffers
  compiler: Drop now unused gl_varying_slot_name()
  st/atifs: Use gl_varying_slot_name_for_stage()
  etnaviv: Use gl_varying_slot_name_for_stage()
  freedreno/ir3: Use gl_varying_slot_name_for_stage()
  intel/compiler: Use gl_varying_slot_name_for_stage()
  zink: flag exact alu op results in ntv with NoContraction
  aco: remove dead code for the handling of exec temporaries
  aco: make all exec accesses non-temporaries
  aco: handle non-temp phi definitions and operands
  aco: don't create unnecessary exec phi on merge blocks
  v3dv/meta_copy: get tlb compatible BC compressed formats for copies
  v3dv/formats: expose support for BC1-3 compressed formats
  v3dv/device: clarify that we can't expose textureCompressionBC
  docs/features: gl_HelperInvocation on Panfrost
  docs/features: Mark sample shading done on Panfrost
  docs/features: Mark some ES3.1 done on Panfrost
  docs/features: Mark more TBO exts done on panfrost
  panfrost: Advertise OES_standard_derivatives
  panfrost: Bump advertised ESSL feature level
  panfrost: Bump max SSBO count
  panfrost: Advertise SAMPLE_SHADING
  panfrost: Assert on indirect compute shaders
  panfrost: Remove stale TODOs
  panfrost: Simplify bind_compute_state
  pan/{mdg, bi}: Lower load_sample_pos
  pan/{mdg, bi}: Lower load_helper_invocation
  pan/bi: Implement coverage mask updates
  pan/bi: Decouple sysval loading from NIR
  pan/bi: Implement nir_intrinsic_load_sample_positions_pan
  pan/bi: Implement load_sample_mask_in
  pan/bi: Fix gl_SampleID read
  pan/bi: Lower ifind_msb
  pan/bi: Implement ufind_msb
  pan/bi: Implement bitfield_reverse
  pan/bi: Support bit_count()
  pan/bi: Add uclz() support
  pan/bi: Lower bitfield inserts/extracts
  pan/bi: Implement texture gathers
  pan/bi: Remove redundant TEXC opcode check
  pan/mdg: Lower stores from helpers
  pan/mdg: Stub load_barycentric_sample
  pan/mdg: Lower ufind_msb, poorly
  pan/mdg: Implement uclz
  pan/mdg: Rename bitcount8 to popcnt, fixing the unit
  pan/mdg: Lower bitfield instructions
  pan/mdg: Remove unused pack_unorm_4x8 lowering
  pan/mdg: Assert on bad 64-bit swizzle in disassembly
  panfrost: Add MULTISAMPLED sysval
  panfrost: Overhaul sysval handling
  panfrost: Implement get_sample_position
  panfrost: Advertise MSAA 8x and 16x
  panfrost: Ensure open_device has pandecode initialized
  panfrost: Use sample location LUT
  panfrost: Upload sample positions on device init
  panfrost: Set sample count/pattern for tiler FBD
  panfrost: Remove batch_is_scanout
  panfrost: Remove PAN_REQ_DEPTH_WRITE
  panfrost: Remove PAN_REQ_MSAA
  panfrost: Don't use PAN_REQ_MSAA in SFBD
  panfrost: Don't set REQ_MSAA in pan_mfbd
  panfrost: Generalize MSAA handling
  panfrost: Set tiler descriptor sampler pattern
  panfrost: Add panfrost_sample_pattern helper
  panfrost: Respect info.fs.uses_sample_shading
  panfrost: Refactor sample shading state
  panfrost: Push sample positions sysval for Midgard
  panfrost: Add sample positions sysval
  panfrost: Preload sample mask if needed
  pan/decode: Only print local storage for vertex jobs
  pan/decode: Cleanup sample locations decode
  nir: Add sample_positions_pan intrinsic
  iris: Make a pin_scratch_space() helper
  zink: enable KHR_shader_draw_parameters on Vulkan <1.2
  zink/codegen: do not enable extensions that are now core
  zink/codegen: fix type annotations
  zink/codegen: validate has_properties and has_features
  zink/codegen: perform basic validation in zink_device_info
  zink/codegen: make zink_device_info accept vk.xml
  zink/codegen: introduce notion of non-standard extensions
  zink/codegen: more validation in zink_instance
  zink/codegen: introduce ExtensionRegistry
  radv/winsys: set use_global_list inside the critical section
  radv: only make the WSI images resident if the global BO list is used
  aco: use VCC as regular SGPR pair on GFX10
  aco: don't abort() if disassembly fails
  aco: check get_reg_specified() on register hints
  aco: also consider VCC in get_reg_specified()
  aco: don't decrease the vgpr_limit when encountering bpermute
  aco: refactor GPR limit calculation
  aco: change gpr_alloc_granule to full alignment
  aco: fix shared VGPR allocation on RDNA2
  zink: VK_KHR_draw_indirect_count is a device extension
  radv: emit pipeline bind markers for SQTT
  zink: fix streamout for tess stage
  wgl: Disable automatic use of layered drivers with LIBGL_ALWAYS_SOFTWARE
  d3d12: Fail screen creation if a shader validator is needed and can't be created
  wgl: Add a loop for screen creation with an ordered list of fallbacks
  wgl: Refactor screen creation to a function
  pan/bi: Fix empty shader handling
  pan/bi: Fix jumps to terminal block again
  panfrost: Fake shader images for bifrost+deqp
  ci: Disable scons-win64 job
  radv: Ignore WC flags for VRAM.
  zink: support SO_OVERFLOW pipe query types
  zink: put SO_OVERFLOW queries on the primgen list
  zink: break out cpu query reading for qbos into separate function
  zink: make the xfb_query_pool into an array
  zink: always use query->type for starting/stopping xfb queries
  pan/bi: Skip ATEST for colour blit shaders
  panfrost: Pass is_blit flag around
  zink: use gallium api to copy to display-target
  zink: ignore irrelevant bind-flags
  zink: limit host-visible bind-flags
  zink: don't always require linear display-targets
  zink: do not use extra staging resource unless needed
  zink: drop extra set of parens
  ci: disable sporadically failing test
  lavapipe: handle null-buffers for xfb
  anv: Allow null handle in DestroyDescriptorUpdateTemplate.
  broadcom/compiler: use unifa for UBO loads from uniform addresses
  broadcom/compiler: emit ldunifarf when needed
  broadcom/compiler: do not DCE ldunifa
  broadcom/compiler: disallow reading two uniforms in the same instruction
  broadcom/compiler: ensure 3-slot delay between unifa and ldunifa
  broadcom/compiler: preserve ordering of unifa/ldunifa sequences
  broadcom/compiler: disallow unifa overlap with thread switch/end
  broadcom/compiler: add a helper to check if an instruction writes unifa
  broadcom/compiler: don't check for GFXH-1633 on V3D 4.2.x
  broadcom/compiler: name registers correctly based on V3D version
  broadcom/compiler: pass a devinfo to check if an instruction writes to TMU
  broadcom/compiler: add V3D_QPU_WADDR_UNIFA
  disk_cache: Fail creation when cannot inizialize queue.
  broadcom/compiler: Skip bool_to_cond where possible
  broadcom/compiler: Add a v3d_compile argument to vir_set_[pu]f
  radv: Define supported extensions in C.
  radv: Remove custom icd json generation.
  panfrost: Set barriers flag for compute shaders
  compiler, nir: Add and set barrier metadata
  panfrost: Enable ES3 conformant floating-point
  iris: Remove context from iris_disk_cache_retrieve
  iris: Remove context from iris_create_uncompiled_shader
  iris: Remove context from iris_compile_vs and friends
  iris: Remove context from iris_upload_shader()
  iris: Remove context from iris_debug_recompile
  iris: Fill out scratch base address dynamically
  zink: lower flrp64 and ffma64 when in softfp64 mode
  zink: add spirv interfaces for bo and image/sampler/push variables
  anv: Add ANV_QUEUE_OVERRIDE env-var to override advertised queues
  anv: Add fake graphics-only and compute-only queue families
  ci: enable max texture size tests for zink
  vulkan: Fix windows api conflict
  pan/bi: Push UBOs on Bifrost
  pan/bi: Add SSA-based scalar copy propagation
  pan/bi: Simplify derivative lowering
  pan/bi: Rework FAU lowering
  pan/bi: Handle modifiers in rewrite_fau_to_pass
  pan/bi: Generalize bi_update_fau with fast zero
  pan/bi: Print FAU uniforms in IR
  pan/bi: Add bi_is_ssa helper
  pan/bi: Add bi_replace_index helper
  pan/bi: Fix multithreaded shader-db
  pan/mdg: Push uniforms based on UBO analysis
  pan/mdg: Update UBO promotion comment
  panfrost: Don't store uniform_count on Midgard
  panfrost: Set FAU count based on program->push
  panfrost: Push uniforms required by the program
  panfrost: Add UBO push data structure
  panfrost: Don't truncate uniform_count
  panfrost: Move sysvals to dedicated UBO
  panfrost: Respect buffer_offset when mapping to CPU
  panfrost: Fix race condition in UBO mapping to CPU
  pan/mdg: Set lower_uniforms_to_ubo
  pan/mdg: Optimize UBO offset calculations
  pan/mdg: Add MIDGARD_MESA_DEBUG=inorder option
  pan/mdg: Fix multithreaded shader-db
  anv: discard all timeline wait/signal value=0
  features: mark off GL 4.5 for zink
  zink: GLSL 450
  zink: enable PIPE_CAP_TEXTURE_BARRIER
  zink: enable PIPE_CAP_TGSI_TXQS
  zink: enable PIPE_CAP_CLIP_HALFZ
  zink: enable PIPE_CAP_CONDITIONAL_RENDER_INVERTED
  zink: GLSL 440
  zink: enable PIPE_CAP_QUERY_BUFFER_OBJECT
  zink: enable PIPE_CAP_TGSI_ARRAY_COMPONENTS
  zink: add a get_query_result_resource hook
  zink: add PIPE_BIND_QUERY_BUFFER to the all-purpose resource creation path
  ci: Ensure that jobs inherting the ci-deqp jobs artifact meson logs
  zink: fix xfb buffer refcounting
  tgsi_to_nir: Fix uniform ranges.
  zink: enable excluded test
  zink: correctly handle 64 valid timestamp bits
  radv: use a more relaxed alignment for upload buffer allocations
  ac/rgp: append the number of seconds to the generated RGP file
  radv: add support for resizing the SQTT buffer automatically
  radv: adjust an error message related to the SQTT buffer size
  radv: do not overallocate the SQTT buffer
  ci: document arm oddity in build-rules
  ci: Restrict meson-gallium job to gstreamer runners
  llvmpipe: enable GL spir-v support
  glsl: fix leak in gl_nir_link_uniform_blocks

BUG=None
TEST=Builds

Change-Id: I7f76ba91619027c6515c3be4dc021fad6e2bf8cd