BACKPORT: anv: Use vma_heap for descriptor pool host allocation
Pre-patch, anv_descriptor_pool used a free list for host allocations
that never merged adjacent free blocks. If the pool only allocated
fixed-sized blocks, then this would not be a problem. But the pool
allocations are variable-sized, and this caused over half of the pool's
memory to be consumed by unusable free blocks in some workloads, causing
unnecessary memory footprint.
Replacing the free list with util_vma_heap, which does merge adjacent
free blocks, fixes the memory explosion in the target workload.
Disdavantges of util_vma_heap compared to the free list:
- The heap calls malloc() when a new hole is created.
- The heap calls free() when a hole disappears or is merged with an
adjacent hole.
- The Vulkan spec expects descriptor set creation/destruction to be
thread-local lockless in the common case. For workloads that
create/destroy with high frequency, malloc/free may cause overhead.
Profiling is needed.
Tested with a ChromeOS internal TensorFlow benchmark, provided by
package 'tensorflow', running with its OpenCL backend on clvk.
cmdline: benchmark_model --graph=mn2.tflite --use_gpu=true --min_secs=60
gpu: adl
memory footprint from start of benchmark:
before: init=132.691MB max=227.684MB
after: init=134.988MB max=134.988MB
Reported-by: Romaric Jodin <rjodin@google.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20289>
(cherry picked from commit a5f9e59ce357c2974a97004d943aae92ad6f5004)
BUG=b:243157751
TEST=dEQP-VK.api.descriptor_pool.*
TEST=tflist benchmark in commit message
Change-Id: I7a530db0905ccc19fcbc7867c32df863f7de8ee8
Reviewed-on: https://chromium-review.googlesource.com/c/chromiumos/third_party/mesa/+/4115830
Reviewed-by: Romaric Jodin <rjodin@chromium.org>
Tested-by: Romaric Jodin <rjodin@chromium.org>
Reviewed-by: Matt Turner <msturner@google.com>
Commit-Queue: Romaric Jodin <rjodin@chromium.org>
2 files changed