2deead184cfbd84a617f64bffcdd7dcaaf2bd6f1 - chromium.googlesource.com/chromiumos/third_party/mesa

commit	2deead184cfbd84a617f64bffcdd7dcaaf2bd6f1	[log] [tgz]
author	Connor Abbott <cwabbott0@gmail.com>	Wed Nov 11 15:31:09 2020 +0100
committer	Marge Bot <eric+marge@anholt.net>	Wed Apr 14 17:33:58 2021 +0000
tree	02077f372d2fb5f3c6b5d9f65319063fbcd55e21
parent	7821e5a3f8d593e1e9738924f5f4dc5996583518 [diff]

ir3/sched: Don't schedule too many tex/SFU instructions

Consider a simple loop that does a series of texture instructions and
then reduces the results:

vec4 sum = vec4(0);
for (int i = 0; i < N; i++) {
   sum += texture(...);
}

Assume that the loop is unrolled and we schedule the resulting basic
block. Right now, after we schedule the first texture instruction, the
only instructions available to schedule that don't incur a sync are the
instructions to setup the second texture instruction. So we keep picking
the texture instructions, no matter how large N is, resulting in a
pathological schedule for register pressure when N is very large:

sum1 = texture(...);
sum2 = texture(...);
sum3 = texture(...);
...
sum = sum1 + sum2 + sum3 + ...;

In particular this happens with some CTS tests for VK_EXT_robustness2,
where a loop like that with many iterations is marked as [[unroll]],
forcing NIR to unroll it.

This solution is a balance between the current approach and always
scheduling for register pressure (and ignoring sync's). We only allow a
certain number of texture fetches to be in flight before considering
textures to "sync", even though they don't really, both because they
likely *will* sync in reality (overflowing the internal queue of waiting
texture instructions) and because at some point we need the normal
algorithm to kick in and start lowering register pressure.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7571>

src/freedreno/ir3/ir3_sched.c[diff]

1 file changed

tree: 02077f372d2fb5f3c6b5d9f65319063fbcd55e21