This is always true in the single-draw case, since setDrawOffset()
asserts on this. In the multi-draw case this optimization doesn't make
sense, because it doesn't make sense to create a multidraw shader with
just one draw.
On an Intel 630 GPU this resulted in single-draw single-material Phong
to go from 550 ms to 440, which is roughly a 20% improvement. For the
simpler shaders the difference is even higher. The multidraw numbers
stayed the same as before, obviously.
So it's all having the same workflow. This one results in even more
saved UBO slots per-draw than in the case of Flat, and the slowdown on
Intel is as bad as expected.
While it's one additional indirection (that has an extra cost on Intel
GPUs apparently, like with Phong and MeshVisualizer and
DistanceFieldVector already), with the assumption that draws usually
share the material info it allows to cram more draws into the 16/64k UBO
limit as the per-draw data are now one vec4 smaller.
For the indirection overhead I can imagine adding a new flag which makes
material mapping implicit (materialId == drawId). That seems to put the
benchmark numbers back to the original speed. Same could be done for
other shaders.
These deliberately share the same binding (because there's very little
space), but the shader wasn't guarding that. Discovered completely by
accident when adding tests for "multidraw with all the things" -- Mesa
gives just a warning, but ANGLE straight out fails the shader
compilation, so better have an assert there.
Besides expanding the tested platform set and updating thresholds where
needed, it makes more sense to list what is tested than what is not,
because when I forget to update the list it looks like I tested while I
did not.
I just put this aside when I discovered the error, thinking it was a
Mesa bug. Now that ARM Mali yelled about the same, I realized it wasn't
just Mesa.
Note to self: Mesa has no bugs. Can you just finally accept that?!
That feeling when you lose three hours debugging STRANGE shader compiler
issues that happen only on ES, seeing stuff like "unexpected HASH_TOKEN
at line 140" or "unterminated ifdef" on just any compiler you try, and
then you spot THIS. FFS.
Apparently this is how I was porting shaders in 2013, but not all, I was
mostly sane, wrapping things in a nice ifdef EXPLICIT_UNIFORM_LOCATION,
except this one case in b9a72bd3d1 where I
temporarily went full retard. No idea why.
Interestingly, shaders that have indirect material references are about
2x slower on Intel. Not the Flat or Vector, which contain the full
material in the DrawUniform. Will probably need extra
Intel-specific optimizations (like avoiding the indirection if
MATERIAL_COUNT=1).
It probably didn't matter as much as the only platform without
ARB_explicit_uniform_location is Mac, which doesn't have
ARB_shading_language_420pack either.
Took me a while (several years?) to figure out a way to benchmark this
without basically duplicating the testing effort and without new
variants being too hard to add.
It didn't really make sense to have a separate test case just to check a
bunch of extra extensions first. This makes it much easier to test the
UBO variants as well.
Plus the "invalid" tests don't actually need to test any extensions, as
they're supposed to fail before any extension-dependent code path is
called.
I went through renaming this on many places quite some time ago, but
this one slipped through. Now that UBOs will be a thing, rename to
EXPLICIT_BINDING instead of EXPLICIT_UNIFORM_BINDING.