This caused MeshToolsCompileGLTest to fail in a strange way, and
PhongGLTest::renderLowLightAngle() as well. Which looked rare enough that
I first suspected some random driver bug, but apparently it was all
caused by these using the default infinity range instead of explicitly
calling setLightRanges() on the shader.
The test is now updated to explicitly verify the default value when a
setter isn't called, to catch this problem better in case it reappears in
a different form elsewhere.
Not that C++ STL and exceptions would be anything to take inspiration
from, but there's std::out_of_range. Python IndexError is also specified
as "index out of range", not "bounds".
Partially needed to avoid build breakages because Corrade itself
switched as well, partially because a cleanup is always good. Done
except for (STL-heavy) code that's deprecated or SceneGraph-related APIs
that are still quite full of STL as well.
With the workarounds moved to the GL::Shader class itself, it's just a
complicated wrapper for adding the compatibility.glsl file and a rather
strange way to define a file-local helper for resource import on static
builds. Do that directly instead.
Done this way only in the Phong shader, everywhere else it's just the
MAGNUM_ASSERT_GL_EXTENSION_SUPPORTED() macro. Some WIP code that I
forgot to clean up?
So it's possible to have light culling enabled on, say, 64 lights, but
with only at most 3 applied each draw, allowing the shader compiler to
unroll the loop if it makes sense. This also better prepares for SSBO
support where the total light count would be unbounded and thus the
value ignored, and thus the value can be 0.
This prepares for SSBO support where the total count is unbounded (and
thus the value is ignored, thus it can be 0).
Also regroup the doc paragraphs so it's clear what's related to UBO
usage and what applies to classic uniforms as well.
Mainly important for Shader::addSource() to prevent it from creating a
needless copy, but doesn't hurt to do the same also for
uniformLocation(), bindAttributeLocation() etc. -- it'll avoid a runtime
strlen() in that case at least.
Same as in the previous commit, most cases are inputs so a StringStl.h
compatibility include will do, the only breaking change is
GL::Shader::sources() which now returns a StringIterable instead of a
std::vector<std::string> (ew).
Awesome about this whole thing is that The Shader API now allows
creating a shader from sources coming either from string view literals
or Utility::Resource completely without having to allocate any strings
internally, because all those can be just non-owning references wrapped
with String::nullTerminatedGlobalView(). The only parts which aren't
references are the #line markers, but (especially on 64bit) those can
easily fit into the 22-byte (or 10-byte on 32bit) SSO storage.
Also, various Shader constructors and assignment operators had to be
deinlined in order to avoid having to include the String header, which
would be needed for Array destruction during a move.
Co-authored-by: Hugo Amiard <hugo.amiard@wonderlandengine.com>
It's not a GL error, and allows the application to compile just a single
shader for all skinned meshes, not one for each skeleton size. Together
with the dynamic per-vertex joint count this means the app only needs a
single shader for all skinned meshes, which is nice.
High-level docs with examples will be written once there's corresponding
support in MeshTools::compile() *and* in importer plugins, as skinned
meshes are usually brought in from files, never set up directly.
Co-authored-by: Squareys <squareys@googlemail.com>
* No need to repeat the type for all variables, unnecessary redundancy.
* Reducing the amount of redundant local variables and if they stay
making their definitions more localized to where they get used.
* All uniform setters used the "initial value is" phrase instead of
"default is", this one didn't.
Because it was no longer bearable with three UnsignedInt arguments in a
row, especially when some of them are only available on a subset of
platforms. And it would get even worse with introduction of planned
features such as multiview or skinning.
Backwards compatibility is in place, as always. To ensure nothing
breaks, this commit still has all tests and snippets using the old API.
Consistently with checkLink(), this avoids having to explicitly include
both Iterable and Reference in shader code. Alsod allowing people to
have direct arrays of shaders, runtime-sized lists of shaders etc.
A compat include is provided on a deprecated build to avoid breaking
existing code.
The class is rather heavy (strings, STL vector) and it'll stay heavier
than strictly needed even after the planned STL cleanup -- shader users
should not bear the overhead of Array, StringView etc. that it needs in
order to compile the shader sources.
I might eventually come to a different conclusion (maybe separating
GL::Shader population and usage like doing in Vulkan with CreateInfos),
but right now this commit has the best available solution -- converting
the instance to a lightweight class containing just ID and type, and
then converting that back to a GL::Shader upon checking compilation/link
status.
While at it, also removed the not-strictly-needed Optional usage from
the header. It wouldn't work with forward-declared GL::Shader anyway.
The new async APIs were just checking the link status, and printing the
linker error. Because drivers commonly do all that in a single step,
without really separating compilation from linking (or at least that's
what I thought?), I assumed the linker error would contain *also* the
compilation error, if any.
But on a quick check with Mesa that's not the case, I only get "error:
linking with uncompiled/unspecialized shader", which is very useless.
Which means, to get proper error output, the checkLink() function now
explicitly takes a list of the input shaders. It will unconditionally go
through them at the beginning and call checkCompile() on each.
To further encourage the shaders to be passed, there's no default
argument -- so if the application calls checkCompile() on its own for
some reason, it has to pass an empty list to checkLink().
Hah this took a while, as there was no texture scaffolding in place at
all. Thus all this had to be added and tested for the first time:
* 2D textures
* 2D texture arrays
* Texture transformation uniforms
* Texture transformation UBOs
* Instanced texture offset
This also means that MeshVisualizer can be used to visualize arbitrary
(single-channel) integer textures now, not just render meshes with
object ID textures. Yay for feature parity!
In cases when specular highlights are not desired, results in 30%
speedup (on Intel) and ~25% speedup on AMD, compared to setting the
specular color to transparent black.
Testing was easy thanks to already having a ground truth image for this
case.
After several failed attempts to make UBO performance not suck on Intel
Mesa and Windows drivers, I ended up hiding the dynamic aspect under a
flag. That way it's still possible to get the proper perf in UBO
workflows that don't do light culling, and for workflows where light
culling matters the 2x slowdown might be still better than looping
through several extra lights that don't contribute anything.
Hm, I wonder why I did it this way, the operation isn't really heavy to
benefit from the hardware interpolators, and with a lot of lights we'd
hit the maximum output count in the vertex shader.
With classic uniforms this seems to be ~5% faster on both Intel and AMD
cards. With UBOs this is ~15% (!) faster on AMD (I guess because the
constant UBO access overhead is moved to just one stage instead of
both?) but slower on Intel (of course, sigh... I assume due to UBO reads
being slow and so when done for every fragment instead of every vertex it
costs more?). Since this benefits the *real* GPUs while the card that
was already awful is more awful I don't think that's a big deal. This
change stays.
This is always true in the single-draw case, since setDrawOffset()
asserts on this. In the multi-draw case this optimization doesn't make
sense, because it doesn't make sense to create a multidraw shader with
just one draw.
On an Intel 630 GPU this resulted in single-draw single-material Phong
to go from 550 ms to 440, which is roughly a 20% improvement. For the
simpler shaders the difference is even higher. The multidraw numbers
stayed the same as before, obviously.
These deliberately share the same binding (because there's very little
space), but the shader wasn't guarding that. Discovered completely by
accident when adding tests for "multidraw with all the things" -- Mesa
gives just a warning, but ANGLE straight out fails the shader
compilation, so better have an assert there.