This prepares for SSBO support where the total count is unbounded (and
thus the value is ignored, thus it can be 0).
Also regroup the doc paragraphs so it's clear what's related to UBO
usage and what applies to classic uniforms as well.
Same as in the previous commit, most cases are inputs so a StringStl.h
compatibility include will do, the only breaking change is
GL::Shader::sources() which now returns a StringIterable instead of a
std::vector<std::string> (ew).
Awesome about this whole thing is that The Shader API now allows
creating a shader from sources coming either from string view literals
or Utility::Resource completely without having to allocate any strings
internally, because all those can be just non-owning references wrapped
with String::nullTerminatedGlobalView(). The only parts which aren't
references are the #line markers, but (especially on 64bit) those can
easily fit into the 22-byte (or 10-byte on 32bit) SSO storage.
Also, various Shader constructors and assignment operators had to be
deinlined in order to avoid having to include the String header, which
would be needed for Array destruction during a move.
Co-authored-by: Hugo Amiard <hugo.amiard@wonderlandengine.com>
It's not a GL error, and allows the application to compile just a single
shader for all skinned meshes, not one for each skeleton size. Together
with the dynamic per-vertex joint count this means the app only needs a
single shader for all skinned meshes, which is nice.
High-level docs with examples will be written once there's corresponding
support in MeshTools::compile() *and* in importer plugins, as skinned
meshes are usually brought in from files, never set up directly.
Co-authored-by: Squareys <squareys@googlemail.com>
With skinning the TransformationUniform*D structure will be reused for
two different UBOs and referencing it without the corresponding
bind*Buffer() API would be confusing. So just do that for all.
* No need to repeat the type for all variables, unnecessary redundancy.
* Reducing the amount of redundant local variables and if they stay
making their definitions more localized to where they get used.
* All uniform setters used the "initial value is" phrase instead of
"default is", this one didn't.
Those were originally named set*Buffer(), but in the process of
finishing up ef9da0ec96 got changed to
bind*Buffer() to avoid a false impression that the buffer stays bound to
the shader instance forever (which it doesn't, same as with textures).
However the documentation didn't get updated, apparently.
A considerable chunk of the docs mentioned that there has to be one
ProjectionUniform3D per draw. Probably a copypaste error from the case
where there's a combined TrannsformationProjectionUniform3D, which *is*
one per draw. Sorry for the confusion.
There was also quite a lot of documentation content referencing the old
deprecated constructors. Fixed now.
Because it was no longer bearable with three UnsignedInt arguments in a
row, especially when some of them are only available on a subset of
platforms. And it would get even worse with introduction of planned
features such as multiview or skinning.
Backwards compatibility is in place, as always. To ensure nothing
breaks, this commit still has all tests and snippets using the old API.
The class is rather heavy (strings, STL vector) and it'll stay heavier
than strictly needed even after the planned STL cleanup -- shader users
should not bear the overhead of Array, StringView etc. that it needs in
order to compile the shader sources.
I might eventually come to a different conclusion (maybe separating
GL::Shader population and usage like doing in Vulkan with CreateInfos),
but right now this commit has the best available solution -- converting
the instance to a lightweight class containing just ID and type, and
then converting that back to a GL::Shader upon checking compilation/link
status.
While at it, also removed the not-strictly-needed Optional usage from
the header. It wouldn't work with forward-declared GL::Shader anyway.
The proper practice is to have GLES and WebGL requiurements separated,
as the two editions diverge more and more and treating one as a subset
of another no longer works.
Because a MeshView might not be the best thing to have when you are
submitting a batch of thousand draws. It takes a strided array views to
allow for more flexibility, but can also detect if the input is already
contiguous and use it as-is.
UNFORTUNATELY the GL 1.0 legacy still continues to stink and so there
has to be a 64-bit-specific overload which is the *actual* variant that
doesn't allocate because glMultiDrawElements takes a `void**` for INDEX
OFFSETS and it's IN BYTES! Which foolish soul designed such a thing back
in the 1860s, I wonder. There's no reason to not have an index offset
in elements because all indices have to have the same type ANYWAY. And
yes, I wasted about three hours debugging driver crashes because I
THOUGHT this parameter takes offset in elements, not bytes.
Also note: on 32-bit platforms this depends on latest Corrade with the
CORRADE_TARGET_32BIT definition. Spent an embarrassing amount of time
wondering why all local builds but Emscripten work.
This is always true in the single-draw case, since setDrawOffset()
asserts on this. In the multi-draw case this optimization doesn't make
sense, because it doesn't make sense to create a multidraw shader with
just one draw.
On an Intel 630 GPU this resulted in single-draw single-material Phong
to go from 550 ms to 440, which is roughly a 20% improvement. For the
simpler shaders the difference is even higher. The multidraw numbers
stayed the same as before, obviously.
While it's one additional indirection (that has an extra cost on Intel
GPUs apparently, like with Phong and MeshVisualizer and
DistanceFieldVector already), with the assumption that draws usually
share the material info it allows to cram more draws into the 16/64k UBO
limit as the per-draw data are now one vec4 smaller.
For the indirection overhead I can imagine adding a new flag which makes
material mapping implicit (materialId == drawId). That seems to put the
benchmark numbers back to the original speed. Same could be done for
other shaders.