And use static functions with an explicit "self" pointer instead. Those
have half the size (8 vs 16 bytes on 64bit x86), which in turn reduces
the state tracker memory use by about 750 bytes. On desktop GL with an
Intel GPU & Mesa this reduces the state tracker allocation size by almot
10%, from 8.3 kB to 7.6 kB. Not bad.
Apart from small memory savings, this also removes the need to include
the full class definiton from the State headers on MSVC (because
on that compiler the member function pointer size is different based on
whether the type definition is known or not, IMAGINE THAT BEING A
FEATURE AND NOT A BUG), leading to less header dependencies and better
incremental compile times there.
This was already done in some cases (and the Vk library used this from
the beginning), and as I'm about to add some more extension-dependent
functionality it felt like a good time to finish that change, finally.
In some cases the *Implementation() could even be dropped in favor of
pointing to the GL API directly (such as is already done for various
glUniform*() calls), that'd be another step -- this is good enough for
now.
All the tests were updated to explicitly check that non-null-terminated
strings get handled properly (the GL label APIs have an explicit size,
so it *should*, but just in case). Also, because various subclasses
override the setter to return correct type for method chaining and the
override has to be deinlined to avoid relying on a StringView include,
the tests are now explicitly done for each leaf class, instead of the
subclass
The <string> being removed from the base class for all GL objects may
affect downstream projects which relied on it being included. In case of
Magnum, the breakages were already fixed in the previous commit.
Compile time improvement for the MagnumGL library alone is 0.2 second or
4% (6.1 seconds before, 5.9 after). Not bad, given that there's three
more files to compile and strings are still heavily used in other GL
debug output APIs and all shader stuff. For a build of just the GL
library and all tests, it goes down from 28.9 seconds to 28.1. Most
tests also still rely quite heavily on std::stringstream for debug
output testing, so the numbers still could go further.
The type is now extended to 32 bits. In the GL and Vk libraries it means
one can now do things like
MeshIndexType type = meshIndexTypeWrap(GL_UNSIGNED_BYTE);
and passing that to GL::Mesh or Vk::Mesh will cause it to use the value
directly, instead of doing a mapping from a generic type. The *real* use
case for this is however to allow custom index buffer representations in
Trade::MeshData. Support for that will be hooked up in the following
commit.
It was used just for certain corner cases that weren't covered by
{EXT,OES}_draw_elements_base_vertex already, but because of a wrongly
understood extension interaction the EXT/OES multidraw entrypoint (which
isn't implemented on ANGLE) was used from these as well.
And while trying to disable the two extensions for the MeshGLTest I
discovered the test half-expects the ANGLE variant to be implemented and
so I finished it for both the single-draw and multi-draw case.
The extension interaction that makes base-vertex multidraw calls broken
on ANGLE will be fixed in the following commit.
Yes, this is only ever provided by an ANGLE extension, and as you might
have guessed, The Great Google Overlords didn't bother caring to have
this supported outside of their browser.
Since the main point of the low-level API is to make use of the
EXT_multi_draw_arrays / ANGLE_multi_draw / WEBGL_multi_draw extensions
for a reduced driver overhead *and* the builtin gl_DrawID variable, it
just doesn't make sense to provide a fallback to draw() in a loop. When
the extensions are not available, the user has to do extra work in order
to supply proper UBO draw offset for each draw call, and thus the
fallback path has basically no practical use.
On the other hand, draw() taking the MeshView instances is kept as-is --
this one is rather old and so breaking compatibility would be an
unnecessary pain point. Plus, since it has to allocate the contiguous
arrays internally, it's not desirable for any fast-path rendering
anyway.
Because a MeshView might not be the best thing to have when you are
submitting a batch of thousand draws. It takes a strided array views to
allow for more flexibility, but can also detect if the input is already
contiguous and use it as-is.
UNFORTUNATELY the GL 1.0 legacy still continues to stink and so there
has to be a 64-bit-specific overload which is the *actual* variant that
doesn't allocate because glMultiDrawElements takes a `void**` for INDEX
OFFSETS and it's IN BYTES! Which foolish soul designed such a thing back
in the 1860s, I wonder. There's no reason to not have an index offset
in elements because all indices have to have the same type ANYWAY. And
yes, I wasted about three hours debugging driver crashes because I
THOUGHT this parameter takes offset in elements, not bytes.
Also note: on 32-bit platforms this depends on latest Corrade with the
CORRADE_TARGET_32BIT definition. Spent an embarrassing amount of time
wondering why all local builds but Emscripten work.
This means that instead of 12 separate allocations we have just one,
allocating everything together in a contiguous piece of memory. That
should be also a bit more cache friendly when accessing the state as
it's not scattered around the memory like crazy.
Because there are no Pointer indirections needed anymore, the State
members are just references now. That resulted in a lot of sweeping
changes around the whole GL library, but they're all trivial, changing
`->` to `.`, mostly.
There's two more nested allocations in the TextureState struct, will
take care of them in a separate commit.
As usual, the old APIs are still present, but marked as deprecated.
Existing code is not updated yet to ensure I didn't break anything with
this.
This way it's much more intuitive and makes the code shorter and nicer
in many cases. Shaders are now also able to hide irrelevant
draw/dispatch APIs to avoid accidents.
Better for checking accidents, as picking a wrong primitive / index type
can lead to *serious* rendering issues. Similarly to a change done to
(Compressed)PixelFormat in 2019.10.
Good thing I checked this -- based on WebGL I was under assumption that
all GPUs have it just 256, but that was really just WebGL limitation.
Also ugh why this query wasn't there since the 90's?
Bloaty says it saved 10 kB in Debug build of MagnumGL:
VM SIZE FILE SIZE
-------------- --------------
[ = ] 0 .debug_info +1.59Ki +0.0%
+0.4% +1.50Ki .text +1.50Ki +0.4%
[ = ] 0 .debug_str +409 +0.0%
[ = ] 0 .debug_line +276 +0.1%
[ = ] 0 .debug_abbrev +20 +0.0%
-28.6% -2 [LOAD [RX]] -2 -28.6%
[ = ] 0 [Unmapped] -4.28Ki -41.0%
-22.7% -9.23Ki .rodata -9.23Ki -22.7%
-0.8% -7.73Ki TOTAL -9.73Ki -0.1%
And 4 kB in Release:
VM SIZE FILE SIZE
-------------- --------------
+1.1% +3.44Ki .text +3.44Ki +1.1%
+1.7% +1.39Ki .eh_frame +1.39Ki +1.7%
[ = ] 0 [Unmapped] +656 +51%
-25.5% -9.47Ki .rodata -9.47Ki -25.5%
-0.7% -4.64Ki TOTAL -4.00Ki -0.4%
That's not negative, so I guess that's good. This change is of course
more significant in the context of a minimal WebGL build, where the exe
can be as little as 50 kB -- there 4 kB is almost 10% of the size.
Deprecated for 2018.04, it's been almost a year since. Whoever is using
Magnum regularly updated already, and who not can always upgrade
gradually (2018.02, 2018.04, 2018.10, 2019.01 etc.).
Fixing the new (and now failing test) from the previous commit. In
setIndexBuffer(), I was resetting the state tracker to a VAO state that
was about to be set in the very next step, and then, when doing that
next step the state tracker "optimized away" the state change because it
thought it was already done (even though it wasn't). Reordering the two
operations fixes it.
In comparison to how this was meant to be done in the original
169031fb7b, the new way should do the same
but additionally avoid a bunch of redundant state calls. Let's hope no
more bugs related to this appear.
This reverts commit f6ba4111e1, which in
turn reverts commit 4ce2875262 from 2015.
Turns out glDrawRangeElements() *is* fixed now in Firefox, but is broken
in Emscripten because their function dependency handling doesn't work
correctly. Related PR: https://github.com/kripken/emscripten/pull/7112
Reverting this until the Emscripten PR is integrated and a version
released with this patch is widespread enough (assuming a year-long
delay could do).
The change done in 680144f1c5 was not
properly handling these cases:
* Mesh(NoCreateT) and wrap() were not constructing the internal vector,
which blew up when move-assigning another instance.
* ~Mesh() was not destructing the internal vector if the VAO ID was
zero or non-owning wrap() was used.
Strangely enough none of these were causing *any* problems for me on
Linux (even ASan was totally quiet and due to the unfortunate
combination of bugs even when I assigned totally random data to the
storage vector). This however blew up on MSVC, assuming there the
implementation is more checked.
Because it's possible to construct Mesh with no GL context available,
the move construction and destruction needs to avoid accessing Context
unless really necessary (it would be also unclear which type of vector
should be constructed if we have no context).
Extended the tests to handle hopefully all the cases.
This allows moving Buffer instances without fear of breaking stuff. I
thought this was in since a long time, but doesn't seem so. This
reintroduces a header dependency, but until I have Containers::Storage
or something that *can* store forward-declared type safely, it's
inevitable.
This is actually a preparation to make buffer-owning meshes a
possibility (where I would need an union of vectors otherwise),
nevertheless it removes the dependency on a vector.
Was Magnum::GL::Extensions::GL before and the redundancy was completely
unnecessary. Potential future extensions coming from GLX, EGL or whatnot
will most probably be in the Platform namespace in a completely separate
file, so this is not a problem.
All code internal to the GL library is affected, not much the outside,
as that is handled by the compatibility alias.
Similarly to pixel formats, there is now generic Magnum::MeshPrimitive
and Magnum::MeshIndexType, which is convertible to GL::MeshPrimitive and
GL::MeshIndexType using GL::meshPrimitive() and GL::meshIndexType(). In
addition, the following is done:
* The original GL::Mesh::IndexType is now GL::MeshIndexType, original
name is now just a typedef.
* GL::Mesh::indexSize() is deprecated in favor of
Magnum::meshIndexTypeSize() and GL::Mesh::indexTypeSize().
* New GL::Mesh::indexType() and GL::MeshView::mesh() getters (not sure
why they were omitted)
* GL::Mesh::indexType(), GL::Mesh::indexTypeSize(),
GL::MeshView::setIndexRange() now expect that the mesh is indexed
(useful property in my opinion, also avoids getting random results).
* The extra MeshPrimitive::LinesAdjacency etc. are still present for
backwards compatibility, but marked as deprecated. Use
GL::MeshPrimitive values instead.