To avoid combinatorial explosion of functions that have to be
implemented on the plugin side, the single-image variants proxy to the
multi-level variants, if available.
THe plugin interface string had to be bumped because otherwise there was
a *really nasty* silent ABI break where instead of doSetFlags() it was
suddenly calling doConvert().
The "funniest" of all is that those get printed in a totally random
fashion. Some in Debug, some in Release, some not at all (like, half of
the MaterialDataTest doesn'ŧ initialize `state` and yet it doesn't
complain). Fuck these half-assed features that achieve nothing except
bullying people WHO KNOW WHAT THEY ARE DOING FFS.
Because::This::Is::Colon::Cancer. Also rename Cube to CubeMap and get
rid of the comment that said cube map images are six consecutive 2D
images. Now it's one 3D image instead because that makes more sense, in
the very rare case we'd need to have six different images again we could
probably add a CubeFace value or some such.
Interestingly enough, there were no docs whatsoever for image and scene
conversion, and neither it was mentioned what all scene data can be
imported. Sigh.
We have half-float vectors and matrices, so why not these as well. Not
sure for what all is the angle precision usable, but at the very least
it could be useful for compact meshlet occlusion cone / AABB
representation or rough animations.
This is already done for the AbstractImporter and the new
AbstractShaderConverter, as there's a common use case of checking just
the filename for input/output path or file type detection and then
delegating to the common implementation working directly on data.
Minor but very important convenience feature, especially useful when
dealing with command-line apps. This now works:
magnum-imageconverter a.png a.jpg -c jpegQuality=0.75
The AnyImageConverter gets the jpegQuality option and then
automatically propagates it to the concrete plugin (which is either
JpegImageConverter or StbImageConverter), possibly warning in case the
target plugin doesn't recognize given option (i.e., doesn't list it in
its default configuration). Previously the user had to always specify a
concrete converter implementation using -C, which was rather annoying
and nonintuitive.
In cases when specular highlights are not desired, results in 30%
speedup (on Intel) and ~25% speedup on AMD, compared to setting the
specular color to transparent black.
Testing was easy thanks to already having a ground truth image for this
case.
After several failed attempts to make UBO performance not suck on Intel
Mesa and Windows drivers, I ended up hiding the dynamic aspect under a
flag. That way it's still possible to get the proper perf in UBO
workflows that don't do light culling, and for workflows where light
culling matters the 2x slowdown might be still better than looping
through several extra lights that don't contribute anything.
While (of course) having zero effect on a single-light scenario, with
five lights it saves about 10% in the classic uniform case (on Intel).
Not bad (also, FFS, what the compiler is doing if it's not able to
optimize this?!).
Hm, I wonder why I did it this way, the operation isn't really heavy to
benefit from the hardware interpolators, and with a lot of lights we'd
hit the maximum output count in the vertex shader.
With classic uniforms this seems to be ~5% faster on both Intel and AMD
cards. With UBOs this is ~15% (!) faster on AMD (I guess because the
constant UBO access overhead is moved to just one stage instead of
both?) but slower on Intel (of course, sigh... I assume due to UBO reads
being slow and so when done for every fragment instead of every vertex it
costs more?). Since this benefits the *real* GPUs while the card that
was already awful is more awful I don't think that's a big deal. This
change stays.
"Luckily", thanks to the DRAW_COUNT=1 and MATERIAL_COUNT=1 optimizations
not everything blows up, so i don't need to skip absolutely everything,
unfortunately Phong lights are affected by this insane crapfest as well
so basically nothing from Phong UBO support is tested there. FFS.
Unlike the drawId optimization before, there's no possibility to
check this anywhere, so the assumption is just documented.
On an Intel 630 this resulted in further significant reduction for the
single-draw single-material case, down to 260 from 440 in the previous
commit, about a 45% reduction compared to the original 550 ms; multidraw
case is still around the 550 there.
This is always true in the single-draw case, since setDrawOffset()
asserts on this. In the multi-draw case this optimization doesn't make
sense, because it doesn't make sense to create a multidraw shader with
just one draw.
On an Intel 630 GPU this resulted in single-draw single-material Phong
to go from 550 ms to 440, which is roughly a 20% improvement. For the
simpler shaders the difference is even higher. The multidraw numbers
stayed the same as before, obviously.