This makes Vector3 to np.array conversion about 20x faster. Yes, *that*
much. Crazy. Timings from the benchmark added in previous commit before:
np.array([]) 0.66096 µs
np.array([1.0, 2.0, 3.0]) 0.70623 µs
a = array.array("f", [1.0, 2.0, 3.0]); np.array(a) 0.57877 µs
a = Vector3(1.0, 2.0, 3.0); np.array(a) 18.18542 µs
after:
np.array([]) 0.57162 µs
np.array([1.0, 2.0, 3.0]) 0.68309 µs
a = array.array("f", [1.0, 2.0, 3.0]); np.array(a) 0.53958 µs
a = Vector3(1.0, 2.0, 3.0); np.array(a) 0.74818 µs
There's still some overhead that could be removed I think, making the
Vector3-to-numpy conversion faster than list-to-numpy.
Lots of optimization opportunities here. In particular, the conversion
of Vector3 to np.array is *crazy slow*, turns out to be caused mainly by
the overhead of exception throwing in pybind11. In case of Matrix3 to
np.array conversion there's no such overhead because the buffer protocol
takes care of that.
Another thing is that pybind11 buffer protocol interface has a
relatively large overhead compared to e.g. python's own array.array. I
blame the unneded allocations.
Some non-trivial tricks had to be done in order to expose the
GL::defaultFramebuffer variable without causing leaks or double frees.
Also, the AbstractFramebuffer has a protected destructor so this needs
another special handling.
The clash of static constructors and members / properties is ...
unfortunate. This is resolved using a minor hack, but I think that's
warranted if it preserves C++/Python API compatibility. The
translation() static constructor and property is not done yet, tho.
Only the double variants (since Python doesn't really differentiate
between 32bit and 64bit floats) and directly into math to mimic Python's
math module.
Only the double ones, exposed as floats, because the extra ALU required
by doubles is negligible to function call overhead. It'll be different
for non-scalar types, but here I use this.