Only the double variants (since Python doesn't really differentiate
between 32bit and 64bit floats) and directly into math to mimic Python's
math module.
Only the double ones, exposed as floats, because the extra ALU required
by doubles is negligible to function call overhead. It'll be different
for non-scalar types, but here I use this.