Sigh, again. That's what I get for removing std::string only from some
APIs and not all. Explicitly fetching the argument values as a
StringView everywhere now, it avoids a copy so it's good to do even if
not strictly needed.
Thus it has no place in the general overview docs of Vk::Instance and
Vk::Device. Better to put it into the VUlkan wrapping docs, where it
also makes sense to have both the device and instance side together.
Makes it possible to write Vk::Instance instance{{argc, argv}} which is
a good tradeoff between passing no arguments at all and doing the fully
verbose thing.
More convenient to use since one doesn't have to explicitly store a
DeviceProperties instance to call pickQueueFamily() on it and then move
it to DeviceCreateInfo to keep it efficient.
This required a slight redesign of how DeviceProperties are stored in
DeviceCreateInfo, now we need the instance to be always valid (but it
can get never used). The wrap() API isn't doing any extra work so this
won't add any inefficiency.
For some reason it wants me to allocate 16 bytes more. Why can't that be
stored somewhere else, I wonder?
Hm, and for this I implemented VK_KHR_driver_properties only to discover
that the info is not queryable if we run the tests with
KHR_get_physical_device_properties2 disabled. Sigh.
Today I spent six hours wrongly convincing myself that it's a driver bug
when vkGetPhysicalDeviceProperties2() is null on a 1.1 instance for a
1.0 physical device. It's not a bug, it's me not reading specs
carefully.
This commit thus basically moves all Instance-level extension-dependent
state to DeviceProperties, because it's actually device-dependent. Which
makes the DeviceProperties class quite heavy and thus it's good it was
readied to be transferred all the way to a Device instance a few commits
back -- I don't really want to do all the dispatch, string processing,
sorting and other mess more times than strictly necessary.
In addition, DeviceProperties::apiVersion() got renamed to version() and
a new isVersionSupported() API got added, mirroring what's on Device
itself; plus thanks to the chicken-and-egg problem of having to call
vkGetPhysicalDeviceProperties() twice, the device version and other
things can now be retrieved in a slightly more efficient way.
This avoids allocating a potentially large array in case just the first
device is needed. Originally I did this in a hope to avoid a stall + CPU
power management issues due to some bad shit in the AMD driver, but it
seems that enumerating even just one device still makes it stall. Sigh.
Similarly to Corrade Assert.h, so it's possible to include this header
without having to worry about irrelevant overhead when asserts are
disabled.
Also test it properly, as it should have been from the start.
I'm a bit unsure why there's a device extension that actually gets used
on an instance and doesn't need any enabling (and thus there's
currently no way to disable it to test all code paths, hmm).
I have more and more cases where I need to query device properties later
down the road (memory capabilities, device name, ...) and leaving all
this up to the user / making this impossible to do in the library
internals is complicating everything too much.
Since there's a shitton of device properties with a new bag of props
coming with every other new extension, I expect the queries to get quite
involved / complicated over time (chaining 100s structs and such), so
let's design this upfront in a way that can avoid reqpeatedly querying
the same thing just because we needlessly discarded a fully populated
instance before.
It also means the users don't need to drag their own DeviceProperties
instance along anymore and can just let the Device take care of that.
Unfortunately the only nice way to make this work with DeviceCreateInfo
method chaining is to add & and && overloads for each. But it's quite
easy to test that all of them work and properly return a r-value
reference so it shouldn't be too much of a maintenance nightmare.
You won't believe it, but it took me over a month of sitting on the
shitter until this design idea materialized out of [..] air. The whole
story, in order:
- Vulkan doesn't allow one VkDeviceMemory to be mapped more than once.
This is rather sad, because since Vulkan best practices suggest to
allocate a large block and suballocate from that, the engine needs
an extra layer that "emulates" mapping the suballocations for the
users but behind the scenes it inevitably has to map the whole
VkDeviceMemory anyway and keep it mapped for as long as any of the
sub-mappings is active.
- Because if it would map just a certain suballocation and then the
user would want to map another suballocation, it would have to
discard the original mapping and create a new one spanning both
suballocations and that has a risk of suddenly being in a different
VM block, making all pointers to the previous mapping invalid.
- The Vulkan Memory Allocator implements this approach of mapping the
whole thing and because of all the bookkeeping it doesn't give a
direct access to the underlying VkDeviceMemory, making it rather
hard to integrate.
Here I realized that:
- Most allocations won't need to be mapped ever, so the hiding and
obfuscation done by VMA isn't needed for those --- and we want
interoperability with 3rd party code, so preventing access to
VkDeviceMemory is out of question.
- There's KHR_dedicated_allocation, which (probably?) wasn't around
when VMA was originally designed. The extension was created because
a dedicated allocation actually *does* make sense in certain
cases and on certain architectures. Providing a way to make those
thus shouldn't be something "temporary, until a real allocator
exists" but rather a well-designed API that's there to stay.
- Except for iGPUs, the usual way to populate a GPU buffer would be to
first copy the data to some host-accessible scratch buffer and then
do a GPU-side copy of that buffer to a device-local memory. The
scratch buffer is very likely to have a vastly different
suballocation scheme than GPU buffers (grow & discard everything
once it's all uploaded, for example) so again trying to put the two
under the same allocator umbrella doesn't make sense.
Thus:
- To avoid implementing a full-blown allocator right from the start,
we'll first provide convenience APIs only for dedicated allocations
-- making it possible to transfer memory ownership to an
Image/Buffer so it can be treated the same way as in GL, and later
having the Image/Buffer constructor implicitly allocate a dedicated
VkDeviceMemory.
- This default allocation will be subsequently equipped with
KHR_dedicated_allocation bits.
- Thanks to the extensible/layered nature of the design, the user is
still capable of being completely in control of allocations,
managing VkDeviceMemory sub-allocations by hand.
Finally, once allocator APIs are figured out, the default Buffer/Image
behavior gets switched from a dedicated allocation to using an
allocator, and dedicated allocation will be only used if the
KHR_dedicated_allocation bit is requested.
Memory type flags are put into a new, separate Memory.h header as those
will be needed more often than the (ever-growing) DeviceProperties --
from Image and Buffer constructors, in particular.