doc: a high-level intro to the new serialization format.

6 years ago · 59a85ed347
6 changed files with 133 additions and 4 deletions
--- a/doc/blob.dox
+++ b/doc/blob.dox
@ -0,0 +1,107 @@
+/*
+    This file is part of Magnum.
+
+    Copyright © 2010, 2011, 2012, 2013, 2014, 2015, 2016, 2017, 2018, 2019
+              Vladimír Vondruš <mosra@centrum.cz>
+
+    Permission is hereby granted, free of charge, to any person obtaining a
+    copy of this software and associated documentation files (the "Software"),
+    to deal in the Software without restriction, including without limitation
+    the rights to use, copy, modify, merge, publish, distribute, sublicense,
+    and/or sell copies of the Software, and to permit persons to whom the
+    Software is furnished to do so, subject to the following conditions:
+
+    The above copyright notice and this permission notice shall be included
+    in all copies or substantial portions of the Software.
+
+    THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+    IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+    FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
+    THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+    LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+    FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
+    DEALINGS IN THE SOFTWARE.
+*/
+
+namespace Magnum {
+/** @page blob Magnum's memory-mappable serialization format
+@brief Efficient and extensible format for storing binary data
+@m_since_latest
+
+@tableofcontents
+@m_footernavigation
+
+Apart from various data import and conversion plugins, described in the
+@ref plugins "previous chapter", Magnum provides its own binary format. Files
+stored in this format have a `*.blob` extension and are identified by various
+permutations of the letters `BLOB` in their first few bytes.
+
+The goal of the format is being usable directly without having to process the
+data payload in any way. That allows for example the file contents to be
+memory-mapped and operated on directly. In order to achieve this, there's four
+different variants of the format based on whether it's running on a 32-bit or
+64-bit system and whether the machine is Little- or Big-Endian. The @ref Trade
+library itself provides serialization and deserialization of blob formats
+matching the platform it's running on. Import and conversion of blobs with
+different endianness or bitness (as well as compatibility with previous format
+versions as the format will evolve) is handled by the
+@ref Trade::MagnumImporter "MagnumImporter" and
+@ref Trade::MagnumSceneConverter "MagnumSceneConverter" plugins --- since this
+functionality is not strictly needed when shipping an application, it's
+provided separately.
+
+@section blob-implementation Implementation
+
+The binary format consists of "chunks" similar to [RIFF](https://en.wikipedia.org/wiki/Resource_Interchange_File_Format),
+and the main property is an ability to combine arbitrary chunks together in the
+most trivial way possible as well as extracting them back. Each chunk has a
+@ref Trade::DataChunkHeader containing a [FourCC](https://en.wikipedia.org/wiki/FourCC)-like @ref Trade::DataChunkType identifier and a chunk size, allowing applications to pick chunks that they're interested in and reliably skip the
+others. Compared to RIFF the file doesn't have any "global" chunk in order to
+make trivial file concatenation possible:
+
+@code{.sh}
+cat chair.blob table.blob > furniture.blob
+@endcode
+
+@section blob-iteration Chunk iteration
+
+To be designed & written first.
+
+@section blob-meshdata Mesh data
+
+Currently there's just a single serializable data type, @ref Trade::MeshData.
+You can create serialized blobs using @ref Trade::MeshData::serialize() or
+alternatively using the @ref magnum-sceneconverter "magnum-sceneconverter"
+tool, for example:
+
+@code{.sh}
+magnum-sceneconverter avocado.glb avocado.blob
+@endcode
+
+Deserialization is then done with @ref Trade::MeshData::deserialize(). The
+function takes a memory view as an input and returns a @ref Trade::MeshData
+instance pointing to that view, without copying or processing the data in any
+way. A recommended way to access serialized data is thus via memory-mapping the
+file (for example using @ref Utility::Directory::mapRead() or any other way
+your platform allows), and keeping it around for as long as you need:
+
+@snippet MagnumTrade.cpp blob-deserialize-mesh
+
+@section blob-custom Custom chunk types
+
+As said above, the format is designed to allow custom chunk types to be mixed
+together with data recognized by Magnum. To make a custom chunk, create your
+own @ref Trade::DataChunkType using @ref Corrade::Utility::Endianness::fourCC()
+--- identifiers starting with an uppercase letter are reserved for Magnum
+itself, custom application-specific data types should use a lowercase first
+letter instead.
+
+Then write a serialization/deserialization API similar to
+@ref Trade::MeshData::serialize() / @ref Trade::MeshData::deserialize() with
+the help of low-level @ref Trade::dataChunkHeaderSerializeInto() and
+@ref Trade::dataChunkHeaderDeserialize(). Those functions will take care of
+properly filling in required chunk header fields when serializing and checking
+chunk validity when deserializing. Validation of the chunk data itself is then
+up to you.
+*/
+}
--- a/doc/changelog.dox
+++ b/doc/changelog.dox
@ -272,6 +272,9 @@ See also:
    scene formats
 -   New @ref Trade::AbstractSceneConverter plugin interface and an
    @ref Trade::AnySceneConverter "AnySceneConverter" plugin
+-   Efficient and extensible memory-mappable serialization format for binary
+    data. See @ref blob for an introduction, see also
+    [mosra/magnum#427](https://github.com/mosra/magnum/pull/427).
 -   Ability to import image mip levels via an additional parameter in
    @ref Trade::AbstractImporter::image2D(),
    @ref Trade::AbstractImporter::image2DLevelCount() and similar APIs for 1D
--- a/doc/features.dox
+++ b/doc/features.dox
@ -37,6 +37,7 @@ necessary to read through everything, pick only what you need.
 -   @subpage transformations --- @copybrief transformations
 -   @subpage animation --- @copybrief animation
 -   @subpage plugins --- @copybrief plugins
+-   @subpage blob --- @copybrief blob
 -   @subpage opengl-wrapping --- @copybrief opengl-wrapping
 -   @subpage shaders --- @copybrief shaders
 -   @subpage scenegraph --- @copybrief scenegraph
--- a/doc/snippets/MagnumTrade.cpp
+++ b/doc/snippets/MagnumTrade.cpp
@ -65,6 +65,19 @@ using namespace Magnum::Math::Literals;

 int main() {

+{
+/* [blob-deserialize-mesh] */
+Containers::Array<const char, Utility::Directory::MapDeleter> blob =
+    Utility::Directory::mapRead("extremely-huge-spaceship.blob");
+
+Containers::Optional<Trade::MeshData> spaceship =
+    Trade::MeshData::deserialize(blob);
+if(!spaceship) Fatal{} << "oh no";
+
+// ...
+/* [blob-deserialize-mesh] */
+}
+
 {
 /* [AbstractImporter-usage] */
 PluginManager::Manager<Trade::AbstractImporter> manager;
--- a/src/Magnum/Trade/Data.h
+++ b/src/Magnum/Trade/Data.h
@ -104,6 +104,8 @@ specified effect in the current version of the header. It doesn't need to be
 alphanumeric either, but for additional versioning of a particular chunk type
 it's recommended to use @ref DataChunkHeader::typeVersion, keeping the chunk
 type FourCC clearly recognizable.
+
+@see @ref blob
 */
 enum class DataChunkType: UnsignedInt {
    /**
@ -139,7 +141,7 @@ MAGNUM_TRADE_EXPORT Debug& operator<<(Debug& debug, DataChunkType value);

 Reads as `BLOB` letters for a Little-Endian 64 bit data chunk. For Big-Endian
 the order is reversed (thus `BOLB`), 32-bit data have the `L` letter lowercase.
-@see @ref DataChunkHeader::signature
+@see @ref blob, @ref DataChunkHeader::signature
 */
 enum class DataChunkSignature: UnsignedInt {
    /** Little-Endian 32-bit data. The letters `BlOB`. */
@ -176,6 +178,8 @@ MAGNUM_TRADE_EXPORT Debug& operator<<(Debug& debug, DataChunkSignature value);
@brief Header for memory-mappable data chunks
@m_since_latest

+See @ref blob for an introduction.
+
 Since the goal of the serialization format is to be a direct equivalent to the
 in-memory data layout, there's four different variants of the header based on
 whether it's running on a 32-bit or 64-bit system and whether the machine is
@ -241,6 +245,7 @@ current platform and @p data is large enough to contain the whole chunk,
@cpp false @ce otherwise. The function doesn't print any diagnostic messages on
 validation failure, use @ref dataChunkHeaderDeserialize() instead if you need
 to know why.
+@see @ref blob
 */
 MAGNUM_TRADE_EXPORT bool isDataChunk(Containers::ArrayView<const void> data);

@ -251,7 +256,7 @@ MAGNUM_TRADE_EXPORT bool isDataChunk(Containers::ArrayView<const void> data);
 Checks that @p data is large enough to contain a valid data chunk, validates
 the header and then returns @p data reinterpreted as a @ref DataChunkHeader
 pointer. On failure prints an error message and returns @cpp nullptr @ce.
-@see @ref isDataChunk(), @ref dataChunkHeaderSerializeInto()
+@see @ref blob, @ref isDataChunk(), @ref dataChunkHeaderSerializeInto()
 */
 MAGNUM_TRADE_EXPORT const DataChunkHeader* dataChunkHeaderDeserialize(Containers::ArrayView<const void> data);

@ -267,7 +272,7 @@ Expects that @p data is at least the size of @ref DataChunkHeader. Fills in
@ref DataChunkHeader::typeVersion and @ref DataChunkHeader::type with passed
 values used in constructor, and @ref DataChunkHeader::size with @p data size.

-@see @ref dataChunkHeaderDeserialize()
+@see @ref blob, @ref dataChunkHeaderDeserialize()
 */
 MAGNUM_TRADE_EXPORT std::size_t dataChunkHeaderSerializeInto(Containers::ArrayView<char> out, DataChunkType type, UnsignedShort typeVersion);

--- a/src/Magnum/Trade/MeshData.h
+++ b/src/Magnum/Trade/MeshData.h
@ -715,7 +715,7 @@ implementation-specific @ref VertexFormat values.

 Using @ref serialize(), an instance of this class can be serialized into
 Magnum's memory-mappable serialization format, and deserialized back using
-@ref deserialize().
+@ref deserialize(). See @ref blob for a high-level introduction.

 The deserialization only involves various sanity checks followed by a creation
 of a new @ref MeshData instance referencing the index, vertex and attribute