1 <?xml version="1.0" encoding="UTF-8"?>
3 * Scilab ( http://www.scilab.org/ ) - This file is part of Scilab
4 * Copyright (C) 2008 - INRIA
7 * Copyright (C) 2012 - 2016 - Scilab Enterprises
9 * This file is hereby licensed under the terms of the GNU GPL v2.0,
10 * pursuant to article 5.3.4 of the CeCILL v.2.1.
11 * This file was originally licensed under the terms of the CeCILL v2.1,
12 * and continues to be available under such terms.
13 * For more information, see the COPYING file which you should have received
14 * along with this program.
17 <refentry xmlns="http://docbook.org/ns/docbook" xmlns:xlink="http://www.w3.org/1999/xlink" xmlns:svg="http://www.w3.org/2000/svg" xmlns:ns5="http://www.w3.org/1999/xhtml" xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:db="http://docbook.org/ns/docbook" xmlns:scilab="http://www.scilab.org" xml:id="save_format" xml:lang="en">
19 <refname>save format</refname>
20 <refpurpose>format of files produced by "save"</refpurpose>
23 <title>Abstract</title>
24 <para>The goal of this document is to specify the HDF5 format used by Scilab to store its data.</para>
25 <para>The format is called SOD for Scilab Open Data.</para>
26 <para>The first public release of SOD has been done with Scilab 5.4.0.</para>
29 <title>Rationale</title>
30 <para>Interoperability is one of the key aspects of modern software. In order to improve more and more this aspect, a standard definition of the HDF5 format is proposed in this SEP.</para>
31 <para>Since Scilab 5.2.0, an export / import capability has been developed and maintain to exchange data. This is already one of the base components of Xcos to store and exchange data.</para>
34 <title>Supported data types</title>
35 <para>All Scilab data types are supported. For example:</para>
36 <informaltable border="1">
40 <emphasis role="bold">Name</emphasis>
43 <emphasis role="bold">Example in Scilab</emphasis>
48 <emphasis role="bold">double</emphasis>
59 <emphasis role="bold">string</emphasis>
63 b=["string 1";"my string 2"];
68 <emphasis role="bold">boolean</emphasis>
77 <emphasis role="bold">integer</emphasis>
81 int8([1 -120 127 312])
88 <emphasis role="bold">polynomial</emphasis>
99 <emphasis role="bold">sparse</emphasis>
103 sp=sparse([1,2;4,5;3,10],[1,2,3])
109 <emphasis role="bold">boolean sparse</emphasis>
113 dense=[%F, %F, %T, %F, %F
123 <emphasis role="bold">list</emphasis>
127 l = list(1,["a" "b"])
133 <emphasis role="bold">tlist</emphasis>
137 t = tlist(["listtype","field1","field2"], [], []);
143 <emphasis role="bold">mlist</emphasis>
147 M=mlist(['V','name','value'],['a','b';'c' 'd'],[1 2; 3 4]);
154 Several "types" are based on <emphasis>tlist</emphasis> or <emphasis>mlist</emphasis>.
155 It is the case of <emphasis>rational</emphasis>, <emphasis>state-space</emphasis>,
156 <emphasis>cell</emphasis> and <emphasis>struct</emphasis>. They are therefore transparently saved.
159 <emphasis>void </emphasis>and <emphasis>undefined </emphasis>are two specific elements created to manage special cases in the list management. They are described later in this document.
163 <title>HDF5 File Structure</title>
164 <para>Scilab HDF5 architecture is pretty straightforward. </para>
165 <emphasis role="bold">General</emphasis>
166 <para>For each Scilab variable, a dataset at the root position is declared. The name of the dataset is the name of Scilab variable. </para>
167 <para>Example, the following code:</para>
169 emptyuint32matrix = uint32([]);
170 uint32scalar = uint32(1);
171 uint32rowvector = uint32([1 4 7]);
172 uint32colvector = uint32([1;4;7]);
173 uint32matrix = uint32([1 4 7;9 6 3]);
174 save("uint32.sod","emptyuint32matrix","uint32scalar","uint32rowvector","uint32colvector","uint32matrix");
176 <para>produces:</para>
179 <imagedata fileref="../images/img001.png"/>
183 Each root dataset has an attribute called <literal>SCILAB_Class</literal>. This attribute defines which types is the variable stored in the HDF5 file.
185 <para>If the variable is a primitive type and without complex values associated, data are stored directly into the dataset. Otherwise, the dataset contains references to the actual data.</para>
186 <para>Every SOD file contains two specific variables:</para>
190 <literal>SCILAB_scilab_version</literal> – Describe which version of Scilab has been used to save the SOD file.
192 <para>For example, with Scilab 5.4.0, the data will be:</para>
194 <emphasis>SCILAB_scilab_version = scilab-5.4.0</emphasis>
198 <para>SCILAB_sod_version – Describe which version of the SOD specification has been used to save the file.</para>
199 <para>For example, with Scilab 5.4.0, the data will be:</para>
201 <emphasis>SCILAB_sod_version = 2</emphasis>
206 Types where data are stored straight into the dataset.
208 <informaltable border="1">
212 <emphasis role="bold">Scilab Type</emphasis>
215 <emphasis role="bold">HDF5 Scilab type attribute</emphasis>
218 <emphasis role="bold">HDF5 attributes</emphasis>
221 <emphasis role="bold">HDF data types mapping</emphasis>
229 <para>SCILAB_Class = string</para>
242 <td namest="c2" nameend="c3" align="left">
243 <para>SCILAB_Class = boolean</para>
249 <para>32-bit integer</para>
256 <td namest="c2" nameend="c3" align="left">
257 <para>SCILAB_Class = integer</para>
260 <para>SCILAB_precision = {8, 16, 32, u8, u16, u32}</para>
262 <td namest="c1" nameend="c2" align="left">
263 <para>8 = 8-bit character</para>
264 <para>16 = 16-bit integer</para>
265 <para>32 = 32-bit integer</para>
266 <para>u8 = 8-bit unsigned character</para>
267 <para>u16 = 16-bit unsigned integer</para>
268 <para>u32 = 32-bit unsigned integer</para>
273 <para>For these types, like in Scilab, the data are stored in a one dimension array. Data are stored by column wise.</para>
274 <para>To reconstruct the matrix, vector or scalar, two attributes provides the number of columns and rows. </para>
276 Since the 5.4.0 release of Scilab and SOD v2, <literal>SCILAB_cols</literal> and <literal>SCILAB_rows</literal> are no longer used for matrices of double, integer, polynomial and string. SOD uses the native multidimensional HDF5 feature.
279 <emphasis role="bold">Example</emphasis>
282 The saving of the declaration: <code>int32([1 -4 7;-9 6 -3])</code> will be displayed as:
286 <imagedata fileref="../images/img002.png"/>
289 <para>in hdfview.</para>
290 <para>And the metadata will be:</para>
291 <emphasis role="italic">
292 <para>int32matrix (800, 2)</para>
293 <para>32-bit integer, 3 x 2 => the size of the variable</para>
294 <para>Number of attributes = 2</para>
295 <para>SCILAB_Class = integer</para>
296 <para>SCILAB_precision = 32</para>
300 Scalar value are stored as a matrix of size 1 by 1.
304 An empty variable (<literal>[]</literal>) will have the attribute <literal>SCILAB_empty</literal> set to true.
307 <emphasis role="bold">Types where data are stored in a dedicated group</emphasis>
309 <para>Many of Scilab datatypes are stored using groups. This allows a clear separations of the value but also an easy access.</para>
311 Groups are named from the variables enclosed by "#". For example, for a matrix of double called matrixofdouble, the name of the root dataset will be matrixofdouble, the name of the associated group will be <emphasis role="strong">#matrixofdouble#</emphasis>.
313 <para>For recursive data type (list, mlist, tlist, etc), names of subgroup are constructed the following way:</para>
315 The <literal>#</literal> allows the creation of an unique identifier. The number of initial <literal>#</literal> shows the level of depth. Therefore, the sublist <emphasis>###listnested#_#2##_#1##</emphasis> will indicate that it is located at the second level.
317 <para>The underscore "_" is a way to represent the depth. Usually, the "/" character is used in such case but it is a reserved keyword in the HDF5 specification.</para>
319 The integers used in the naming shows the position in the data structure, both in term of position in the current structure but also regarding the parent element. In the example, <emphasis>###listnested#_#2##_#1##</emphasis>, the 1 shows that it is dealing with the second element of the third structure of the main element (elements are indexed from 0).
322 For example, the group named <emphasis>###listnested#_#2##_#1##</emphasis>, will point to the value [32, 42] from the example:
325 listnested=list(2,%i,'f',ones(3,3))
326 listnested(3) = list( %t, [32,42]);
329 <emphasis role="bold">Sparse</emphasis>
332 <emphasis role="strong">Scilab type:</emphasis> sparse
335 <emphasis role="strong">HDF5 Scilab type attribute:</emphasis> SCILAB_Class = sparse
338 <emphasis role="strong">HDF5 attributes: </emphasis>
340 <para>SCILAB_rows = <int></para>
341 <para>Number of rows</para>
343 <emphasis>SCILAB_cols = <int></emphasis>
345 <para>Number of columns</para>
347 <emphasis>SCILAB_items = <int></emphasis>
349 <para>Define the number of elements in the sparse matrix</para>
351 <emphasis role="strong">Root dataset values:</emphasis>
354 First value (<literal>#0#</literal>): Each element of this data structure shows the number of non-null element per line. Therefore, the first element shows the number of element in the first line of the sparse matrix.
357 Second value (<literal>#1#</literal>): Provides the position of the column of each elements of the sparse matrix.
360 Third value (<literal>#2#</literal>): Stores the reference to the actual values of the element in the sparse matrix (which will be stored in a specific group).
362 <para>Example, taking this matrix:</para>
363 <programlisting role="no-scilab-exec">
364 0. 1. 0. 0. 0. 0. 0. 0. 0. 0.
365 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
366 0. 0. 0. 0. 0. 0. 0. 0. 0. 3.
367 0. 0. 0. 0. 2. 0. 0. 0. 0. 0.
369 <para>which is generated by the function:</para>
370 <programlisting role="scilab_code">
371 sparse([1,2;4,5;3,10],[1,2,3])
377 <emphasis>#0#</emphasis> contains <emphasis>1;0;1;1</emphasis>
380 <emphasis>#1#</emphasis> contains <emphasis>2;10;5</emphasis>
383 <emphasis>#2#</emphasis> references a matrix of double (not complex in this example) which contains <emphasis>1.0; 3.0; 2.0</emphasis>
386 <emphasis role="bold">Boolean sparse</emphasis>
389 <emphasis role="strong">Scilab type:</emphasis> boolean sparse
392 <emphasis role="strong">HDF5 Scilab type attribute:</emphasis> SCILAB_Class = boolean sparse
395 <emphasis>HDF5 attributes:</emphasis>
398 <emphasis>SCILAB_rows = <int></emphasis>
400 <para>Number of rows</para>
402 <emphasis>SCILAB_cols = <int></emphasis>
404 <para>Number of columns</para>
406 <emphasis>SCILAB_items = <int></emphasis>
408 <para>Define the number of elements in the sparse matrix</para>
410 <emphasis role="strong">Root dataset values:</emphasis> While a sparse has 3 datasets, the boolean sparse has only 2 because defined values are automatically considered as true.
413 First value (<literal>#0#</literal>): Each element of this data structure shows the number of non-null element per line.
415 <para>Therefore, the first element shows the number of element in the first line of the sparse matrix.</para>
417 Second value (<literal>#1#</literal>): Provides the position of the column of each elements of the sparse matrix.
419 <para>With the boolean sparse matrix:</para>
421 dense=[%F, %F, %T, %F, %F
427 <emphasis>#0#</emphasis> contains <emphasis>1;1;0;1</emphasis>.
430 <emphasis>#1#</emphasis> contains <emphasis>3;1;5</emphasis>.
432 <para>Only the two information are necessary to recreate the boolean sparse.</para>
433 <para>HDF data types mapping:</para>
434 <para>32-bit integer</para>
436 <emphasis role="bold">Double</emphasis>
439 <emphasis role="strong">Scilab type:</emphasis> double
442 <emphasis role="strong">HDF5 Scilab type attribute:</emphasis> SCILAB_Class = double
445 <emphasis role="strong">Root dataset values:</emphasis> Both real and complex values are stored in a group called <literal>#<variable name>#</literal>.
448 First value: Reference to the real values. Named <literal>#0#</literal>.
451 If the matrix is complex, the second value will reference the complex values. Named <literal>#1#</literal>.
454 <emphasis role="strong">HDF data types mapping:</emphasis> 64-bit floating-point
457 <emphasis role="bold">Polynomial</emphasis>
460 <emphasis role="strong">Scilab type:</emphasis> polynomial
463 <emphasis role="strong">HDF5 Scilab type attribute:</emphasis> SCILAB_Class = polynomial
466 <emphasis role="strong">HDF5 attributes: </emphasis>
468 <para>SCILAB_Class = polynomial</para>
469 <para>SCILAB_varname = <string></para>
470 <para>The symbolic variable name</para>
472 <emphasis>SCILAB_Complex = <boolean></emphasis>
474 <para>If the polynomial is complex (not set if false)</para>
476 <emphasis role="strong">Root dataset values:</emphasis>
478 <para>Coefficients are stored under the form of a matrix of double (cf the relative section to double storage). It is interesting to note that coefficients can be complex and, therefore, be stored as a matrix of complex. Rules of naming of the (sub-)groups and dataset are described at the beginning of the chapter.</para>
480 <emphasis role="strong">HDF data types mapping:</emphasis> Object reference
483 <emphasis role="bold">list</emphasis>
486 <emphasis role="strong">Scilab type:</emphasis> list
489 <emphasis role="strong">HDF5 Scilab type attribute:</emphasis>
491 <para>SCILAB_Class = list</para>
493 <emphasis role="strong">HDF5 attributes:</emphasis> SCILAB_items = <number of items in the list>
496 <emphasis role="strong">Root dataset values:</emphasis>
499 Associated to the root dataset, values stored in this dataset are the references to the values stored in the list. The values are stored in the group called <literal>#<variable name>#</literal>. In the <literal>#<variable name>#</literal> group, data can be any type. They are included straight into the group. Their representations are the same as in other cases, based in recursive structure (meaning that list of list of list of various types can stored and loaded).
501 <para>Rules of naming of the (sub-)groups and dataset are described at the beginning of the chapter.</para>
503 <emphasis role="strong">HDF data types mapping:</emphasis> Object reference
506 <emphasis role="bold">tlist </emphasis>
509 <emphasis role="strong">Scilab type:</emphasis> tlist
512 <emphasis role="strong">HDF5 Scilab type attribute:</emphasis>
514 <para> SCILAB_Class = tlist</para>
516 <emphasis role="strong">HDF5 attributes:</emphasis> cf list
519 <emphasis role="bold">mlist </emphasis>
522 <emphasis role="strong">Scilab type:</emphasis> mlist
525 <emphasis role="strong">HDF5 Scilab type attribute:</emphasis>
527 <para>SCILAB_Class = mlist</para>
529 <emphasis role="strong">HDF5 attributes:</emphasis> cf list
532 <emphasis role="bold">void </emphasis>
535 <emphasis role="strong">Scilab type:</emphasis> void
538 <emphasis role="strong">HDF5 Scilab type attribute:</emphasis>
540 <para>SCILAB_Class = void</para>
541 <para>A void value can only be found in very special usages of list, tlist and mlist. It can be created with the following syntax:</para>
542 <programlisting>voidelement_ref=list(1,,3);</programlisting>
544 <emphasis role="bold">undefined </emphasis>
547 <emphasis role="strong">Scilab type:</emphasis> undefined
550 <emphasis role="strong">HDF5 Scilab type attribute:</emphasis>
552 <para> SCILAB_Class = undefined</para>
554 An undefined value is generated when the size of a list is increased and some elements not defined. They will be generated with the syntax:
557 undefinedelement_ref=list(2,%i,'f',ones(3,3));
558 undefinedelement_ref(6)="toto"
562 <title>Real life examples</title>
564 Sample files of all these variables are provided into the Scilab distribution. They are available in the directory: <emphasis>SCI/modules/hdf5/tests/sample_scilab_data/</emphasis>
566 <para>At the date of redaction of this document, the following files are provided with the Scilab distribution:</para>
567 <emphasis role="italic">
571 <para>booleanscalar.sod
573 <para>booleansparse.sod
575 <para>emptymatrix.sod
577 <para>emptysparse.sod
579 <para>hypermatrixcomplex.sod
581 <para>hypermatrix.sod
593 <para>matricedoublecomplexscalar.sod
595 <para>matricedoublecomplex.sod
597 <para>matricedoublescalar.sod
599 <para>matricedouble.sod
601 <para>matricestringscalar.sod
603 <para>matricestring.sod
607 <para>polynomialscoef.sod
609 <para>polynomials.sod
611 <para>sparsematrix.sod
621 <para>undefinedelement.sod
623 <para>voidelement.sod
628 <title>Format evolutions</title>
629 <informaltable border="1">
633 <emphasis role="bold">SOD version</emphasis>
636 <emphasis role="bold">Scilab version</emphasis>
639 <emphasis role="bold">Description</emphasis>
650 <para>Initial version of the Scilab/HDF5 format</para>
658 <para>5.4.0 alpha / beta</para>
661 <para>Default format for load and save</para>
662 <para>Previous format (.bin) still supported</para>
674 For matrices of double, integer, polynomial and string <emphasis>SCILAB_cols</emphasis> / <emphasis>SCILAB_rows</emphasis> have been removed to use multidimensional HDF5
686 <para>.bin support dropped.</para>
692 <refsection role="see also">
693 <title>See also</title>
694 <simplelist type="inline">
696 <link linkend="save">save</link>
699 <link linkend="load">load</link>
702 <link linkend="listvarinfile">listvarinfile</link>
705 <link linkend="type">type</link>
708 <link linkend="typeof">typeof</link>