* Bug #7858 fixed - Statistics: variance and variancef
[scilab.git] / scilab / modules / statistics / help / en_US / descriptive_statistics / variance.xml
1 <?xml version="1.0" encoding="UTF-8"?>
2 <!--
3  * Scilab ( http://www.scilab.org/ ) - This file is part of Scilab
4  * Copyright (C) 2013 - Samuel GOUGEON
5  * Copyright (C) 2000 - INRIA - Carlos Klimann
6  *
7  * This file must be used under the terms of the CeCILL.
8  * This source file is licensed as described in the file COPYING, which
9  * you should have received as part of this distribution.  The terms
10  * are also available at
11  * http://www.cecill.info/licences/Licence_CeCILL_V2.1-en.txt
12  *
13  -->
14 <refentry xmlns="http://docbook.org/ns/docbook" xmlns:xlink="http://www.w3.org/1999/xlink" xmlns:svg="http://www.w3.org/2000/svg" xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:db="http://docbook.org/ns/docbook" xmlns:scilab="http://www.scilab.org" xml:lang="en" xml:id="variance">
15     <refnamediv>
16         <refname>variance</refname>
17         <refpurpose>variance (and mean) of a vector or matrix (or hypermatrix) of real or complex numbers</refpurpose>
18     </refnamediv>
19     <refsynopsisdiv>
20         <title>Calling Sequence</title>
21         <synopsis>
22             [s [,mc]] = variance(x [,orien [,m]])
23             
24             [s, mc] = variance(x)
25             [s, mc] = variance(x, "r"|1 )
26             [s, mc] = variance(x, "c"|2 )
27             [s, mc] = variance(x, "*"  , %nan)
28             [s, mc] = variance(x, "r"|1, %nan)
29             [s, mc] = variance(x, "c"|2, %nan)
30             s = variance(x, "*", m)
31             s = variance(x, "r"|1, m)
32             s = variance(x, "c"|2, m)
33         </synopsis>
34     </refsynopsisdiv>
35     <refsection>
36         <title>Arguments</title>
37         <variablelist>
38             <varlistentry>
39                 <term>x</term>
40                 <listitem>
41                     <para>
42                         real or complex vector or matrix. A hypermatrix is accepted only for undirectional computations <literal>variance(x)</literal> or <literal>variance(x,"*",m)</literal>
43                     </para>
44                 </listitem>
45             </varlistentry>
46             <varlistentry>
47                 <term>orien</term>
48                 <listitem>
49                     <para>the orientation of the computation. Valid values are
50                         <itemizedlist>
51                             <listitem>1 or "r": result is a row, after a column-wise computation.</listitem>
52                             <listitem>2 or "c": result is a column, after a row-wise computation.</listitem>
53                             <listitem>
54                                 "*": full undirectional computation (default); explicitly required when <varname>m</varname> is used.
55                             </listitem>
56                         </itemizedlist>
57                     </para>
58                 </listitem>
59             </varlistentry>
60             <varlistentry>
61                 <term>m</term>
62                 <listitem>
63                     <para>
64                         The known mean of the underlying statistical distribution law (assuming that it is known).
65                         <itemizedlist>
66                             <listitem>
67                                 "*" mode (default): <varname>m</varname> must be scalar
68                             </listitem>
69                             <listitem>
70                                 "r" or 1 mode: <varname>m</varname> is a row of length <literal>size(x,2)</literal>. The variance along the column #j is computed using <literal>m(j)</literal> as the mean for the considered column. If <literal>m(j)</literal> is the same for all columns, it can be provided as a scalar <varname>m</varname>.
71                             </listitem>
72                             <listitem>
73                                 "c" or 2 mode: <varname>m</varname> is a column of length <literal>size(x,1)</literal>. The variance along the row #i is computed using <literal>m(i)</literal> as the mean for the considered row. If <literal>m(i)</literal> is the same for all rows, it can be provided as a scalar <varname>m</varname>.
74                             </listitem>
75                         </itemizedlist>
76                     </para>
77                     <para>
78                         When <varname>m</varname> is not provided, the <literal>variance</literal> is built dividing the quadratic distance of <literal>n</literal> values to <literal>mean(x)</literal> (or <literal>mean(x,"c")</literal> or <literal>mean(x,"r")</literal>) by <literal>n-1</literal> (<literal>n</literal> being <literal>length(x)</literal> or <literal>size(x,1)</literal> or <literal>size(x,2)</literal>). If the elements of <varname>x</varname> are mutually independent, the result is then statistically unbiased.
79                     </para>
80                     <para>
81                         Else, the <literal>variance</literal> is built dividing the quadratic distance of values to <varname>m</varname> by the number n of considered values (n being length(x) or size(x,1) or size(x,2)).
82                     </para>
83                     <para>
84                         If a true value <varname>m</varname> independent from x elements is used, <varname>x</varname> and <varname>m</varname> values are mutually independent, and the result is then unbiased.
85                     </para>
86                     <para>
87                         When the special value <literal>m = %nan</literal> is provided, the variance is still normalized by n (not n-1) but is computed using
88                         <literal>m=mean(x)</literal> instead (or <literal>m = mean(x,"c")</literal> or <literal>m = mean(x,"r")</literal>). This <varname>m</varname> does not bring independent information, and yields a statistically biased result.
89                     </para>
90                 </listitem>
91             </varlistentry>
92             <varlistentry>
93                 <term>s</term>
94                 <listitem>
95                     The variance of unweighted values of <varname>x</varname> elements. It is a scalar or a column vector or a row vector according to <varname>orien</varname>.
96                 </listitem>
97             </varlistentry>
98             <varlistentry>
99                 <term>mc</term>
100                 <listitem>
101                     Scalar or <varname>orien</varname>-wise mean of <varname>x</varname> elements (unweighted) (<literal>= mean(x,..)</literal>), as computed before and used as reference in the variance.
102                 </listitem>
103             </varlistentry>
104         </variablelist>
105     </refsection>
106     <refsection>
107         <title>Description</title>
108         <para>
109             This function computes the variance of the real or complex numbers stored into a vector or matrix <varname>x</varname>. If <varname>x</varname> is complex, <literal>variance(x,..) = variance(real(x),..) + variance(imag(x),..)</literal> is returned.
110         </para>
111         <para>
112             For a vector, a matrix, or a hypermatrix <varname>x</varname>, <code>s = variance(x)</code>
113             returns in the scalar <varname>s</varname> the variance of all the entries of <varname>x</varname>.
114         </para>
115         <para>
116             <code>s = variance(x,"c")</code> (or,  equivalently, <code>s = variance(x,2)</code>)
117             is the columnwise variance: <varname>s</varname> is a column vector, with <code>s(j) = variance(x(j,:))</code>.
118         </para>
119         <para>
120             <code>s = variance(x,"r")</code> (or,  equivalently, <code>s = variance(x,1)</code>)
121             is the rowwise variance: <varname>s</varname> is a row vector, with <code>s(i) = variance(x(:,i))</code>.
122         </para>
123         <para>
124             The second output argument <varname>m</varname> is the mean of the input, with respect to the <varname>orien</varname> parameter.
125         </para>
126         <para>
127             <warning>
128                 The <literal>variance(x, "*"|"c"|"r", 1)</literal> synopsis used only in Scilab 5.4.1 must be replaced with
129                 <literal>variance(x, "*"|"c"|"r", %nan)</literal>. <literal>variance(x, "*"|"c"|"r", 1)</literal> will warn
130                 the user until Scilab 6.0. Indeed, <literal>1</literal> will be now considered as <literal>m=1</literal>.
131                 If <literal>1</literal> is the true value provided as <varname>m</varname>, the warning may be avoided entering <literal>1+%eps</literal> instead
132                 of <literal>1</literal>.
133             </warning>
134         </para>
135     </refsection>
136     <refsection>
137         <title>Examples</title>
138         <programlisting role="example"><![CDATA[
139 x = [ 0.2113249 0.0002211 0.6653811; 0.7560439 0.4453586 0.6283918 ]
140 s = variance(x)
141 s = variance(x, "r")
142 s = variance(x, "c")
143
144 // The underlying statistical distribution and its mean are known:
145 x = grand(100, 5, "unf", 0, 7);      // Uniform distribution on [0, 7]
146 // => the true asymptotic mean is (0+7)/2 = 3.5 and variance = (7-0)^2/12
147 (7-0)^2/12                  // True asymptotic variance
148 s = variance(x)             // Unbiased (division by n-1).
149 s = variance(x, "*", 3.5)   // Unbiased (division by n). Always >= variance(x)
150 s = variance(x, "*", %nan)    // Biased   (division by n). Always <= variance(x)
151 // Across columns:
152 s = variance(x, "c")
153 s = variance(x, "c", 3.5)
154 s = variance(x, "c", %nan)
155
156 // With complex numbers uniformly distributed on [0,1] + [0,1].i:
157 x = rand(4, 3) + rand(4, 3)*%i
158 s = variance(x)
159 s = variance(x, "*", 0.5 + 0.5*%i)
160 s = variance(x, "*", %nan)
161 s = variance(x, "r")
162 s = variance(x, "c")
163
164 // With a hypermatrix
165 x = rand(3, 2, 2)    // Uniform distribution on [0, 1]
166 s = variance(x)
167 s = variance(x, "*", 0.5)
168 s = variance(x, "*", %nan)
169 // s = variance(x, "r")  // Is not supported for a hypermatrix
170 // s = variance(x, "c")  // Is not supported for a hypermatrix
171  ]]></programlisting>
172     </refsection>
173     <refsection role="see also">
174         <title>See Also</title>
175         <simplelist type="inline">
176             <member>
177                 <link linkend="variancef">variancef</link>
178             </member>
179             <member>
180                 <link linkend="mtlb_var">mtlb_var</link>
181             </member>
182             <member>
183                 <link linkend="stdev">stdev</link>
184             </member>
185         </simplelist>
186     </refsection>
187     <refsection>
188         <title>Bibliography</title>
189         <para>
190             Wonacott, T.H. &amp; Wonacott, R.J.; Introductory Statistics, fifth edition, J.Wiley &amp; Sons, 1990.
191         </para>
192     </refsection>
193     <refsection>
194         <title>History</title>
195         <revhistory>
196             <revision>
197                 <revnumber>5.5.0</revnumber>
198                 <revdescription>
199                     <itemizedlist>
200                         <listitem>
201                             <para>variance(x, orien, 0|1) removed (as introduced in Scilab 5.4.1)</para>
202                         </listitem>
203                         <listitem>
204                             <para>variance(x, orien, m) introduced: the true mean m of the underlaying statistical law can be used.</para>
205                         </listitem>
206                         <listitem>
207                             <para>variance(x, orien, %nan) introduced: mean(x,..) is used but divided by n values (instead of n-1)</para>
208                         </listitem>
209                         <listitem>
210                             <para>[s, mc] = variance(x,..) introduced: the mean mc computed from x is now also returned</para>
211                         </listitem>
212                     </itemizedlist>
213                 </revdescription>
214             </revision>
215             <revision>
216                 <revnumber>5.4.1</revnumber>
217                 <revdescription>
218                     <itemizedlist>
219                         <listitem>
220                             <para>variance(complexes) fixed. variance(x,"*",1) introduced. Vectorization of the code for directional usages variance(x,"r"|"c"). Full revision of the help page</para>
221                         </listitem>
222                     </itemizedlist>
223                 </revdescription>
224             </revision>
225         </revhistory>
226     </refsection>
227 </refentry>