* Bug #7858 fixed - Statistics: variance and variancef
[scilab.git] / scilab / modules / statistics / help / en_US / descriptive_statistics / variancef.xml
1 <?xml version="1.0" encoding="UTF-8"?>
2 <!--
3  * Scilab ( http://www.scilab.org/ ) - This file is part of Scilab
4  * Copyright (C) 2013 - Samuel GOUGEON
5  * Copyright (C) 2000 - INRIA - Carlos Klimann
6  *
7  * This file must be used under the terms of the CeCILL.
8  * This source file is licensed as described in the file COPYING, which
9  * you should have received as part of this distribution.  The terms
10  * are also available at
11  * http://www.cecill.info/licences/Licence_CeCILL_V2.1-en.txt
12  *
13  -->
14 <refentry xmlns="http://docbook.org/ns/docbook" xmlns:xlink="http://www.w3.org/1999/xlink" xmlns:svg="http://www.w3.org/2000/svg" xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:db="http://docbook.org/ns/docbook" xmlns:scilab="http://www.scilab.org" xml:lang="en" xml:id="variancef">
15     <refnamediv>
16         <refname>variancef</refname>
17         <refpurpose>variance (and mean) of a vector or matrix of frequency-weighted real or complex numbers</refpurpose>
18     </refnamediv>
19     <refsynopsisdiv>
20         <title>Calling Sequence</title>
21         <synopsis>
22             [s [,mc]] = variancef(x, fre [,orien [,m]])
23             
24             [s, mc] = variancef(x)
25             [s, mc] = variancef(x, fre, "r"|1 )
26             [s, mc] = variancef(x, fre, "c"|2 )
27             [s, mc] = variancef(x, fre, "*"  , %nan)
28             [s, mc] = variancef(x, fre, "r"|1, %nan)
29             [s, mc] = variancef(x, fre, "c"|2, %nan)
30             s = variancef(x, fre, "*", m)
31             s = variancef(x, fre, "r", m)
32             s = variancef(x, fre, "c", m)
33         </synopsis>
34     </refsynopsisdiv>
35     <refsection>
36         <title>Arguments</title>
37         <variablelist>
38             <varlistentry>
39                 <term>x</term>
40                 <listitem>
41                     <para>
42                         vector or matrix of real or complex numbers
43                     </para>
44                 </listitem>
45             </varlistentry>
46             <varlistentry>
47                 <term>fre</term>
48                 <listitem>
49                     <para>
50                         vector or matrix of positive decimal integers = frequencies: <code>fre(i,j)</code> is the number of times that <code>x(i,j)</code> must be counted.
51                         <varname>fre</varname> and <varname>x</varname> have same sizes.
52                     </para>
53                 </listitem>
54             </varlistentry>
55             <varlistentry>
56                 <term>orien</term>
57                 <listitem>
58                     <para>the orientation of the computation. Valid values are:
59                         <itemizedlist>
60                             <listitem>1 or "r" : result is a row, after a column-wise computation.</listitem>
61                             <listitem>2 or "c" : result is a column, after a row-wise computation.</listitem>
62                             <listitem>
63                                 "*" : full undirectional computation (default); explicitly required when <literal>m</literal> is used.
64                             </listitem>
65                         </itemizedlist>
66                     </para>
67                 </listitem>
68             </varlistentry>
69             <varlistentry>
70                 <term>m</term>
71                 <listitem>
72                     <para>
73                         The known mean of the underlying statistical distribution law (assuming that it is known).
74                         <itemizedlist>
75                             <listitem>
76                                 "*" mode (default): <varname>m</varname> must be scalar
77                             </listitem>
78                             <listitem>
79                                 "r" or 1 mode: <varname>m</varname> is a row of length <literal>size(x,2)</literal>. The variance along the column #j is computed using <literal>m(j)</literal> as the mean for the considered column. If <literal>m(j)</literal> is the same for all columns, it can be provided as a scalar <varname>m</varname>.
80                             </listitem>
81                             <listitem>
82                                 "c" or 2 mode: <varname>m</varname> is a column of length <literal>size(x,1)</literal>. The variance along the row #i is computed using <literal>m(i)</literal> as the mean for the considered row. If <literal>m(i)</literal> is the same for all rows, it can be provided as a scalar <varname>m</varname>.
83                             </listitem>
84                         </itemizedlist>
85                     </para>
86                     <para>
87                         When <varname>m</varname> is not provided, the <literal>variance</literal> is built dividing the quadratic distance of n values to <literal>mean(x,fre)</literal>(or <literal>mean(x,fre,"c")</literal> or <literal>mean(x,fre,"r")</literal>) by (n-1) (n being sum(fre) or sum(fre,"c") or sum(fre,"r")). If the elements of <varname>x</varname> are mutually independent, the result is then statistically unbiased.
88                     </para>
89                     <para>
90                         Else, the <literal>variance</literal> is built dividing the quadratic distance of values to <varname>m</varname> by the number n of considered values.
91                     </para>
92                     <para>
93                         If a true value <varname>m</varname> independent from x elements is used, <varname>x</varname> and <varname>m</varname> values are mutually independent, and the result is then unbiased.
94                     </para>
95                     <para>
96                         When the special value <literal>m = %nan</literal> is provided, the variance is still normalized by n (not n-1) but is computed using
97                         <literal>m = mean(x, fre)</literal> instead (or <literal>m = mean(x,fre,"c")</literal> or <literal>m = mean(x,fre,"r")</literal>). This <varname>m</varname> does not bring independent information, and yields a statistically biased result.
98                     </para>
99                 </listitem>
100             </varlistentry>
101             <varlistentry>
102                 <term>s</term>
103                 <listitem>
104                     The variance of weighted values of <varname>x</varname> elements. It is a scalar or a column vector or a row vector according to <varname>orien</varname>.
105                 </listitem>
106             </varlistentry>
107             <varlistentry>
108                 <term>mc</term>
109                 <listitem>
110                     Scalar or <varname>orien</varname>-wise mean of weighted <varname>x</varname> elements (<literal>= mean(x, fre,..)</literal>), as computed before and used as reference in the variance.
111                 </listitem>
112             </varlistentry>
113         </variablelist>
114     </refsection>
115     <refsection>
116         <title>Description</title>
117         <para>
118             This function computes the variance of the values of a
119             vector or matrix <varname>x</varname>, each of them <literal>x(i,j)</literal> being counted <literal>fre(i,j)</literal> times.
120             If <literal>x</literal> is complex, then <literal>variancef(x,fre,..) = variancef(real(x),fre,..) + variancef(imag(x),fre,..)</literal> is returned.
121         </para>
122         <para>
123             <literal>s = variancef(x,fre)</literal> (or <literal>s=variancef(x,fre,"*")</literal>) returns the scalar variance computed over all values of <varname>x</varname>.
124         </para>
125         <para>
126             <literal>s = variancef(x,fre,"r")</literal>(or equivalently <literal>s = variancef(x,fre,1)</literal>) returns a row <varname>s</varname> such that for each j,
127             <literal>s(j) = variancef(x(:,j),fre(:,j),..)</literal>.
128         </para>
129         <para>
130             <literal>s = variancef(x,fre,"c")</literal>(or equivalently <literal>s = variancef(x,fre,2)</literal>) returns a column <varname>s</varname> such that for each i,
131             <literal>s(i) = variancef(x(i,:),fre(i,:),..)</literal>.
132         </para>
133         <para>
134             When the mean <varname>m</varname> is provided, it is used as reference in the variance computation instead of being internally estimated from <varname>x</varname> (unless it is equal to the special value <code>%nan</code>: See <varname>m</varname>'s description). This allows to compute the variance of a sample <varname>x</varname> with respect to a given statistical model (rather than extracting an empirical statistical dispersion in order to build the model).
135         </para>
136     </refsection>
137     <refsection>
138         <title>Examples</title>
139         <programlisting role="example"><![CDATA[
140 x = [0.2113249 0.0002211 0.6653811; 0.7560439 0.9546254 0.6283918]
141 fre = [1 2 3; 3 4 3]
142 [s, m] = variancef(x, fre)
143 [s, m] = variancef(x, fre, "r")
144 [s, m] = variancef(x, fre, "c")
145
146 // Example #2:
147 x0 = grand(20, 7, "uin", -9, 10)+0.4
148 x = matrix((-9:10)+0.4, 5, 4)
149 fre = members(x, x0)        // Computes the frequencies of x's elements in x0
150 [s, m] = variancef(x, fre)  // Should be equal to variance(x0)
151 [s, m] = variance(x0)
152
153 // Example #2 (follow-up):
154 m = (-9+10)/2+0.4               // Known asymptotic mean (if x0 had an infinite number of elements)
155 s = variancef(x, fre, "*", m)   // Sample variance wrt the true mean
156 s0 = (10 - (-9))^2 /12            // Known asymptotic variance
157 s2 = variancef(x, fre, "*", %nan) // Takes m = meanf(x,fre) =>  always <= s
158  ]]></programlisting>
159     </refsection>
160     <refsection role="see also">
161         <title>See Also</title>
162         <simplelist type="inline">
163             <member>
164                 <link linkend="variance">variance</link>
165             </member>
166             <member>
167                 <link linkend="mtlb_var">mtlb_var</link>
168             </member>
169             <member>
170                 <link linkend="stdevf">stdevf</link>
171             </member>
172         </simplelist>
173     </refsection>
174     <refsection>
175         <title>Bibliography</title>
176         <para>
177             Wonacott, T.H. &amp; Wonacott, R.J.; Introductory Statistics, fifth edition, J.Wiley &amp; Sons, 1990.
178         </para>
179     </refsection>
180     <refsection>
181         <title>History</title>
182         <revhistory>
183             <revision>
184                 <revnumber>5.5.0</revnumber>
185                 <revdescription>
186                     <itemizedlist>
187                         <listitem>
188                             <para>variancef(complexes,..) fixed.</para>
189                         </listitem>
190                         <listitem>
191                             <para>variancef(x, fre, orien, m) introduced: the true mean m of the underlying statistical law can be used.</para>
192                         </listitem>
193                         <listitem>
194                             <para>variancef(x, fre, orien, %nan) introduced: mean(x, fre,..) is used but divided by n values (instead of n-1)</para>
195                         </listitem>
196                         <listitem>
197                             <para>[s, mc] = variancef(x,fre,..) introduced : the mean mc computed from x and fre is now also returned</para>
198                         </listitem>
199                     </itemizedlist>
200                 </revdescription>
201             </revision>
202         </revhistory>
203     </refsection>
204 </refentry>