1 <?xml version="1.0" encoding="UTF-8"?>
3 * Copyright (C) 2010-2011 - INRIA - Allan CORNET
4 * Scilab ( http://www.scilab.org/ ) - This file is part of Scilab
6 <refentry version="5.0-subset Scilab" xml:id="csvRead" xml:lang="en"
7 xmlns="http://docbook.org/ns/docbook"
8 xmlns:xlink="http://www.w3.org/1999/xlink"
9 xmlns:svg="http://www.w3.org/2000/svg"
10 xmlns:ns3="http://www.w3.org/1999/xhtml"
11 xmlns:mml="http://www.w3.org/1998/Math/MathML"
12 xmlns:db="http://docbook.org/ns/docbook">
14 <pubdate>$LastChangedDate$</pubdate>
18 <refname>csvRead</refname>
20 <refpurpose>Read comma-separated value file</refpurpose>
24 <title>Calling Sequence</title>
28 M = csvRead(filename, separator)
29 M = csvRead(filename, separator, decimal)
30 M = csvRead(filename, separator, decimal, conversion)
31 M = csvRead(filename, separator, decimal, conversion, substitute)
32 M = csvRead(filename, separator, decimal, conversion, substitute, rexgepcomments, range)
33 [M, comments] = csvRead(filename, separator, decimal, conversion, substitute, rexgepcomments, range)
38 <title>Parameters</title>
45 <para>a 1-by-1 matrix of strings, the file path.</para>
50 <term>separator</term>
53 <para>a 1-by-1 matrix of strings, the field separator used.</para>
61 <para>a 1-by-1 matrix of strings, the decimal used.</para>
66 <term>conversion</term>
69 <para>a 1-by-1 matrix of strings, the type of the output
70 <literal>M</literal>. Available values are "string" or "double"
77 <term>substitute</term>
80 <para>a m-by-2 matrix of strings, a replacing map (default = [],
81 meaning no replacements). The first column
82 <literal>substitute(:,1)</literal> contains the searched strings and
83 the second column <literal>substitute(:,2)</literal> contains the
84 replace strings. Every occurence of a searched string in the file is
91 <term>rexgepcomments</term>
94 <para>a string: a regexp to remove lines which match. (default:
104 <para>a 1-by-4 matrix of floating point integers, the range of rows
105 and columns which must be read (default range=[], meaning that all
106 the rows and columns). Specify range using the format <literal>[R1
109 where (R1,C1) is the upper left corner of the
110 data to be read and (R2,C2) is the lower right corner.
119 <para>a m-by-n matrix of strings or double.</para>
124 <term>comments</term>
127 <para>a m-by-n matrix of strings matched by regexp.</para>
134 <title>Description</title>
136 <para>Given an ascii file with comma separated values delimited fields,
137 this function returns the corresponding Scilab matrix of strings or
141 <para>For example, the .csv data file may have been created by a
142 spreadsheet software using "Text and comma" format.
145 <para>It might happen that the columns are separated by a non-comma
146 separator. In this case, use csvRead(filename, separator) for another
150 <para>The default value of the optional input arguments are defined by the
151 <literal>csvDefault</literal> function.
154 <para>Any optional input argument equal to the empty matrix
155 <literal>[]</literal> is set to its default value.
158 <para>When the input argument "conversion" is equal to "double", the
159 non-numeric fields within the .csv (e.g. strings) are converted into
165 <title>Examples</title>
167 <para>The following script presents some basic uses of the
168 <literal>csvRead</literal> function.
171 <programlisting role="example">// Create a file with some data separated with tabs.
173 filename = fullfile(TMPDIR, "data.csv");
174 csvWrite(M, filename, ascii(9), '.');
177 M1 = csvRead(filename,ascii(9), [], 'string')
180 M2 = csvRead(filename,ascii(9), '.', 'double')
182 // Compares original data and result.
185 // Use the substitude argument to manage
186 // special data files.
198 mputl(content,filename);
199 M = csvRead(filename,",",".","double",substitute)
200 isnan(M(2,1)) // Expected=%t
201 isnan(M(4,1)) // Expected=%t
205 <para>The following script presents more practical uses of the
206 <literal>csvRead</literal> function.
209 <programlisting role="example">// Define a matrix of strings
211 "1" "8" "15" "22" "29" "36" "43" "50"
212 "2" "9" "16" "23" "30" "37" "44" "51"
213 "3" "10" "17" "6+3*I" "31" "38" "45" "52"
214 "4" "11" "18" "25" "32" "39" "46" "53"
215 "5" "12" "19" "26" "33" "40" "47" "54"
216 "6" "13" "20" "27" "34" "41" "48" "55"
217 "+0" "-0" "Inf" "-Inf" "Nan" "1.D+308" "1.e-308" "1.e-323"
220 // Create a file with some data separated with commas
221 filename = fullfile(TMPDIR , 'foo.csv');
223 fd = mopen(filename,'wt');
224 for i = 1 : size(Astr,"r")
225 mfprintf(fd,"%s\n",strcat(Astr(i,:),sep));
228 // To see the file : edit(filename)
231 Bstr = csvRead ( filename )
233 // Create a file with a particular separator: here ";"
234 filename = fullfile(TMPDIR , 'foo.csv');
236 fd = mopen(filename,'wt');
237 for i = 1 : size(Astr,"r")
238 mfprintf(fd,"%s\n",strcat(Astr(i,:),sep));
243 // Read the file and customize the separator
244 csvRead ( filename , sep )
247 <para>The following script shows how to remove lines with regexp argument
248 of the <literal>csvRead</literal> function.
251 <programlisting role="example">CSV = ["// tata"; ..
258 filename = fullfile(TMPDIR , 'foo.csv');
259 mputl(CSV, filename);
261 // remove lines with // @ beginning
262 [M, comments] = csvRead(filename, [], [], [], [], '/\/\//')
267 <para>Empty field are managed by csvRead</para>
269 <programlisting role="example">
270 csvWrite(['1','','3';'','','6'], TMPDIR + "/example.csv")
271 csvRead(TMPDIR + "/example.csv", [], [], "string")
272 csvRead(TMPDIR + "/example.csv", [], [], "double")
276 <programlisting role="example">
277 // Define a matrix of strings
279 "1" "8" "15" "22" "29" "36" "43" "50"
280 "2" "9" "16" "23" "30" "37" "44" "51"
281 "3" "10" "17" "6+3*I" "31" "38" "45" "52"
282 "4" "11" "18" "25" "32" "39" "46" "53"
283 "5" "12" "19" "26" "33" "40" "47" "54"
284 "6" "13" "20" "27" "34" "41" "48" "55"
285 "+0" "-0" "Inf" "-Inf" "Nan" "1.D+308" "1.e-308" "1.e-323"
288 // Create a file with some data separated with commas
289 filename = fullfile(TMPDIR , 'foo.csv');
291 fd = mopen(filename,'wt');
292 for i = 1 : size(Astr,"r")
293 mfprintf(fd,"%s\n",strcat(Astr(i,:),sep));
296 // To see the file : edit(filename)
299 Bstr = csvRead ( filename )
301 // Create a file with a particular separator: here ";"
302 filename = fullfile(TMPDIR , 'foo.csv');
304 fd = mopen(filename,'wt');
305 for i = 1 : size(Astr,"r")
306 mfprintf(fd,"%s\n",strcat(Astr(i,:),sep));
310 // Read the file and customize the separator
311 csvRead ( filename , sep )
315 <para>In the following script, the file "filename" is read by blocks of
316 5000 rows. The algorithm stops when the number of rows actually read from
317 the file differ from 5000, i.e. when the end of the file has been
321 <programlisting role="example">blocksize = 5000;
326 R1 = (iblock-1) * blocksize + 1;
327 R2 = blocksize + R1-1;
328 irange = [R1 C1 R2 C2];
329 mprintf("Block #%d, rows #%d to #%d\n",iblock,R1,R2);
331 M=csvRead(filename , [] , [] , [] , [] , [] , [] , irange );
335 if ( nrows > 0 ) then
336 p = t/(nrows*ncols)*1.e6;
337 mprintf(" Actual #rows=%d\n",nrows);
338 mprintf(" T=%.3f (s)\n",t);
339 mprintf(" T=%.1f (ms/cell)\n",p);
341 if ( nrows < blocksize ) then
342 mprintf("... End of the file.\n");
349 <para>This produces :</para>
351 <programlisting role="no-scilab-exec">Block #1, rows #1 to #5000
355 Block #2, rows #5001 to #10000
359 Block #3, rows #10001 to #15000
368 <title>See Also</title>
370 <simplelist type="inline">
372 <link linkend="csvWrite">csvWrite</link>
377 <title>History</title>
380 <revnumber>5.4.0</revnumber>
381 <revremark>Function introduced. Based on the 'csv_readwrite' module.</revremark>