找到你要的答案

Q:Reading comma and space separated numbers from a text file

Q:从文本文件中读取逗号和空格分隔的数字

I have a text file which contains numbers that are separated with comma and also space. It is in fact a file that in each row contains the index of row and pairs of numbers, such that the pairs are separated with space and the numbers of each pair is separated with comma. The number of columns in each row is different. For example two rows of the file are as below:

1 34,10 12,23
2 22,123 11,102 12,34 22,232

I tried dlmread but it gives error, as there are commas in the file. I tried csvread but it reads just some of the data and in an unclear pattern. I also used:

mymatrix = load('filename.txt','-ascii')

but it gives error because the number of columns are not the same in each row.

How can I read this irregular data pattern?

I have a text file which contains numbers that are separated with comma and also space. It is in fact a file that in each row contains the index of row and pairs of numbers, such that the pairs are separated with space and the numbers of each pair is separated with comma. The number of columns in each row is different. For example two rows of the file are as below:

1 34,10 12,23
2 22,123 11,102 12,34 22,232

我试图dlmread但它给错误,如文件有逗号。我试着csvread但读起来只是一些数据和不清楚的模式。我也用:

mymatrix =负载('filename .txt”,“文本”)

但由于每个列的列数不相同,它给出了错误。

我怎样才能读懂这个不规则的数据模式?

answer1: 回答1:

The importdata GUI is a really mighty tool:

I almost never encountered a case where it wouldn't work. Sometimes it's overkill because the structure of the file is more regular and other functions should be used. But in your case with different row lenghts it is a good option.

The nice thing is, it's intuitive: input your file, change some parameters and finally generate a script automatically ("Import Selection Button"):

And you get a nice double matrix:

data =

     1    34    10    12    23   NaN   NaN   NaN   NaN
     2    22   123    11   102    12    34    22   232

If you have multiple files with a different maximum number of value pairs per row, you either could use the file where occurs the overall maximum number or you take the generated script and modify it for dynamic detection. But first see, if this already works well enough for you.


This is the code generated by the GUI:

filename = 'data.txt';

formatSpec = '%f%f%f%f%f%f%f%f%f%[^\n\r]';
fileID = fopen(filename,'r');
data = textscan(fileID, formatSpec, 'Delimiter', {',',' '}, ...
               'MultipleDelimsAsOne', true, 'EmptyValue' ,NaN, 'ReturnOnError', false);
out = [data{1:end-1}];

这真的是一个强大的工具importdata GUI:

我几乎从来没有遇到过的情况下,它不会工作。有时它是矫枉过正,因为该文件的结构更规则,应使用其他功能。但在你的情况下与不同的行的长度是一个很好的选择。

好的是,它是直观的:输入你的文件,改变一些参数,并最终生成脚本自动(“导入选择按钮”):

你得到一个不错的双矩阵:

data =

     1    34    10    12    23   NaN   NaN   NaN   NaN
     2    22   123    11   102    12    34    22   232

如果你有多个文件,每行有不同的最大值对,那么你要么可以使用文件发生全局最大值,要么使用生成的脚本并修改它进行动态检测。但首先看看,如果这已经足够适合你。


这是GUI生成的代码:

filename = 'data.txt';

formatSpec = '%f%f%f%f%f%f%f%f%f%[^\n\r]';
fileID = fopen(filename,'r');
data = textscan(fileID, formatSpec, 'Delimiter', {',',' '}, ...
               'MultipleDelimsAsOne', true, 'EmptyValue' ,NaN, 'ReturnOnError', false);
out = [data{1:end-1}];
answer2: 回答2:

Considering the general case in which you do not know in advance the maximum number of data in the input file, you can read the file row by row by using fgetl.

fgetl returns a char string that you can then convert to an array of number by using str2num

Since each array might have different length, you can then assign the array to a cellarray in which to collect all the data.

% Open input file
fid=fopen('tmp_in.txt');
% Initialize output
the_data=[];
% Read the input file
while 1
   tline = fgetl(fid);
   if(~ischar(tline))
      break
   end
% Convert to number and store in a cellarray
   the_data=cat(1,the_data,{str2num(tline)})
end
% Close the input file
fclose(fid);

Hope this helps.

考虑到你事先不知道的输入文件中的数据的最大数量,一般情况下,你可以阅读的文件排排用fgetl。

fgetl返回一个字符串,然后你可以转换为用str2num数数组

由于每个阵列可能有不同的长度,然后你就可以指定数组中,电池方阵收集所有的数据。

% Open input file
fid=fopen('tmp_in.txt');
% Initialize output
the_data=[];
% Read the input file
while 1
   tline = fgetl(fid);
   if(~ischar(tline))
      break
   end
% Convert to number and store in a cellarray
   the_data=cat(1,the_data,{str2num(tline)})
end
% Close the input file
fclose(fid);

希望这有助于。

answer3: 回答3:

Since the row lengths are irregular I don't believe there's a simple solution. But here's a strategy that should work.

  1. Iterate over each line in the file. fgetl will work.
  2. For each line, do nPair = length(strfind(myLine, ','));
  3. formatSpec = ['%d', repmat(' %d,%d', [nPair, 1])]
  4. rowNums = textscan(myLine, formatSpec);

  5. rowNums now has the numbers of the line as a cell array.

Good luck!

由于行长度是不规则的,我不相信有一个简单的解决方案。但这里有一个应该工作的策略。

  1. Iterate over each line in the file. fgetl will work.
  2. For each line, do nPair = length(strfind(myLine, ','));
  3. formatSpec = ['%d', repmat(' %d,%d', [nPair, 1])]
  4. rownums = textscan(MyLine,formatspec);

  5. rownums现在有线数字作为一个单元阵列。

祝你好运!

matlab  file-io