0

I tried to take 2 txt file and combine the line that every line in file1 concat with every line in file2

Example: file1:

a
b

file2:

c
d

result:

a c

a d

b c

b d

This is the code:

{
        //int counter = 0;


        string[] lines1 = File.ReadLines("e:\\1.txt").ToArray();
        string[] lines2 = File.ReadLines("e:\\2.txt").ToArray();

        int len1 = lines1.Length;
        int len2 = lines2.Length;

        string[] names = new string[len1 * len2];
        int i = 0;
        int finish = 0;
        //Console.WriteLine("Check this");
        for (i = 0; i < lines2.Length; i++)
        {
            for (int j = 0; j < lines1.Length; j++)
            {
                names[finish] = lines2[i] + ' ' + lines1[j];
                finish++;
            }
        }

        using (System.IO.StreamWriter file = new System.IO.StreamWriter(@"E:\text.txt"))
        {
            foreach (string line in names)
            {
                // If the line doesn't contain the word 'Second', write the line to the file. 
                    file.WriteLine(line);
            }
        }
    }

I get this exception:

"An unhandled exception of type 'System.OutOfMemoryException' occurred in ConsoleApplication2.exe" on this line:

string[] names = new string[len1 * len2];

Is there other way to combine this 2 files without getting OutOfMemoryException?

3
  • How big are the files? Commented Dec 18, 2014 at 10:18
  • Put your for loops inside the "Using", and instead of "names[finish] =", just directly write it to the file like this: "file.WriteLine(lines2[i] + ' ' + lines1[j]);", This way you dont have to create the string[] names Commented Dec 18, 2014 at 10:34
  • The files both are 400,000,000 rows after combine Commented Dec 18, 2014 at 12:49

6 Answers 6

2

something like

using (var output = new StreamWriter(@"E:\text.txt"))
{
    foreach(var line1 in File.ReadLines("e:\\1.txt"))
    {
        foreach(var line2 in File.ReadLines("e:\\2.txt"))
        {
            output.WriteLine("{0} {1}", line1, line2);
        }
    }
}

Unless the lines are very long, this should avoid an OutOfMemoryException.

Sign up to request clarification or add additional context in comments.

Comments

1

It looks like you want a cartesian product rather than concatenation. Instead of loading all lines into memory, use ReadLines with SelectMany, this may not be fast but will avoid the exception:

var file1 = File.ReadLines("e:\\1.txt");
var file2 = File.ReadLines("e:\\2.txt");

var lines = file1.SelectMany(x => file2.Select(y => string.Join(" ", x, y));
File.WriteAllLines("output.txt", lines);

Comments

0

Use an StringBuilder instance instead of concatenating strings. Strings are unmutable in .Net, so each change to any instance creates a new one, consuming the available memory. Make names become StringBuilder[] names and use the Append method to construct your result.

2 Comments

But it throws at string[] names = new string[len1 * len2];; it doesn't even reach the for loop.
Another solution is to read lines from files sequentially instead of reading them all at the same time, avoiding to allocate all the strings in memory at the same time. See StreamReader.ReadLine() method.
0

If your files are large (this is the reason for out of memory) you should not (never) load the complete file into memory. Especially the result file (with size = size1 * size2) will get very large. I'd suggest to use a StreamReader to read through the input files line by line and use a StreamWriter to write the result file line by line.

Wth this technique you may process arbuitrarilly large files (as long as the result fits on your hard disk)

Comments

-1

Use stringbuilder instead of appending via "+"

Comments

-1

Use "List names" and "Add" your combined lines to the List. So you don't need to alloc memory.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.