Jonas Midstrup's Blog

Programming, IT in general and everything in between.

CSV Parser

without comments

I’m currently working on a project where I needed a C# console application that was able to read through a Excel CSV (Comma Separated Values) file.

Basically the CSV file format is just a txt file with rows and each column is then separated by a comma (surprise!) or a semicolon. Besides a comma the data in each column can optionally be “framed” by quotation marks.

Therefore i started out with the following code, just as I would read through a normal txt file:

try{
    using (StreamReader readFile = new StreamReader(path))
    {
// Do something here…
    }
}
catch (Exception e)
{
    // Do some error handling here…
}

This is, as you can see, really straight forward. First of all I declare an object of a StreamReader in a using statement. Using the object “readFile” I am able then to navigate the file. The using statement is important as this will do the cleanup for me, by calling StreamReader.Dispose(), when the statement finishes. I always wrap this kind of code in a try…catch because when you work with files, errors just occasionally happen.

Now, to read the data from the CSV file I add the following lines of code inside the using statement:

List<string[]> parsedData = new List<string[]>();
string line;
string[] row;

while ((line = readFile.ReadLine()) != null)
{
    row = line.Split(‘,’);

    parsedData.Add(row);

}

It just declares a new List that can hold an array of strings and the line and row variables is needed when traversing through the file. I then use the readFile object to call the ReadLine() method of the StreamReader class in a while loop. When there is no more lines in the file the line variable will be null. Inside the while loop I use the string.Split() method to split the line into an array of strings (my columns) and I then add this array to my List object (parsedData).

The problem then was that I didn’t know exactly what encoding the file would be in. What to do then? I settled on a solution where I tell the StreamReader what encoding the file probably has and it will then open it in that encoding. This can be done by adding a parameter when calling the constructor on the StreamReader class like this:

using (StreamReader readFile = new StreamReader(path, encoding))

Finally all this can be wrapped in a nice method. I also added a check to be sure that the file I want to parse is actually available. But there you go:

public static List<string[]> ParseCSV(string path, Encoding encoding, char splitter)
{
if (!File.Exists(path))

        return null;

    List<string[]> parsedData = new List<string[]>();

    try

    {
        using (StreamReader readFile = new StreamReader(path, encoding))

        {
            string line;

            string[] row;

            while ((line = readFile.ReadLine()) != null)

            {
row = line.Split(splitter);
parsedData.Add(row);
            }
        }
    }
    catch (Exception e)
{
        // Do some error handling here…
    }

    return parsedData;

}

Written by jonasm

June 23rd, 2011 at 3:47 pm

Posted in .NET

Tagged with , , ,

Leave a Reply