CSV - Comma Separated Value

Written by Paul Bourke
April 2000


The CSV (Comma Separated Value) format is a straightforward (mostly) way of transfering data between programs. It is typically used for spreadsheet and database applications because it is suited to data arranged in a table, that is, made up of rows and columns.

In it's simplest form a CSV file consists of records and fields. The records are separated by any combination of linefeeds, carriage control, or a carriage control linefeed pairs. Fields are separated by commas and any leading or trailing spaces are removed from fields.

For example the following has two records each with four fields.
1,tree,december,1999
5,bush,january, 2000

The above begs the question, "What if my field contains leading or trailing spaces, commas, linefeeds, or carriage control characters?". The answer is to enclose fields in double quotes. Note that this means that records can span lines. The following example has two records each with three fields.

1," tree","december, 1999"
5,bush,"january, 2000"

So what about fields that contain double quotes? These are represented by two double quotes, the field itself must also be enclosed in double quotes. For example the first field in the following becomes I said "stop".

"I said ""stop""","January 2000","5pm"

In the vast majority of cases CSV is used to transfer human readable ascii data, however there is no reason why it can't be used for more general binary data.