- Question: What is CSV
-
- A CSV, or Comma Separated Values file, is a
loosely defined text file format that is commonly used for data exchange.
Unfortunately there is no true single standard for how CSV files are to be
defined, so CSV files can sometimes be problematic when trying to exchange
data from one application to another. This application uses the Microsoft
Text ODBC driver to implement CSV file support, so the software follows the
same conventions used by Microsoft applications. CSV files will encode a
single record of information on a single line of plaintext, with individual
data fields separated by a comma. Numeric data usually does not use any
field delimiters. Other types of data fields are not implemented in a
consistent manner with all applications.
The two typical problem areas are what field delimiters will get used - if
any - and how special characters are encoded when they are part of the data
field. Some applications may use field delimiters on all non-numeric fields.
Usually the field delimiter will be a single or double quote mark character.
In the implementation Microsoft uses, it generally avoids using any field
delimiters unless the field contains other special characters. It is even
common for this to be mixed on an individual field by field basis within a
single record. So some text fields may use field delimiters and some may
not, depending on the data in the field.
-
- Special characters can present additional
problems with CSV data files in that there is no universally defined
convention for how to properly encode them. The most common problems are
what to do with commas and quote marks that are embedded in the data fields.
The Microsoft Text driver will force a field to use a double-quote mark as a
field delimiter when either of these special characters are present in the
data field. This allows the application to know how to tell apart commas
that delimit fields versus commas that are part of the data. If the data
contains a double-quote mark, the Microsoft Text driver will escape the
character by outputting it twice. So a single double-quote mark embedded in
the field data will be represented by two double-quote marks, as a single
double-quote mark is used to delimit the start and end of the field.
Complements of xcent.com