Tuesday, October 7, 2008

Loading Empty Lines

From time to time, I need to load raw text data into a table, line by line, preserving the original text and the original order of lines. The LOAD TABLE command does an excellent job, as follows:

CREATE TABLE raw_text (
line_number BIGINT NOT NULL
DEFAULT AUTOINCREMENT
PRIMARY KEY CLUSTERED,
line_text LONG VARCHAR NOT NULL DEFAULT '' );

LOAD TABLE raw_text ( line_text )
FROM 'c:/temp/xxx.txt'
DEFAULTS ON
DELIMITED BY ''
ESCAPES OFF
HEXADECIMAL OFF
QUOTES OFF
STRIP OFF;

SELECT * FROM raw_text ORDER BY line_number;
When the input file looks like this...
first line

third line
...the SELECT displays this...



Now, I said that LOAD TABLE did an "excellent job" and it does... if the input file doesn't have any empty lines, or if you don't want to preserve empty lines. Because they are completely ignored.

If you do want to preserve empty input lines, there is a solution: include the ROW DELIMITED BY option like this:
LOAD TABLE raw_text ( line_text )
FROM 'c:/temp/xxx.txt'
DEFAULTS ON
DELIMITED BY ''
ESCAPES OFF
HEXADECIMAL OFF
QUOTES OFF
ROW DELIMITED BY '\x0d\x0a'
STRIP OFF;

SELECT * FROM raw_text ORDER BY line_number;
Now the SELECT shows all three lines, including the empty one:



The ROW DELIMITED BY option was added in SQL Anywhere version 10, another reason to upgrade... but if you're not using version 10 already, consider moving straight to version 11.

No comments: