Tuesday, August 9, 2011

Importing length fixed data in Oracle

These days I have been working a lot with Oracle, usually I have to deal with MS SQL Server, Oracle, MySQL and PostgreSQL so sometimes I get confused and sometimes I missed functionality from other DB engines. Anyways, this time I want to show how to import huge amount of data using sqlldr.

Explain sqlldr in one post is impossible so I will just focus in one part. When you import data usually you get a nice formatted csv file, you know, coma separated or something like that, but what do you do when you just get a huge amount of text?, no comma, tab, quotes separated, nothing! well, if you have the table structure information (the size of each column), you can use sqlldr and specify the size of each column in the control file and sqlldr will load each column based on that.

This is how you need to write in your control file:
  INFILE filename.dat
  INTO TABLE coredata
Column1    POSITION(1:12) CHAR,
Column2    POSITION(13:204) CHAR

Couple of things to remark:
position[start pont:end point]
I made the mistake to think that the second value was the length of the column, which is wrong is the end of the column which I do not like since it woukd be better to just give the start point and the length. You need to calculate that for each column. I know it is just adding all the time but still I would rather put the length instead of the end point.

You can add multiple lines like this to import multiple files in one execution instead of changing the name for every file and execution.

For example, if you add:
at the beginning it will skip 1 row, load just 10 rows and limit to 5 errors.

Once you have that you only need to execute it using sqlldr like this:
sqlldr username/password loader=loader.ctl
Sqlldr is an amazing tool if you need to import huge amounts of data. I really recommend it. It has the best performance comparing other options and it supports tons of configurations like, importing multiple files at once, or import to a different tables, you can manipulate the data while it inserts, parallel processing, append or replace data, skip rows, limit the rows inserted. It is a huge list so you should really look into it.

Hope it helps someone, happy coding :)

No comments:

Post a Comment