- Teradata Fastload Tutorial
- Sas Teradata Fastload
- What Is A Reference Manual
- Teradata Fastload Csv
Let’s look at an actual FastLoad script that you might see in the real world. In the script below, every comment line is placed inside the normal Teradata comment syntax, [/*. . . . */]. FastLoad and SQL commands are written in upper case in order to make them stand out. In reality, Teradata utilities, like Teradata itself, are by default not case sensitive. You will also note that when column names are listed vertically we recommend placing the comma separator in front of the following column. Coding this way makes reading or debugging the script easier for everyone. The purpose of this script is to update the Employee_Profile table in the SQL01 database. The input file used for the load is named EMPS.TXT. Below the sample script each step will be described in detail.
Normally it is not a good idea to put the DROP and CREATE statements in a FastLoad script. The reason is that when any of the tables that FastLoad is using are dropped, the script cannot be restarted. It can only be rerun from the beginning. Since FastLoad has restart logic built into it, a restart is normally the better solution if the initial load attempt should fail. However, for purposes of this example, it shows the table structure and the description of the data being read.
Step Two: Next, you LOGON to the Teradata system. You will quickly see that the utility commands in FastLoad are similar to those in BTEQ. FastLoad commands were designed from the underlying commands in BTEQ. However, unlike BTEQ, most of the FastLoad commands do not allow a dot “.” in front of them and therefore need a semi-colon. Teradata FastLoad can be used to load CSV/TSV or other delimited files into database. Refer to article Teradata FastLoad - Load CSV File for more details about how to load CSV into Teradata. This article shows how to skip header line or multiple lines in the input file. SAS/ACCESS works by interfacing with the Load operator through the TPT API, which in turn uses the Teradata Fastload protocol for loading data. See your Teradata documentation for more information about the Load operator. This is the default FastLoad method. Teradata MultiLoad Reference 3 Preface Purpose This book provides information about Teradata MultiLoad, which is a Teradata® Tools and Utilities product. Teradata Tools and Utilities is a group of products designed to work with Teradata Database. Teradata MultiLoad provides an efficient way to deal with batch maintenance of large databases. Jul 11, 2016 Looking at the manual it seems it doesn't. To export data from a file to Teradata. For this I am using Fastload Java API. Provided by 'Teradata Fastload.
/* !/bin/ksh* */ /* FASTLOAD SCRIPT TO LOAD THE */ /* Version 1.1 */ /* ++++++++++++++++++++++++++++*/ /* Setup the FastLoad Parameters */ | Always good to identify the script and author in comments. Since this script does not drop the target or error tables, it is restartable. This is a good thing for production jobs. |
SESSIONS 100; /*or, the number of sessions supportable*/ |
TENACITY 4; /* the default is no tenacity, means no retry */ SLEEP 10; /* the default is 6, means retry in 6 minutes */ LOGON CW/SQL01,SQL01; | Wait 10 Min between retries. |
SHOW VERSIONS; /* Shows the Utility’s release number */ |
/* Set the Record type to a comma delimited for FastLoad */ RECORD 2; |
SET RECORD VARTEXT ‘,’; | Specifies if record layout is vartext with a comma delimiter. |
Notice that all fields are defined as VARCHAR. When using VARTEXT, the fields do not contain the length field like in these formats: text, FastLoad, or unformatted. |
Defines the flat file name. |
/* Optional to show the layout of the input */ SHOW |
/* Begin the Load and Insert Process into the */ |
|
ERRORFILES SQL01.Emp_Err1, SQL01.Emp_Err2 | Names the error tables. Sets the number of rows at which to pause & record progress in the restart log before loading further. |
Defines the insert statement to use for loading the rows. |
Continues loading process with Phase 2. |
Logs off of Teradata. |
Step One: Before logging onto Teradata, it is important to specify how many sessions you need. The syntax is [SESSIONS {n}].
Step Two: Next, you LOGON to the Teradata system. You will quickly see that the utility commands in FastLoad are similar to those in BTEQ. FastLoad commands were designed from the underlying commands in BTEQ. However, unlike BTEQ, most of the FastLoad commands do not allow a dot [“.”] in front of them and therefore need a semi-colon. At this point we chose to have Teradata tell us which version of FastLoad is being used for the load. Why would we recommend this? We do because as FastLoad’s capabilities get enhanced with newer versions, the syntax of the scripts may have to be revisited.
Step Three: If the input file is not a FastLoad format, before you describe the INPUT FILE structure in the DEFINE statement, you must first set the RECORD layout type for the file being passed by FastLoad. We have used VARTEXT in our example with a comma delimiter. The other options are FastLoad, TEXT, UNFORMATTED OR VARTEXT. You need to know this about your input file ahead of time.
Step Four: Next, comes the DEFINE statement. FastLoad must know the structure and the name of the flat file to be used as the input FILE, or source file for the load.
Step Five: FastLoad makes no assumptions from the DROP TABLE statements with regard to what you want loaded. In the BEGIN LOADING statement, the script must name the target table and the two error tables for the load. Did you notice that there is no CREATE TABLE statement for the error tables in this script? FastLoad will automatically create them for you once you name them in the script. In this instance, they are named “Emp_Err1” and “Emp_Err2”. Phase 1 uses “Emp_Err1” because it comes first and Phase 2 uses “Emp_Err2”. The names are arbitrary, of course. You may call them whatever you like. At the same time, they must be unique within a database, so using a combination of your userid and target table name helps insure this uniqueness between multiple FastLoad jobs occurring in the same database.
In the BEGIN LOADING statement we have also included the optional CHECKPOINT parameter. We included [CHECKPOINT 100000]. Although not required, this optional parameter performs a vital task with regard to the load. In the old days, children were always told to focus on the three “R’s’ in grade school (“reading, ‘riting, and ‘rithmatic”). There are two very different, yet equally important, R’s to consider whenever you run FastLoad. They are RERUN and RESTART. RERUN means that the job is capable of running all the processing again from the beginning of the load. RESTART means that the job is capable of running the processing again from the point where it left off when the job was interrupted, causing it to fail. When CHECKPOINT is requested, it allows FastLoad to resume loading from the first row following the last successful CHECKPOINT. We will learn more about CHECKPOINT in the section on Restarting FastLoad.
Step Six: FastLoad focuses on its task of loading data blocks to AMPs like little Yorkshire terrier’s do when playing with a ball! It will not stop unless you tell it to stop. Therefore, it will not proceed to Phase 2 without the END LOADING command.
In reality, this provides a very valuable capability for FastLoad. Since the table must be empty at the start of the job, it prevents loading rows as they arrive from different time zones. However, to accomplish this processing, simply omit the END LOADING on the load job. Then, you can run the same FastLoad multiple times and continue loading the worktables until the last file is received. Then run the last FastLoad job with an END LOADING and you have partitioned your load jobs into smaller segments instead of one huge job. This makes FastLoad even faster!
Of course to make this work, FastLoad must be restartable. Therefore, you cannot use the DROP or CREATE commands within the script. Additionally, every script is exactly the same with the exception of the last one, which contains the END LOADING causing FastLoad to proceed to Phase 2. That’s a pretty clever way to do a partitioned type of data load.
Step Seven: All that goes up must come down. And all the sessions must LOGOFF. This will be the last utility command in your script. At this point the table lock is released and if there are no rows in the error tables, they are dropped automatically. However, if a single row is in one of them, you are responsible to check it, take the appropriate action and drop the table manually.
- Teradata Useful Resources
FastLoad utility is used to load data into empty tables. Since it does not use transient journals, data can be loaded quickly. It doesn't load duplicate rows even if the target table is a MULTISET table.
Limitation
Teradata Fastload Tutorial
Target table should not have secondary index, join index and foreign key reference.
Sas Teradata Fastload
How FastLoad Works
FastLoad is executed in two phases.
Phase 1
The Parsing engines read the records from the input file and sends a block to each AMP.
Each AMP stores the blocks of records.
Then AMPs hash each record and redistribute them to the correct AMP.
At the end of Phase 1, each AMP has its rows but they are not in row hash sequence.
Phase 2
Phase 2 starts when FastLoad receives the END LOADING statement.
Each AMP sorts the records on row hash and writes them to the disk.
Locks on the target table is released and the error tables are dropped.
Example
Create a text file with the following records and name the file as employee.txt.
Following is a sample FastLoad script to load the above file into Employee_Stg table.
Executing a FastLoad Script
Once the input file employee.txt is created and the FastLoad script is named as EmployeeLoad.fl, you can run the FastLoad script using the following command in UNIX and Windows.
Once the above command is executed, the FastLoad script will run and produce the log. In the log, you can see the number of records processed by FastLoad and status code.
FastLoad Terms
What Is A Reference Manual
Following is the list of common terms used in FastLoad script.
Teradata Fastload Csv
LOGON − Logs into Teradata and initiates one or more sessions.
DATABASE − Sets the default database.
BEGIN LOADING − Identifies the table to be loaded.
ERRORFILES − Identifies the 2 error tables that needs to be created/updated.
CHECKPOINT − Defines when to take checkpoint.
SET RECORD − Specifies if the input file format is formatted, binary, text or unformatted.
DEFINE − Defines the input file layout.
FILE − Specifies the input file name and path.
INSERT − Inserts the records from the input file into the target table.
END LOADING − Initiates phase 2 of the FastLoad. Distributes the records into the target table.
LOGOFF − Ends all sessions and terminates FastLoad.
Comments are closed.