-
-
Save BirgittaHauser/e5872aba85764b8738eac3adfb0e0279 to your computer and use it in GitHub Desktop.
-- Read *csv File from IFS | |
With x as (-- Split IFS File into Rows (at CRLF) | |
Select Ordinal_Position as RowKey, Element as RowInfo | |
from Table(SysTools.Split(Get_Clob_From_File('/home/Hauser/Employee.csv'), x'0D25')) a | |
Where Trim(Element) > ''), | |
y as (-- Split IFS File Rows into Columns (and remove leading/trailing double quotes ") | |
Select x.*, Ordinal_Position ColKey, | |
Trim(B '"' from Element) as ColInfo | |
from x cross join Table(SysTools.Split(RowInfo, ',')) a) | |
-- Return the Result as Table | |
Select RowKey, | |
Min(Case When ColKey = 1 Then ColInfo End) EmployeeNo, | |
Min(Case When ColKey = 2 Then ColInfo End) Name, | |
Min(Case When ColKey = 3 Then ColInfo End) FirstName, | |
Min(Case When ColKey = 4 Then ColInfo End) Address, | |
Min(Case When ColKey = 5 Then ColInfo End) Country, | |
Min(Case When ColKey = 6 Then ColInfo End) ZipCode, | |
Min(Case When ColKey = 7 Then ColInfo End) City | |
From y | |
Where RowKey > 1 -- Remove header | |
Group By RowKey; |
BirgittaHauser
commented
Feb 2, 2020
via email
•
Thanks Birgitta. I think Modifies SQL Data would also be required here. Based on the ACS wizard, the default behavior would be Reads SQL Data. Without the Modifies, the system will return SQL0577 for this routine.
Also, just a matter of taste, I would name the columns after their Excel counterparts (i.e., A,B,C,...AA,AB,AC,...) to more easily map them out to the origins.
One needs to be cognizant about certain size limitation split() currently imposes.
Looking at the parameter definition, INPUT_LIST is defined as CLOB(1048576). I have a (relatively) large csv file (40 columns, just under 30K rows, and the size stands at just under 7M). The first CTE from the new UDTF parsecsv() will cut the processing short when run against that file.
The size of the input parm for split() could probably be bumped up to help the size limitation issue, but I would be concerned about the performance impact. The performance of the new UDTF parsecsv() is not too great already.
It seems to me that, while very cool, this routine could only be deployed to run against relatively small files. Birgitta, would you agree, or am I missing anything?