AnsweredAssumed Answered

Create a field based on multiple rows (with same Identifer) in UDJC (version PDI 8.0)

Question asked by Peter Parker on Jul 25, 2018
Latest reply on Aug 9, 2018 by John Craig


What I am doing is taking a comma delimited text field as input, and using PDI 8.0 to output a new text file with changes and new fields.


I have used a UDJC to generate a new field based on calculations from other fields, but I've only learned how to look at a single row at a time.


Here is example of rows of data for a single identifier (imagine csv with multiple rows for every ID, and this is one of the IDs):


EmployeeID, FTE, Job ID, Rownum

99,0.25, 3, 101

99,0.5, 3, 102

99,0.25, 7, 103


I'd like to collect data on all the rows, and do a custom calculation, resulting in a "Keep Row" field for one of the rows for each Id having a Y, with the other being an N.  We plan a very complex way of removing the duplicates that involves many fields and it seems creating such a field to use "Filter" with later may be the way to do this.  Does there exist examples of a similar PDI job I can view (with UDJC that looks at more than 1 row at a time), or can anyone point me in the right direction?