What I am doing is taking a comma delimited text field as input, and using PDI 8.0 to output a new text file with changes and new fields.
I have used a UDJC to generate a new field based on calculations from other fields, but I've only learned how to look at a single row at a time.
Here is example of rows of data for a single identifier (imagine csv with multiple rows for every ID, and this is one of the IDs):
EmployeeID, FTE, Job ID, Rownum
99,0.25, 3, 101
99,0.5, 3, 102
99,0.25, 7, 103
I'd like to collect data on all the rows, and do a custom calculation, resulting in a "Keep Row" field for one of the rows for each Id having a Y, with the other being an N. We plan a very complex way of removing the duplicates that involves many fields and it seems creating such a field to use "Filter" with later may be the way to do this. Does there exist examples of a similar PDI job I can view (with UDJC that looks at more than 1 row at a time), or can anyone point me in the right direction?