Kettle PDI how to define parameters before Table input - kettle

I'm converting data from one database to another with a slightly different structure.
In my flow at some point I need to read data from the first database filtering on the id coming from previous steps.
This is the image of my flow
The last step is where I need to filter data. The query is:
SELECT e.*,UNIX_TIMESTAMP(v.dataInserimento)*1000 as timestamp
FROM verbale_evento ve JOIN evento e ON ve.eventi_id=e.id
WHERE ve.Verbale_id=? AND e.titolo='Note verbale'
Unfortunately ve.Verbale_id is a column of the first table (first step). How can I define to filter by that field?
Right now I've an error:
2017/12/22 15:01:00 - Error setting value #2 [Boolean] on prepared statement
2017/12/22 15:01:00 - Parameter index out of range (2 > number of parameters, which is 1).
I need to do this query at the end of the entire transformation.

You can pass previous rows of data as parameters.
However, the number of parameter placeholders in the Table input query must match the number of fields of the incoming data stream. Also, order matters.
Try trimming the data stream to only the field you want to pass using a select values step and then choose that step in the “get data from” box near the bottom of the table input. Also, check the “execute for each input row”.

Related

Sorting in Redshift table

I created a test table with two columns - id and name and ran following commands.
CREATE TABLE "public"."test1"(id integer, name character varying(256) encode lzo)distkey(id) compound sortkey(id);
INSERT INTO test1 VALUES (1,'First'),(2,'Second'),(4,'Fourth'),(3,'Third');
VACUUM test1
However on running SELECT * FROM test1; I am receiving following data:
Shouldn't the returned data be sorted according to id? If not, how can I make sure that a SELECT query without ORDER clause returns the data sorted according to the key: id?
You have to use order. From docs:
When a query doesn't contain an ORDER BY clause, the system returns result sets with no predictable ordering of the rows. The same query run twice might return the result set in a different order.

How do I Loop through a multi-value People Field or Lookup field in SharePoint 2013 designer using REST

I have a multi-valued people picker and a multi-valued Lookup field that I need to read all the entries in a 2013 workflow. I know how to create a workflow that retrieves the data and iterate through each list item using REST and a dictionary. Given I'm iterating through each item, I need to now iterate through each multi-valued field.
In the past, I have done this using a loop iterator and a second dictionary entry representing the data in the multi-valued field, but I don't have access to this code anymore. I can use a loop and use the find function parsing through my responseContent, but this is not reliable since my reponseContent will have multiple records in it and I know it can be done using a second dictionary entry.
My REST query is:
_api/lists/GetByTitle('EmailSetup')/Items?$select=EmailYN,EmailSubject,EmailBody,EmailTo/EMail,Emailcc/EMail,EmailToWorkflowPerson/Title,EmailccWorkflowPersons/Title&$filter=(Title%20eq%20%27BSM%20Review%27)%20and%20(WorkflowName%20eq%20%27ProcessBSMRequests%27)&$expand=EmailTo,Emailcc,EmailToWorkflowPerson,EmailccWorkflowPersons
Where my multi-valued fields are the Emailcc and EmailccWorkflowPerson, (people picker and lookup respectively).
I have my first dictionary as the following data structure that captures the requestHeaders
Accept String application/json;odata=verbose
Content-Type String application/json;odata=verbose
In my first loop I get all my attributes, but not certain how to get the multi-valued fields Emailcc and EmailccWorkflowPersons.
Yes, I can parse through my response, but there's a better way to somehow put these multi-value fields into a structure and then loop through these.
What I need is what is that structure (dictionary) and how do you get the data into that structure and then how do you loop through that structure.
The final result should be of the sort (psuedocode) where Index is which record I am on and Index2 is which multi-value I am on.
d/results([%Varaible: Index%])/Emailcc/Email[%Variable: Index2%])xxx
With a lot of debugging I have gotten half my answer and maybe someone can help me with the other half. The data structure of the data when it comes back via the REST looks like (some masking of our own data):
responseContent={"d":{"results":[{
"__metadata":{"id":"Web\/Lists(guid'c7bb71c8-a9dd-495f-aa5f-4dcacdf8db5c')\/Items(1)","uri":"https:\/\/xxxxx.xxxxxx.xxxxxxxxx.xxx\/hc\/teams\/MES\/_api\/Web\/Lists(guid'c7bb71c8-a9dd-495f-aa5f-4dcacdf8db5c')\/Items(1)","etag":"\"13\"","type":"SP.Data.EmailSetupListItem"},
"EmailTo":{"__metadata":{"id":"b493bee4-ec1a-4b76-a028-11766bdb7e5b","type":"SP.Data.UserInfoItem"},"EMail":"xyxee.dyff#homeward.com"},
"EmailToWorkflowPerson":{"__deferred":{"uri":"https:\/ \/xxxxx.xxxxxx.xxxxxxxxx.xxx\/hc\/teams\/MES\/_api\/Web\/Lists(guid'c7bb71c8-a9dd-495f-aa5f-4dcacdf8db5c')\/Items(1)\/EmailToWorkflowPerson"}},
"Emailcc":{"results":[{"__metadata":{"id":"790a690a-515b-4d07-bba3-73bf325fbbed","type":"SP.Data.UserInfoIt em"},"EMail":"xyxee.dyffns1#homeward.com"},{"__metadata":{"id":"3d77e75c-5fa8-4df6-937c-97e572714843","type":"SP.Data.UserInfoItem"},"EMail":"xyxee.dyffr#homeward.com"}]},
"EmailccWorkflowPersons":{"results":[{"__metadata":{"id":"06582ed9-09 10-4932-9b43-0cfb072942c7","type":"SP.Data.WorkflowPersonsListItem"},"Title":"Assistant Administrator"},{"__metadata":{"id":"13d03566-1703-4550-a21f-08ea286d4940","type":"SP.Data.WorkflowPersonsListItem"},"Title":"Initiator"}]},
"EmailYN":"No",
"EmailSubject":"BSM Request # %%ID%%",
"EmailBody":"<div class=\"ExternalClass645790473F7D4B62BE6224DD7B93990F\">%%IDLINK%%<br><\/div><div class=\"ExternalClass645790473F7D4B62BE6224DD7B93990F\">and the BSM# %%ID%%<br><\/div><div class=\"ExternalClass64 5790473F7D4B62BE6224DD7B93990F\"><br><\/div>"
}]}}
I created another dictionary variable, EmailResults just as the first one to store the multi-value emailcc addresses.
Then the following Get:
Get d/results([%variable: Index%)/Emailcc/results from Variable:responseContent (Output to Variable: EmailccResults)
To get the record count, I use Count Items in the EmailccResults
I set my second index to start at zero and loop through the number baseed on the count in EmailccResults.
To set my intermediate email address (getting one value at a time from the mult-value People picker).
Get d/results([%variable: Index%)/Emailcc/results(%Variable: Index2%)/EMail from Variable: responseContent (Output to Variable: EmailCc)
then I increment the Index2 variable and go to the next record. This works perfectly.
Now my problem is I have a multi-value Lookup that is included in this query (see what the results are above). I attempt the same logic and I am successfully getting the count, but not the Title fields.
My get is:
Get d/results([%variable: Index%)/EmailccWorkflowPersons/results from Variable:responseContent (Output to Variable: EmailccResults)
My actual assignment is:
Get d/results([%variable: Index%)/EmailccWorkflowPersons/results(%Variable: Index2%)/Title from Variable: responseContent (Output to Variable: tmpvar)
** the Lookup works exactly the same way. My problem was that my get above had some blank lines in the text box.

Kettle PDI how to pass multiple parameters not used in Table Input

I'm converting data from one database to another with a slightly different structure. In my flow at some point I need to read data from the first database filtering on the id coming from previous steps.
This is the image of my flow:
In the step "ZtlBus note" the query is:
SELECT e.*,UNIX_TIMESTAMP(v.dataInserimento)*1000 as timestamp
FROM verbale_evento ve JOIN evento e ON ve.eventi_id=e.id
WHERE ve.Verbale_id=? AND e.titolo='Note verbale'
Because I've just one parameter, in the previous step I use a Select values step. Unfortunately, after the Table input I need others fields coming from previous steps (Audit step) as marked in the picture.
I'm wondering how I can pass these fields after Table input. Some advice is appreciated.
if you use the "Database Join" step instead the input table step you will be able to keep the previous values of your transformation.

How to Query Large Sharepoint 2013 Lists in Infopath 2010?

I'm designing an Infopath form to help guide people in a data creation process. The form needs to draw from a Sharepoint list that contains around 19,000 rows, each with six columns that contain attributes (Column 1 = Attribute A, Column 2 = Attribute B, etc.) I've reduced the first three columns to their own lists, which contain only a few hundred unique entries each, if that. When I get to Column 4, there are 8,000 unique entries, which makes querying the list outright impossible
In an attempt to get around the item limitation, I've created an Infopath form with a data connection to the list (which does not automatically query when the form is loaded). Additionally, I've added drop downs that sets values for the queryFields of the secondary data source (one for Column 1, another for Column 2, and another for Column 3). On the last drop down, I set an action to query the database, but I still get the error regarding limitations and that rules cannot be applied.
Is there any way to "pre-filter" the data connection so that I can bypass the limitation by only drawing the data I need? Am I going about this the right way?
Any guidance would be greatly appreciated.
Are you able to add indexes to your list columns that you intend to query on? I've found that I can get around the error message on list limits if I go to the list and add an index for the columns that I will be setting as query fields prior to running my query data connection.

multiple strings as argument in table input

I'm trying to use SQL like select column from table where column in (?)
as ? should be concatenation of strings. I did script, that concatenates rows in something like 'string','secondstring' and so on.
I know, I should use just more parameters, but to the moment of execution I don't know, how many arguments there will be, and that is hundreds of them each time.
I'd like to do it in one SQL, so putting every argument in a single row, and check "execute for each row" isn't perfect either.
Any clue, how to do this?
You can use the cycles and variables kettle.
For example:
-create a job that contains:
1)a transformation where you store in an environment variable
(setVariable ("varname" value, "r") r is the parameter to be accessible by the parent job) the concat all input rows.
2)a transformation which makes the desired query with variable replacement (SELECT column FROM table WHERE column IN (${varname})).
If you need I can send the example files.