multiple strings as argument in table input - kettle

I'm trying to use SQL like select column from table where column in (?)
as ? should be concatenation of strings. I did script, that concatenates rows in something like 'string','secondstring' and so on.
I know, I should use just more parameters, but to the moment of execution I don't know, how many arguments there will be, and that is hundreds of them each time.
I'd like to do it in one SQL, so putting every argument in a single row, and check "execute for each row" isn't perfect either.
Any clue, how to do this?

You can use the cycles and variables kettle.
For example:
-create a job that contains:
1)a transformation where you store in an environment variable
(setVariable ("varname" value, "r") r is the parameter to be accessible by the parent job) the concat all input rows.
2)a transformation which makes the desired query with variable replacement (SELECT column FROM table WHERE column IN (${varname})).
If you need I can send the example files.

Related

Kettle PDI how to define parameters before Table input

I'm converting data from one database to another with a slightly different structure.
In my flow at some point I need to read data from the first database filtering on the id coming from previous steps.
This is the image of my flow
The last step is where I need to filter data. The query is:
SELECT e.*,UNIX_TIMESTAMP(v.dataInserimento)*1000 as timestamp
FROM verbale_evento ve JOIN evento e ON ve.eventi_id=e.id
WHERE ve.Verbale_id=? AND e.titolo='Note verbale'
Unfortunately ve.Verbale_id is a column of the first table (first step). How can I define to filter by that field?
Right now I've an error:
2017/12/22 15:01:00 - Error setting value #2 [Boolean] on prepared statement
2017/12/22 15:01:00 - Parameter index out of range (2 > number of parameters, which is 1).
I need to do this query at the end of the entire transformation.
You can pass previous rows of data as parameters.
However, the number of parameter placeholders in the Table input query must match the number of fields of the incoming data stream. Also, order matters.
Try trimming the data stream to only the field you want to pass using a select values step and then choose that step in the “get data from” box near the bottom of the table input. Also, check the “execute for each input row”.

Power Query : replacing a cell's content by the previous one under conditions

On a regular bases, I have to clean a CSV file.
I'd like to automate this task, what I'm trying to achieve with Power Query.
But I'm stuck on one process : in this file, some dates are wrongly filled (always with the same value - so that I can easily identify them), and I'd like to replace them with the previous row's one.
Is there a M language function allowing this kind of operations ?
Thanks !
Right now, Table.FillDown only fills on null values. Maybe first you can replace those wrong values with null, then you can call Table.FillDown on those columns?

Using RegEx in SSIS

I currently have a package pulling data from an excel file, but when pulling the data out I get rows I do not want. So I need to extract everything from the 'ID' field that has any sort of letter in it.
I need to be able to run a RegEx command such as "%[a-zA-Z]%" to pull out that data. But with the current limitation of conditional split it's not letting me do that. Any ideas on how this can be done?
At the core of the logic, you would use a Script Transformation as that's the only place you can access the regex.
You could simply a second column to your data flow, IDCleaned and that column would only contain cleaned values or a NULL. You could then use the Conditional Split to filter good rows vs bad. System.Text.RegularExpressions.Regex.Replace error in C# for SSIS
If you don't want to add another column, you can set your current ID column to be ReadWrite for the Script and then update in place. Perhaps adding a boolean column might make the Conditional Split logic easier at this point.

Is there a way to get each row's value from a database into an array?

Say I have a query like the one below. What would be the best way to put each value into an array if I don't know how many results there will be? Normally I would do this with a loop, but I have no idea how many results there are. Would I need run another query to count the results first?
<CFQUERY name="alllocations" DATASOURCE="#DS#">
SELECT locationID
FROM tblProjectLocations
WHERE projectID = '#ProjectName#'
</CFQUERY>
Depending on what you want to do with the array, you can just refer to the column directly for most array operations, eg:
i = arrayLen(alllocations["locationID"]);
Using that notation will work for most array operations.
Note that this doesn't "create an array", it's simply a matter that a query columns - a coldfusion.sql.QueryColumn object is close enough to a CFML array for CF to be able to convert it to one when an array is needed. Hence the column can be passed to an array function.
What one cannot do is this:
myArray = q["locationID"];
This is because by default CF will treat q["locationID"] as a string if it can, and the string value is what's in the first row of the locationID column in the q query. It's only when an array is actually required will CF convert it to an array instead. This is basically how loose-typing works.
So if you just need to pass your query column to some function that expects an array, you can use the syntax above. If you want to actually put the column into a variable, then you will need to do something like this:
myArray = listToArray(valueList(q.localtionID));
NB: make sure you use <cfqueryparam> on your filter values instead of hard-coding them into your SQL statement.
myquery.column.toArray() is also a good undocumented choice.
Since you're only retrieving 1 field value from the query, you could use ValueList() to convert the query results into a comma-delimited list of locationIds, then use listToArray() to change that list into an array.
If you were retrieving multiple field values from the query, then you'd want to loop through the query, copy all the field values from the given row into a struct, and then add that struct to an array using arrayAppend().
(If you're not familiar with these functions, you can look them up in the Adobe docs or on cfquickdocs.com).

QuerySetCell on a column with a numeric name in ColdFusion

I'm attempting the use QuerySetCell to change value of a specific column in a query object, and have been receiving this error:
Column names must be valid variable names. They must start with a letter and can only include letters, numbers, and underscores.
The reason for this error, and the complication here, is that the columns I am trying to update have some integers as names, taken from a separate record's key/ID. For example, the query may contain three columns with names: "6638, 6639, 6640".
Now, I understand why this error is occurring (though not necessarily why CF has this limitation), however cannot come up with a work around. The further complications are that I cannot make any changes as to how the initial query set's column names, and need to preserve the column names and their order for when I convert the query to a JSON string and update my results table using the JSONified query.
Has anyone encountered this issue before, and if so how were you able to work around it, or were you forced to change how the columns were named in your initial query?
Using CF8 and have the ability to edit the JSONified query after it is returned from my Ajax handler if that makes a difference.
You can use bracket notation to set the values in a query (at least you can in CF9 - I do not have CF8 installed to test).
The sytax is pretty simple:
<cfset queryName[columnName][row] = "some new value" />
From your example, you could use this:
<cfset myQuery["6638"][1] = "moo" />
This will set the value of the '6638' column in the first row to 'moo'. If you have multiple rows being returned, you would need to set each row.