I have a cpp class that implements a Netezza User-defined function (documentation here). It takes a argument that will be a string of some date format, and converts it to YYYYMMDD format. If it isn't a valid date, it will return "99991231". Whenever I run the code on some tables, I get different outputs every time for the same inputs. I assume there is some memory issue that I am not seeing.
Logically, we set the char array retval equal to the output of the date command. If it gave a null output, we set to "99991231". Then we set a temp char array to the first 9 bytes of retval (last one being the null terminator). Then we memcpy into ret->data (a char ptr of the struct we must return).
#include <stdarg.h>
#include <string.h>
#include "udxinc.h"
#include "udxhelpers.h"
using namespace nz::udx_ver2;
class Dateconvert: public Udf
{
public:
Dateconvert(UdxInit *pInit) : Udf(pInit){}
~Dateconvert(){}
static Udf* instantiate(UdxInit *pInit);
virtual ReturnValue evaluate()
{
StringReturn* ret = stringReturnInfo();
StringArg *str;
str = stringArg(0);
int lengths = str->length;
char *datas = str->data;
string tempData = datas;
string shell_arg = tempData;
shell_arg = "'" + shell_arg + "'";
string cmd="date -d " + shell_arg + " +%Y%m%d 2>/dev/null";
FILE *ls = popen(cmd.c_str(), "r");
char retval[100];
retval[0]='n';
fgets(retval, sizeof(retval), ls);
if(!isdigit(retval[0]))
{
strcpy(retval,"99991231");
}
pclose(ls);
char temp1[9];
memcpy(temp1, retval, 8);
temp1[8]='\0';
ret->size = 9;
memcpy(ret->data, temp1, 9);
NZ_UDX_RETURN_STRING(ret);
}
};
Udf* Dateconvert::instantiate(UdxInit *pInit)
{
return new Dateconvert(pInit);
}
When I run the UDF on one distinct value in Netezza, it gives me the expected output. However, when I run it over multiple columns, the output is sometimes correct, sometimes wrong, seemingly randomly. I assume this has to be an internal memory issue. Examples:
input output
1) 8/11/2014 20140811
2) 8/11/2014 20140811
Fri 10/17/14 20141017
3) 8/11/2014 99991231
Fri 10/17/14 20141017
4) 8/11/2014 20140811
Fri 10/17/14 20141017
5) 8/11/2014 20140811
Fri 10/17/14 20141017
9-Nov-12 20121109
6) 8/11/2014 20140811
Fri 10/17/14 20141017
9-Nov-12 01241109 (what?)
7) 8/11/2014 99991231
Fri 10/17/14 20141017
9-Nov-12 20121109
Anytime there is only one call of the function, it returns the correct answer. The problem arises when it is called multiple times, which I don't understand. Why would anything be carrying over? Changing the return value size to 8 from 9 at the end of the evaluate function does not solve the issue.
This is the format by which the function is called:
select a.val1, DATECONVERT(a.val1)
from
(
select '8/11/2014' as val1 from calendar
union
select 'Fri 10/17/14' as val1 from calendar
union
select '9-Nov-12' as val1 from calendar
) a
And compile command for the UDF:
nzudxcompile /export/home/nz/dateconvert.cpp -o dateconvert.o --sig "Dateconvert(VARCHAR(200))" --version 2 --return "VARCHAR(200)" --class Dateconvert --user user1 --pw mypw --db mydb
To cut to the chase, the problem here is in how you assign tempData.
StringReturn* ret = stringReturnInfo();
StringArg *str;
str = stringArg(0);
int lengths = str->length;
char *datas = str->data;
string tempData = datas;
StringArg does not store a NUL-terminated string, but instead provides the length and expects you to manage that yourself.
select a.val1, ADMIN.DATECONVERT(a.val1)
from
(
select '09-Nov-12'::varchar(20) as val1
union all
select '9-Nov-12'::varchar(20) as val1
) a;
VAL1 | DATECONVERT
-----------+-------------
09-Nov-12 | 20121109
9-Nov-12 | 01221109
(2 rows)
In this example, what is happening is that the longer first string still has a character hanging around in memory when the second, shorter string is assigned to tempData. That hanging '2' at the end gets added on like so:
09-Nov-12
9-Nov-122
Each of these are both valid inputs to date, which nicely explains the output you are seeing.
$ date -d 09-Nov-12 +%Y%m%d
20121109
$ date -d 09-Nov-122 +%Y%m%d
01221109
Change the assignment to use that length, and you'll avoid the problem.
//string tempData = datas;
string tempData(datas, datas+lengths);
And then you get the expected output:
select a.val1, ADMIN.DATECONVERT(a.val1)
from
(
select '09-Nov-12'::varchar(20) as val1
union all
select '9-Nov-12'::varchar(20) as val1
) a;
VAL1 | DATECONVERT
-----------+-------------
09-Nov-12 | 20121109
9-Nov-12 | 20121109
(2 rows)
All that being said, I don't know that the overall approach you are taking in this UDF will work. As I'm running it above, the rows are generated on the host, because they are hard-coded in the SQL, and date is certainly available on the host. However, you can't expect the code that runs on the MPP backend (which we often refer to as SPUs) to have the same availability of linux utilities that you will find on the host, or if they exist that they have the same capabilities.
If I move the date into an actual table, the UDF will operate on it on the SPU, and it will give me bad output because the date command on the SPU image is significantly different than that of the host, and doesn't understand this input format at all.
select a.col1, admin.DATECONVERT(a.col1) from calendar a;
COL1 | DATECONVERT
-----------+-------------
09-Nov-12 | 99991231
9-Nov-12 | 99991231
(2 rows)
Related
I have soma data, starting from A10 to column M, until the 59th row.
I have some dates in column F10:F that are text strings, converted to official dates in column N (here the question with the process)
M3 is set to =NOW().
In cell N3 I have: =M3+14.
I want to delete all the rows, with a date in column N10:N that comes before [today + 2 weeks] (so cell N3).
When I create a script in Apps Script, it doesn't run the if statement, but if I leave it in comments, it can go in the for loop and deletes the rows, so I'm pretty sure the problem is, again, date formatting.
In this question I ask: how do I compare the values of N10:N with N3, in order to delete all the rows that don't meet the condition if(datesNcol <= targetDate)? (in code is written as if (rowData[i] < flatArray))
I leave also a demo sheet with this problem explained in detail and two alternatives (getBackground condition and numeric days condition).
Attempts:
This is a simplified code example:
const gen = SpreadsheetApp.getActiveSpreadsheet().getSheetByName('Generatore');
const bVals = gen.getRange('B10:B').getValues();
const bFilt = bVals.filter(String);
const dataLastRow = bFilt.length;
function deleteExpired() {
dateCorrette(); //ignore, formula that puts corrected dates from N10 to dataLastRow
var dateCorrect = gen.getRange(10,14,dataLastRow,1).getValues();
var targetDate = gen.getRange('N3').getValues();
var flatArray = [].concat.apply([], targetDate);
for (var i = dateCorrect.length - 1; i >= 0; i--) {
var rowData = dateCorrect[i];
if (rowData[i] < flatArray) {
gen.deleteRow(i+10);
}
}
};
If run the script, nothing is deleted.
If I //comment the if function and the closing bracket, it delets all the rows of the list one by one.
I can't manage to meet that condition.
Right now, it logs this [Sun Jan 01 10:33:20 GMT-05:00 2023] as flatArray
and this [Wed Dec 21 03:00:00 GMT-05:00 2022] as dateCorrect[49], so the first row to delete, that is the 50th (is correct for all the dateCorrect[i] dates).
I tried putting a getTime() method in the targetDate variable, but it only functions if there is the getValue() method, not getValues(), so I then don't know how to use getTime() method on rowData, which is based on dateCorrected[i], which have to use the getValues() method. And then it also doesn't accept the flatArray variable, that has to be commented out (or it logs [ ] for flatArray, not the corrected date)
I leave the other attempts in the demo sheet, because I want to prioritize this problem around the date and make it clear in my head.
Thanks for all the help.
DEMO SHEET, ITA Locale time
I don't know how the demo sheet works with Apps Script, I suggest to copy the code in a personal sheet
UPDATE:
I've also tried putting an extra column, with an IF built-in function that writes "del" if the function has to be deleted.
=IF(O10>14;"del";"")
And then
var boba = gen.getRange(10,16,bLast,1).getDisplayValues();
.
.
if (boba[i] == 'del')
This does the job. But I can't understand why the other methods don't work.
Try this. It seems like you do a lot of things that aren't necessary. Unless I'm missing something.
A few notes. I typically do not use global variable, unless absolutely necessary. I don't create a variable for last row unless I have to use that value multiple times in my script. I use the method Sheet.getLastRow(). dataCorrect is a 2D array of 1 column so the second index can only be [0]. And getRange('N4') is a single cell so getValue() is good enough.
function deleteExpired() {
const gen = SpreadsheetApp.getActiveSpreadsheet().getSheetByName('Generatore');
var dateCorrect = gen.getRange(10,14,gen.getLastRow()-9,1).getValues();
var targetDate = gen.getRange('N3').getValue();
for (var i = dateCorrect.length - 1; i >= 0; i--) {
if (dataCorrect[i][0] < targetDate) {
gen.deleteRow(i+10);
}
}
}
Try this:
function delRows() {
const ss = SpreadsheetApp.getActive();
const gsh = ss.getSheetByName('Generatore');
const colB = gsh.getRange('B10:B' + gsh.getLastRow()).getValues();
var colN = gsh.getRange('N10:N' + gsh.getLastRow()).getValues();
var tdv = new Date(new Date().getFullYear(), new Date().getMonth(), new Date().getDate() + 14).valueOf();//current date + 14
let d = 0;
colN.forEach((n, i) => {
if (new Date(n).valueOf() < tdv) {
gsh.deleteRow(i + 10 - d++);
}
});
}
I'm trying to divide the data by a certain datetime.
I've created e_timefrom what was originally a string "2019-10-15 20:33:04" for example.
To obtain all the information from the string containing h:m:s, I uses the following command to create a double
gen double e_time = clock(event_timestamp, "YMDhms")
Now I get the result I want from format e_time %tc (human readable),
I want to generate a new variable anything that is greater than 2019-10-15 as 1 and anything less than that as 0 .
I've tried
// 1
gen new_d = 0 if e_time < "1.887e+12"
replace new_d = 1 if e_time >= "1.887e+12"
// 2
gen new_d = 0 if e_time < "2019-10-15"
replace new_d = 1 if e_time > "2019-10-15"
However, I get an error message type mismatch.
I tried converting a string "2019-10-15" to double \to check if 1.887e+12 really meant 2019-10-15 using display, but I'm not sure how the command really works here.
Anyhow I tried
// 3
di clock("2019-10-15", "YMDhms")
but it didn't work.
Can anyone give advice on comparing dates that are in a double format properly?
Your post is a little hard to follow (a reproducible data example would help a lot) but the error type mismatch is because e_time is numeric, and "2019-10-15" is a string.
I suggest the following:
clear
input str20 datetime
"2019-10-14 20:33:04"
"2019-10-16 20:33:04"
end
* Keep first 10 characters
gen date = substr(datetime,1,10)
* Check that all strings are 10 characters
assert length(date) == 10
* Convert from string to numeric date variable
gen m = substr(date,6,2)
gen d = substr(date,9,2)
gen y = substr(date,1,4)
destring m d y, replace
gen newdate = mdy(m,d,y)
format newdate %d
gen wanted = newdate >= mdy(10,15,2019) & !missing(newdate)
drop date m d y
list
+------------------------------------------+
| datetime newdate wanted |
|------------------------------------------|
1. | 2019-10-14 20:33:04 14oct2019 0 |
2. | 2019-10-16 20:33:04 16oct2019 1 |
+------------------------------------------+
I have created the below function that will return workspace details which the loggedin user has access to.
But this function is returning only the first record from the select list.
I need all the records to be displayed as output.
Please modify it and let me know.
CREATE OR REPLACE FUNCTION "F_WORKSPACE_LOGIN_USERS" (
p_email VARCHAR2
) RETURN VARCHAR2 IS
l_error VARCHAR2(1000);
l_workspace VARCHAR2(1000);
l_teams VARCHAR2(1000);
l_team VARCHAR2(1000);
BEGIN
FOR i IN ( SELECT a.name workspace,
a.team_id id
FROM slackdatawarehouse.teams a,
( SELECT TRIM(workspaces) workspaces
FROM alluser_workspaces_fact
WHERE lower(email) = lower(p_email)
) b
WHERE a.team_id IN ( SELECT c.team_id
FROM slackdatawarehouse.team_tokens c
)
OR instr(', '
|| lower(b.workspaces),', '
|| lower(a.name) ) > 0
ORDER BY 1 ) LOOP
l_teams := l_team
|| ','
|| i.id;
l_teams := ltrim(rtrim(l_teams,','),',');
RETURN l_teams;
END LOOP;
END;
Current output is :
T6HPQ5LF7,T6XBXVAA1,T905JLZ62,T7CN08JPQ,T9MV4732M,T5PGS72NA,T5A4YHMUH,TAAFTFS0P,T69BE9T2A,T85D2D8MT,T858U7SF4,T9D16DF5X,T9DHDV61G,T9D17RDT3,T5Y03HDQ8,T5F5QPRK7
Required output is :
T6HPQ5LF7
T6XBXVAA1
T905JLZ62
i need output like above as one by one
I don't know what that code really does (can't test it), but this might be the culprit:
...
RETURN l_teams;
END LOOP;
As soon as code reaches the RETURN statement, it exits the loop and ... well, returns what's currently in L_TEAMS variable. Therefore, move RETURN out of the loop:
...
END LOOP;
RETURN l_teams;
If it still doesn't work as expected (which might be the case), have a look at pipelined functions (for example, on Oracle-base site) as they are designed to return values you seem to be looking for.
A simple example:
SQL> create or replace type t_dp_row as object
2 (deptno number,
3 dname varchar2(20));
4 /
Type created.
SQL> create or replace type t_dp_tab is table of t_dp_row;
2 /
Type created.
SQL> create or replace function f_depts
2 return t_dp_tab pipelined
3 is
4 begin
5 for cur_r in (select deptno, dname from dept)
6 loop
7 pipe row(t_dp_row(cur_r.deptno, cur_r.dname));
8 end loop;
9 return;
10 end;
11 /
Function created.
SQL> select * from table(f_depts);
DEPTNO DNAME
---------- --------------------
10 ACCOUNTING
20 RESEARCH
30 SALES
40 OPERATIONS
SQL>
I don't understand whats wrong with the code, I have read a lot of times but I can't find the error
pstmt = con->prepareStatement("SELECT (?) FROM votos WHERE id = (?)");
pstmt->setString(1, eleccion);
pstmt->setInt(2, p->getId());
res = pstmt->executeQuery();
while(res->next())
{
p->setVoto(res->getInt(1));
}
When the eleccion and id variables are Provincial and 1 respectively the getInt(1) function should return 1, but it returns 0.
The command (in the mysql command line):
SELECT Provincial from Votos WHERE id=1
Returns a table with one row and one column with the value 1
Side notes:
Spelling was checked
The getId() function works correctly
The compiler doesn't give any error
You can't use a placeholder in a prepared query for a column name. It's returning the value of the string eleccion, not using it as the name of a column in the table. You need to do string concatenation to substitute the column name.
std::string sql = std::string("SELECT `") + eleccion + "` FROM votos WHERE id = ?";
pstmt = con->prepareStatement(sql.c_str());
pstmt->setInt(1, p->getId());
res = pstmt->executeQuery();
while(res->next())
{
p->setVoto(res->getInt(1));
}
If the value of eleccion is coming from the user or some other untrusted source, make sure you validate it before concatenating, to prevent SQL injection.
improved formatting,I am a bit stuck where I am not able to extract the last 4 characters of the string., when I write :-
indikan=substr(Indikation,length(Indikation)-3,4);
It is giving invalid argument.
how to do this?
This code works:
data temp;
indikation = "Idontknow";
run;
data temp;
set temp;
indikan = substrn(indikation,max(1,length(indikation)-3),4);
run;
Can you provide more context on the variable? If indikation is length 3 or smaller than I could see this erroring or if it was numeric it may cause issues because it right justifies the numbers (http://support.sas.com/documentation/cdl/en/lrdict/64316/HTML/default/viewer.htm#a000245907.htm).
If it's likely to be under four characters in some cases, I would recommend adding max:
indikan = substrn(indikation,max(1,length(indikation)-3),4);
I've also added substrn as Rob suggests given it better handles a not-long-enough string.
Or one could use the reverse function twice, like this:
data _null_;
my_string = "Fri Apr 22 13:52:55 +0000 2016";
_day = substr(my_string, 9, 2);
_month = lowcase(substr(my_string, 5, 3));
* Check the _year out;
_year = reverse(substr(reverse(trim(my_string)), 1, 4));
created_at = input(compress(_day || _month || _year), date9.);
put my_string=;
put created_at=weekdatx29.;
run;
Wrong results might be caused by trailing blanks:
so, before you perform substr, strip/trim your string:
indikan=substr(strip(Indikation),length(strip(Indikation))-3);
must give you last 4 characters
Or you can try this approach, which, while initially a bit less intuitive, is stable, shorter, uses fewer functions, and works with numeric and text values:
indikan = prxchange("s/.*(.{4}$)/$1/",1,indikation);
data temp;
input trt$;
cards;
treat123
treat121
treat21
treat1
treat1
trea2
;run;
data abc;
set temp;
b=substr(trt,length(trt)-3);
run;
[Output]
Output: