zf2 JsonModel() using Blob in doctrine 2 ORM - doctrine-orm

I am trying to fetch JSON response using JsonModel() in ZF2,
I am getting following error
$dql = "SELECT * FROM \Application\Entity\Message m ";
$resultSet = $objectManager->createQuery($dql)
->getResult(\Doctrine\ORM\Query::HYDRATE_ARRAY);
$result = new JsonModel($resultSet);
I am getting following warning
Warning: json_encode(): type is unsupported, encoded as null in....
If I exclude blob type field from selection it works fine.
Why JsonModel does not work with blob type fields?
Is there any alternative I can use in doctrine?

json_encode function only works with UTF-8 encoded data.
blob is Binary data type.
There is difference between blob(binary) and text data type
You might need to convert your binary data into string before encoding data into json.
There are different ways to do it.
method1 in sql
method2 in sql

Related

How can I correct AWS Glue Crawler/Data Catalog inferring all fields in CSV as strings when they're clearly not?

I have a big CSV text file uploaded weekly to an S3 path partitioned by upload date (maybe not important). The schema of these files are all the same, the formatting is all the same, the naming conventions are all the same. Each file contains ~100 columns and ~1M rows of mixed text/numeric types. The raw data looks like this:
id,date,string,int_values,double_values
"6F87U",2021-03-21,"Text",0,1.1483
"8DU87",2021-03-22,"More text, oh yes",1,2.525
"79LO2",2021-03-23,"Moar, give me moar, text",2,3.485489
When I run a Crawler with everything default, querying with Athena like so:
select * from tb_csv_data
...the results in Athena are thus:
id
date
string
int_values
double_values
"6F87U"
2021-03-21
"Text"
0
1.1483
"8DU87"
2021-03-22
"More text
oh yes"
1
"79LO2"
2021-03-23
"Moar
give me moar
text
The problem at this level seems to be with proper detection (read: ignoring) of commas as delimiters within quotation marks. So I have a CSV classifier with the following characteristics that I have attached to the Crawler, I run the Crawler again with the classifier attached, and the resulting table properties are thus:
Input format org.apache.hadoop.mapred.TextInputFormat
Output format org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat
Serde serialization lib org.apache.hadoop.hive.serde2.OpenCSVSerde
Serde parameters
quoteChar "
separatorChar ,
Table properties
sizeKey 4356512114
objectCount 3
UPDATED_BY_CRAWLER crawler-name
CrawlerSchemaSerializerVersion 1.0
recordCount 3145398
averageRecordSize 1384
CrawlerSchemaDeserializerVersion 1.0
compressionType none
columnsOrdered true
areColumnsQuoted true
delimiter ,
typeOfData file
The resulting table with the same simple Athena query as above seems to be correct:
id
date
string
int_values
double_values
6F87U
2021-03-21
Text, yes
0
1.1483
8DU87
2021-03-22
More text, oh yes
1
2.525
79LO2
2021-03-23
Moar, give me moar, text
2
3.485489
The expected automatic inference of data types is supposed to be this (let's simplify and presume the date is correct as a string):
Column name
Data type
id
string
date
string
string
string
int_values
bigint (or long)
double_values
double
...but instead they're all strings!
Column name
Data type
id
string
date
string
string
string
int_values
string
double_values
string
I need this data to be accurately queryable from Athena as it is, where it is, so what can I do without further processing of the raw data? I suppose I could manually adjust the table properties in the Console but is that really correct when I need the entire pipeline to be automated? I also want to avoid having to cast types in queries 80+ times for each field as most of these columns are numeric. What can I do?
Thank you!
The limitation arrives from the serde that you are using in your query. Refer to note section in this doc which has below explanation :
When you use Athena with OpenCSVSerDe, the SerDe converts all column types to STRING. Next, the parser in Athena parses the values from STRING into actual types based on what it finds. For example, it parses the values into BOOLEAN, BIGINT, INT, and DOUBLE data types when it can discern them. If the values are in TIMESTAMP in the UNIX format, Athena parses them as TIMESTAMP. If the values are in TIMESTAMP in Hive format, Athena parses them as INT. DATE type values are also parsed as INT.
For date type to be detected it has to be in UNIX numeric format, such as 1562112000 according to the doc.

Get parameters from Rest Http url for get method using streamsets microservice pipeline

I have created a microservice pipeline in streamsets. Upon making a get callout, i have to retrieve data from mysql depending on the parameters sent in the http get url using expression evaluator?
My url is supposed to be like this: http://my.url.com:0191?param1=xyz&param2=abc
I have to retrieve data based on param1 value and param2 value.
Also, how do I handle cases when the params send will be null?
The URL parameters appear in the queryString record header attribute; you can parse them out in an Expression Evaluator with the Field Expression:
${str:splitKV(record:attribute('queryString'), '&', '=')}
If you set the Output Field to /, then your parameters will now be in the fields /param1 and /param2. You can use these in a MySQL query in the JDBC Lookup processor like this:
-- Assuming col1 is an integer (doesn't need quotes) and col2 is a string (needs quotes)
SELECT * FROM tablename
WHERE col1 = ${record:value('/param1') AND col2 = '${record:value('/param2')'
You can handle nulls using the record:attributeOrDefault() function to set a default, or by using a Stream Selector to send the record along a different path.

How to use IFNULL() in BigQuery - Standard SQL?

I would like to know how to use the IFNULL() BigQuery Standard SQL function properly. This is my current data structure. The columns named "key" and "stringColumn" store strings. Meanwhile, the column named "integerColumn" stores integers:
I would like to create a new column named "singleValueColumn" that takes the value of the "stringColumn" or "integerColumn" that is not null:
This is my BigQuery Standard SQL query:
SELECT key,
value.string_value as stringColumn,
value.int_value as integerColumn,
IFNULL(value.string_value, value.int_value) as singleValueColumn
FROM `com_skytracking_ANDROID.app_events_*`,
UNNEST(event_dim) as event,
UNNEST(event.params) as event_param
WHERE event.name = "order_event"
However, when I run the query I am getting this error:
Error: No matching signature for function IFNULL for argument types: STRING, INT64. Supported signature: IFNULL(ANY, ANY) at [4:9]
Thanks for your help.
Check this doc. I think you need to cast the int_value as a string:
IFNULL(value.string_value, CAST(value.int_value AS STRING)) AS singleValueColumn

Informix: Modify from CLOB to LVARCHAR

I have a table
CREATE TABLE TEST
(
test_column CLOB
)
I want to change the datatype of test_column to LVARCHAR. How can I achieve this? I tried several things until now:
alter table test modify test_column LVARCHAR(2500)
This works, but the content of test_column gets converted from 'test' to '01000000d9c8b7a61400000017000000ae000000fb391956000000000100000000000000000000000000000000000000000000000000000000000000000000000000000000000000'.
alter table test add tmp_column LVARCHAR(2500);
update test set tmp_column = DBMS_LOB.SUBSTR(test_column,2500,1);
This does not work and I get the following exception:
[Error Code: -674, SQL State: IX000] Method (substr) not found.
Do you have any further ideas?
Using a 12.10.xC5DE instance to do some tests.
From what i could find in the manuals, there isn't a cast from CLOB to other data types.
CLOB data type
No casts exist for CLOB data. Therefore, the database server cannot convert data of the CLOB type to any other data type, except by using these encryption and decryption functions to return a BLOB. Within SQL, you are limited to the equality ( = ) comparison operation for CLOB data. To perform additional operations, you must use one of the application programming interfaces from within your client application.
The encryption/decryption functions mentioned still return CLOB type objects, so they do not do what you want.
Despite the manual saying that there is no cast for CLOB, there is a registered cast in the SYSCASTS table. Using dbaccess , i tried an explicit cast on some test data and got return values similar to the ones you are seeing. The text in the CLOB column is 'teste 01', terminated with a line break.
CREATE TABLE myclob
(
id SERIAL NOT NULL
, doc CLOB
);
INSERT INTO myclob ( id , doc ) VALUES ( 0, FILETOCLOB('file1.txt', 'client'));
SELECT
id
, doc
, doc::LVARCHAR AS conversion
FROM
myclob;
id 1
doc
teste 01
conversion 01000000d9c8b7a6080000000800000007000000a6cdc0550000000001000000000
0000000000000000000000000000000000000000000000000000000000000000000
0000000000
So, there is a cast from CLOB, but it does not seem to be useful for what you want.
So back to the SQL Packages Extension . You need to register this datablade on the database. The files required are located in the $INFORMIXDIR/extend and you want the excompat.* module. Using the admin API, you can register the module by executing the following:
EXECUTE FUNCTION sysbldprepare('excompat.*', 'create');
If the return value is 0 (zero) then the module should now be registered.
SELECT
id
, DBMS_LOB_SUBSTR(doc, DBMS_LOB_GETLENGTH(doc) - 1, 1) as conversion
FROM
myclob;
id 1
conversion teste 01
Another way would be to register your own cast from CLOB to LVARCHAR, but you would have to code an UDR to implement it.
P.S:
Subtracting 1 from the CLOB length to remove the line break.

Empty blob insert query in ODBC c ++ (oracle)

I need to insert a blob in o oracle database. I am using c++ and ODBC library.
I am stucked at the insert query and update query .It is abstract for me how to make an blob insert query.
I know how to make an query for a non blob column.
My table structure is :
REATE TABLE t_testblob (
filename VARCHAR2(30) DEFAULT NULL NULL,
apkdata BLOB NULL
)
I found an exemple on insert and update :
INSERT INTO table_name VALUES (memberlist,?,memberlist)
UPDATE table_name SET ImageFieldName = ? WHERE ID=yourId
But these structure of querys or abstract to me . What should memberlist be ? why is there "?" where are the values to be inserted ?
Those question marks means that it is PreparedStatement. Such statements are good for both server and client. Server has less work because it is easier to parse such statement, and client do not need to worry about SQLInjection. Client prepares such query, builds buffer for input values and calls it.
Also such statement is executed very quick compared to "normal" queries, especially in loops, importing data from csv file etc.
I don't know what ODBC C++ library you use while ODBC is strictly C library. Other languages like Java or Python can use it too. I think the easiest is example in Python:
cursor = connection.cursor()
for txt in ('a', 'b', 'c'):
cursor.execute('SELECT * FROM test WHERE txt=?', (txt,))
Of course such PreparedStatement can be used in INSERT or UPDATE statements too, and for your example it can look like:
cursor.execute("INSERT INTO t_testblob (filename, apkdata) VALUE (?, ?)", filename, my_binary_data)