JPQL query syntax exception - jpa-2.0

I am trying to run the following query:
select new br.com.edipo.ada.entity.Resultado (et, avg(es.vlEscolha) as vlCalculado)
from Escolha es
join fetch Resolucao re on re.idResolucao = es.idResolucao
join fetch Alternativa al on al.idAlternativa = es.idAlternativa
join fetch Questao qu on qu.idQuestao = al.idQuestao
join fetch QuestaoEtiqueta qe on qe.idQuestao = qu.idQuestao
join fetch Etiqueta et on et.idEtiqueta = qe.idEtiqueta
where es.blSelecionada = 1
and re.idAvaliacao = :idAvaliacao
and re.idUsuario = :idUsuario
group by et.dsEtiqueta
But I am getting the following error:
Caused by: org.hibernate.hql.internal.ast.QuerySyntaxException: unexpected token: on near line 1, column 149 [select new br.com.edipo.ada.entity.Resultado (et, avg(es.vlEscolha) as vlCalculado) from br.com.edipo.ada.entity.Escolha es join fetch Resolucao re on re.idResolucao = es.idResolucao join fetch Alternativa al on al.idAlternativa = es.idAlternativa join fetch Questao qu on qu.idQuestao = al.idQuestao join fetch QuestaoEtiqueta qe on qe.idQuestao = qu.idQuestao join fetch Etiqueta et on et.idEtiqueta = qe.idEtiqueta where es.blSelecionada = 1 and re.idAvaliacao = :idAvaliacao and re.idUsuario = :idUsuario group by et.dsEtiqueta]
According to it, the error is on column 149 ("... Resolucao re ON ..."), but I cannot see what is wrong.
I am using JPA 2.0 on JBoss AS 7.

Indeed the problem is with ON keyword as it is not used in JPQL. Try to replace your query with:
select new br.com.edipo.ada.entity.Resultado (et,avg(es.vlEscolha) as vlCalculado)
from Escolha es
join fetch Resolucao re
join fetch Alternativa al
join fetch Questao qu
join fetch QuestaoEtiqueta qe
join fetch Etiqueta et
where es.blSelecionada = 1
and re.idAvaliacao = :idAvaliacao
and re.idUsuario = :idUsuario
group by et.dsEtiqueta

Thanks! Here is how it ended up:
select new br.com.edipo.ada.entity.Resultado (et.dsEtiqueta,avg(es.vlEscolha) as vlCalculado)
from Escolha es
join es.resolucao re
join es.alternativa al
join al.questao qu
join qu.etiquetas et
where es.blSelecionada = 1
and re.avaliacao.id = :idAvaliacao
and re.idUsuario = :idUsuario
group by et.dsEtiqueta

Related

PyFlink on Kinesis Analytics Studio - Cannot convert DataStream to Amazon Kinesis Data Stream

I have a DataStream <pyflink.datastream.data_stream.DataStream> coming from a CoFlatMapFunction (simplified here):
%flink.pyflink
# join two streams and update the rule-set
class MyCoFlatMapFunction(CoFlatMapFunction):
def open(self, runtime_context: RuntimeContext):
state_desc = MapStateDescriptor('map', Types.STRING(), Types.BOOLEAN())
self.state = runtime_context.get_map_state(state_desc)
def bool_from_user_number(self, user_number: int):
'''Retunrs True if user_number is greater than 0, False otherwise.'''
if user_number > 0:
return True
else:
return False
def flat_map1(self, value):
'''This method is called for each element in the first of the connected streams'''
self.state.put(value[1], self.bool_from_user_number(value[2]))
def flat_map2(self, value):
'''This method is called for each element in the second of the connected streams (exchange_server_tickers_data_py)'''
current_dateTime = datetime.now()
dt = current_dateTime
x = value[1]
y = value[2]
yield Row(dt, x, y)
def generate__ds(st_env):
# interpret the updating Tables as DataStreams
type_info1 = Types.ROW([Types.SQL_TIMESTAMP(), Types.STRING(), Types.INT()])
ds1 = st_env.to_append_stream(table_1 , type_info=type_info1)
type_info2 = Types.ROW([Types.SQL_TIMESTAMP(), Types.STRING(), Types.STRING()])
ds2 = st_env.to_append_stream(table_2 , type_info=type_info2)
output_type_info = Types.ROW([ Types.PICKLED_BYTE_ARRAY() ,Types.STRING(),Types.STRING() ])
# Connect the two streams
connected_ds = ds1.connect(ds2)
# Apply the CoFlatMapFunction
ds = connected_ds.key_by(lambda a: a[0], lambda a: a[0]).flat_map(MyCoFlatMapFunction(), output_type_info)
return ds
ds = generate__ds(st_env)
The output, however, I am unable to view, either via registering it as a view / table, writing to a sink table or (the best case) using a Kinesis Streams sink to write data from the Flink stream into a Kinesis stream. Firehouse would also not fit my use case as the 30 second latency would be too long. Any help would be appreciated, thanks!
What I have tried:
Registering it as a view / table like so:
# interpret the DataStream as a Table
input_table = st_env.from_data_stream(ds).alias("dt", "x", "y")
z.show(input_table, stream_type="update")
Which gives an error of:
Query schema: [dt: RAW('[B', '...'), x: STRING, y: STRING]
Sink schema: [dt: RAW('[B', ?), x: STRING, y: STRING]
I have also tried writing to a sink table, like so:
%flink.pyflink
# create a sink table to emit results
st_env.execute_sql("""DROP TABLE IF EXISTS table_sink""")
st_env.execute_sql("""
CREATE TABLE table_sink (
dt RAW('[B', '...'),
x VARCHAR(32),
y STRING
) WITH (
'connector' = 'print'
)
""")
# convert the Table API table to a SQL view
table = st_env.from_data_stream(ds).alias("dt", "spread", "spread_orderbook")
st_env.execute_sql("""DROP TEMPORARY VIEW IF EXISTS table_api_table""")
st_env.create_temporary_view('table_api_table', table)
# emit the Table API table
st_env.execute_sql("INSERT INTO table_sink SELECT * FROM table_api_table").wait()
I get the error:
org.apache.flink.table.api.ValidationException: Unable to restore the RAW type of class '[B' with serializer snapshot '...'.
I have also tried to use a sink and add_sink to write the data to a sink, which would be an AWS kinesis data stream like in these Docs, like so:
%flink.pyflink
from pyflink.common.serialization import JsonRowSerializationSchema
from pyflink.datastream.connectors import KinesisStreamsSink
output_type_info = Types.ROW([Types.SQL_TIMESTAMP(), Types.STRING(), Types.STRING()])
serialization_schema = JsonRowSerializationSchema.Builder().with_type_info(output_type_info).build()
# Required
sink_properties = {
'aws.region': 'eu-west-2'
}
kds_sink = KinesisStreamsSink.builder()
.set_kinesis_client_properties(sink_properties)
.set_serialization_schema(SimpleStringSchema())
.set_partition_key_generator(PartitionKeyGenerator
.fixed())
.set_stream_name("test_stream")
.set_fail_on_error(False)
.set_max_batch_size(500)
.set_max_in_flight_requests(50)
.set_max_buffered_requests(10000)
.set_max_batch_size_in_bytes(5 * 1024 * 1024)
.set_max_time_in_buffer_ms(5000)
.set_max_record_size_in_bytes(1 * 1024 * 1024)
.build()
ds.sink_to(kds_sink)
Which i assume would work, but KinesisStreamsSink is not found in pyflink.datastream.connectors and I am unable to find any documentation on how to do this within AWS Kinesis Analytics Studio. Any help would be much much appreciated, thank you! How would I go about writing the data to a Kinesis Streams sink / converting it to a table?
Okay, i have figured it out. There were a couple issues with the particular Pyflink version available on AWS Kinesis Analytics Studio (1.13). The error messages themselves were not that useful, so for anyone who is having issues themselves I would really recommend viewing the errors in the Flink Web UI. Firstly, the MapStateDescriptor datatypes must be specified using Types.PICKLED_BYTE_ARRAY(). Secondly, not shown in the Qn, but each MapStateDescriptor must have a distinct name. I also found that using Row from pyflink.common threw errors for me. It worked better for me to switch to using use Tuples by specifying Types.TUPLE() as is done in this example. I also had to switch to specifying the output as a tuple.
Another thing I have not done is specify a watermark strategy for the DataStream, which could potentially be done by extracting the timestamp from the first field, and assign watermarks based on knowledge of the stream:
class MyTimestampAssigner(TimestampAssigner):
def extract_timestamp(self, value, record_timestamp: int) -> int:
return int(value[0])
watermark_strategy = WatermarkStrategy.for_bounded_out_of_orderness(Duration.of_seconds(5)).with_timestamp_assigner(MyTimestampAssigner())
ds = ds.assign_timestamps_and_watermarks(watermark_strategy)
# the first field has been used for timestamp extraction, and is no longer necessary
# replace first field with a logical event time attribute
table = st_env.from_data_stream(ds, col("dt").rowtime, col('f0'), col('f1'))
But i have instead created a sink table for writing to a Kinesis Data Stream again as an output. In total, the corrected code would look something like this:
from pyflink.table.expressions import col
from pyflink.datastream.state import MapStateDescriptor
from pyflink.datastream.functions import RuntimeContext, CoFlatMapFunction
from pyflink.common.typeinfo import Types
from pyflink.common import Duration as Time, WatermarkStrategy, Duration
from pyflink.common.typeinfo import Types
from pyflink.common.watermark_strategy import TimestampAssigner
from pyflink.datastream import StreamExecutionEnvironment
from pyflink.datastream.functions import KeyedProcessFunction, RuntimeContext
from pyflink.datastream.state import ValueStateDescriptor
from datetime import datetime
# Register the tables in the env
table1 = st_env.from_path("sql_table_1")
table2 = st_env.from_path("sql_table_2")
# interpret the updating Tables as DataStreams
type_info1 = Types.TUPLE([Types.SQL_TIMESTAMP(), Types.STRING(), Types.INT()])
ds1 = st_env.to_append_stream(table2, type_info=type_info1)
type_info2 = Types.TUPLE([Types.SQL_TIMESTAMP(), Types.STRING(), Types.STRING()])
ds2 = st_env.to_append_stream(table1, type_info=type_info2)
# join two streams and update the rule-set state
class MyCoFlatMapFunction(CoFlatMapFunction):
def open(self, runtime_context: RuntimeContext):
'''This method is called when the function is opened in the runtime. It is the initialization purposes.'''
# Map state that we use to maintain the filtering and rules
state_desc = MapStateDescriptor('map', Types.PICKLED_BYTE_ARRAY(), Types.PICKLED_BYTE_ARRAY())
self.state = runtime_context.get_map_state(state_desc)
# maintain state 2
ob_state_desc = MapStateDescriptor('map_OB', Types.PICKLED_BYTE_ARRAY(), Types.PICKLED_BYTE_ARRAY())
self.ob_state = runtime_context.get_map_state(ob_state_desc)
# called on ds1
def flat_map1(self, value):
'''This method is called for each element in the first of the connected streams '''
list_res = value[1].split('|')
for i in list_res:
time = datetime.utcnow().replace(microsecond=0)
yield (time, f"{i}_one")
# called on ds2
def flat_map2(self, value):
'''This method is called for each element in the second of the connected streams'''
list_res = value[1].split('|')
for i in list_res:
time = datetime.utcnow().replace(microsecond=0)
yield (time, f"{i}_two")
connectedStreams = ds1.connect(ds2)
output_type_info = Types.TUPLE([Types.SQL_TIMESTAMP(), Types.STRING()])
ds = connectedStreams.key_by(lambda value: value[1], lambda value: value[1]).flat_map(MyCoFlatMapFunction(), output_type=output_type_info)
name = 'output_table'
ds_table_name = 'temporary_table_dump'
st_env.execute_sql(f"""DROP TABLE IF EXISTS {name}""")
def create_table(table_name, stream_name, region, stream_initpos):
return """ CREATE TABLE {0} (
f0 TIMESTAMP(3),
f1 STRING,
WATERMARK FOR f0 AS f0 - INTERVAL '5' SECOND
)
WITH (
'connector' = 'kinesis',
'stream' = '{1}',
'aws.region' = '{2}',
'scan.stream.initpos' = '{3}',
'sink.partitioner-field-delimiter' = ';',
'sink.producer.collection-max-count' = '100',
'format' = 'json',
'json.timestamp-format.standard' = 'ISO-8601'
) """.format(
table_name, stream_name, region, stream_initpos
)
# Creates a sink table writing to a Kinesis Data Stream
st_env.execute_sql(create_table(name, 'output-test', 'eu-west-2', 'LATEST'))
table = st_env.from_data_stream(ds)
st_env.execute_sql(f"""DROP TEMPORARY VIEW IF EXISTS {ds_table_name}""")
st_env.create_temporary_view(ds_table_name, table)
# emit the Table API table
st_env.execute_sql(f"INSERT INTO {name} SELECT * FROM {ds_table_name}").wait()

Odoo 10 search active and inactive records using search() method

I have many2many field location_from_ids and trying to find all the childs of location_ids.
location_from_ids = fields.Many2many(comodel_name='stock.location',relation='report_stock_config_location_from_rel',column1='report_id',column2='location_id',string='Locations From', context={'active_test': False})
I am using search() method to get all the childs of location_ids:
def _get_filter(self, report):
res = ''
if report.location_from_ids:
location_ids = [l.id for l in report.location_from_ids]
locations = self.env['stock.location'].search([('id', 'child_of', location_ids), ('active', 'in', ('t', 'f'))])
I need to get all the locations (active and inactive) but getting only active records. How can I achieve to get all the records: active and inactive?
Just "deactivate" the active test on searches:
locations = self.env['stock.location'].with_context(
active_test=False).search(
[('id', 'child_of', location_ids)])
Como complemento a la respuesta, es bueno revisar las operaciones que soporta los records en odoo
https://odoo-new-api-guide-line.readthedocs.io/en/latest/environment.html#environment
Supported Operations
RecordSet also support set operations you can add, union and intersect, ... recordset:
record in recset1 # include
record not in recset1 # not include
recset1 + recset2 # extend
recset1 | recset2 # union
recset1 & recset2 # intersect
recset1 - recset2 # difference
recset.copy() # to copy recordset (not a deep copy)

using pd.read_sql() to extract large data (>5 million records) from oracle database, making the sql execution very slow

Initially tried using pd.read_sql().
Then I tried using sqlalchemy, query objects but none of these methods are
useful as the sql getting executed for long time and it never ends.
I tried using Hints.
I guess the problem is the following: Pandas creates a cursor object in the
background. With cx_Oracle we cannot influence the "arraysize" parameter which
will be used thereby, i.e. always the default value of 100 will be used which
is far too small.
CODE:
import pandas as pd
import Configuration.Settings as CS
import DataAccess.Databases as SDB
import sqlalchemy
import cx_Oracle
dfs = []
DBM = SDB.Database(CS.DB_PRM,PrintDebugMessages=False,ClientInfo="Loader")
sql = '''
WITH
l AS
(
SELECT DISTINCT /*+ materialize */
hcz.hcz_lwzv_id AS lwzv_id
FROM
pm_mbt_materialbasictypes mbt
INNER JOIN pm_mpt_materialproducttypes mpt ON mpt.mpt_mbt_id = mbt.mbt_id
INNER JOIN pm_msl_materialsublots msl ON msl.msl_mpt_id = mpt.mpt_id
INNER JOIN pm_historycompattributes hca ON hca.hca_msl_id = msl.msl_id AND hca.hca_ignoreflag = 0
INNER JOIN pm_tpm_testdefprogrammodes tpm ON tpm.tpm_id = hca.hca_tpm_id
inner join pm_tin_testdefinsertions tin on tin.tin_id = tpm.tpm_tin_id
INNER JOIN pm_hcz_history_comp_zones hcz ON hcz.hcz_hcp_id = hca.hca_hcp_id
WHERE
mbt.mbt_name = :input1 and tin.tin_name = 'x1' and
hca.hca_testendday < '2018-5-31' and hca.hca_testendday > '2018-05-30'
),
TPL as
(
select /*+ materialize */
*
from
(
select
ut.ut_id,
ut.ut_basic_type,
ut.ut_insertion,
ut.ut_testprogram_name,
ut.ut_revision
from
pm_updated_testprogram ut
where
ut.ut_basic_type = :input1 and ut.ut_insertion = :input2
order by
ut.ut_revision desc
) where rownum = 1
)
SELECT /*+ FIRST_ROWS */
rcl.rcl_lotidentifier AS LOT,
lwzv.lwzv_wafer_id AS WAFER,
pzd.pzd_zone_name AS ZONE,
tte.tte_tpm_id||'~'||tte.tte_testnumber||'~'||tte.tte_testname AS Test_Identifier,
case when ppd.ppd_measurement_result > 1e15 then NULL else SFROUND(ppd.ppd_measurement_result,6) END AS Test_Results
FROM
TPL
left JOIN pm_pcm_details pcm on pcm.pcm_ut_id = TPL.ut_id
left JOIN pm_tin_testdefinsertions tin ON tin.tin_name = TPL.ut_insertion
left JOIN pm_tpr_testdefprograms tpr ON tpr.tpr_name = TPL.ut_testprogram_name and tpr.tpr_revision = TPL.ut_revision
left JOIN pm_tpm_testdefprogrammodes tpm ON tpm.tpm_tpr_id = tpr.tpr_id and tpm.tpm_tin_id = tin.tin_id
left JOIN pm_tte_testdeftests tte on tte.tte_tpm_id = tpm.tpm_id and tte.tte_testnumber = pcm.pcm_testnumber
cross join l
left JOIN pm_lwzv_info lwzv ON lwzv.lwzv_id = l.lwzv_id
left JOIN pm_rcl_resultschipidlots rcl ON rcl.rcl_id = lwzv.lwzv_rcl_id
left JOIN pm_pcm_zone_def pzd ON pzd.pzd_basic_type = TPL.ut_basic_type and pzd.pzd_pcm_x = lwzv.lwzv_pcm_x and pzd.pzd_pcm_y = lwzv.lwzv_pcm_y
left JOIN pm_pcm_par_data ppd ON ppd.ppd_lwzv_id = l.lwzv_id and ppd.ppd_tte_id = tte.tte_id
'''
#method1: using query objects.
Q = DBM.getQueryObject(sql)
Q.execute({"input1":'xxxx',"input2":'yyyy'})
while not Q.AtEndOfResultset:
print Q
#method2: using sqlalchemy
connectstring = "oracle+cx_oracle://username:Password#(description=
(address_list=(address=(protocol=tcp)(host=tnsconnect string)
(port=pertnumber)))(connect_data=(sid=xxxx)))"
engine = sqlalchemy.create_engine(connectstring, arraysize=10000)
df_p = pd.read_sql(sql, params=
{"input1":'xxxx',"input2":'yyyy'}, con=engine)
#method3: using pd.read_sql()
df_p = pd.read_sql_query(SQL_PCM, params=
{"input1":'xxxx',"input2":'yyyy'},
coerce_float=True, con= DBM.Connection)
It would be great if some one could help me out in this. Thanks in advance.
And yet another possibility to adjust the array size without needing to create oraaccess.xml as suggested by Chris. This may not work with the rest of your code as is, but it should give you an idea of how to proceed if you wish to try this approach!
class Connection(cx_Oracle.Connection):
def __init__(self):
super(Connection, self).__init__("user/pw#dsn")
def cursor(self):
c = super(Connection, self).cursor()
c.arraysize = 5000
return c
engine = sqlalchemy.create_engine(creator=Connection)
pandas.read_sql(sql, engine)
Here's another alternative to experiment with.
Set a prefetch size by using the external configuration available to Oracle Call Interface programs like cx_Oracle. This overrides internal settings used by OCI programs. Create an oraaccess.xml file:
<?xml version="1.0"?>
<oraaccess xmlns="http://xmlns.oracle.com/oci/oraaccess"
xmlns:oci="http://xmlns.oracle.com/oci/oraaccess"
schemaLocation="http://xmlns.oracle.com/oci/oraaccess
http://xmlns.oracle.com/oci/oraaccess.xsd">
<default_parameters>
<prefetch>
<rows>1000</rows>
</prefetch>
</default_parameters>
</oraaccess>
If you use tnsnames.ora or sqlnet.ora for cx_Oracle, then put the oraaccess.xml file in the same directory. Otherwise, create a new directory and set the environment variable TNS_ADMIN to that directory name.
cx_Oracle needs to be using Oracle Client 12c, or later, libraries.
Experiment with different sizes.
See OCI Client-Side Deployment Parameters Using oraaccess.xml.

python replace string function throws asterix wildcard error

When i use * i receive the error
raise error, v # invalid expression
error: nothing to repeat
other wildcard characters such as ^ work fine.
the line of code:
df.columns = df.columns.str.replace('*agriculture', 'agri')
am using pandas and python
edit:
when I try using / to escape, the wildcard does not work as i intend
In[44]df = pd.DataFrame(columns=['agriculture', 'dfad agriculture df'])
In[45]df
Out[45]:
Empty DataFrame
Columns: [agriculture, dfad agriculture df]
Index: []
in[46]df.columns.str.replace('/*agriculture*','agri')
Out[46]: Index([u'agri', u'dfad agri df'], dtype='object')
I thought the wildcard should output Index([u'agri', u'agri'], dtype='object)
edit:
I am currently using hierarchical columns and would like to only replace agri for that specific level (level = 2).
original:
df.columns[0] = ('grand total', '2005', 'agriculture')
df.columns[1] = ('grand total', '2005', 'other')
desired:
df.columns[0] = ('grand total', '2005', 'agri')
df.columns[1] = ('grand total', '2005', 'other')
I'm looking at this link right now: Changing columns names in Pandas with hierarchical columns
and that author says it will get easier at 0.15.0 so I am hoping there are more recent updated solutions
You need to the asterisk * at the end in order to match the string 0 or more times, see the docs:
In [287]:
df = pd.DataFrame(columns=['agriculture'])
df
Out[287]:
Empty DataFrame
Columns: [agriculture]
Index: []
In [289]:
df.columns.str.replace('agriculture*', 'agri')
Out[289]:
Index(['agri'], dtype='object')
EDIT
Based on your new and actual requirements, you can use str.contains to find matches and then use this to build a dict to map the old against new names and then call rename:
In [307]:
matching_cols = df.columns[df.columns.str.contains('agriculture')]
df.rename(columns = dict(zip(matching_cols, ['agri'] * len(matching_cols))))
Out[307]:
Empty DataFrame
Columns: [agri, agri]
Index: []

Setting the datasource of the child band in UltraGrid

I have an UltraGrid in my Windows Form Application and it has to have two bands. I was able to setup the Parent Band with no problem by using the code below:
Try
con.Open()
da = New SqlDataAdapter("SELECT o.OJTID, o.Surname + ', ' + o.FirstName + ' ' + o.MiddleName AS FullName, t.TotalGrade FROM tblOJTs o INNER JOIN tblTGrades t ON o.OJTID = t.OJTID", con)
da.Fill(ds, "tblOGrades")
con.Close()
Catch ex As Exception
MsgBox("Error connecting to databe.", MsgBoxStyle.Critical)
MsgBox(ex.ToString)
con.Close()
Exit Sub
End Try
UltraGrid1.DisplayLayout.ViewStyle = Infragistics.Win.UltraWinGrid.ViewStyle.MultiBand
UltraGrid1.DataSource = ds.Tables("tblOGrades")
The thing now is, I dont know how to set the datasource of the Child Band. I dont encounter such problem with UltraWebGrid because of the Hierarchical DataSource feature but I think its not available for WinForms. I know you guys will help thanks in advance :)
You need to fill another DataTable in your DataSet with the data related to the first table. After that, you should define the relation between the two datables and finally set the DataSource of the UltraWinGrid to the DataSet itself instead of the single DataTable.
For example:
con.Open()
da = New SqlDataAdapter("SELECT o.OJTID, o.Surname ....", con)
da.Fill(ds, "tblOGrades")
da = new SqlDataAdapter("SELECT .related data....", con)
da.Fill(ds, "tblRelated")
ds.Relations.Add("Grades_Relation", _
ds.Tables("tblOGrades").Columns("OJTID"), _
ds.Tables("tblRelated").Columns("relatedID"))
con.Close()
....
UltraGrid1.DataSource = ds