I have a table with 10,000 rows and I want to select the first 1000 rows and then select again and this time, the next set of rows, which is 1001-2001.
I am using the BETWEEN clause in order to select the range of values. I can also increment the values. Here is my code:
count = cursor.execute("select count(*) from casa4").fetchone()[0]
ctr = 1
ctr1 = 1000
str1 = ''
while ctr1 <= count:
sql = "SELECT AccountNo FROM ( \
SELECT AccountNo, ROW_NUMBER() OVER (ORDER BY Accountno) rownum \
FROM casa4 ) seq \
WHERE seq.rownum BETWEEN " + str(ctr) + " AND " + str(ctr1) + ""
ctr = ctr1 + 1
ctr1 = ctr1 + 1000
cursor.execute(sql)
sleep(2) #interval in printing of the rows.
for row in cursor:
str1 = str1 + '|'.join(map(str,row)) + '\n'
print "Records:" + str1 #var in storing the fetched rows from database.
print sql #prints the sql statement(str) and I can see that the var, ctr and ctr1 have incremented correctly. The way I want it.
What I want to achieve is using a messaging queue, RabbitMQ, I will send this rows to another database and I want to speed up the process. Selecting all and sending it to the queue returns an error.
The output of the code is that it returns 1-1000 rows correctly on the 1st but, on the 2nd loop, instead of 1001-2001 rows, it returns 1-2001 rows, 1-3001 and so on.. It always starts on 1.
I was able to recreate your issue with both pyodbc and pypyodbc. I also tried using
WITH seq (AccountNo, rownum) AS
(
SELECT AccountNo, ROW_NUMBER() OVER (ORDER BY Accountno) rownum
FROM casa4
)
SELECT AccountNo FROM seq
WHERE rownum BETWEEN 11 AND 20
When I run that in SSMS I just get rows 11 through 20, but when I run it from Python I get all the rows (starting from 1).
The following code does work using pyodbc. It uses a temporary table named #numbered, and might be helpful in your situation since your process looks like it would do all of its work using the same database connection:
import pyodbc
cnxn = pyodbc.connect("DSN=myDb_SQLEXPRESS")
crsr = cnxn.cursor()
sql = """\
CREATE TABLE #numbered (rownum INT PRIMARY KEY, AccountNo VARCHAR(10))
"""
crsr.execute(sql)
cnxn.commit()
sql = """\
INSERT INTO #numbered (rownum, AccountNo)
SELECT
ROW_NUMBER() OVER (ORDER BY Accountno) AS rownum,
AccountNo
FROM casa4
"""
crsr.execute(sql)
cnxn.commit()
sql = "SELECT AccountNo FROM #numbered WHERE rownum BETWEEN ? AND ? ORDER BY rownum"
batchsize = 1000
ctr = 1
while True:
crsr.execute(sql, [ctr, ctr + batchsize - 1])
rows = crsr.fetchall()
if len(rows) == 0:
break
print("-----")
for row in rows:
print(row)
ctr += batchsize
cnxn.close()
Related
Below code is developed in SQL to update target table columns. Can some one help me to rewrite below query in redshift as I am trying to execute same query on amazon redshift it is giving error as:
Amazon Invalid operation: relation "c" does not exist;
With TempTable As
(
SELECT Left('abcdefghijk',len(TerritoryName)/3) + Substring(TerritoryName,len(TerritoryName)-len(TerritoryName)/3-len(TerritoryName)/3+1,len(TerritoryName)-len(TerritoryName)/3-len(TerritoryName)/3) + Right('ijklmnopqrstuv',len(TerritoryName)/3) As Masked_TerritoryName
,Left('abcdefghijk',len(DistrictName)/3) + Substring(DistrictName,len(DistrictName)-len(DistrictName)/3-len(DistrictName)/3+1,len(DistrictName)-len(DistrictName)/3-len(DistrictName)/3) + Right('ijklmnopqrstuv',len(DistrictName)/3) As Masked_DistrictName
,Left('abcdefghijk',len(RegionName)/3) + Substring(RegionName,len(RegionName)-len(RegionName)/3-len(RegionName)/3+1,len(RegionName)-len(RegionName)/3-len(RegionName)/3) + Right('ijklmnopqrstuv',len(RegionName)/3) As Masked_RegionName
,Left('abcdefghijk',len(RSMTerritoryName)/3) + Substring(RSMTerritoryName,len(RSMTerritoryName)-len(RSMTerritoryName)/3-len(RSMTerritoryName)/3+1,len(RSMTerritoryName)-len(RSMTerritoryName)/3-len(RSMTerritoryName)/3) + Right('ijklmnopqrstuv',len(RSMTerritoryName)/3) As Masked_RSMTerritoryName
,Left('abcdefghijk',len(CCAName)/3) + Substring(CCAName,len(CCAName)-len(CCAName)/3-len(CCAName)/3+1,len(CCAName)-len(CCAName)/3-len(CCAName)/3) + Right('ijklmnopqrstuv',len(CCAName)/3) As Masked_CCAName
,Left('abcdefghijk',len(LCAName)/3) + Substring(LCAName,len(LCAName)-len(LCAName)/3-len(LCAName)/3+1,len(LCAName)-len(LCAName)/3-len(LCAName)/3) + Right('ijklmnopqrstuv',len(LCAName)/3) As Masked_LCAName
,Left('abcdefghijk',len(TMComp)/3) + Substring(TMComp,len(TMComp)-len(TMComp)/3-len(TMComp)/3+1,len(TMComp)-len(TMComp)/3-len(TMComp)/3) + Right('ijklmnopqrstuv',len(TMComp)/3) As Masked_TMComp
,Left('abcdefghijk',len(ASMTerritoryName)/3) + Substring(ASMTerritoryName,len(ASMTerritoryName)-len(ASMTerritoryName)/3-len(ASMTerritoryName)/3+1,len(ASMTerritoryName)-len(ASMTerritoryName)/3-len(ASMTerritoryName)/3) + Right('ijklmnopqrstuv',len(ASMTerritoryName)/3) As Masked_ASMTerritoryName
,TerritoryCode
FROM TargetTable
)
Update C
Set C.TerritoryName = N.Masked_TerritoryName
,C.DistrictName = N.Masked_DistrictName
,C.RegionName = N.Masked_RegionName
,C.RSMTerritoryName = N.Masked_RSMTerritoryName
,C.CCAName = N.Masked_CCAName
,C.LCAName = N.Masked_LCAName
,C.TMComp = N.Masked_TMComp
,C.ASMTerritoryName = N.Masked_ASMTerritoryName
From TargetTable C
Inner Join TempTable N ON C.TerritoryCode = N.TerritoryCode
I don't believe you can use just an alias for the target table. You have "... Update C ...", I expect you need "... Update TargetTable ..." or "... Update TargetTable C ...".
Also you don't need to list TargetTable in the FROM clause as this is assumed. Your join on conditions become where conditions. So you query will look like this:
With TempTable As
(
SELECT ...
FROM TargetTable
)
Update TargetTable C
Set ...
From TempTable N
Where C.TerritoryCode = N.TerritoryCode
I am trying to write a sql in bigquery and I have a requirement to filter records based on a group by column and another column in the table
what I mean is I want to check if the group by column(column name:mnt) has more than one row then I have to check if col2 (col name: zel) value, then I have to apply a filter saying col2 ='X' and only pass that record else pass i.e dont filter the records if the col1 has only distinct one value per group
So I have written a sql to do this I have used row_number as well as rank , dense rank function but I noticed the value of rank and dense rank and row number functions return same value for a group
Please see the below code
#standardsql
with t1 as (SELECT mnt,
case when rank() over (partition by ltrim(rtrim(mnt)) order by
ltrim(rtrim(mnt)) asc) >1 then 'Y' else 'N' end
as flag,
rank() over (partition by mnt order by mnt) as rn,
dense_rank() over (partition by mnt order by mnt) as drn, FROM
projectname.datasetname.tablename1),
t2 as ( SELECT
mnt,
rel,
lif,
lts,
lokez FROM projectname.datasetname.tablename2
WHERE lts <> "" AND _PARTITIONTIME = TIMESTAMP(CURRENT_DATE()) ) ,
t3 as (SELECT
lif,
lifn,
lts,
par FROM `projectname.datasetname.tablename3`)
,t4 as (SELECT rcv FROM `projectname.datasetname.tablename4` WHERE mes
= 'PRO')
select * from (
SELECT t1.mnt as mnt,
t1.flag,
t1.rn,
t1.drn
t2.rel as zel,
t2.lokez as ZLOEKZ,
t4.rcv as Zrcv
FROM t1 left join t2 on replace(t1.mnt, '00000000', '') =
REPLACE(t2.mnt, '00000000', '') AND t1.lif = t2.lif and t2.lts <> ""
and
case when t1.flag = 'Y' and t2.rel ='X' then 1
when (t1.flag ='N' and t2.rel=t2.rel) or (t1.flag ='N' and t2.rel
is null) then 1
when t1.flag = 'Y' and t2.rel <>'X' then 2
else 3
end = 1
left join t3 ON t1.lif = t3.lif AND t2.lts = t3.lts AND
t3.par = 'BA' left join t4 on t4.rcv = t3.lifn and t2.lokez is null )
where ZLOEKZ is null order by mnt
As you can see I am using a case statement and even it seems to be not working fine. I am pasting the case condition below again
case when t1.flag = 'Y' and t2.rel ='X' then 1
when (t1.flag ='N' and t2.rel=t2.rel) or (t1.flag ='N' and
t2.rel
is null) then 1
when t1.flag = 'Y' and t2.rel <>'X' then 2
else 3
end = 1
But the expected record count did not match so I added the above sql lines to see if my analytical functions were giving me result I wanted
rank() over (partition by mnt order by mnt) as rn,
dense_rank() over (partition by mnt order by mnt) as drn
strangely for same mnt number the rank , dense rank and row_number function are assigning the same value what am i doing wrong here.
mnt flag rn drn rel lokez rcv
100 N 1 1 X abc 123
100 N 1 1 null xyz 123
100 N 1 1 null def 234
This is my output
I mean as per my code for same mnt number I am seeing flag set to N instead of Y and for the rank and dense rank are giving me same number for all 3 mnt it is generating 1 instead of 123 (note for rank function I understand) but dense rank should not do that
I tried to convey the issue as efficiently as I could please let me know if there is any clarifications I can provide.
any help appreciated
thanks
SELECT * EXCEPT(ct) FROM (
SELECT *, COUNT() OVER(PARTITION BY mnt) AS ct
) WHERE ct=1 or zel='X'
This is the code snippet for the problem you mentioned. Use this in your code according to the logic.
I have an sql query running on a loop. There are two values FINGER and index_str that both need to be updated in parallel.
FINGER: (numpy array)
[['1012_8']
['10214_5']
['10409_9']
index_str: (pandas dataframe)
0 14,38,51,65,84,85
1 3,34,58,65,66,75
2 3,15,68,70,80,82
Above are the first 3 examples. There are over 1000 of each in reality.
for i in range(len(FINGER)):
print i
print FINGER[i]
for x in index_str[i]:
yy = FINGER[i][0]
#print range(len(FINGER))
index_str = str(x)
query = "SELECT finger, ind, x,y, CAST( (direction*180/3.142)as INT),CAST(quality*100 as INT) from UNIL_fingerprints where finger = '" + yy + "' and ind IN (" + index_str + ") order by ind "
print query
c.execute(query)
rows = c.fetchall()
print rows
Above is the loop and query in question.
So far the loop runs through all values of index_str for only the first FINGER value. To elaborate, the query updates for the first 3 examples as follows.
SELECT finger, ind, x,y, CAST( (direction*180/3.142)as INT),CAST(quality*100 as INT) from UNIL_fingerprints where finger = '1012_8' and ind IN (14,38,51,65,84,85) order by ind
SELECT finger, ind, x,y, CAST( (direction*180/3.142)as INT),CAST(quality*100 as INT) from UNIL_fingerprints where finger = '1012_8' and ind IN (3,34,58,65,66,75) order by ind
SELECT finger, ind, x,y, CAST( (direction*180/3.142)as INT),CAST(quality*100 as INT) from UNIL_fingerprints where finger = '1012_8' and ind IN (3,15,68,70,80,82) order by ind
Whereas '1012_8' should be '10214_5' and '10409_9' respectively in the 2nd and 3rd query above.
Any ideas on how to get this to update properly would be helpful.
You want zip():
for finger, indexes in zip(FINGERS, index_str):
print("fingers : {}- indexes: {}".format(finger, indexes))
Also you REALLY want to learn and use the db-api properly (well, unless you dont mind being hacked, that is).
I got a raw SQL statement in my views.py
Message.objects.raw('''
SELECT s1.ID, s1.CHARACTER_ID, MAX(s1.MESSAGE) MESSAGE, MAX(s1.c) occurrences
FROM
(SELECT ID, CHARACTER_ID, MESSAGE, COUNT(*) c
FROM tbl_message WHERE ts > DATE_SUB(NOW(), INTERVAL %s DAY) GROUP BY CHARACTER_ID,MESSAGE) s1
LEFT JOIN
(SELECT ID, CHARACTER_ID, MESSAGE, COUNT(*) c
FROM tbl_message WHERE ts > DATE_SUB(NOW(), INTERVAL %s DAY) GROUP BY CHARACTER_ID,MESSAGE) s2
ON s1.CHARACTER_ID=s2.CHARACTER_ID
AND s1.c < s2.c
WHERE s2.c IS NULL
GROUP BY CHARACTER_ID
ORDER BY occurrences DESC''', [days, days])
The result of this SQL statement (tested on database directly) is:
ID | CHARACTER_ID | MESSAGE | OCCURENCES
----+--------------+---------+--------------
148 | 10 | test | 133
But all I got is a InvalidQuery Exception with the information Raw query must include the primary key
Then I double checked the docs and read:
There is only one field that you can’t leave out - the primary key
field....An InvalidQuery exception will be raised if you forget to include the primary key.
As you can see I got the requested primary key added in my statement. What's wrong?
class Message(models.Model):
character = models.ForeignKey('Character')
message = models.TextField()
location = models.ForeignKey('Location')
ts = models.DateTimeField()
class Meta:
pass
def __unicode__(self):
return u'%s: %s...' % (self.character, self.message[0:20])
Include 1 as id to your query
Message.objects.raw('''
SELECT 1 as id , s1.ID, s1.CHARACTER_ID, MAX(s1.MESSAGE) MESSAGE, MAX(s1.c) occurrences
FROM
(SELECT ID, CHARACTER_ID, MESSAGE, COUNT(*) c
FROM tbl_message WHERE ts > DATE_SUB(NOW(), INTERVAL %s DAY) GROUP BY CHARACTER_ID,MESSAGE) s1
LEFT JOIN
(SELECT ID, CHARACTER_ID, MESSAGE, COUNT(*) c
FROM tbl_message WHERE ts > DATE_SUB(NOW(), INTERVAL %s DAY) GROUP BY CHARACTER_ID,MESSAGE) s2
ON s1.CHARACTER_ID=s2.CHARACTER_ID
AND s1.c < s2.c
WHERE s2.c IS NULL
GROUP BY CHARACTER_ID
ORDER BY occurrences DESC''', [days, days])
I reproduced the same problem using Python 2.7.5, Django 1.5.1 and Mysql 5.5.
I've saved the result of the raw call to the results variable, so I can check what columns it contains:
>>> results.columns
['ID', 'CHARACTER_ID', 'MESSAGE', 'occurrences']
ID is in uppercase, so in your query I changed s1.ID to s1.id and it works:
>>> results = Message.objects.raw('''
... SELECT s1.id, s1.CHARACTER_ID, MAX(s1.MESSAGE) MESSAGE, MAX(s1.c) occurrences
... FROM
... (SELECT ID, CHARACTER_ID, MESSAGE, COUNT(*) c
... FROM tbl_message WHERE ts > DATE_SUB(NOW(), INTERVAL %s DAY) GROUP BY CHARACTER_ID,MESSAGE) s1
... LEFT JOIN
... (SELECT ID, CHARACTER_ID, MESSAGE, COUNT(*) c
... FROM tbl_message WHERE ts > DATE_SUB(NOW(), INTERVAL %s DAY) GROUP BY CHARACTER_ID,MESSAGE) s2
... ON s1.CHARACTER_ID=s2.CHARACTER_ID
... AND s1.c < s2.c
... WHERE s2.c IS NULL
... GROUP BY CHARACTER_ID
... ORDER BY occurrences DESC''', [days, days])
>>> results.columns
['id', 'CHARACTER_ID', 'MESSAGE', 'occurrences']
>>> results[0]
<Message_Deferred_character_id_location_id_message_ts: Character object: hello...>
Make Sure the primary key is part of the select statement.
Example:
This will not work:
`Model.objects.raw("Select Min(id), rider_id from Table_Name group by rider_id")`
But this will work:
`Model.objects.raw("Select id, Min(id), rider_id from Table_Name group by rider_id")`
For those also stuck with this problem, perhaps like me, wondering why Django needs a pk, when you don’t have a pk for the query (eg you want multiple rows) – Django just needs an id field returned, the pk does not need to be part of a where clause. ie:
select * from table where foo = 'bar';
or
select id, description from table where foo = 'bar';
Both of these work, if there is a field id in the table. But this throws the error described by Thomas Schwärzl, because no id field is returned:
select description from table where foo = 'bar';
How do I do the following using DAO on a recordset
SELECT TOP 1 * FROM foo WHERE id = 10 ORDER BY timestamp DESC
Using SetCurrentIndex you can only use one index it seems otherwise using id and timestamp and selecting the first one would work.
I am by no means sure of what you want.
Dim rs As DAO.Recordset
Dim db As Database
Set db = CurrentDB
sSQL = "SELECT TOP 1 * FROM foo WHERE id = 10 ORDER BY timestamp DESC"
Set rs = db.OpenRecordset(sSQL)
Find does not work with all recordsets. This will work:
Set rs = CurrentDb.OpenRecordset("select * from table1")
rs.FindFirst "akey=1 and atext='b'"
If Not rs.EOF Then Debug.Print rs!AKey
This will not:
Set rs = CurrentDb.OpenRecordset("table1")
rs.FindFirst "akey=1 and atext='b'"