pyodbc sql results getting cut off - python-2.7

I have a script as below
cursor = connection.cursor()
select_string = "SELECT * from mytable"
cursor.execute(select_string)
data = cursor.fetchall()
print(data)
print len(data)
Data is as listed below
number day info is_true
82 Monday quick "lazy fox" &amp bear true
12 Tuesday why did 'the frog' cross false
when i print the length of the data the is_true column is not being considered because of the quotations/special characters in the info column. Is there a way where i can select * from a table and disregard any quotations that may end the column processing early?

The string formatting shouldn't be a problem if you use Pandas for reading the table from the SQL connection. This should work:
import pandas as pd
import pyodbc
connection = pyodbc.connect('<SERVER>')
select_string = "SELECT * from mytable"
data = pd.read_sql(select_string , connection)
print(data)
print(data.shape)

Related

How do I save a table from big query into colab as a pandas dataframe?

I am new to BQ I am trying to save my BQ tables as a pandas dataframe in my colab environment. This is the code I am using but I am getting a "Bad request Error". Any ideas how I can trouble shoot? I can't figure out what I am doing wrong.
My code is as follows:
from google.cloud import bigquery
client = bigquery.Client(project=project_id)
sample_count = 2000
row_count = client.query('''
SELECT
COUNT(*) as total
FROM `123.cleaned_sales`''').to_dataframe().total[0]
df = client.query('''
SELECT
*
FROM
`123.cleaned_sales`
WHERE RAND() < %d/%d
''' % (sample_count, row_count)).to_dataframe()
print('Full dataset has %d rows' % row_count)```
Here is my error message
[enter image description here][1]
[1]: https://i.stack.imgur.com/VQOox.png
The image you shared indicates the projectId and datasetId should not to be empty in this query :
row_count = client.query('''
SELECT
COUNT(*) as total
FROM `123.cleaned_sales`''').to_dataframe().total[0]
You have to set the projectId :
row_count = client.query('''
SELECT
COUNT(*) as total
FROM `your_project_id.123.cleaned_sales`''').to_dataframe().total[0]
Do the same with the second query.

Python MySQLdb print row(s) from Table A where not in Table B

I have two tables, sales_olap and resellers
I need to print out what reseller from sales_olap does not exist in resellers table
for instance:
result = cursor.execute("SELECT SO.reseller_name FROM sales_olap AS SO WHERE
SO.reseller_name!=(SELECT reseller FROM resellers)")
for row in result:
print row
but I am getting an error of: 1242, Subquery returns more than 1 row
How can I get it to only print the reseller names from sales_olap table where they do not exist in resellers table?
If I try doing:
result = cursor.execute("SELECT reseller_name FROM sales_olap WHERE reseller_name NOT IN(SELECT reseller FROM resellers)")
for row in result:
print row['reseller_name']
Then I get the error of: TypeError: 'long' object is not iterable
Was able to get this working by doing it this way:
cursor.execute("SELECT reseller_name FROM sales_olap LEFT JOIN resellers ON sales_olap.reseller_name=resellers.reseller WHERE resellers.reseller IS NULL")
result = cursor.fetchall()
for row, in result:
print (row)

SQLite sorting order by asc

I am using SQLite with python-2.7 . In my SQLite database i contains a date field that stored date in dd-MM-yyyy format like that.
31/02/2018
30/02/2017
01/06/2018
How can i sort it ascending order.
try this query:
SELECT date(substr(`date_column`,7,4)||'-'||substr(`date_column`,4,2)||'-'||substr(`date_column`,1,2)) as text_date FROM `table` order by text_date asc
first get all rows with
connect = sqlite3.connect('datanase_file')
cursor = connect.execute('select date from table_name')
rows = cursor.fetchall()
then convert the dates to seconds from epoch so you can sort them
for row in rows:
t = time.strptime(row[0],'%d/%m/%Y')
converted_time = time.mktime(t)
then add converted_time column with type INT to your database.
then update your database with
connect.execute('''UPDATE table_name SET converted_time = ? WHERE date = ?''',
(converted_time, row[0]))
then you can sort it with any program.

UFT API TEST: Create SQL query based on values from previous step activity at run time

Steps to be performed in UFT API Test:
Get JSON RESPONSE from added REST activity in test flow
Add Open DB connection activity
Add Select Data activity with query string
SELECT Count(*) From Table1 Where COL1 = 'XXXX' and COL2 = ' 1234'
(here COL2 value has length of 7 characters including spaces)
In the above query values in where clause is received(dynamically at run time) from JSON response.
When i try to link the query value using link to data source with custom expression
eg:
SELECT COUNT(*) FROM Table1 Where COL1 =
'{Step.ResponseBody.RESTACTIVITYxx.OBJECT[1].COL1}' and COL2 =
'{Step.ResponseBody.RESTACTIVITYxx.OBJECT[1].COL2}'
then the QUERY changed (excluding spaces in COL2) to:
SELECT Count(*) From Table1 Where COL1 = 'XXXX' and COL2 = '1234'
I eventried with concatenate and Replace string activity but same happens.
Please kindly help..
You can use the StringConcatenation action, to build de Query String.
Use the String as Query in Database "Select data"

How to insert a particular column from a DataFrame into a new column of a database table?

I have the following DataFrame, where column 0 is an ID, column 1 is Name, and column 2 is Total. Column 2 is newly generated.
0 1 2
0 1 Name1 1
1 2 Name2 8
2 3 Name3 6
3 4 Name4 5
and so on..
ID is a primary key in an existing table in my database. I created a new column in my table (which originally has two columns ID and Name), labeled "Total" and I want to insert column 1 values into it for each corresponding ID.
I'm currently regenerating the existing table in the DataFrame with the new column Total in the end. Then rewrite the whole table again using df.to_sql(..., if_exists='replace').
Here is my full code for reference:
import sqlite3
from pandas import DataFrame
#access the database created
db = sqlite3.connect('database')
c = db.cursor()
c.execute("select ID, Name, count(*) from table1 as t1 join table2 as t2 on t1.ID=t2.ID group by ID")
df = DataFrame(c.fetchall())
df.to_sql('table1', db, if_exists='replace', index=False)
I get the following error:
AttributeError: 'numpy.int64' object has no attribute 'replace'
Proper way to do it:
import sqlite3
import pandas as pd
#access the database created
db = sqlite3.connect('database')
df = pd.read_sql("select ID, Name, count(*) from table1 as t1 join table2 as t2 on t1.ID=t2.ID group by ID", db)
df.to_sql('table1', db, if_exists='replace', index=False)