Python MYSQL to Text File - Headings Required - python-2.7

I have some code to query a MYSQL database and send the output to a text file.
The code below prints out the first 7 columns of data and sends it to a text file called Test
My question is, how do i also obtain the column HEADINGS from the database as well to display in the text file?
I am using Python 2.7 with a MYSQL database.
import MySQLdb
import sys
connection = MySQLdb.connect (host="localhost", user = "", passwd = "", db =
"")
cursor = connection.cursor ()
cursor.execute ("select * from tablename")
data = cursor.fetchall ()
OutputFile = open("C:\Temp\Test.txt", "w")
for row in data :
print>>OutputFile, row[0],row[1],row[2],row[3],row[4],row[5],row[6]
OutputFile.close()
cursor.close ()
connection.close ()
sys.exit()

The best way to get the details of the column name is by using INFORMATION_SCHEMA
SELECT `COLUMN_NAME`
FROM `INFORMATION_SCHEMA`.`COLUMNS`
WHERE `TABLE_SCHEMA`='yourdatabasename'
AND `TABLE_NAME`='yourtablename';
or by using the SHOW command of mySQL
SHOW columns FROM your-table;
This command is only mySQL specific.
and then to get the data you can use the .fetchall() function to get the details.

Related

Python Sqlite3 attaching to a memory database

I am using Python 2.7 with sqlite3 version 2.6.0. I am trying to create a memory database, attach to it from a physical database and insert data, then query it back later. I am having issues if anyone can help.
The following two fail with the error message "unable to open database file"
con = sqlite3.connect(":memory:?cache=shared")
con = sqlite3.connect("file::memory:?cache=shared")
The following works until I attempt to access a table in the attached DB. I can do this with physical databases with no problem. I suspect the issue is not having the cache=shared.
con = sqlite3.connect(":memory:")
cursor = con.cursor()
cursor.executescript("create table table1 (columna int)")
cursor.execute("select * from table1")
con2 = sqlite3.connect("anotherdb.db")
cursor2 = con2.cursor()
cursor2.execute("attach database ':memory:' as 'foo'")
cursor2.execute("select * from foo.table1")
The error from the last select is "no such table: foo.table1".
Thanks in advance.
The SQLite library shipped with Python 2.x does not have URI file names enabled, so it is not possible to open an in-memory database in shared-cache mode.
You should switch to apsw, or Python 3.

Vertica Query Result Export and Import using Python

I want to copy certain data from a Vertica cluster (lets say a test cluster) to another Vertica cluster (lets say QA cluster). Manually I can do this by dumping the result of a query into a CSV file and then importing it on the other cluster. But, how can I do it on a Python script without using os or system commands. I want to do it purely using some Python module or adapter. As of now I am using python-vertica adapter, I am able to connect to Test cluster and get the data into a python list, but I am unable to export it to a CSV file natively using the adapter (i.e. without using python csv module). Also, how can I import the CSV file in my QA cluster using the same adapter (or a different vertica module for python)?
You can do it with COPY FROM VERTICA for simple problems. Read here for more info.
For python you can use in my template:
Environment:
python=2.7.x
vertica-python==0.7.3
Vertica Analytic Database v8.1.1-10
Source code example:
#!/usr/bin/env python2
# coding: UTF-8
import csv
import cStringIO
# connection info: username, password, etc
SRC_DB_INFO = {...}
DST_DB_INFO = {...}
csvbuffer = cStringIO.StringIO()
csvwriter = csv.writer(csvbuffer, delimiter='|', lineterminator='\n', quoting=csv.QUOTE_MINIMAL)
# establish connection to source database
connection = vertica_python.connect(**SRC_DB_INFO)
cursor = connection.cursor()
cursor.execute('SELECT * FROM A')
# convert data to csv format
for row in cursor.iterate():
csvwriter.writerow(row)
# cleanup
cursor.close()
connection.close()
# establish connection to destination database
connection = vertica_python.connect(**DST_DB_INFO)
cursor = connection.cursor()
# copy data
cursor.copy('COPY B FROM STDIN ABORT ON ERROR', csvbuffer.getvalue())
connection.commit()
# cleanup
cursor.close()
connection.close()

Fetch a file(.csv) from S3 bucket and copy to an RDS

I'm gonna connect to a S3 bucket, get the csv files and copy the rows to RDS DB. On this script we are using arcpy, I'm not that familiar with this package, I'm just trying to get the csv file directly from S3 bucket as source without downloading it on the server. The code is as follows:
import arcpy
from boto.s3.key import Key
import StringIO
import pandas as pd
import boto
import boto.s3.connection
access_key = ''
secret_key = ''
conn = boto.connect_s3(aws_access_key_id = access_key,aws_secret_access_key = secret_key,host = 's3.amazonaws.com')
b = conn.get_bucket('mybucket')
#for key in b.list:
b_key = b.get_key('file1.csv')
arcpy.env.overwriteOutput = True
b_url = b_key.generate_url(0, query_auth=False, force_http=True)
print b_url
##Read file
k = Key(b,file1.csv)
content = k.get_contents_as_string()
sourcefile_csv = pd.read_csv(StringIO.StringIO(content))
##CopyRows_management (in_rows, out_table, {config_keyword})
#http://pro.arcgis.com/en/pro-app/tool-reference/data-management/copy-rows.htm
arcpy.CopyRows_management(sourcefile_csv, "RDSTablePath", "")
print("copy rows done")
Error: in CopyRows arcgisscripting.ExecuteError. Failed to execute Parameters are not valid
If we use a path on the server as source path like below it works fine:
sourcefile_csv = "D:\\DEV\\file1.csv"
arcpy.CopyRows_management(sourcefile_csv, "RDSTablePath", "")
Any help would be appreciated.
It looks like you are trying to use the Pandas dataframe as the table to read from with CopyRows_management? I don't think that is a valid input for the function, thus the "Parameters are not valid" error. The documentation says that in_rows should be "The rows from a feature class, layer, table, or table view to be copied." I think the use of pandas is unnecessary here anyways.
So either save the csv somewhere that the script can access it (as you did in when you used the path on the server) or, if you don't want to save the file anywhere, just read the contents of the csv and iterate through it using an Insert Cursor to write it to your table/feature class.
See this post on how to read a csv from a string using the csv module. Then just loop through the rows of the csv and use the Insert Cursor to write to your table.
If your RDS happens to be an Aurora MySql then you should take a look into Loading Data from S3 feature, where you can skip the code and just loads line by line into your DB.

Accessing Hive from remote server through Python

I have installed following necessary packages on the remote server to access Hive through Python.
Python 2.7.6,
Python development tools,
pyhs2,
sasl-0.1.3,
thrift-0.9.1,
PyHive-0.1.0
Here is the Python script to access Hive.
#!/usr/bin/env python
import pyhs2 as hive
import getpass
DEFAULT_DB = 'camp'
DEFAULT_SERVER = '10.25.xx.xx'
DEFAULT_PORT = 10000
DEFAULT_DOMAIN = 'xxx.xxxxxx.com'
# Get the username and password
u = raw_input('Enter PAM username: ')
s = getpass.getpass()
# Build the Hive Connection
connection = hive.connect(host=DEFAULT_SERVER, port=DEFAULT_PORT, authMechanism='LDAP', user=u + '#' + DEFAULT_DOMAIN, password=s)
# Hive query statement
statement = "select * from camp.test"
cur = connection.cursor()
# Runs a Hive query and returns the result as a list of list
cur.execute(statement)
df = cur.fetchall()
Here is the output I got:
File "build/bdist.linux-x86_64/egg/pyhs2/__init__.py", line 7, in connect
File "build/bdist.linux-x86_64/egg/pyhs2/connections.py", line 46, in __init__
File "build/bdist.linux-x86_64/egg/pyhs2/cloudera/thrift_sasl.py", line 74, in open
File "build/bdist.linux-x86_64/egg/pyhs2/cloudera/thrift_sasl.py", line 92, in _recv_sasl_message
File "build/bdist.linux-x86_64/egg/thrift/transport/TTransport.py", line 58, in readAll
File "build/bdist.linux-x86_64/egg/thrift/transport/TSocket.py", line 118, in read
thrift.transport.TTransport.TTransportException: TSocket read 0 bytes
I don't see any error in the output after executing the script, however I don't see any query results on the screen. I'm not sure why it's not displaying any query results, Hive server IP, port, user and password are correct. I also verified connectivity between hive server and remote server, no issues with connectivity.
Try using this code:
import pyhs2
with pyhs2.connect(host='localhost',
port=10000,
authMechanism="PLAIN",
user='root',
password='test',
database='default') as conn:
with conn.cursor() as cur:
#Show databases
print cur.getDatabases()
#Execute query
cur.execute("select * from table")
#Return column info from query
print cur.getSchema()
#Fetch table results
for i in cur.fetch():
print i
I've managed to get access by using the following
from pyhive import presto
DEFAULT_DB = 'XXXXX'
DEFAULT_SERVER = 'server.name.blah'
DEFAULT_PORT = 8000
# Username
u = "user"
# Build the Hive Connection
connection = presto.connect(host=DEFAULT_SERVER, port=DEFAULT_PORT, username=u)
# Hive query statement
statement = "select * from public.dudebro limit 5"
cur = connection.cursor()
# Runs a Hive query and returns the result as a list of list
cur.execute(statement)
df = cur.fetchall()
print df

cannot read tables from sqlite3 database attached in python

I can connect to a database in sqlite3, attach another database and run an inner join to retrieve records from two tables, one in each database. But when I try to do the same with a python script running on the command line, I get no results - the error reads that the table (in the attached database) does not exist.
import sqlite3 as lite
db_acts = '/full/path/to/activities.db'
db_sign = '/full/path/to/sign_up.db'
def join_tables():
try:
con = lite.connect(db_acts)
cursor = con.cursor()
cursor.execute("attach database 'db_sign' as 'sign_up'")
cursor.execute("select users.ID, users.Email, users.TextMsg from sign_up.users INNER JOIN db_acts.alerts on sign_up.users.ID = db_acts.alerts.UID")
rows = cursor.fetchall()
for row in rows:
print 'row', row
con.commit()
con.close()
except lite.Error, e:
print 'some error'
sys.exit(1)
The response on localhost is the same as on the HostGator remote host where I just ran a test (it's a new site without user inputs at the moment). I have no problem reading rows from tables in the original database connection - only the tables in the attached database are not read. The attachment works at least partially - a print statement to attach it in the except clause shows that the database is in use.