C++ MySQL Connector no utf8 - c++

I have a problem getting UTF-8 strings from MySQL database. I use C++ connector 1.1 and connect with following code:
sql::ConnectOptionsMap connection_properties;
connection_properties["hostName"] = server;
connection_properties["userName"] = user;
connection_properties["password"] = password;
connection_properties["schema"] = database;
connection_properties["port"] = 3306;
connection_properties["OPT_CHARSET_NAME"] = "utf8";
connection_properties["characterSetResults"] = "utf8";
connection_properties["preInit"] = "SET NAMES utf8";
driver = get_driver_instance();
con = driver->connect(connection_properties);
con->setSchema(database);
I already tried different utf8 options as you see....
If a statement should return database strings like "アフガニスタン" I only see chars like this "アフガニスタン" when I use Visual Studio debugger. The observed code:
std::string name = res->getString(2);
After Json encode it prints "ÒéóÒâòÒé¼ÒâïÒé╣Òé┐Òâ│" into command line.
Other utf8 columns with normal latin characters are returned as expected. It only affects translation columns with non latin chars.
Same database call from PHP with same logic (db connection and json encode) on same PC prints out following chars "\u30a2\u30d5\u30ac\u30cb\u30b9\u30bf\u30f3".
Any ideas about that?

Actually there is no problem. I wrote returned data into a file and all UTF-8 characters are correct. Debugger and CMD are not able to display UTF-8 data as expected...

Related

Qt: Safe parsing of Windows format data under Linux

I have a Server-Client application in which JSON data is send between those. The Client has a Linux and a Windows version, while the Server application runs under Linux.
The Linux Client communicates just find, but I have problems with the Windows Client.
The problematic JSON data contains a text field with an apostrophe. Let's say the content is "a dog`s name", then the Windows client sends this as "a dog\x92s name", while the Linux client goes for "a dog\xE2\x80\x99s name", at least that is what qDebug() shows me.
I parse the JSON data with the lines
QJsonDocument document = QJsonDocument::fromJson(body);
if(document.isArray()) json_data = document.array();
if(document.isObject()) json_data.append(document.object());
where body is a QByteArray and json_data is a QJsonArray.
If the Windows data is fed into this, it seems that the Qt JSON parser does not recognize it as valid JSON and thus json_data end up being empty.
I really don't want to do anything manually with that text specific to those very characters, as I want it not only to work with that apostrophe but with all kinds of special characters that a user might enter in general. Is there some way to handle this in general? I assume the Windows is in something like the Windows-1252 encoding?
I think windows client sends strings encoded in CP1251 or CP1252. And json decoder expects utf-8.
Maybe source code is not in utf-8 and has string literals. Qt4 has QTextCodec::setCodecForCStrings. Qt5 assume string literals encoded in utf-8.
$ echo -n "’" | iconv -f utf-8 -t cp1251 | xxd
00000000: 92
$ echo -n "’" | xxd
00000000: e280 99
If you don't want to fix windows client the proper way (fixing it's output encoding) you can deal with this situation by converting all input from windows client to unicode before building QJsonDocument on server.
QByteArray bodycp1252;
QTextCodec* cp1252 = QTextCodec::codecForName("CP1252");
QTextCodec* utf8 = QTextCodec::codecForName("UTF-8");
QByteArray body = utf8->fromUnicode(cp1252->toUnicode(bodycp1252));
QJsonDocument document = QJsonDocument::fromJson(body);
It's possible to check if QByteArray contains valid utf-8 data with QUtf8::isValidUtf8(const char *chars, qsizetype len) function. It is defined in private headers, so you need to add QT += core-private. Unfortunately implementation is not visible by linker (not exported from QtCore.lib) so you need to add qutfcodec.cpp from qt sources to your project to resolve linker errors.
////////////////// is-valid-utf8.pro
QT -= gui
QT += core core-private
CONFIG += c++11 console
CONFIG -= app_bundle
qt_src = "C:/Qt/5.15.1/Src"
SOURCES += \
main.cpp \
$$qt_src/qtbase/src/corelib/codecs/qutfcodec.cpp
////////////////// main.cpp
#include <QCoreApplication>
#include <private/qutfcodec_p.h>
#include <QTextCodec>
#include <QDebug>
bool isValidUtf8(const QByteArray& data) {
return QUtf8::isValidUtf8(data.data(), data.size()).isValidUtf8;
}
int main(int argc, char *argv[])
{
QCoreApplication a(argc, argv);
QTextCodec* utf8 = QTextCodec::codecForName("UTF-8");
QTextCodec* cp1251 = QTextCodec::codecForName("CP1251");
QByteArray utf8data1 = utf8->fromUnicode("Привет мир");
QByteArray cp1251data1 = cp1251->fromUnicode("Привет мир");
QByteArray utf8data2 = utf8->fromUnicode("Hello world");
QByteArray cp1251data2 = cp1251->fromUnicode("Hello world");
Q_ASSERT(isValidUtf8(utf8data1));
Q_ASSERT(isValidUtf8(cp1251data1) == false);
Q_ASSERT(isValidUtf8(utf8data2));
Q_ASSERT(isValidUtf8(cp1251data2));
qDebug() << "test passed";
return 0;
}
source

Return Multiple Serial Numbers from Cisco Devices

I am parsing the show version command for a series of information. Maybe there is any easier way, but I am trying to return all the serial numbers for devices in a stack. Currently I am only getting back the active switches serial number. Also I need to search through multiple areas for the serial number. Both Processor Board ID and System Serial Number.
I have tested the following Regex strings on https://regex101.com,
.*?^System\sSerial\sNumber\s
^System Serial Number\s([^,]+)
But in my code they do not seem to be working. When I print my variable it is showing empty for all iterations through the For loop.
#!/usr/bin/python
from getpass import getpass
import netmiko
import re
def make_connection (ip, username, password):
return netmiko.ConnectHandler(device_type='cisco_ios', ip=ip,
username=username, password=password)
def get_ip (input):
return(re.findall(r'(?:(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?).){3}
(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)', input))
def get_ips (file_name):
#with does all the cleanup and prework of file open for you
with open(file_name, 'r') as in_file:
for line in in_file:
#this is probably supposed to be lineips = get_ip(line)
#line = get_ip(line)
lineips = get_ip(line)
for ip in lineips:
ips.append(ip)
def to_doc_a(file_name, varable):
f=open(file_name, 'a')
f.write(str(varable))
f.write('\n')
f.close()
def to_doc_w(file_name, varable):
f=open(file_name, 'w', newline="\n")
f.write(str(varable))
f.close()
#This will be a list of the devices we want to SSH to
ips = []
#Pull the IPs.txt is a list of the IPs we want to connect to
#This function pulls those IPs out of the txt file and puts them into a
#list
get_ips('IPs.txt')
#list where informations will be stored
#devices = []
#Results string storage
strresults = ""
#Prompt user for account info
username = input("Username: ")
password = getpass()
file_name = "results.csv"
#Clearing all the old info out of the results.csv file
to_doc_w(file_name, "")
#Make a for loop to hit all the devices, for this we will be looking at
#the IOS it’s running
for ip in ips:
#Connect to a device
net_connect = make_connection(ip, username, password)
#Run a command and set that to output
output = net_connect.send_command('show version')
#finding hostname in output using regular expressions
regex_hostname = re.compile(r'(\S+)\suptime')
hostname = regex_hostname.findall(output)
#finding uptime in output using regular expressions
regex_uptime = re.compile(r'\S+\suptime\sis\s(.+)')
uptime = regex_uptime.findall(output)
#finding version in output using regular expressions
regex_version = re.compile(r'Cisco\sIOS\sSoftware.+Version\s([^,]+)')
version = regex_version.findall(output)
#finding serial in output using regular expressions
regex_serial = re.compile(r'Processor\sboard\sID\s(\S+)')
serial = regex_serial.findall(output)
#finding serial in output using regular expressions
regex_serial2 = re.compile(r'^System Serial Number\s([^,]+)')
serial2 = regex_serial2.findall(output)
print(serial2)
#finding ios image in output using regular expressions
#regex_ios = re.compile(r'System\s\image\s\file\sis\s"([^ "]+)')
#ios = regex_ios.findall(output)
#finding model in output using regular expressions
regex_model = re.compile(r'[Cc]isco\s(\S+).*memory.')
model = regex_model.findall(output)
#append results to table [hostname,uptime,version,serial,ios,model]
#devices.append([hostname[0], uptime[0], version[0], serial[0],
#model[0]])
results = (ip, hostname, version, serial, serial2, model)
#Store results for later, reduce calls to append file, greatly i
#ncrease performance
strresults = strresults + str(results) + "\n"
#Next we will append the output to the results file
#to_doc_a(file_name, results)
to_doc_w(file_name, strresults)
No matter what Cisco device I would like this to pull the serial number and if there are multiple devices in a stack return all the serial numbers for devices in the stack. Also it should return IP, hostname, Version of Code and Model.
For the System Serial Number, your pattern ^System Serial Number\s([^,]+) uses an anchor to assert the start of the string, starts with uppercase Serial Number and is missing a colon : after number.
You could update your pattern where (\S+) captures in a group matching 1+ times a non whitespace char. In your pattern you use [^,]+ to match not a comma, but that would also match a space or newline.
System serial number:\s(\S+)
Regex demo | Python demo

Escaping characters from unicode string

I am trying to store u'\U0001f381' string in Azure sql server from python 2.7.11 in ubuntu 14.04 LTS. I have set the column type as nvarchar(MAX) so that it will accept unicode strings.
following is the python script:
import pymssql
from creds import *
conn = pymssql.connect(host=HOST, user=USER, password=PASSWORD, database=DATABASE)
cursor = conn.cursor()
lst = [u'2017-07-04', u'\U0001f3e8', 1.0, 0.0, 0.0, 9.0]
print lst
placeholder = '%s,' * len(lst)
query = 'INSERT INTO Example_SearchAnalytics VALUES ( '+placeholder.rstrip(',')+ ')'
cursor.execute(query,tuple(lst))
conn.commit()
conn.close()
But I am getting following error when I execute above script from ubuntu environment.
pymssql.OperationalError: (105, "Unclosed quotation mark after the
character string ''.DB-Lib error message 20018, severity 15:\nGeneral
SQL Server error: Check messages from the SQL Server\nDB-Lib error
message 20018, severity 15:\nGeneral SQL Server error: Check messages
from the SQL Server\n")
I don't get any error when I execute same script from windows environment. I think I need to escape any character from unicode string but I am not sure which.
Please help.
Repeat quotation marks or use CHAR(39) as explained on below thread:
Escaping single quote in SQL Server
Hope this helps.
Regards,
Alberto Morillo

Python MYSQL to Text File - Headings Required

I have some code to query a MYSQL database and send the output to a text file.
The code below prints out the first 7 columns of data and sends it to a text file called Test
My question is, how do i also obtain the column HEADINGS from the database as well to display in the text file?
I am using Python 2.7 with a MYSQL database.
import MySQLdb
import sys
connection = MySQLdb.connect (host="localhost", user = "", passwd = "", db =
"")
cursor = connection.cursor ()
cursor.execute ("select * from tablename")
data = cursor.fetchall ()
OutputFile = open("C:\Temp\Test.txt", "w")
for row in data :
print>>OutputFile, row[0],row[1],row[2],row[3],row[4],row[5],row[6]
OutputFile.close()
cursor.close ()
connection.close ()
sys.exit()
The best way to get the details of the column name is by using INFORMATION_SCHEMA
SELECT `COLUMN_NAME`
FROM `INFORMATION_SCHEMA`.`COLUMNS`
WHERE `TABLE_SCHEMA`='yourdatabasename'
AND `TABLE_NAME`='yourtablename';
or by using the SHOW command of mySQL
SHOW columns FROM your-table;
This command is only mySQL specific.
and then to get the data you can use the .fetchall() function to get the details.

Search for UTF encoded characters using QRegExp

I'm trying to check for characters §£¤ using QRegExp.
QString string = "§¤£";
int res = string.count(QRegExp("[§¤£]"));
And the res returns 0.
Edit your .pro file and set the following:
CODECFORSRC = UTF-8
CODECFORTR = UTF-8
Then add to your .cpp file:
QTextCodec::setCodecForCStrings(QTextCodec::codecForName("UTF-8"));
QTextCodec::setCodecForTr(QTextCodec::codecForName("UTF-8"));
That will give you the UTF-8 support for your soruce and internationalization if you need it.