Cross-platform mySQL inconsistencies with character sets

Cross-platform mySQL inconsistencies with character sets - c++

I have a script that creates a database, couple of tables in it and populates the data. This script was made from the SQLite database on Windows. One of the tables in that script is named "abcß" ("abc+ALT225").
Next I tried to load this script to mySQL thru mySQL Workbench. Both server and Workbench are running on Linux.
After fixing some syntax inconsistencies the database was successfully created. I tried to query the database and tables. All tables were queried successfully, but the one above.
Trying to query the "information_table.tables.table_name", I get that name as "abc\0d-61\0d63" which gives different result than the original name. Because of this my program is crashing when I run it because I sent the table name to codecvt_utf-8 encoder.
The database and tables are created with the default encoding.
Does anybody know why I'm not seeing the proper results?
But most importantly - I presume the program is crashing because some of the characters are outside of wchar_t/utf-8 encoding. So I'm curious - what should I use to convert that sequence to std::wstring?
TIA!
EDIT:
The code is as follows:
class MySQLDatabase
{
public:
int LoadDatabaseData();
protected:
struct MySQLImpl;
MySQLImpl *m_pimpl;
};
struct MySQLDatabase::MySQLImpl
{
std::wstring_convert<std::codecvt_utf8<wchar_t> > m_myconv;
};
int MySQLDatabase::LoadDatabaseData()
{
const char *table_name;
std::wstring tableName = m_pimpl->m_myconv.from_bytes( table_name );
}
EDIT2:
Do you think it will work if I add:
std::wstring_convert<std::codecvt_utf16<wchar_t> > m_myconv;
?
EDIT3:
Here is what I see in the Workbench:
# TABLE_CATALOG, TABLE_SCHEMA, TABLE_NAME, TABLE_TYPE, ENGINE, VERSION, ROW_FORMAT, TABLE_ROWS, AVG_ROW_LENGTH, DATA_LENGTH, MAX_DATA_LENGTH, INDEX_LENGTH, DATA_FREE, AUTO_INCREMENT, CREATE_TIME, UPDATE_TIME, CHECK_TIME, TABLE_COLLATION, CHECKSUM, CREATE_OPTIONS, TABLE_COMMENT
'def', 'draft', 'abcÃ', 'BASE TABLE', 'InnoDB', '10', 'Compact', '0', '0', '16384', '0', '0', '0', NULL, '2016-12-09 00:15:27', NULL, NULL, 'utf8_general_ci', NULL, '', ''

Do not use utf-16 for anything.
Do not use "unicode".
Where the heck did \0d-61 come from?
Do not use any conversion subroutines, go back to the source and make sure it is encoded UTF-8.
For verifying that you are using UTF-8, abcß is hex 61 62 63 C39F

Related

Big Query job not found issue

I am trying to move some old code that used the older google-api-client gem to the Idiomatic Ruby client google-cloud-ruby.
The process is a simple query job that saves it's results to another table. In the older gem, I used a config like this:
config= {
"jobReference": {
"projectId": GOOGLE_PROJECT,
'location'=> 'europe-west2'
},
'configuration'=> {
'query'=> {
'allowLargeResults' => true,
'createDisposition' => 'CREATE_IF_NEEDED',
'writeDisposition' => 'WRITE_TRUNCATE',
'query' => sql,
'destinationTable'=> {
'projectId'=> GOOGLE_PROJECT,
'datasetId'=> 'my_dataset',
'tableId'=> table,
'location'=> 'europe-west2'
}
}
},
}
Following the docs for the newer library, I am running this as a basic test (the sql is defined elsewhere)
bigquery = Google::Cloud::Bigquery.new
dataset = bigquery.dataset('my_dataset')
puts(dataset.location)
puts("1")
job = bigquery.query_job(sql, table: dataset.table(table), write: 'truncate', create: 'needed')
puts("2")
job.wait_until_done!
puts("3")
job.done?
This gets as far as the puts 2, failing on job.wait_until_done! with the error Google::Cloud::NotFoundError: notFound: Not found: Job my_project:job_hApg5h0NQQb4Xbv7Sr3zzIXm5RWF
If I 'puts' the job.job_id I see the same ID as it's saying it can't find. I've tried running this in datasets in multi-region and single location and still the same error. Ultimately, I need this to run on the 'europe-west2' region only.
Can anyone help and/or point me to a working example? Thanks in advance!

As suggested by #Tlaquetzal, you can replace your SQL query to a simple SELECT 1 as below sample query and see the results.
sql = "SELECT 1 FROM `project.dataset.table`"

Unparsable MOF Query When Trying to Register Event

Update 2
I accepted an answer and asked a different question elsewhere, where I am still trying to get to the bottom of this.
I don't think that one-lining this query is the answer, as I am still not getting the required results (and multi-lining queries is allowed in .mof, as shown in the URLs in comments to the answer ...
Update
I rewrote the query as a one-liner as suggested, but still got the same error! As it was still talking about lines 11-19 I knew there must be another issue. After saving a new file with the change, I reran mofcomp and it appears to have loaded, but the event which I have subscribed to simply does not work.
I really feel that there is not enough documentation on this topic and it is hard to work out how I am meant to debug this - any help on this would be much appreciated, even if this means using a different more appropriate method.
I have the following .mof file, which I would like to use to register an event on my system :
#pragma namespace("\\\\.\\root\\subscription")
instance of __EventFilter as $EventFilter
{
Name = "Event Filter Instance Name";
Query = "Select * from __InstanceCreationEvent within 1 "
"where targetInstance isa \"Cim_DirectoryContainsFile\" "
"and targetInstance.GroupComponent = \"Win32_Directory.Name=\"c:\\\\test\"\"";
QueryLanguage = "WQL";
EventNamespace = "Root\\Cimv2";
};
instance of ActiveScriptEventConsumer as $Consumer
{
Name = "TestConsumer";
ScriptingEngine = "VBScript";
ScriptText =
"Set objFSO = CreateObject(\"Scripting.FileSystemObject\")\n"
"Set objFile = objFSO.OpenTextFile(\"c:\\test\\Log.txt\", 8, True)\n"
"objFile.WriteLine Time & \" \" & \" File Created\"\n"
"objFile.Close\n";
// Specify any other relevant properties.
};
instance of __FilterToConsumerBinding
{
Filter = $EventFilter;
Consumer = $Consumer;
};
But whenever I run the command mfcomp myfile.mof I am getting this error:
Parsing MOF file: myfile.mof
MOF file has been successfully parsed
Storing data in the repository...
An error occurred while processing item 1 defined on lines 11 - 19 in file myfile.mof:
Error Number: 0x80041058, Facility: WMI
Description: Unparsable query.
Compiler returned error 0x80041058
This error appears to be caused by incorrect syntax in the query, but I don't understand where I have gone wrong with this - is anyone able to advise?

There are no string concatenation or line continuation characters being used in building "Query". To keep it simple, you could put the entire query on one line.

Django query Unicode Issues

EDIT #2:
{'sql': 'SELECT "strains_terpene"."id", "strains_terpene"."name",
"strains_terpene"."short_desc", "strains_terpene"."long_desc",
"strains_terpene"."aroma", "strains_terpene"."flavor",
"strains_terpene"."effects" FROM "strains_terpene" WHERE
"strains_terpene"."name" = \'\xce±-Humulene\'', 'time': '0.000'}
Upon closer look it appears that django may be properly escaping the single quotes in the end. Had to take a different angle to see this by using this:
from django.db import connections
connections['default'].queries
So now the question remains, why even though python3, django, and postgres are all set to utf-8 is the unicode being encoded to local in the query?
Original Question:
Here is the runtime error:
strains.models.DoesNotExist: Terpene matching query does not exist.
Here is the str(Terpene.objects.filter(name='β-Caryophyllene').query):
SELECT "strains_terpene"."id", "strains_terpene"."name", "strains_terpene"."short_desc", "strains_terpene"."long_desc", "strains_terpene"."aroma", "strains_terpene"."flavor", "strains_terpene"."effects"
FROM "strains_terpene"
WHERE "strains_terpene"."name" = ß-Caryophyllene
Here is how postgres likes to see the query for it to work:
select * from strains_terpene where name = 'β-Caryophyllene'
Am i missing something here? Why is Django not wrapping my condition in single quotes?
PostgresDB is encoded with utf-8
Python 3 strings are unicode
EDIT:
I notice the query attribute is also converting the β to ß...
I thought this could be a conversion issue considering im using windows cmd for the python shell.
So i did a:
with open('log2.txt','w',encoding='utf-8') as f:
print(Terpene.objects.filter(name='β-Caryophyllene').query, file=f)
And here are the results even when output directly to utf-8 plain text.
SELECT "strains_terpene"."id", "strains_terpene"."name", "strains_terpene"."short_desc", "strains_terpene"."long_desc", "strains_terpene"."aroma", "strains_terpene"."flavor", "strains_terpene"."effects"
FROM "strains_terpene"
WHERE "strains_terpene"."name" = ß-Caryophyllene
So now I am confused on 2 fronts. Why does django choose to ommit the single quotes for the where condition and why is the lowercase beta being converted to an uppercase?
EXTRA INFO:
Here is the section of actual code.
Importing mass results via CSV.
The results dict stores the mapping between columns and Terpene Names
The first log.txt is for verifying the contents of results
The second log1.txt is to verify the key before using it as the lookup condition
The finally log2.txt verifies sql being sent to the database
First the Code Snippet:
results = {
u'α-Pinene': row[7],
u'β-Pinene': row[8],
u'Terpinolene': row[9],
u'Geraniol': row[10],
u'α-Terpinene': row[11],
u'γ-Terpinene': row[12],
u'Camphene': row[13],
u'Linalool': row[14],
u'd-Limonene': row[15],
u'Citral': row[16],
u'Myrcene': row[17],
u'α-Terpineol': row[18],
u'Citronellol': row[19],
u'dl-Menthol': row[20],
u'1-Borneol': row[21],
u'2-Piperidone': row[22],
u'β-Caryophyllene': row[23],
u'α-Humulene': row[24],
u'Caryophyllene Oxide': row[25],
}
with open("log.txt", "w") as text_file:
print(results.keys(), file=text_file)
for r, v in results.items():
if '<' not in v:
value = float(v.replace("%", ""))
with open("log1.txt", "w") as text2:
print(r, file=text2)
with open("log2.txt", "w", encoding="utf-8") as text3:
print(Terpene.objects.filter(name=r).query, file=text3)
TerpeneResult.objects.create(
terpene=Terpene.objects.get(name=r),
qa_sample=sample,
result=value,
)
And log.txt -- results.keys():
dict_keys(['dl-Menthol', 'Geraniol', 'Camphene', '1-Borneol', 'Linalool',
'α-Humulene', 'Caryophyllene Oxide', 'β-Caryophyllene', 'Citronellol',
'α-Pinene', '2-Piperidone', 'β-Pinene', 'd-Limonene', 'γ-Terpinene',
'Terpinolene', 'α-Terpineol', 'Myrcene', 'α-Terpinene', 'Citral'])
log1.txt -- α-Humulene
Lastly the sql being generated -- log2.txt:
SELECT "strains_terpene"."id", "strains_terpene"."name", "strains_terpene"."short_desc", "strains_terpene"."long_desc", "strains_terpene"."aroma", "strains_terpene"."flavor", "strains_terpene"."effects"
FROM "strains_terpene"
WHERE "strains_terpene"."name" = Î±-Humulene
Note the unicode being lost at the last moment when the sql is generated.

Search Informatica for text in SQL override

Is there a way to search all the mappings, sessions, etc. in Informatica for a text string contained within a SQL override?
For example, suppose I know a certain stored procedure (SP_FOO) is being called somewhere in an INFA process, but I don't know where exactly. Somewhere I think there is a Post SQL on a source or target calling it. Could I search all the sessions for Post SQL containing SP_FOO ? (Similar to what I could do with grep with source code.)

You can use Repository queries for querying REPO tables(if you have enough access) to get data related with all the mappings,transformations,sessions etc.
Please use the below link to get almost all kind of repo queries.Ur answers can be find in the below link.
https://uisapp2.iu.edu/confluence-prd/display/EDW/Querying+PowerCenter+data
select *--distinct sbj.SUBJECT_AREA,m.PARENT_MAPPING_NAME
from REP_SUBJECT sbj,REP_ALL_MAPPINGS m,REP_WIDGET_INST w,REP_WIDGET_ATTR wa
where sbj.SUBJECT_ID = m.SUBJECT_ID AND
m.MAPPING_ID = w.MAPPING_ID AND
w.WIDGET_ID = wa.WIDGET_ID
and sbj.SUBJECT_AREA in ('TLR','PPM_PNLST_WEB','PPM_CURRENCY','OLA','ODS','MMS','IT_METRIC','E_CONSENT','EDW','EDD','EDC','ABS')
and (UPPER(ATTR_VALUE) like '%PSA_CONTACT_EVENT%'
-- or UPPER(ATTR_VALUE) like '%PSA_MEMBER_CHARACTERISTIC%'
-- or UPPER(ATTR_VALUE) like '%PSA_REPORTING_HH_CHRSTC%'
-- or UPPER(ATTR_VALUE) like '%PSA_REPORTING_MEMBER_CHRSTC%'
)
--and m.PARENT_MAPPING_NAME like '%ARM%'
order by 1
Please let me know if you have any issues.

Another less scientific way to do this is to export the workflow(s) as XML and use a text editor to search through them for the stored procedure name.

If you have read access to the schema where the informatica repository resides, try this.
SELECT DISTINCT f.subj_name folder, e.mapping_name, object_type_name,
b.instance_name, a.attr_value
FROM opb_widget_attr a,
opb_widget_inst b,
opb_object_type c,
opb_attr d,
opb_mapping e,
opb_subject f
WHERE a.widget_id = b.widget_id
AND b.widget_type = c.object_type_id
AND ( object_type_name = 'Source Qualifier'
OR object_type_name LIKE '%Lookup%'
)
AND a.widget_id = b.widget_id
AND a.attr_id = d.attr_id
AND c.object_type_id = d.object_type_id
AND attr_name IN ('Sql Query')--, 'Lookup Sql Override')
AND b.mapping_id = e.mapping_id
AND e.subject_id = f.subj_id
AND a.attr_value is not null
--AND UPPER (a.attr_value) LIKE UPPER ('%currency%')

Yes. There is a small java based tool called Informatica Meta Query.
Using that tool, you can search for any information that is present in the Informatica meta data tables.
If you cannot find that tool, you can write queries directly in the Informatica Meta data tables to get the required information.

Adding few more lines to solution provided by Data Origin and Sandeep.
It is highly advised not to query repository tables directly. Rather, you can create synonyms or views and then query those objects to avoid any damage to rep tables.
In our dev/ prod environment application programmers are not granted any direct access to repo. tables.

As querying the Informatica database isn't the best idea, I would suggest you to export all the workflows in your folder into xml using Repository Manager. From Rep Mgr you can select all of them once and export them at once. Then write a java program to search the pattern from the xml's you have.
I have written a sample prog here, please modify it as per your requirement:
make a spec file with workflow names(specFileName).
main()
{
try {
File inFile = new File(specFileName);
BufferedReader reader = new BufferedReader(newFileReader(infile));
String tectToSearch = '<YourString>';
String currentLine;
while((currentLine = reader.readLine()) != null)
{
//trim newline when comparing with String
String trimmedLine = currentLine.trim();
if(currentline has the string pattern)
{
SOP(specFileName); //specfile name
}
}
reader.close();
}
catch(IOException ex)
{
System.out.println("Error reading to file '" + specFileName +"'");
}
}

How Do You Call an MSSQL System Function From ADO/C++?

...specifically, the fn_listextendedproperty system function in MSSQL 2005.
I have added an Extended Property to my database object, named 'schemaVersion'. In my MSVC application, using ADO, I need to determine if that Extended Property exists and, if it does, return the string value out of it.
Here is the T-SQL code that does what I want. How do I write this in C++/ADO, or otherwise get the job done?
select value as schemaVer
from fn_listextendedproperty(default, default, default, default, default, default, default)
where name=N'schemaVersion'
Here's the code I tried at first. It failed with the error listed below the code:
_CommandPtr cmd;
cmd.CreateInstance(__uuidof(Command));
cmd->ActiveConnection = cnn;
cmd->PutCommandText("select value "
"from fn_listextendedproperty(default, default, default, default, default, default, default) "
"where name=N'schemaVersion'");
VARIANT varCount;
cmd->Execute(NULL, NULL, adCmdText);
...here are the errors I peeled out of the ADO errors collection. The output is from my little utility function which adds the extra text like the thread ID etc, so ignore that.
(Proc:0x1930, Thread:0x8A0) INFO : === 1 Provider Error Messages : =======================
(Proc:0x1930, Thread:0x8A0) INFO : [ 1] (-2147217900) 'Incorrect syntax near the keyword 'default'.'
(Proc:0x1930, Thread:0x8A0) INFO : (SQLState = '42000')
(Proc:0x1930, Thread:0x8A0) INFO : (Source = 'Microsoft OLE DB Provider for SQL Server')
(Proc:0x1930, Thread:0x8A0) INFO : (NativeError = 156)
(Proc:0x1930, Thread:0x8A0) INFO : ==========================================================
EDIT: Updated the call according to suggestions. Also changed "SELECT value AS schemaVer" to just "SELECT value".
EDIT: Changed the first parameter of Execute() to NULL per suggestion. This fixed my original problem, and I proceeded to the next. :)

Try specifying NULL rather than default for each parameter of fn_listextendedproperty. This should hopefully then execute without errors, just leaving you to retrieve the result as your next step.

I still have not figured out how to do this directly. To get on with my life, I wrote a stored procedure which called the function:
set ANSI_NULLS ON
set QUOTED_IDENTIFIER ON
go
ALTER PROCEDURE [dbo].[mh_getSchemaVersion]
#schemaVer VARCHAR(256) OUTPUT
AS
select #schemaVer = CAST( (select value from fn_listextendedproperty(default, default, default, default, default, default, default) where name=N'schemaVersion') AS varchar(256) )
return ##ROWCOUNT
...and then called thst sproc from my ADO/C++ code:
_CommandPtr cmd;
cmd.CreateInstance(__uuidof(Command));
cmd->ActiveConnection = cnn;
cmd->PutCommandText("mh_getSchemaVersion")_l
_variant_t schemaVar;
_ParameterPtr schemaVarParam = cmd->CreateParameter("#schemaVer", adVarChar, adParamOutput, 256);
cmd->GetParameters()->Append((IDispatch*)schemaVarParam);
cmd->Execute(NULL, NULL, adCmdStoredProc);
std::string v = (const char*)(_bstr_t)schemaVarParam->GetValue();
ver->hasVersion_ = true;
...which works, but I didn't want to have to deploy a new stored procedure.
So if anyone can come up with a solution to the original problem and show me how to call the system function directly from ADO/C++, I will accept that as the answer. Otherwise I'll just accept this.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

Cross-platform mySQL inconsistencies with character sets - c++

Do not use utf-16 for anything. Do not use "unicode". Where the heck did \0d-61 come from? Do not use any conversion subroutines, go back to the source and make sure it is encoded UTF-8. For verifying that you are using UTF-8, abcß is hex 61 62 63 C39F

Related

Big Query job not found issue

Unparsable MOF Query When Trying to Register Event

Django query Unicode Issues

Search Informatica for text in SQL override

How Do You Call an MSSQL System Function From ADO/C++?

Categories

Resources