Doctrine: data too long for column (due to escaping) - doctrine-orm

I got the following error:
The 'name' column is 255 characters long, but it seems that output escaping messes it up. What did I do wrong?
An exception occurred while executing 'INSERT INTO parts (deletedAt,
name, code, price, in_stock) VALUES (?, ?, ?, ?, ?)' with params
[null,
"\x50\x6f\x73\x74\x4e\x4c\x20\x50\x72\x69\x6f\x72\x69\x74\x79\x20\x50\x61\x6b\x6b\x65\x74\x20\x41\x61\x6e\x67\x65\x74\x65\x6b\x65\x6e\x64\x20\x42\x65\x6c\x67\x69\xeb",
"1101012", "12.00", 0]:
SQLSTATE[22001]: String data, right truncated: 1406 Data too long for
column 'name' at row 1 500 Internal Server Error - DriverException 2
linked Exceptions:
PDOException » PDOException »
foreach ($csv as $row) { #persist every part
$part = new Part();
$part->setCode($row[0]);
$part->setName($row[1]);
$part->setPrice(preg_replace("/[^0-9.]/","", $row[2])); # extract decimal number
$part->setInStock(($row[3])?1:0);
$em = $this->getDoctrine()->getManager();
$em->persist($part);
$count++; # get number of commits
}
$em->flush();
from my my.ini:
[client]
port = 3306
socket = /tmp/mysql.sock
default-character-set = utf8mb4
[mysql]
default-character-set = utf8mb4
no-auto-rehash
[mysqld]
port = 3306
collation-server = utf8mb4_unicode_ci
init-connect = 'SET NAMES utf8mb4'
character-set-server = utf8mb4
skip-character-set-client-handshake

So it turned out there were two problems.
One was that Doctrine did not respect the default collation set by the MySQL server, and even a myriad of options in config.yml etc. Finally I have it looking like this, and I don't really understand why. But it works now.
charset: utf8mb4
default_table_options:
charset: utf8mb4
collate: utf8mb4_unicode_ci
options:
charset: utf8mb4
collate: utf8mb4_unicode_ci
1002: "SET NAMES 'utf8mb4' COLLATE 'utf8mb4_unicode_ci'"
Second was that the source of the data I was trying to load into the database (a csv file) was apparently not in utf8. Saving it into "utf8" solved the problem. Now I wonder how I can make php convert the input into the desired format.

Related

How can I use code to export a SharePoint list to Excel

I found a previous question and it looks to be what I'm looking for. However when I run the code, I get a debug error (Highlights the last line from "Set ObjMyList . . . . ("A1"))". Below is the code I'm using with the specific path & GUIDs. I tried adjusting the sharepoint address, but the one listed is the one that points to the library. I also tried just the home address (Stopping at "TEP") and all the way to including "All Items.aspx". I'm sure I am missing something "simple", but just thought I'd try to ask here.
Dim objMyList As ListObject
Dim objWksheet As Worksheet
Dim strSPServer As String
Const SERVER As String = "https://twdc.sharepoint.com/sites/WDPR-dclrecruiting/Test/TEP/Trip%20Event%20Planning%20Library"
Const LISTNAME As String = "{6B39FDF1-29AE-418C-9D99-92293FED5C81}"
Const VIEWNAME As String = "{CCFD1C7F-74CA-4921-A599-628C800C818A}"
strSPServer = "http://" & SERVER & "/_vti_bin"
Set objWksheet = Worksheets.Add
Set objMyList = objWksheet.ListObjects.Add(xlSrcExternal, _
Array(strSPServer, LISTNAME, VIEWNAME), False, xlYes, Range("A1"))
Below code works in my local
Sub ExportList()
Dim objWksheet As Worksheet
Dim strSPServer As String
Const SERVER As String = "sp/sites/team"
Const LISTNAME As String = "{3e47ff9c-9aab-4a40-9d6a-c47e9b793484}" 'From source code
Const VIEWNAME As String = "{67709eda-c975-4669-85e5-d95e263dadc6}" 'From source code
' The SharePoint server URL pointing to the SharePoint list to import into Excel.
strSPServer = "http://" & SERVER & "/_vti_bin"
Set objWksheet = Sheets("Sheet1")
' Add a list range to the newly created worksheet
' and populated it with the data from the SharePoint list.
Set objMyList = objWksheet.ListObjects.Add(xlSrcExternal, Array(strSPServer, LISTNAME, VIEWNAME), True, , Range("A1"))
Set objMyList = Nothing
Set objWksheet = Nothing
End Sub

spring mvc special characters

I'm trying to add a field with special characters like "Patrícia" or "José" and since that field is a Name I used a regex pattern from
^[a-zA-Z\u00C0-\u00FF]+
tested in https://regexr.com/
This is the field inside the user model:
#Id
#GeneratedValue(strategy = GenerationType.AUTO)
#Column(name = "user_id")
private Integer id;
(etc)
#Column(name = "name")
#Pattern(regexp = " ^[a-zA-Z\u00C0-\u00FF]+" , message = "{user.name.pattern}")
#NotEmpty(message = "{user.name.empty}")
#Size(min=1, max=40, message = " {user.name.size}")
private String name;
(getters/setters/constructor)
If I add Patricia or Jose, for example the validation succeeds and the user gets inserted. The regex seems ok in the regExr using those names...
This is the error I get:
Hibernate: insert into user (f1, f2, f3, last_name, name, f6, f7, user_id) values (?, ?, ?, ?, ?, ?, ?, ?)
2018-05-22 17:40:34.622 WARN 8475 --- [nio-8080-exec-3] o.h.engine.jdbc.spi.SqlExceptionHelper : SQL Error: 1062, SQLState: 23000
2018-05-22 17:40:34.623 ERROR 8475 --- [nio-8080-exec-3] o.h.engine.jdbc.spi.SqlExceptionHelper : Duplicate entry '3' for key 'PRIMARY'
2018-05-22 17:40:34.624 INFO 8475 --- [nio-8080-exec-3] o.h.e.j.b.internal.AbstractBatchImpl : HHH000010: On release of batch it still contained JDBC statements
2018-05-22 17:40:34.626 ERROR 8475 --- [nio-8080-exec-3] o.h.i.ExceptionMapperStandardImpl : HHH000346: Error during managed flush [org.hibernate.exception.ConstraintViolationException: could not execute statement]
2018-05-22 17:40:34.664 ERROR 8475 --- [nio-8080-exec-3] o.a.c.c.C.[.[.[/].[dispatcherServlet] : Servlet.service() for servlet [dispatcherServlet] in context with path [] threw exception [Request processing failed; nested exception is org.springframework.dao.DataIntegrityViolationException: could not execute statement; SQL [n/a]; constraint [PRIMARY]; nested exception is org.hibernate.exception.ConstraintViolationException: could not execute statement] with root cause
com.mysql.jdbc.exceptions.jdbc4.MySQLIntegrityConstraintViolationException: Duplicate entry '3' for key 'PRIMARY'
What does the primary key has to do with the name validation?
What can I be doing wrong?
Thank you,
Eunito.

Cross-platform mySQL inconsistencies with character sets

I have a script that creates a database, couple of tables in it and populates the data. This script was made from the SQLite database on Windows. One of the tables in that script is named "abcß" ("abc+ALT225").
Next I tried to load this script to mySQL thru mySQL Workbench. Both server and Workbench are running on Linux.
After fixing some syntax inconsistencies the database was successfully created. I tried to query the database and tables. All tables were queried successfully, but the one above.
Trying to query the "information_table.tables.table_name", I get that name as "abc\0d-61\0d63" which gives different result than the original name. Because of this my program is crashing when I run it because I sent the table name to codecvt_utf-8 encoder.
The database and tables are created with the default encoding.
Does anybody know why I'm not seeing the proper results?
But most importantly - I presume the program is crashing because some of the characters are outside of wchar_t/utf-8 encoding. So I'm curious - what should I use to convert that sequence to std::wstring?
TIA!
EDIT:
The code is as follows:
class MySQLDatabase
{
public:
int LoadDatabaseData();
protected:
struct MySQLImpl;
MySQLImpl *m_pimpl;
};
struct MySQLDatabase::MySQLImpl
{
std::wstring_convert<std::codecvt_utf8<wchar_t> > m_myconv;
};
int MySQLDatabase::LoadDatabaseData()
{
const char *table_name;
std::wstring tableName = m_pimpl->m_myconv.from_bytes( table_name );
}
EDIT2:
Do you think it will work if I add:
std::wstring_convert<std::codecvt_utf16<wchar_t> > m_myconv;
?
EDIT3:
Here is what I see in the Workbench:
# TABLE_CATALOG, TABLE_SCHEMA, TABLE_NAME, TABLE_TYPE, ENGINE, VERSION, ROW_FORMAT, TABLE_ROWS, AVG_ROW_LENGTH, DATA_LENGTH, MAX_DATA_LENGTH, INDEX_LENGTH, DATA_FREE, AUTO_INCREMENT, CREATE_TIME, UPDATE_TIME, CHECK_TIME, TABLE_COLLATION, CHECKSUM, CREATE_OPTIONS, TABLE_COMMENT
'def', 'draft', 'abcÃ', 'BASE TABLE', 'InnoDB', '10', 'Compact', '0', '0', '16384', '0', '0', '0', NULL, '2016-12-09 00:15:27', NULL, NULL, 'utf8_general_ci', NULL, '', ''
Do not use utf-16 for anything.
Do not use "unicode".
Where the heck did \0d-61 come from?
Do not use any conversion subroutines, go back to the source and make sure it is encoded UTF-8.
For verifying that you are using UTF-8, abcß is hex 61 62 63 C39F

How to deal with double data with two different systems (English and German) in google charts

I have my local system in English and my server is hosted on a German System. When I run my code for google charts it produces proper results on localhost. But when i run the code on server (German system), my numeric values gets a german notation. And thus the chart load fails.
For example:
`var data = google.visualization.arrayToDataTable([
['Year', 'Enegry', { role: 'style' } ],
['2017',15782.515,'#76A7FA'],
['2016',0,'#76A7FA'],
['2015',0,'#76A7FA'],
['2014',0,'#76A7FA'],
['2013',0,'#76A7FA']]);
`
This works fine since numbers have '.' as decimal separator.
Where as
var data = google.visualization.arrayToDataTable([
['Year', 'Enegry', { role: 'style' } ],
['2017',15782,515,'#76A7FA'],
['2016',0,'#76A7FA'],
['2015',0,'#76A7FA'],
['2014',0,'#76A7FA'],
['2013',0,'#76A7FA']]);
Numeric values have ',' as separator. Thus google chart fails to load with the error message: Row 0 has 4 columns, but must have 3.
(The values are stored in an ArrayList)
Any help on this would be appreciated.
you can specify a locale in the load statement
google.charts.load('current', {'packages':['corechart'], 'language': 'en'});
see Load the Libraries -- Locales

Django query Unicode Issues

EDIT #2:
{'sql': 'SELECT "strains_terpene"."id", "strains_terpene"."name",
"strains_terpene"."short_desc", "strains_terpene"."long_desc",
"strains_terpene"."aroma", "strains_terpene"."flavor",
"strains_terpene"."effects" FROM "strains_terpene" WHERE
"strains_terpene"."name" = \'\xce±-Humulene\'', 'time': '0.000'}
Upon closer look it appears that django may be properly escaping the single quotes in the end. Had to take a different angle to see this by using this:
from django.db import connections
connections['default'].queries
So now the question remains, why even though python3, django, and postgres are all set to utf-8 is the unicode being encoded to local in the query?
Original Question:
Here is the runtime error:
strains.models.DoesNotExist: Terpene matching query does not exist.
Here is the str(Terpene.objects.filter(name='β-Caryophyllene').query):
SELECT "strains_terpene"."id", "strains_terpene"."name", "strains_terpene"."short_desc", "strains_terpene"."long_desc", "strains_terpene"."aroma", "strains_terpene"."flavor", "strains_terpene"."effects"
FROM "strains_terpene"
WHERE "strains_terpene"."name" = ß-Caryophyllene
Here is how postgres likes to see the query for it to work:
select * from strains_terpene where name = 'β-Caryophyllene'
Am i missing something here? Why is Django not wrapping my condition in single quotes?
PostgresDB is encoded with utf-8
Python 3 strings are unicode
EDIT:
I notice the query attribute is also converting the β to ß...
I thought this could be a conversion issue considering im using windows cmd for the python shell.
So i did a:
with open('log2.txt','w',encoding='utf-8') as f:
print(Terpene.objects.filter(name='β-Caryophyllene').query, file=f)
And here are the results even when output directly to utf-8 plain text.
SELECT "strains_terpene"."id", "strains_terpene"."name", "strains_terpene"."short_desc", "strains_terpene"."long_desc", "strains_terpene"."aroma", "strains_terpene"."flavor", "strains_terpene"."effects"
FROM "strains_terpene"
WHERE "strains_terpene"."name" = ß-Caryophyllene
So now I am confused on 2 fronts. Why does django choose to ommit the single quotes for the where condition and why is the lowercase beta being converted to an uppercase?
EXTRA INFO:
Here is the section of actual code.
Importing mass results via CSV.
The results dict stores the mapping between columns and Terpene Names
The first log.txt is for verifying the contents of results
The second log1.txt is to verify the key before using it as the lookup condition
The finally log2.txt verifies sql being sent to the database
First the Code Snippet:
results = {
u'α-Pinene': row[7],
u'β-Pinene': row[8],
u'Terpinolene': row[9],
u'Geraniol': row[10],
u'α-Terpinene': row[11],
u'γ-Terpinene': row[12],
u'Camphene': row[13],
u'Linalool': row[14],
u'd-Limonene': row[15],
u'Citral': row[16],
u'Myrcene': row[17],
u'α-Terpineol': row[18],
u'Citronellol': row[19],
u'dl-Menthol': row[20],
u'1-Borneol': row[21],
u'2-Piperidone': row[22],
u'β-Caryophyllene': row[23],
u'α-Humulene': row[24],
u'Caryophyllene Oxide': row[25],
}
with open("log.txt", "w") as text_file:
print(results.keys(), file=text_file)
for r, v in results.items():
if '<' not in v:
value = float(v.replace("%", ""))
with open("log1.txt", "w") as text2:
print(r, file=text2)
with open("log2.txt", "w", encoding="utf-8") as text3:
print(Terpene.objects.filter(name=r).query, file=text3)
TerpeneResult.objects.create(
terpene=Terpene.objects.get(name=r),
qa_sample=sample,
result=value,
)
And log.txt -- results.keys():
dict_keys(['dl-Menthol', 'Geraniol', 'Camphene', '1-Borneol', 'Linalool',
'α-Humulene', 'Caryophyllene Oxide', 'β-Caryophyllene', 'Citronellol',
'α-Pinene', '2-Piperidone', 'β-Pinene', 'd-Limonene', 'γ-Terpinene',
'Terpinolene', 'α-Terpineol', 'Myrcene', 'α-Terpinene', 'Citral'])
log1.txt -- α-Humulene
Lastly the sql being generated -- log2.txt:
SELECT "strains_terpene"."id", "strains_terpene"."name", "strains_terpene"."short_desc", "strains_terpene"."long_desc", "strains_terpene"."aroma", "strains_terpene"."flavor", "strains_terpene"."effects"
FROM "strains_terpene"
WHERE "strains_terpene"."name" = α-Humulene
Note the unicode being lost at the last moment when the sql is generated.