mysqldump with wrong vowels - search and replace?

mysqldump with wrong vowels - search and replace? - replace

i have a mysqldump from the customer with wrong vowels.
It is a backup, and i do not get a new one.
eg instead of ü there is Ã¼, instead of ö there is Ã¶.
To solve this, can i make search and replace in notepad? Or can i damage other tables than tt_content or pages with a global search and replace?

I solved this by export and import with different charset configuration.
Just import your existing mysql dump at your local development server and try export/import as follow.
Create a new mysql dump and try some settings like:
mysqldump --default-character-set=latin1 --skip-set-charset --skip-extended-insert --skip-add-drop-table --no-create-info -u [USERNAME] -p [DBNAME] > [MYSQLDUMNAME].sql
Import the new created mysql dump with settings like:
mysql --default-character-set=utf8 -u [USERNAME] -p [DBNAME] < [MYSQLDUMNAME].sql
You will need some tests, to find out the correct transformation (latin1,utf8).
If you have a mix of correct and incorrect chars in your mysql dump, you will probably exclude such tables, and import them separately like:
mysqldump --default-character-set=latin1 --skip-set-charset --skip-extended-insert --add-drop-table --ignore-table=[DBNAME].[TABLENAME] -u [USERNAME] -p [DBNAME] > [MYSQLDUMNAME].sql
Replace [USERNAME],[DBNAME],[TABLENAME],[MYSQLDUMNAME] with your values.

This is mostly caused by wrong encoding settings used to dump the backup (like communicating with the server in utf-8 when database is in cp-1252). If you can get the settings used to create it, you can import it on your local machine with the same settings correctly and create a new dump with correct settings to fix it.
You can attempt to fix it with search replace, but you will probably miss a lot of symbols, unless it is really small dump and you can actually check it completely by hand afterwards.

Look at the following TYPO3Wiki entry. Here are some method described how to convert the Data into utf8:
https://wiki.typo3.org/UTF-8_support#Possibility_2

Related

Is there a way to perform accent-insensitive lookups using Django and MariaDB?

I would like to make an accent-insensitive lookup in a french database (with words containing accents):
>>> User.objects.filter(first_name="Jeremy")
['<User: Jéremy>', '<User: Jérémy>', '<User: Jeremy>']
After lots of research, I found that Django has an Unaccent lookup for PostgreSQL but nothing for other databases (like MariaDB)
Is there a way to make this happen without changing the database to PostgreSQL?

Finally got it solved:
So first and foremost: get acquainted with SQL Collations! (Thanks to #MartinBurch)
Collations set the searching rules so that SQL can match your given keywords.
So after lots of head-smashes, here is how I solved it:
1- head to your "my.cnf" file
(if you know where is your my.cnf, you can skip to step 5)
2- Open your terminal and type:
$ which mysqld
That should return "/usr/sbin/mysqld" or another path depending on your configuration.
3- Use the path returned in step 2 as follows:
$ /your/path/to/mysqld --verbose --help | grep -A 1 "Default options"
That should return something that finishes like:
Default options are read from the following files in the given order:
/etc/mysql/my.cnf ~/.my.cnf /usr/etc/my.cnf
4- Find the file that has a [mysqld] section. if none of them has one, just add it.
Note: writing the section's name is very important SQL won't start otherwise
5- Add the following lines at the end of the file and Save:
[mysqld]
character_set_server = latin1
collation-server = latin1_general_ci
6- Restart your MariaDB server (if you have a brew version):
brew services stop mariadb
brew services start mariadb
7- head to your Django project and use '__icontains' to make case insensitive and accent insensitive queries:
User.objects.filter(name__icontains=jeremy)
>>> ['<User: Jéremy>', '<User: Jérémy>', '<User: Jeremy>']

load data into power BI from relative path

I am trying to find a solution to load an external data file but from a relative path, so when someone else open my PBIX it will still work on his/her computer.
many thanks.

Relative paths are *not* currently supported by Power BI.
To ease the pain, you can create a variable that contains the path where the files are located, and use that variable to determine the path of each table. That way, you only have to change a single place (that variable) and all the tables will automatically point to the new location.
Create a Blank Query, give it a name (e.g. dataFolderPath) and type in the path where your files are (e.g. C:\Users\augustoproiete\Desktop)
With the variable created, edit each of your tables in the Advanced Editor and concatenate your variable with the name of the file.
e.g. instead of "C:\Users\augustoproiete\Desktop\data.xlsx", change it to dataFolderPath & "\data.xlsx"
You can also vote/watch this feature request to be notified when it gets implemented:
Support relative path to excel/csv sources

You can use also the "Parameters" function.
1. Create a new Parameter like "PathExcelFiles"
Parameter_ScreenShot
Edit your "Source" entry
SourceEntry_ScreenShot
Done !

I don't think this is possible yet.
Please add your support for this idea so the Microsoft Power BI team will be more likely to add this as a new feature.

I couldn't bear the fact that there is no possibility to use relative paths, but finally I had to...
So I tried to find a half-decent acceptable workaround.
Using Python-Script it is at least possible to get access to the users %HOME% directory.
let
PySource = Python.Execute("from pathlib import Path#(lf)import pandas as pd#(lf)dataset = pd.DataFrame([[str(Path.home())]], columns = [1])"),
homeDir = Text.Trim(Lines.ToText(PySource{[Name="dataset"]}[Value][1])),
...
The same should be possible with R-Script but didn't do it.
Anybody knows any better solution to get the %HOME% directory inside "Power" Query? I would be glad to have one.
Then I created two scripts inside my working directory install.bat:
#ECHO OFF
if exist "%HOME%\.pbiTemplatePath\filepath.txt" GOTO :ERROR
#This is are the key commands
mkdir "%HOME%\.pbiTemplatePath"
echo|set /p="%cd%" > "%HOME%\.pbiTemplatePath\filepath.txt"
GOTO :END
#Just a little message box
:ERROR
SET msgboxTitle=There is already another working directory installed.
SET /p msgboxBody=<"%HOME%\.pbiTemplatePath\filepath.txt"
SET tmpmsgbox=%temp%\~tmpmsgbox.vbs
IF EXIST "%tmpmsgbox%" DEL /F /Q "%tmpmsgbox%"
ECHO msgbox "%msgboxBody%",0,"%msgboxTitle%">"%tmpmsgbox%"
WSCRIPT "%tmpmsgbox%"
:END
and uninstall_all.bat:
#ECHO OFF
if exist "%HOME%\.pbiTemplatePath\filepath.txt" RMDIR /S /Q "%HOME%\.pbiTemplatePath\"
So in "Power" BI I did this:
let
PySource = Python.Execute("from pathlib import Path#(lf)import pandas as pd#(lf)dataset = pd.DataFrame([[str(Path.home())]], columns = [1])"),
homeDir = Text.Trim(Lines.ToText(PySource{[Name="dataset"]}[Value][1])),
workingDirFile = Text.Combine({homeDir, ".PbiTemplatePath\filepath.txt"} , "\"),
workingDir = Text.Trim(Lines.ToText(Csv.Document(File.Contents(workingDirFile),[Delimiter=";", Columns=1, QuoteStyle=QuoteStyle.None])[Column1])),
...
Now if my git-repository (containing a "Power" BI-template-file and some config-files saying the template where to load the data from and the install/uninstall-scripts). Install has to be executed once and nobody has to copy and paste any path.
I'd be glad about any suggestion of improvement. It's not the solution Gotham deserves... Gotham deserves a better one.

As mentioned by a few people, you can use a dataset parameter and reference that in your script. What I haven't seen mentioned is that you can change these values using an API call:
https://learn.microsoft.com/en-us/rest/api/power-bi/datasets/update-parameters

Copy in Postgresql: Absolute Path Interpreted as Relative Path

I am running this statement in a Django app:
c = connections['default'].cursor()
query="copy (select * from analysis.\"{0}\") to STDOUT DELIMITER ',' CSV HEADER;".format(view_name)
with open(csvFile,'w') as f:
c.copy_expert(query,f)
f.close()
It does not create the correct csv file. Some of the values appear to be in the wrong columns. I am trying to test the SQL statement by running it in POSTGRESQL:
copy (select * from analysis."S03_2005_activity_140807_153431_with_geom") to 'C:/djangoProjects/web_output/csvfiles/S03_2005_activity_140807_153431_with_geom.csv' DELIMITER ',' CSV HEADER;
It gives me: "ERROR: relative path not allowed for COPY to file". I have looked into the issue and it appears to typically be one of two issues: 1. confusing '\' and '/'. My slashes should be correct. 2. The server being on a different computer. I thought this may be my issue as the database is located on an external computer, but I have the connection in my Postgresql. It also runs from Django so I'm not sure why it isn't working from PG Admin.

If you want to store data / get data from your local machine and communicate with a Postgres server on a different, remote machine, you cannot simply use COPY.
Try the meta-command \copy in psql. It's a wrapper for the SQL COPY command and uses local files.
Your filename should work as is on a Windows machine, but Postgres interprets it as a local filename on the server, which is probably a Unix derivate. And there the filename would have to start with '/'.

What is the format of the django config file for manage.py?

I'm hooking up selenose (selenium) tests and using liveserver in the process. It appears that I automatically start running into problems with ports being used so want to configure liveserver to use more that one port. I see how to do that via the command line (--liveserver=localhost:8100-8110) but would like to use a config file.
I have one I'm using for nose already and thought I might be able to reuse it but can't find anything to support that belief and my test runs say it won't work. I was expecting to be able to add something like the following:
[???]
liveserver=localhost:8100-8110
but replace the '???' with an actual header.

for some reason django uses an environment variable for this. you can set it in your settings if you want
import os
os.environ['DJANGO_LIVE_TEST_SERVER_ADDRESS'] = 'localhost:8000-9000'

Django dumpdata UTF-8 (Unicode)

Is there a easy way to dump UTF-8 data from a database?
I know this command:
manage.py dumpdata > mydata.json
But the data I got in the file mydata.json, Unicode data looks like:
"name": "\u4e1c\u6cf0\u9999\u6e2f\u4e94\u91d1\u6709\u9650\u516c\u53f8"
I would like to see a real Unicode string like 全球卫星定位系统 (Chinese).

After struggling with similar issues, I've just found, that xml formatter handles UTF8 properly.
manage.py dumpdata --format=xml > output.xml
I had to transfer data from Django 0.96 to Django 1.3. After numerous tries with dump/load data, I've finally succeeded using xml. No side effects for now.
Hope this will help someone, as I've landed at this thread when looking for a solution..

django-admin.py dumpdata yourapp could dump for that purpose.
Or if you use MySQL, you could use the mysqldump command to dump the whole database.
And this thread has many ways to dump data, including manual methods.
UPDATE: because OP edited the question.
To convert from JSON encoding string to human readable string you could use this:
open("mydata-new.json","wb").write(open("mydata.json").read().decode("unicode_escape").encode("utf8"))

This solution worked for me from #Julian Polard's post.
Basically just add -Xutf8 in front of py or python when running this command:
python -Xutf8 manage.py dumpdata > data.json
Please upvote his answer as well if this worked for you ^_^

You need to either find the call to json.dump*() in the Django code and pass the additional option ensure_ascii=False and then encode the result after, or you need to use json.load*() to load the JSON and then dump it with that option.

Here I wrote a snippet for that.
Works for me!

You can create your own serializer which passes ensure_ascii=False argument to json.dumps function:
# serfializers/json_no_uescape.py
from django.core.serializers.json import *
class Serializer(Serializer):
def _init_options(self):
super(Serializer, self)._init_options()
self.json_kwargs['ensure_ascii'] = False
Then register new serializer (for example in your app __init__.py file):
from django.core.serializers import register_serializer
register_serializer('json-no-uescape', 'serializers.json_no_uescape')
Then you can run:
manage.py dumpdata --format=json-no-uescape > output.json

As YOU has provided a good answer that is accepted, it should be considered that python 3 distincts text and binary data, so both files must be opened in binary mode:
open("mydata-new.json","wb").write(open("mydata.json", "rb").read().decode("unicode_escape").encode("utf8"))
Otherwise, the error AttributeError: 'str' object has no attribute 'decode' will be raised.

I'm usually add next strings in my Makefile:
.PONY: dump
# make APP=core MODEL=Schema dump
dump:
#python manage.py dumpdata --indent=2 --natural-foreign --natural-primary ${APP}.${MODEL} | \
python -c "import sys; sys.stdout.write(sys.stdin.read().encode().decode('unicode_escape'))" \
> ${APP}/fixtures/${MODEL}.json
It's ok for standard django project structure, fix if your project structure is different.

This problem has been fixed for both JSON and YAML in Django 3.1.

here's a new solution.
I just shared a repo on github: django-dump-load-utf8.
However, I think this is a bug of django, and hope someone can merge my project to django.
A not bad solution, but I think fix the bug in django would be better.
manage.py dumpdatautf8 --output data.json
manage.py loaddatautf8 data.json

import codecs
src = "/categories.json"
dst = "/categories-new.json"
source = codecs.open(src, 'r').read().decode('string-escape')
codecs.open(dst, "wb").write(source)

I encountered the same issue. After reading all the answers, I came up with a mix of Ali and darthwade's answers:
manage.py dumpdata app.category --indent=2 > categories.json
manage.py shell
import codecs
src = "/categories.json"
dst = "/categories-new.json"
source = codecs.open(src, "rb").read().decode('unicode-escape')
codecs.open(dst, "wb","utf-8").write(source)
In Python 3, I had to open the file in binary mode and decode as unicode-escape. Also I added utf-8 when I open in write (binary) mode.
I hope it helps :)

Here is the solution from djangoproject.com
You go to Settings there's a "Use Unicode UTF-8 for worldwide language support", box in "Language" - "Administrative Language Settings" - "Change system locale" - "Region Settings". If we apply that, and reboot, then we get a sensible, modern, default encoding from Python.
djangoproject.com

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

mysqldump with wrong vowels - search and replace? - replace

Look at the following TYPO3Wiki entry. Here are some method described how to convert the Data into utf8: https://wiki.typo3.org/UTF-8_support#Possibility_2

Related

Is there a way to perform accent-insensitive lookups using Django and MariaDB?

load data into power BI from relative path

Copy in Postgresql: Absolute Path Interpreted as Relative Path

What is the format of the django config file for manage.py?

Django dumpdata UTF-8 (Unicode)

Categories

Resources