How to use aws cli with cli-input-json - amazon-web-services

I was having trouble using the cli-input-json by following the AWS documentation which seems simple enough (here), but was as per usual missing a key element that they assume everyone knows. That being that the JSON file needs to be UTF-8 encoded. By default when using Visual Studio Code & Powershell, it would get the skeleton file in UTF-16 LE encoding.
The error that the CLI would provide was a JSON formatting error.
Props goes to Sudarpo Chong with a comment on a similiar post, but doesn't focus on this as a key issue.
Summary:
ensure JSON file in UTF-8 encoding.

Related

'TypeError at /api/chunked_upload/ Unicode-objects must be encoded before hashing' errorwhen using botocore in Django project

I have hit a dead end with this problem. My code works perfectly in development but when I deploy my project and configure DigitalOcean Spaces & S3 bucket I get the following error when uploading media:
TypeError at /api/chunked_upload/
Unicode-objects must be encoded before hashing
I'm using django-chucked-uploads and it doesn't play well with Botocore
I'm using Python 3.7
My code is taken from this demo: https://github.com/juliomalegria/django-chunked-upload-demo
Any help will be massively helpful
This library was implemented for Python 2, so there might be a couple of things that don't work out of the box with Python 3.
This issue that you're facing is one of them since files in Python 3 are read directly as Unicode (since now py3's str is py2's unicode). The md5 hashing is the part of the code triggering this exception (this line) because it doesn't expect Unicode strings.
If you have created your own model inheriting from AbstractChunkedUpload, you can override the md5 property to encode the chunks before updating the hash. See this other SO question on how to solve this specific.
Hopefully this helped!
Disclaimer: I'm the creator of this library. However, I haven't maintained it in a long time to the point that it might be no longer usable.

Google Cloud Dataflow removing accents and special chars with '??'

This is going to be quite a hit or miss question as I don't really know which context or piece of code to give you as it is a situation of it works in local, which does!
The situation here is that I have several services, and there's a step where messages are put in a PubSub topic awaiting for the Dataflow consumer to handle them and save as .parquet files (I also have another one which sends that payload to a HTTP endpoint).
The thing is, the message in that service prior sending it to that PubSub topic seems to be correct, Stackdriver logs show all the chars as they should be.
However, when I'm going to check the final output in .parquet or in the HTTP endpoint I just see, for example h?? instead of hí, which seems pretty weird as running everything in local makes the output be correct.
I can only think about encoding server-wise when deploying the Dataflow as a job and not running in local.
Hope someone can shed some light in something this abstract.
The strange thing is that it works locally.
But as a workaround, the first thing that comes to mind is to use encoding.
Are you using at some point a function to convert your string input as bytes?
If yes, you could try to force getBytes() to use utf-8 encoding by passing by the argument like in the following example from this Stackoverflow thread:
byte[] bytes = string.getBytes("UTF-8");
// feed bytes to Base64
// get bytes from Base64
String string = new String(bytes, "UTF-8");
Also:
- Have you tried setting the parquet.enable.dictionary option?
- Are your original files written in utf-8 before conversion?
Google Cloud Dataflow (at least the Java SDK) replaces Spanish characters like 'ñ' or accents 'á','é',' etc with the symbol � since the default charset of the JVM installed on service workers is US-ASCII. So, if UTF-8 is not explicitly declared when you instantiate strings or their relative byte-arrays transformation, the platform default encoding will be used.

Export CSV to open as utf-8 in Excel on MAC using Python 2.7

We have users who need to be able to export data to a csv, which they open in Excel on mac machines, that support utf-8 characters.
NOTE: We don't want our users to have to go to the data tab, click import from text, then... We want them to be able to open the file immediately after downloading and have it display the correct info.
At first, I thought this was just an encoding/decoding problem since we are using python 2.7 (actively working on upgrading to python 3.6), but after that was fixed, I discovered Excel was the cause of the problem (as the csv works fine when opened in a text editor or even Numbers). The solution i am trying involves adding the utf-8 BOM to the beginning of the file as I read somewhere that this would let Excel know that it requires utf-8.
#Here response is just a variable that is valid when used like this and
#we can export CSV's fine that don't need utf-8
writer = csv.writer(response)
writer.writerow("0xEF0xBB0xBF")
I was hoping that but just adding the utf-8 BOM to the beginning of the csv file like this would allow Excel to realize it needed to use utf-8 encoding when opening this file, but alas it does not work. I am not sure if this is because Excel for MAC doesn't support this or if I simply added the BOM incorrectly.
Edit: I'm not sure why I didn't mention it, as it was critical in the solution, but we are using Django. I found this stack overflow post that gave the solution (which I've included below).
Because we are using Django, we were able to just include:
response.write('\xEF\xBB\xBF')
before creating a csv writer and adding the content to the csv.
Another idea that probably would have lead to a solution is opening the file normally, adding the BOM, and then creating a csv writer (Note: I did not test this idea, but if the above solution doesn't work for someone/they aren't using Django, it is an idea to try).

While writing records in a flat file using Informatica ETL job, greek characters are coming as boxes

While writing records in a flat file using Informatica ETL job, greek characters are coming as boxes.We can see original characters in the database.In session level, we are using UTF-8 encoding.We have a multi language application and need to process Chinese, Russian, Greek,Polish,Japanese etc. characters.Please suggest.
try to change your page encoding. I also faced this kind of issue. We are using ANSII encoding, hence we created separate integration service with different encoding and file ran successfully.
There is an easy option. In session properties, select target flat file then click set file propeties. In that you can change the code-page. There you can choose UTF-8. By default it is in ANSII, that's why you are facing this issue.

Storing UTF-8 XML using Word's CustomXMLPart or any other supported way

I am writing a Word add-in which is supposed to store some own XML data per document using Word object model and its CustomXMLPart. The problem I am now facing is the lack of IStream-like functionality for reading/writing XML to/from a CustomXMLPart. It only provides BSTR interface and I am puzzled how to handle UTF-8 XMLs with BSTRs. To my understanding an UTF-8 XML file should really never have to undergo this sort of Unicode conversion. I am not sure what to expect as a result here.
Is there another way of using Word automation interfaces to store arbitrary custom information inside a DOCX file?
The "package" is an OPC document (Open Packaging Convention), which is basically a structured zip folder with a different extension (e.g. .pptx, .docx, .xps, etc.). You can get that file in stream and manipulate it any which way you like - but not artibitrarily. It will not be recognized as valid docx if you put things in the wrong places (not just xml elements, but also files in the folders inside the zip file). But if you're just talking "artibitrary" meaning CustomXMLPart, then that's okay.
This is a good kicker page to learn more about the Open XML SDK and if you're up to it, which allows for somewhat easier access to the file formats than using (.NET) System.IO.Packaging or a third-party zip library. To go deeper, grab the eBook (free) Open XML Explained.
With the Open XML SDK (again, this can all be done without the SDK) in .NET, this is what you'll want to do: How to: Insert Custom XML to an Office Open XML Package by Using the Open XML API.