Using the email.header package, I can do
the_text,the_charset = decode_header(inputText)
to get the character set of the email header, where the inputText was retrieved by a command like
inputText = msg.get('From')
to use the From: header as an example.
in order to extract the header encoding for that header, do I have to do something like this?:
the_header_encoding = email.charset.Charset(the_charset).header_encoding
That is, do I have to create an instance of the Charset class based on the name of the charset (and would that even work?), or is there a way to extract the header encoding more directly from the header itself?
Encoded-Message header can consist of 1 or more lines, and each line can use a different encoding, or no encoding at all.
You'll have to parse the type of encoding out yourself, one per line. Using a regular expression:
import re
quopri_entry = re.compile(r'=\?[\w-]+\?(?P<encoding>[QB])\?[^?]+?\?=', flags=re.I)
encodings = {'Q': 'quoted-printable', 'B': 'base64'}
def encoded_message_codecs(header):
used = []
for line in header.splitlines():
entry = quopri_entry.search(line)
if not entry:
used.append(None)
continue
used.append(encodings.get(entry.group('encoding').upper(), 'unknown'))
return used
This returns a list of strings drawn from quoted-printable, base64, unknown or None if no Encoded-Message was used for that line.
Related
We are using axios to pass GET request to our django instance, that splits it into search terms and runs the search. This has been working fine until we ran into an edge case. We use urlencode to ensure that strings do not have empty spaces or others
So generalize issue, we have TextField called "name" and we want to search for term "A & B Company". However, issue is that when the request reaches django.
What we expected was that name=A%20&%20B%20Company&field=value would be parsed as name='A & B Company' and field='value'.
Instead, it is parsed as name='A ' 'B Company' and field='value'. The & symbol is incorrectly treated as separator, despite being encoded.
Is there a way to indicate django GET parameter that certain & symbols are part of the value, instead of separators for fields?
You can use the lib urllib
class ModelExample(models.Model):
name = models.TextField()
# in view...
from urllib.parse import parse_qs
instance = ModelExample(name="name=A%20&%20B%20Company&field=value")
dict_qs = parse_qs(instance.name)
dict_qs contains a dict with decoded querystring
You can find more informations about urllib.parse here: https://docs.python.org/3/library/urllib.parse.html
I'm trying to write my own Jekyll plugin to construct an api query from a custom tag. I've gotten as far as creating the basic plugin and tag, but I've run into the limits of my programming skills so looking to you for help.
Here's my custom tag for reference:
{% card "Arbor Elf | M13" %}
Here's the progress on my plugin:
module Jekyll
class Scryfall < Liquid::Tag
def initialize(tag_name, text, tokens)
super
#text = text
end
def render(context)
# Store the name of the card, ie "Arbor Elf"
#card_name =
# Store the name of the set, ie "M13"
#card_set =
# Build the query
#query = "https://api.scryfall.com/cards/named?exact=#{#card_name}&set=#{#card_set}"
# Store a specific JSON property
#card_art =
# Finally we render out the result
"<img src='#{#card_art}' title='#{#card_name}' />"
end
end
end
Liquid::Template.register_tag('cards', Jekyll::Scryfall)
For reference, here's an example query using the above details (paste it into your browser to see the response you get back)
https://api.scryfall.com/cards/named?exact=arbor+elf&set=m13
My initial attempts after Googling around was to use regex to split the #text at the |, like so:
#card_name = "#{#text}".split(/| */)
This didn't quite work, instead it output this:
[“A”, “r”, “b”, “o”, “r”, “ “, “E”, “l”, “f”, “ “, “|”, “ “, “M”, “1”, “3”, “ “]
I'm also then not sure how to access and store specific properties within the JSON response. Ideally, I can do something like this:
#card_art = JSONRESPONSE.image_uri.large
I'm well aware I'm asking a lot here, but I'd love to try and get this working and learn from it.
Thanks for reading.
Actually, your split should work – you just need to give it the correct regex (and you can call that on #text directly). You also need to escape the pipe character in the regex, because pipes can have special meaning. You can use rubular.com to experiment with regexes.
parts = #text.split(/\|/)
# => => ["Arbor Elf ", " M13"]
Note that they also contain some extra whitespace, which you can remove with strip.
#card_name = parts.first.strip
#card_set = parts.last.strip
This might also be a good time to answer questions like: what happens if the user inserts multiple pipes? What if they insert none? Will your code give them a helpful error message for this?
You'll also need to escape these values in your URL. What if one of your users adds a card containing a & character? Your URL will break:
https://api.scryfall.com/cards/named?exact=Sword of Dungeons & Dragons&set=und
That looks like a URL with three parameters, exact, set and Dragons. You need to encode the user input to be included in a URL:
require 'cgi'
query = "https://api.scryfall.com/cards/named?exact=#{CGI.escape(#card_name)}&set=#{CGI.escape(#card_set)}"
# => "https://api.scryfall.com/cards/named?exact=Sword+of+Dungeons+%26+Dragons&set=und"
What comes after that is a little less clear, because you haven't written the code yet. Try making the call with the Net::HTTP module and then parsing the response with the JSON module. If you have trouble, come back here and ask a new question.
I have a sequence with string property that I added to end of uri string. If string has special symbols like '?','/' and etc, they encodes to uri encoded string and broke uri. For example:
api/res?param1=val1¶m=val2
becomes
api/res?param1=val1%26param2%3Dval2
api/res?param1=val1 - main part of uri
¶m=val2 = uri.var.param part from Parameter Mediator that I had add to uri by template like: uri-template="/api/res?param1=val1{uri.var.param}"
You may use legacy encoding for this purpose and it will just append without doing any change.
For example like follows
uri-template="/api/res?param1=val1{+uri.var.param}
Please note the + sign there.
Thanks
I have a an url encoded with URL encoding, namely : /filebrowser/?cd=bank/fran%E7ais/essais
The problem is that if I retrieve the argument through :
path = request.GET.get('relative_h', None)
I get :
/filebrowser/?cd=bank/fran�ais/essais
instead of:
/filebrowser/?cd=bank/français/essais
or :
/filebrowser/?cd=bank/fran%E7ais/essais
Yet, %E7 does correspond to 'ç', as you can see there.
And since the %E7 is decoded with the replacement character, I can't even use urllib.parse.unquote to get my 'ç' back...
Is there a way to get the raw argument or the correctly decoded string?
Switching the request encoding to latin-1 before accessing the parameter returned the correctly decoded string for me, when running your example locally.
request.encoding = 'latin-1'
path = request.GET.get('relative_h', None)
However, I'm not able to tell you why that would be, since I would have assumed that the default encoding of utf-8 would have handled that particular character.
I am trying to use ConfigParser in the following situation. I am running some code after which i have an object with several attributes. I pick out some of these attributes and write them to a .ini file with configparser. This works fine and my .ini file looks something like this.
[Section]
path = "D:\\"
number = 10.0
boolean = False
Then with some other script I want to read the file and add the items as attributes to another object (self) using.
parser.read('file.ini')
self.__dict__.update(dict(parser.items("Section")))
This fails because all values are read as strings by configparser and now all attributes are strings. parser.items("Section") looks like this:
[('path', '"D:\\"'), ('number', '10.0'), ('boolean', 'False')]
Now I could go and specify the floats, integers, and booleans by their keys and use the corresponding methods parser.getfloat, getint, and getboolean to get the right python types out. However, that means making an extra list of keys and a loop to get them for each data type, which i don't want. Furthermore, even the path is now double quoted and i need to start removing quotes.
This behavior makes ConfigParser almost completely worthless to me and I am doubting if I am using it correctly an if ConfigParser is the best option for my goal, which is simply to write object attributes to a file and at a later time read the file and add the parameters to a new object as attributes. Can I use ConfigParser for this effectively? Or should I start looking for a different method?
INI is not JSON. There are no data types. There are sections and key/value pairs. Stuff before the = is the key, stuff after it is the value. Everything is text.
There are no "strings", which means there is no need to double quote anything. The same goes for "escaped" backslashes. The concept of escaping does not exist in the INI file format.
So, first off, your file should be looking like this:
[Section]
path = D:\
number = 10.0
boolean = False
Next, I consider this a dangerous operation
parser.read('file.ini')
self.__dict__.update(dict(parser.items("Section")))
because it potentially trashes your class instance with arbitrary values from a text file that might contain garbage, but when you can swear that the file will always be fine, well, do it if you must.
I would write a more explicit way of reading and validating config data. I sense your reason not to do that is laziness or a false sense of simplicity; both of these reasons are not very good.
If you want a semi-automated way of type conversion, tag the keys with a data type.
[Section]
s_path = D:\
f_number = 10.0
b_boolean = False
and write a function that knows how to handle them and throws when something is not right.
def type_convert(items):
result = []
for (key, value) in items:
type_tag = key[:2]
if type_tag == "s_":
result.append((key[2:], value))
elif type_tag == "f_":
result.append((key[2:], float(value)))
elif type_tag == "b_":
result.append((key[2:], bool(value)))
else:
raise ValueError('Invalid type tag "%s" found in ini file.' % type_tag)
# alternatively: "everything else defaults to string"
return result
which you can use to make the operation somewhat more convenient:
self.__dict__.update(dict(type_convert(parser.items("Section"))))
Of course you still run the risk of trashing your class instance by overriding keys that should not be overridden.