Issue with accents or Special characters in Json (Abap)

Issue with accents or Special characters in Json (Abap) - web-services

I am consuming a HTTP service from abap.
The service returns me a json with the following data:
{
"statusCode": 200,
"message": "Ã©xito",
"data": [
{
"_id": "584e9469df829275019c4a74",
"nombre": "COCHAMÃ“",
"Ãštil": "Si",
"email": "supervisor#demo.com",
"Sms NÃºmero de telÃ©fono": "981363931",
"Llamar al telÃ©fono": "26944444",
"Radio de bÃºsquedaPedido PÃºblico(Km) 1": 3,
"Radio de bÃºsquedaPedido PÃºblico(Km) 2": 3,
"Radio de bÃºsquedaPedido PÃºblico(Km) 3": 3,
"Tiempo de Descarga masa(min)": 10,
"Radio de bÃºsquedaPedido Privado(Km)": 1,
"Cola de Pedidos(n)": 6,
"Tiempo de Esperapara Asignar pedidos(Sgds)": 45,
"Hora de finalizaciÃ³n": "21:00"
}
]
}
The code:
Call method cl_http_client=>create_by_url
Exporting
url = lv_url
Importing
client = Data(lcl_client)
Exceptions
argument_not_found = 1
plugin_not_active = 2
internal_error = 3
Others = 4.
If sy-subrc Ne 0.
Raise urlexception.
Else.
Data(lcl_request) = lcl_client->request.
lcl_client->request->set_method( if_http_request=>co_request_method_post ).
lcl_request->set_form_field( name = Parametro1 value = lv_mail ).
lcl_request->set_form_field( name = Parametro2 value = lv_password ).
If idcomuna Is Not Initial.
lv_comunasap = idcomuna.
lcl_request->set_form_field( name = Parametro3 value = lv_comunasap ).
Endif.
If idcomunagc Is Not Initial.
lv_comunamongo = idcomunagc.
lcl_request->set_form_field( name = Parametro4 value = lv_comunamongo ).
Endif.
cl_http_utility=>set_request_uri( request = lcl_request
uri = lv_url ).
Call method lcl_client->send
Exceptions
http_communication_failure = 1
http_invalid_state = 2
http_processing_failed = 3
http_invalid_timeout = 4
Others = 5.
If sy-subrc Ne 0.
Raise sendexception.
Else.
Call method lcl_client->receive
Exceptions
http_communication_failure = 1
http_invalid_state = 2
http_processing_failed = 3
Others = 4.
If sy-subrc <> 0.
Else.
lcl_client->response->get_status( Importing code = Data(lv_code) reason = Data(lv_reason) ).
Data(lv_respuesta) = lcl_client->response->get_cdata( ).
Originally the json should come with accents(Spanish names).
The Ã©, Ãº are one of the strange characters that it should be letter with accent.
How to get the json full data with accents in my ABAP program?

Whatever it's output on a display or in a variable, #Jagger is right, the response is returned in UTF-8. As you use GET_CDATA (get characters), I think that SAP takes the explicit charset given in the response header (Content-Type: text/json;charset=utf-8), and so it should be converted correctly. If it's not, then maybe the charset is missing in the header.
So, if it's not given, then do the conversion yourself, the same way as for any other UTF-8:
First of all, use GET_DATA (not GET_CDATA) to read it as a string of bytes, then convert it into a string of characters by using the method CONVERT_FROM (codepage = `utf-8`) of the class CL_ABAP_CODEPAGE.

Related

How to get the candy machine ID using a NFT created by the candy machine?

Suppose I only have the mint address of a single NFT created by a specific candy machine.
How can I use the mint address and ultimately get the candy machine ID? is it even possible?

A fast way to get the CMid using an NFT is fetching the first tx that the NFT has (the oldest one) and checking the fifth instruction, then the first account on this instruction is the Candy Machine used to create and mint the NFT.
For example lets take this NFT 3GXHJJd1DfEn1PVip87uUJXjeW1jDgeJb3B7a6xHWAeJ, the oldest transaction that is has is this one. Then you can see on the image below that the first account on the 5th instruction is: H2oYLkXdkX38eQ6VTqs26KAWAvEpYEiCtLt4knEUJxpu (Note that this CM account is empty because they withdraw and close the account after mint).
You can do it using some explorer of with code using solana/web3.js

as per the official documentation:
https://docs.metaplex.com/guides/mint-lists
The typical method to create the mint list is to a use a tool that finds all NFTs with a specific creator in the first position of the creators array. If your NFTs were minted with a candy machine this will be the candy machine creator id by default. If you have multiple candy machines that are part of the collection, you can create a separate mint list for each candy machine and combine them together to create a single mint list which you provide to the marketplace(s) you are listing with.
And how to get the creators from a mint address is by getting the metadata associated with the mint address.
The metadata is encoded in a specific format for which you can use the metaplex libraries to decode.
Here is a simple python example: https://github.com/michaelhly/solana-py/issues/48#issuecomment-1073077165
def unpack_metadata_account(data):
assert(data[0] == 4)
i = 1
source_account = base58.b58encode(bytes(struct.unpack('<' + "B"*32, data[i:i+32])))
i += 32
mint_account = base58.b58encode(bytes(struct.unpack('<' + "B"*32, data[i:i+32])))
i += 32
name_len = struct.unpack('<I', data[i:i+4])[0]
i += 4
name = struct.unpack('<' + "B"*name_len, data[i:i+name_len])
i += name_len
symbol_len = struct.unpack('<I', data[i:i+4])[0]
i += 4
symbol = struct.unpack('<' + "B"*symbol_len, data[i:i+symbol_len])
i += symbol_len
uri_len = struct.unpack('<I', data[i:i+4])[0]
i += 4
uri = struct.unpack('<' + "B"*uri_len, data[i:i+uri_len])
i += uri_len
fee = struct.unpack('<h', data[i:i+2])[0]
i += 2
has_creator = data[i]
i += 1
creators = []
verified = []
share = []
if has_creator:
creator_len = struct.unpack('<I', data[i:i+4])[0]
i += 4
for _ in range(creator_len):
creator = base58.b58encode(bytes(struct.unpack('<' + "B"*32, data[i:i+32])))
creators.append(creator)
i += 32
verified.append(data[i])
i += 1
share.append(data[i])
i += 1
primary_sale_happened = bool(data[i])
i += 1
is_mutable = bool(data[i])
metadata = {
"update_authority": source_account,
"mint": mint_account,
"data": {
"name": bytes(name).decode("utf-8").strip("\x00"),
"symbol": bytes(symbol).decode("utf-8").strip("\x00"),
"uri": bytes(uri).decode("utf-8").strip("\x00"),
"seller_fee_basis_points": fee,
"creators": creators,
"verified": verified,
"share": share,
},
"primary_sale_happened": primary_sale_happened,
"is_mutable": is_mutable,
}
return metadata
and
METADATA_PROGRAM_ID = PublicKey('metaqbxxUerdq28cj1RbAWkYQm3ybzjb6a8bt518x1s')
def get_nft_pda(mint_key):
return(PublicKey.find_program_address([b'metadata', bytes(METADATA_PROGRAM_ID), bytes(PublicKey(mint_key))],METADATA_PROGRAM_ID)[0])
def get_metadata(mint_key):
data = base64.b64decode(solana_client.get_account_info(get_nft_pda(mint_key))['result']['value']['data'][0])
return(unpack_metadata_account(data))
An example getting Candy Machine ID using this method for the "Aw4RhpcW5rod2Afhp7dhv2XrMZyNJpzVdYkjJ7kkYzpS" mint address would result in:
"update_authority": "DGNZDSvy6emDXvBuCDRrpLVxcPaEcvKiStvvCivEJ38X",
"mint": "Aw4RhpcW5rod2Afhp7dhv2XrMZyNJpzVdYkjJ7kkYzpS",
"data": {
"name": "Shadowy Super Coder #5240",
"symbol": "SSC",
"uri": "https://shdw-drive.genesysgo.net/8yHTE5Cz3hwcTdghynB2jgLuvKyRgKEz2n5XvSiXQabG/5240.json",
"seller_fee_basis_points": 500,
"creators": [
"71ghWqucipW661X4ht61qvmc3xKQGMBGZxwSDmZrYQmf",
"D6wZ5U9onMC578mrKMp5PZtfyc5262426qKsYJW7nT3p"
],
"verified": [
1,
0
],
"share": [
0,
100
]
},
"primary_sale_happened": true,
"is_mutable": true
}
In this case, for the Collection SSC, the candy machine ID is 71ghWqucipW661X4ht61qvmc3xKQGMBGZxwSDmZrYQmf

In Power Query, how can I remove duplicates either side of a delimiter?

I wish to turn : into :
For example amazon:amazon becomes amazon:
This is doable by hand using the replace values function but I need a way to do it programatically.
Thanks!

You can try this Transform but if it doesn't work, provide detail as to the
nature of the failure
examples of data on which it doesn't work
any error messages and the line which returns the error
remDups = Table.TransformColumns(#"Changed Type",{"Column1", each
let
sep = ":",
splitList = Text.Split(_, " "),
sepString = List.FindText(splitList,sep){0},
sepStringPosition = List.PositionOf(splitList,sepString),
//rem if the same remove last
splitSep = Text.Split(sepString, sep),
replString = if splitSep{0} = splitSep{1} then splitSep{0} & sep else sepString,
//put the string backtogether
replList = List.ReplaceRange(splitList,sepStringPosition,1,{replString})
in
Text.Combine(replList," ")
})

M & Power Query: How to use the $Skip ODATA expression within a loop?

Good afternoon all,
I'm trying to call all of the results within an API that has:
6640 total records
100 records per page
67 pages of results (total records / records per page)
This is an ever growing list so I've used variables to create the above values.
I can obviously use the $Skip ODATA expression to get any one of the 67 pages by adding the expression to the end of the URL like so (which would skip the first 100, therefore returning the 2nd page:
https://psa.pulseway.com/api/servicedesk/tickets/?$Skip=100
What I'm trying to do though is to create a custom function that will loop through each of the 67 calls, changing the $Skip value by an increment of 100 each time.
I thought I'd accomplished the goal with the below code:
let
Token = "Token",
BaseURL = "https://psa.pulseway.com/api/",
Path = "servicedesk/tickets/",
RecordsPerPage = 100,
CountTickets = Json.Document(Web.Contents(BaseURL,[Headers = [Authorization="Bearer " & Token],RelativePath = Path & "count"])),
TotalRecords = CountTickets[TotalRecords],
GetJson = (Url) =>
let Options = [Headers=[ #"Authorization" = "Bearer " & Token ]],
RawData = Web.Contents(Url, Options),
Json = Json.Document(RawData)
in Json,
GetPage = (Index) =>
let Skip = "$Skip=" & Text.From(Index * RecordsPerPage),
URL = BaseURL & Path & "?" & Skip,
Json = GetJson(URL)
in Json,
TotalPages = Number.RoundUp(TotalRecords / RecordsPerPage),
PageIndicies = {0.. TotalPages - 1},
Pages = List.Transform(PageIndicies, each GetPage(_))
in
Pages
I got all happy when it successfully made the 67 API calls and combined the results into a list for me to load in to a Power Query table, however what I'm actually seeing is the first 100 records repeated 67 times.
That tells me that my GetPage custom function which handles the $Skip value isn't changing and is stuck on the first one. To make sure the Skip index was generating them properly I duplicated the query and changed the code to load in the $Skip values and see what they are, expecting them all to be $Skip=0, what I see though is the correct $Skip values as below:
Image showing correct Skip values
It seems everything is working as it should be, only I'm only getting the first page 67 times.
I've made a couple of posts on other community site around this issue before but I realise the problem I was (poorly) describing was far too broad to get any meaningful assistance. I think now I've gotten to the point where I understand what my own code is doing and have really zoomed in to the problem - I just don't know how to fix it when I'm at the final hurdle...
Any help/advice would be massively appreciated. Thank you.
Edit: Updated following #RicardoDiaz answer.
let
// Define base parameters
Filter = "",
Path = "servicedesk/tickets/",
URL = "https://psa.pulseway.com/api/",
Token = "Token",
Limit = "100",
// Build the table based on record start and any filters
GetEntityRaw = (Filter as any, RecordStart as text, Path as text) =>
let
Options = [Headers=[ #"Authorization" = "Bearer " & Token ]],
URLbase = URL & Path & "?bearer=" & Token & "&start=" & RecordStart & "&limit=" & Text.From(Limit),
URLentity = if Filter <> null then URLbase & Filter else URLbase,
Source = Json.Document(Web.Contents(URLentity, Options)),
Result = Source[Result],
toTable = Table.FromList(Result, Splitter.SplitByNothing(), null, null, ExtraValues.Error)
in
toTable,
// Recursively call the build table function
GetEntity = (optional RecordStart as text) as table =>
let
result = GetEntityRaw(Filter, RecordStart, Path),
nextStart = Text.From(Number.From(RecordStart) + Limit),
nextTable = Table.Combine({result, #GetEntity(nextStart)}),
check = try nextTable otherwise result
in
check,
resultTable = GetEntity("0")
in
resultTable

As I couldn't test your code, it's kind of hard to provide you a concrete answer.
Said that, please review the generic code I use to connect to an api and see if you can find where yours is not working
EDIT: Changed api_record_limit type to number (removed the quotation marks)
let
// Define base parameters
api_url_filter = "",
api_entity = "servicedesk/tickets/",
api_url = "https://psa.pulseway.com/api/",
api_token = "Token",
api_record_limit = 500,
// Build the table based on record start and any filters
fx_api_get_entity_raw = (api_url_filter as any, api_record_start as text, api_entity as text) =>
let
api_url_base = api_url & api_entity & "?api_token=" & api_token & "&start=" & api_record_start & "&limit=" & Text.From(api_record_limit),
api_url_entity = if api_url_filter <> null then api_url_base & api_url_filter else api_url_base,
Source = Json.Document(Web.Contents(api_url_entity)),
data = Source[data],
toTable = Table.FromList(data, Splitter.SplitByNothing(), null, null, ExtraValues.Error)
in
toTable,
// Recursively call the build table function
fxGetEntity = (optional api_record_start as text) as table =>
let
result = fx_api_get_entity_raw(api_url_filter, api_record_start, api_entity),
nextStart = Text.From(Number.From(api_record_start) + api_record_limit),
nextTable = Table.Combine({result, #fxGetEntity(nextStart)}),
check = try nextTable otherwise result
in
check,
resultTable = fxGetEntity("0"),
expandColumn = Table.ExpandRecordColumn(
resultTable,
"Column1",
Record.FieldNames(resultTable{0}[Column1]),
List.Transform(Record.FieldNames(resultTable{0}[Column1]), each _)
)
in
expandColumn
QUESTION TO OP:
Regarding this line:
Result = Source[Result],
Does the json return a field called result instead of data?

Finding position of a sequence of words (strings) in a sentence

I have a sentence with two markers <e1> and </e1>. I need the index of the position of the sequence of the words between these two markers. Note that the , and other possible characters are counted!
sent="Hi please help me to <e1>solve, this problem please</e1> Thank you."
What I need (the desired output):
[5, 6, 7, 8, 9]
If you count each word from the beginning of the sentence, I need the index of the sequence between two markers:
solve -> 5
, -> 6
this -> 7
problem -> 8
please -> 9
I tried these two solutions:
Solution 1:
sent="Hi please help me to <e1>solve, this problem please</e1> Thank you."
E1 = re.search('<e1>(.*)</e1>', sent).group(1)
sent = sent.replace('<e1>', '')
sent = sent.replace('</e1>', '')
sent = word_tokenize(sent)
E1_indx = []
E1_lis = word_tokenize(E1)
print(E1_lis)
for item in E1_lis:
E1_indx.append(sent.index(item))
print(E1_indx)
But the output is:
[5, 6, 7, 8, 1]
Solution 2:
sent="Hi please help me to <e1>solve, this problem please</e1> Thank you."
e1_st = re.findall(r'<e1>\w+', sent)
e1_end = re.findall(r'\w+</e1>', sent)
e1_st=(''.join(str(x) for x in e1_st))
e1_end=(''.join(str(x) for x in e1_end))
e1_st = e1_st.replace('<e1>', '')
e1_end = e1_end.replace('</e1>', '')
sent = sent.replace('<e1>', '')
sent = sent.replace('</e1>', '')
sent = word_tokenize(sent)
print(list(range(sent.index(e1_st), sent.index(e1_end)+1)))
Output:
[]
The problem arises when there is a repetitive word of sequence before it (here "please").
Is there any straightforward solution?

It looks like this question.
If you compute the offsets as following and remove the markers you should have the expected results.
sub_b = sent.find('<e1>')
sent = sent.replace('<e1>')
sub_e = sent.find('</e1>')
sent = sent.replace('</e1>')

extra commas when using read_csv causing too many "s in data frame

I'm trying to read in a large file (~8Gb) using pandas read_csv. In one of the columns in the data, there is sometimes a list which includes commas but it enclosed by curly brackets e.g.
"label1","label2","label3","label4","label5"
"{A1}","2","","False","{ "apple" : false, "pear" : false, "banana" : null}
Therefore, when these particular lines were read in I was getting the error "Error tokenizing data. C error: Expected 37 fields in line 35, saw 42". I found this solution which said to add
sep=",(?![^{]*})" into the read_csv arguments which worked with splitting the data correctly. However, the data now includes the quotation marks around every entry (this didn't happen before I added the sep argument in).
The data looks something like this now:
"label1" "label2" "label3" "label4" "label5"
"{A1}" "2" "" "False" "{ "apple" : false, "pear" : false, "banana" : null}"
meaning I can't use, for example, .describe(), etc on the numerical data because they're still strings.
Does anyone know of a way of reading it in without the quotation marks but still splitting the data where it is?
Very new to Python so apologies if there is an obvious solution.
serialdev found a solution to removing the "s but the data columns are objects and not what I would expect/want, e.g. the integer values aren't seen as integers.
The data needs to be split at "," explicitly (including the "s), is there a way of stating that in the read_csv arguments?
Thanks!

To read in the data structure you specified, where the last element is an unknown length.
"{A1}","2","","False","{ "apple" : false, "pear" : false, "banana" : null}"
"{A1}","2","","False","{ "apple" : false, "pear" : false, "banana" : null, "orange": "true"}"
Change the separate to a regular expression using a negative forward lookahead assertion. This will enable you to separate on a ',' only when not immediately followed by a space.
df = pd.read_csv('my_file.csv', sep='[,](?!\s)', engine='python', thousands='"')
print df
0 1 2 3 4
0 "{A1}" 2 NaN "False" "{ "apple" : false, "pear" : false, "banana" :...
1 "{A1}" 2 NaN "False" "{ "apple" : false, "pear" : false, "banana" :...
Specifying the thousands separator as the quote is a bit of a hackie way to parse fields contains a quoted integer into the correct datatype. You can achieve the same result using converters which can also remove the quotes from the strings should you need it to and cast "True" or "False" to a boolean.

If need remove " from column, use vectorized function str.strip:
import pandas as pd
mydata = [{'"first_name"': '"Bill"', '"age"': '"7"'},
{'"first_name"': '"Bob"', '"age"': '"8"'},
{'"first_name"': '"Ben"', '"age"': '"9"'}]
df = pd.DataFrame(mydata)
print (df)
"age" "first_name"
0 "7" "Bill"
1 "8" "Bob"
2 "9" "Ben"
df['"first_name"'] = df['"first_name"'].str.strip('"')
print (df)
"age" "first_name"
0 "7" Bill
1 "8" Bob
2 "9" Ben
If need apply function str.strip() to all columns, use:
df = pd.concat([df[col].str.strip('"') for col in df], axis=1)
df.columns = df.columns.str.strip('"')
print (df)
age first_name
0 7 Bill
1 8 Bob
2 9 Ben
Timings:
mydata = [{'"first_name"': '"Bill"', '"age"': '"7"'},
{'"first_name"': '"Bob"', '"age"': '"8"'},
{'"first_name"': '"Ben"', '"age"': '"9"'}]
df = pd.DataFrame(mydata)
df = pd.concat([df]*3, axis=1)
df.columns = ['"first_name1"','"age1"','"first_name2"','"age2"','"first_name3"','"age3"']
#create sample [300000 rows x 6 columns]
df = pd.concat([df]*100000).reset_index(drop=True)
df1,df2 = df.copy(),df.copy()
def a(df):
df.columns = df.columns.str.strip('"')
df['age1'] = df['age1'].str.strip('"')
df['first_name1'] = df['first_name1'].str.strip('"')
df['age2'] = df['age2'].str.strip('"')
df['first_name2'] = df['first_name2'].str.strip('"')
df['age3'] = df['age3'].str.strip('"')
df['first_name3'] = df['first_name3'].str.strip('"')
return df
def b(df):
#apply str function to all columns in dataframe
df = pd.concat([df[col].str.strip('"') for col in df], axis=1)
df.columns = df.columns.str.strip('"')
return df
def c(df):
#apply str function to all columns in dataframe
df = df.applymap(lambda x: x.lstrip('\"').rstrip('\"'))
df.columns = df.columns.str.strip('"')
return df
print (a(df))
print (b(df1))
print (c(df2))
In [135]: %timeit (a(df))
1 loop, best of 3: 635 ms per loop
In [136]: %timeit (b(df1))
1 loop, best of 3: 728 ms per loop
In [137]: %timeit (c(df2))
1 loop, best of 3: 1.21 s per loop

Would this work since you have all the data that you need:
.map(lambda x: x.lstrip('\"').rstrip('\"'))
So simply clean up all the occurrences of " afterwards
EDIT with example:
mydata = [{'"first_name"' : '"bill', 'age': '"75"'},
{'"first_name"' : '"bob', 'age': '"7"'},
{'"first_name"' : '"ben', 'age': '"77"'}]
IN: df = pd.DataFrame(mydata)
OUT:
"first_name" age
0 "bill "75"
1 "bob "7"
2 "ben "77"
IN: df['"first_name"'] = df['"first_name"'].map(lambda x: x.lstrip('\"').rstrip('\"'))
OUT:
0 bill
1 bob
2 ben
Name: "first_name", dtype: object
Use this sequence after selecting the column, it is not ideal but will get the job done:
.map(lambda x: x.lstrip('\"').rstrip('\"'))
You can change the Dtypes after using this pattern:
df['col'].apply(lambda x: pd.to_numeric(x, errors='ignore'))
or simply:
df[['col2','col3']] = df[['col2','col3']].apply(pd.to_numeric)

It depend on your file. Did you check your data if there is comma or not, in cell ? If you have like this e.g Banana : Fruit, Tropical, Eatable, etc. in same cell, you're gonna get this kind of bug. One of basic solution is removing all commas in a file. Or, if you can read it, you can remove special characters :
>>>df
Banana
0 Hello, Salut, Salom
1 Bonjour
>>>df['Banana'] = df['Banana'].str.replace(',','')
>>>df
Banana
0 Hello Salut Salom
1 Bonjour

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

Issue with accents or Special characters in Json (Abap) - web-services

Related

How to get the candy machine ID using a NFT created by the candy machine?

In Power Query, how can I remove duplicates either side of a delimiter?

M & Power Query: How to use the $Skip ODATA expression within a loop?

Finding position of a sequence of words (strings) in a sentence

extra commas when using read_csv causing too many "s in data frame

Categories

Resources