encodeForJavaScript() with JSON.parse, doublequote woes - coldfusion

In CF (9.0.2 with esapi-2.0_rc10.jar):
<cfset test = ['ha"ha"']>
<script>
x = JSON.parse('#encodeForJavaScript(serializeJSON(test))#');
y = JSON.parse('#replace(serializeJSON(test), '"', '\"', "all")#');
z = #serializeJSON(test)#;
j = JSON.parse('#jsStringFormat(serializeJSON(test))#');
</script>
Output:
<script>
x = JSON.parse('\x5B\x22ha\x22ha\x22\x22\x5D');
y = JSON.parse('[\"ha\\"ha\\"\"]');
z = ["ha\"ha\""];
j = JSON.parse('[\"ha\\\"ha\\\"\"]');
</script>
y, z and j are valid.
x actually fails: "Uncaught SyntaxError: Unexpected token h "
I thought encodeForJavaScript() in ESAPI was supposed to be the best and safest function to be used in situation like this. Why does it fail here?
side question, if I'm only using serializeJSON(), even if the data is dynamically built with user input, does it mean I don't really need to use JSON.parse since there will be no functions in the JSON string for sure?

If you use encodeForJavascript on a JSON string, then it is no longer valid JSON.

Quote from JSON.org:
A number is very much like a C or Java number, except that the octal and hexadecimal formats are not used.
This is in the JSON context
This pic 'shows' the format for strings in json objects
See json.org for more info

Related

Remove operations using regex [duplicate]

This question already has answers here:
How to find sum of integers in a string using JavaScript
(3 answers)
Closed 3 years ago.
I am getting a string back "1+2" and would like to remove the "+" and then add the numbers together.
Is this possible using Regex? So far I have:
let matches = pattern.exec(this.expression);
matches.input.replace(/[^a-zA-Z ]/g, "")
I am now left with two numbers. How would I add together?
"this.a + this.b"
Assuming the string returned only has '+' operation how about:
const sum = str.split('+').reduce((sumSoFar, strNum) => sumSoFar + parseInt(strNum), 0);
You cannot add two numbers using regex.
If what you have is a string of the form "1+2", why not simply split the string on the + symbol, and parseInt the numbers before adding them?
var str = "1+2";
var parts = str.split("+"); //gives us ["1", "2"]
console.log(parseInt(parts[0]) + parseInt(parts[1]));
If you don't always know what the delimiter between the two numbers is going to be you could use regex to get your array of numbers, and then reduce or whatever from there.
var myString = '1+2 and 441 with 9978';
var result = myString.match(/\d+/g).reduce((a,n)=> a+parseInt(n),0);
console.log(result); // 1 + 2 + 441 + 9978 = 10422
*Edit: If you actually want to parse the math operation contained in the string, there are a couple of options. First, if the string is from a trusted source, you could use a Function constructor. But this can be almost as dangerous as using eval, so it should be used with great caution. You should NEVER use this if you are dealing with a string entered by a user through the web page.
var myFormula = '1+2 * 441 - 9978';
var fn = new Function('return ' + myFormula);
var output = fn();
console.log(myFormula, ' = ', output); //1+2 * 441 - 9978 = -9095
A safer (but more difficult) course would be to write your own math parser which would detect math symbols and numbers, but would prevent someone from injecting other random commands that could affect global scope variables and such.

How extract (changeable variable) word & number using regular expression matlab

I have more than 10k text files look similar like this, all of them are similar in format but not in size, sometime is bigger or smaller.
[{u'language': u'english', u'area': 3825.8953168044045, u'class': u'machine printed', u'utf8_string': u'troia', u'image_id': 428035, u'box': [426.42422762784093, 225.33333055900806, 75.15151515151516, 50.909090909090864], u'legibility': u'legible', u'id': 1056659}, {u'language': u'na', u'area': 24201.285583103767, u'id': 1056660, u'image_id': 428035, u'box': [223.99998520359847, 249.57575480143228, 172.12121212121215, 140.6060606060606], u'legibility': u'illegible', u'class': u'machine printed'}]
I want to extract two changeable variable in every text using regular expression.
The output should be like this
box = [223.99998520359847, 249.57575480143228, 172.12121212121215, 140.6060606060606]
box1 = .. sometime there is more than one
&
second output
word = troia
word1 = ... sometime there is more than one word
My code 1: for the word extraction
fid = fopen('text1.txt','r');
C = textscan(fid, '%s','Delimiter','');
fclose(fid);
C = C{:};
Lia = ~cellfun(#isempty, strfind(C,'utf8_string'));
output = [C{find(Lia)}];
expression = 'u''utf8_string'': u+'
matchStr = regexp(output, expression,'match');
My code 1 result give me only the
utf8_string
My code 2: for the box number extraction
s = sprintf('text_.txt');
fid = fopen(s);
tline = fgetl(fid);
C = regexp(tline,'u''box'': +\[([0-9\. ,]+)\]','tokens');
C = cellfun(#(x) x{1},C,'UniformOutput',false)';
M = cell2mat(cellfun(#(x) x', cat(1,C2{:}),'UniformOutput',false));
This code 2 is running but not with every text something i got this error
Error using cat Dimensions of matrices being concatenated are not consistent
If you do not insist on regexp: The input strings looks like json, so the following short code does even more than you want:
% Read the whole file
s = fileread('test.txt');
% Remove the odd u'
s = strrep(s, 'u''', '''');
% Replace ' by "
s = strrep(s, '''', '"');
% See http://www.mathworks.com/matlabcentral/fileexchange/20565
t = parse_json(s);
Now t a is cell object containing structs with the data. So
word = t{1}.utf8_string;
box = cell2mat(t{1}.box);
will give you the first word and box. If you have a newer Matlab version you can probably use jsondecode instead of parse_json.

How to convert UTF→CP1251 and finally to URL-encoded %CA%F3%EF%E8%F2%FC+%EA%E2%E0%F0%F2%E8%F0%F3

I have a phrase in Russian "Купить квартиру". I need to convert it to
%CA%F3%EF%E8%F2%FC+%EA%E2%E0%F0%F2%E8%F0%F3
Encoding looks like ANSI
Notice, if I Uri.EscapeDataString("Купить квартиру"), I get
%D0%9A%D1%83%D0%BF%D0%B8%D1%82%D1%8C%20%D0%BA%D0%B2%D0%B0%D1%80%D1%82%D0%B8%D1%80%D1%83
But these strings are not equal.
Is there some correct way to convert?
Uri.EscapeDataString follows the URL spec RFC 3986, which says to use UTF-8 character encoding.
You'll need to write your own function in custom M, like this:
let
To1251URL = (input as text) as text => let
ToBytes = Binary.ToList(Text.ToBinary(input, 1251)),
ToText = Text.Combine(List.Transform(ToBytes, (b) => "%" & Number.ToText(b, "X"))),
FixSpace = Text.Replace(ToText, "%20", "+")
in
FixSpace,
Applied = To1251URL("Купить квартиру")
in
Applied

trying to find the value is numeric or integer from string

With the url string below, I need to find the value of the parameter named construction.
<cfset understand = "http://www.example.com/ops.cfm?id=code&construction=148&construction=150&Building=852&Building=665&Building=348&Building=619&Building=625&Building=626&_=1426353166006&action=SUBMIT">
<cfset understand2 = "http://www.example.com/ops.cfm?id=code&construction=AVENT+Construction+Site&construction=Signore+upper+constructions&Building=852&Building=665&Building=348&Building=619&Building=625&Building=626&_=1426353166006&action=SUBMIT">
I then want to check if the value is numeric or a string. I am doing this:
isDefined('understand') and isnumeric(understand)
But it always returns "NO".
Seems like a good case for REGEX, but that's not my strength. If you are always looking for the value of the same item (construction), you could take advantage of the underlying Java and use the STRING.split() method. Then use the Coldfusion val() function to see what you get. The following solution assumes that 0 is not a valid value. If it is then you have more work to do.
<cfscript>
target=understand;
//target=understand2; //uncomment to try the second values
token="construction=";
potentialValues = target.split(token); //creates an array of values using the token as a separator
for (item in potentialValues )
{
writeoutput(val(item) & "<br />"); //val will grab the numerical value and ignore everything that follows. No number will become 0
}
</cfscript>
Try this:
constructionIsAt = find(understand, "construction");
characterAfterConstruction = mid(understand, constructionIsAt + 13, 1);
if isNumeric(characterAfterConstruction) {
code for numeric
}
else {
code for non numeric
}

Shortcut to get a statement with certain pattern in R

I have to write the following as it is.
('trial1' = Ozone1, 'trial2' = Ozone2, trial3 = Ozone3,...........trial1000 = Ozone1000)
I want to write this with one command in R. How do I do it?
I tried it using paste0
Let us take only 5 as number of repetitions:
paste0("trial",1:5,"= Ozone", 1:5)
I get this as result.
"trial1= Ozone1" "trial2= Ozone2" "trial3= Ozone3" "trial4= Ozone4" "trial5= Ozone5"
But it is not the way I wanted it. I want the output to come out as it is like (not even in inverted commas):
('trial1' = Ozone1, 'trial2' = Ozone2, 'trial3' = Ozone3, 'trial4' = Ozone4, 'trial5 = Ozone5)
Also as you can see, it is not a string i.e. output should not come between inverted commas as "........". I want it as it is exactly.
How do i do it?
This will generate the string you want...
paste0('(',paste0("'trial",1:1000,"'= Ozone",1:1000,collapse=' ,'),')')
This will print the string without quotes...
print(paste0('(',paste0("'trial",1:10,"'= Ozone",1:10,collapse=' ,'),')'), quote=FALSE)
I hope it answered your question...
You need to escape the single quotes, ie \', and use the collapse argument of paste0:
paste0("(", paste0("\'trial",1:5,"\' = Ozone",1:5, collapse=", "), ")")
[1] "('trial1' = Ozone1, 'trial2' = Ozone2, 'trial3' = Ozone3, 'trial4' = Ozone4, 'trial5' = Ozone5)"