JSON breaking: using coldfusion regex to remove some quotes and double quotes - regex

i am going nuts here, trying to remove some quotes and double quotes in my json response,
there are some characters too like period, comma etc, i am trying like this
<cfset mystring = rereplace(mystring, '(['""])', '\\\1', 'all') />
but unable to fix it, please guide me thanks

I think the problem is that you're enclosing your regex pattern string in single quotes, but then escaping the double quote inside that string, but not the single quote. You might try the following:
<cfset mystring = rereplace(mystring, "(['""])", "\\\1", "all") />
But I'm not sure that will actually do what you want. That will also escape double and single quotes where they don't need to be escaped -- such as the quotes surrounding names and values. For example, the JSON
[{"name":"value"}]
would become
[{\"name\":\"value\"}]
Surely that isn't what you want! Rather, you would need to determine where double quotes fall within strings surrounded by double quotes, and escape those (assuming they're not already escaped). I am not certain that ColdFusion regex, or any regex flavor, is up to that task. Rather, whatever service is producing the invalid JSON needs to be fixed.

Related

Regex match on rubular but not in Ruby

I'm trying to return "A1RKKUPIHCS9HS" from string cheese:
cheese = "#<struct Peddler::Marketplace id=\"A1RKKUPIHCS9HS\",..."
I tried both scan and match like this:
cheese.match(/(?<=id=\\").{14}/)
cheese.scan(/(?<=id=\\")./)
It works on Rubular, but when I try it in Ruby, it doesn't. No idea why.
Enter the following as your test string at Rubular:
#<struct Peddler::Marketplace id="A1RKKUPIHCS9HS",...
That is, do not put the string in double quotes or escape double quotes within the string. Rubular will take care of that, just as it surrounds your regex with two forward slashes.
You want your regex to be /(?<=id=").{14}/. That's the same as /(?<=id=\").{14}/ since the double quote need not be escaped, but escaping it leaves it unchanged and therefore does no harm. Ruby treats double (and single) quotes with the regex as ordinary characters with no special meaning.
Just out of curiosity, using String#[]:
cheese = "#<struct Peddler::Marketplace id=\"A1RKKUPIHCS9HS\",..."
cheese[/(?<=id=").*?(?=")/]
#⇒ "A1RKKUPIHCS9HS"
You could do .scan(/id="(.{14})"/) as a simpler way.

Using regex to find a double quote within string encased in double quotes

I am using ultraedit with regex. I would like to find (and replace) and embedded double quotes found withing a string that starts/ends with a double quote. This is a text file with pipe | as the delimeter.
How do I find the embedded double quotes:
"This string is ok."|"This is example with a "C" double quoted grade in middle."|"Next line"
I eventually need to replace the double quotes in "C" to just have C.
The big trade off in CSV is correct parsing in every case versus simplicity.
This is a resonably moderated approach. If you have really wily strings with quotes next to pipes in them, you better use something like PERL and Text::CSV.
There is a bother with a regex that requires a non-pipe character on each side of the quote (such as [^|]) in that the parser will absorb the C and then won't find the other quote next to the C.
This example will work pretty well as long as you don't have pipes and quotes next to each other in your actual CSV strings. The lookaheads and behinds are zero-width, so they do not remove any additional characters besides the quote.
1 2 3 4
(?<!^)(?<!\|)"(?!\|)(?!$)
Don't match quotes at the beginning of the line.
Don't match quotes with a pipe in front.
Don't match quotes with a pipe afterwards.
Don't match quotes at the end of a string.
Every quote thus matched can be removed. Don't forget to specify global replacement to get all of the quotes.
Try this find:
(["][^"]*)["]C["]([^"]*["])
and replace:
\1C\2
Turn on Regular Expressions in Perl mode.
Screen shot of
UltraEdit Professional Text/HEX Editor
Version 21.30.0.1005
Trying it out.
Start with:
"This string is ok."|"This is example with a "C" double quoted grade in middle."|"Next line"
"This string is ok."|"This is example with a C double quoted grade in middle."|"Next line"
Ends with:
"This string is ok."|"This is example with a C double quoted grade in middle."|"Next line"
"This string is ok."|"This is example with a C double quoted grade in middle."|"Next line"
Breakdown of the regex FIND.
First part.
(["][^"]*)
from (["][^"]*)["]C["]([^"]*["])
This looks for a sequence of:
Double quote: ["].
Any number of characters that are not double quotes: [^"]*
The brackets that surround ["][^"]* indicate that the regex engine should store this sequence of characters so that the REPLACE part can refer back to it (as back references).
Note that this is repeated at the start and end - meaning that there are two sequences stored.
Second part.
["]C["]
from (["][^"]*)["]C["]([^"]*["])
This looks for a sequence of:
Double quote: ["].
The capital letter C (which may or may not stand for Cookies).
Double quote: ["].
Breakdown of the regex REPLACE.
\1C\2
\1 is a back reference that means replace this with the first sequence saved.
The capital letter C (which may or may not stand for Cookies).
\2 is a back reference that means replace this with the second sequence saved.
For the example you gave just "\w" works as the regex to find "C"
Try it here
The replacing mechanism is probably built into ultraedit
You really don't want to do this with regex. You should use a csv parser that can understand pipe delimiters. If I were to this with just regex, I would use multiple replacements like this:
Find and replace the good quotes with placeholder to text. Start/end quote:
s/(^"|"$)/QUOTE/g
Quotes near pipe delimiters:
s/"\|"/DELIMITER/g
Now only embedded double quotes remain. To delete all of them:
s/"//g
Now put the good quotes back:
s/QUOTE|DELIMITER/"/g
nanny posted a good solution, but for a Perl script, not for usage in a text editor like UltraEdit.
In general it is possible to have double quotes within a field value. But each double quote must be escaped with one more double quote. This is explained for example in Wikipedia article about comma-separated values.
This very simple escaping algorithm makes reading in a CSV file character by character coded in a programming language very easy. But double quotes, separators and line breaks included in a double quoted value are a nightmare for a regular expression find and replace in a CSV file.
I have recorded several replaces into an UltraEdit macro
InsertMode
ColumnModeOff
Top
PerlReOn
Find MatchCase RegExp "^"|"$"
Replace All "QuOtE"
Find MatchCase ""|"
Replace All "QuOtE|"
Find MatchCase "|""
Replace All "|QuOtE"
Find MatchCase """"
Replace All "QuOtEQuOtE"
Find MatchCase """
Replace All """"
Find MatchCase "QuOtE"
Replace All """
The first replace is a Perl regular expression replace. Each double quote at beginning or end of a line is replaced by the string QuOtE by this replace. I'm quite sure that QuOtE does not exist in the CSV file.
Each double quote before and after the pipe character is also replaced by QuOtE by the next 2 non regular expression replaces.
Escaped double quotes "" in the CSV file are replaced next by QuOtEQuOtE with a non regular expression replace.
Now the remaining single double quotes are replaced by two double quotes to make them valid in CSV file. You could of course also remove those single double quotes.
Finally, all QuOtE are replaced back to double quotes.
Note: This is not the ultimate solution. Those replaces could produce nevertheless a wrong result, for example for an already valid CSV line like this one
"first value with separator ""|"" included"|second value|"third value again with separator|"|fourth value contains ""Hello!"""|fifth value
as the result is
"first value with separator """|""" included"|second value|"third value again with separator|"|fourth value contains ""Hello!"""|fifth value
PS: The valid example line above should be displayed in a spreadsheet application as
first value with separator "|" included second value third value again with separator| fourth value contains "Hello!" fifth value

Apostrophe treated as end of string in xslt

I have a xml where i need to check for a condition
[preceding-sibling::heading/title='ABC's Notes. --']
This expression is throwing error as "expected "]" found s ". I need to search for title "ABC's Notes. --" but i think 'ABC' is being interpreted as a separate string.
How should i write the above code to make apostrophe not being treated as end of string. Any inputs would be helpful
Both double ("…") and single ('…') quotes can be used to delimit strings in XSLT. To include either in a string either use the other quotes for the string (in your case switching to double quotes would work, or use the XML entities: &apos; or " for single and double quotes respectively.

Regex for string with optional quotes

I'm trying to break apart a string I'm getting from a telnet service which puts in quotes either end of a filename has white space in it, and doesn't include the quotes if there are no white space present. All the other fields are delimited by spaces so no real issue there.
I'm trying (maybe too ambitiously!) to get the whole lot out in Regex groups. Not that it has much bearing on it, but I'm using Perl.
An example of a quoted string is:
"RAW Superleague backchat 0907 1531" movie/DV/DV100 63173952000 576000 15:21:35:24 16:34:43:01
and an unquoted string might be:
F0736584_02 movie/DV/DV100 9172224000 576000 16:04:19:00 16:14:55:24
I'm using the regex:
/^"?(.*)"$?\s(\S+)\s(\S+)\s(\S+)\s(\S+)\s(\S+)/
which returns the string with quotes very nicely in groups, but doesn't return the second without quotes. I thought that the optional flag would handle this, but it seems not. Any help appreciated.
Because the second line doesn't start with a whitespace. Try this:
/^"?(.*)"$?\s?(\S+)\s(\S+)\s(\S+)\s(\S+)\s(\S+)/
^----------- new

Double quotes in single quotes and vice versa

Often I find myself inverting quotes:
from double quotes "" to single quotes '' and
from single quotes '' to double quotes "".
I know there is a way to switch single quotes to double quotes:
:%s/'\(\([^']*\)\)'/"\1"/g
And a way to switch double quotes to single quotes:
:%s/"\(\([^"]*\)\)"/'\1'/g
but how do I do both operations together without including the first swapped quotes in the 2nd swapping?
Typically, when you want to swap A & B like this, you need an intermediate step where you replace A with something entirely different and very likely to be unique within the document, whether an unusual character or something longer and crazier like |x-monkeyz-x|.
You can then convert all the Bs to As, and finally all the |x-monkeyz-x| to Bs.
For example,
Replace all ' with !X!
Replace all " with '
Replace all !X! with "
EDIT
This is better: Easiest way to swap occurrences of two strings in Vim?
If there is no escaped quotes inside string literals and it is not needed to
ensure correct pairing of quotes, one can use the command
:%s/['"]/\="'\""[submatch(0)!='"']/g
I usually use an intermediate string like my name that's unlikely to appear in the text:
Change single quote to UNLIKELY_STRING
Change double quote to single quote
Change UNLIKELY_STRING to double quote
Use \=:
:%s/'\([^']*\)'/\='"'.tr(submatch(1), '"', "'").'"'/g
. This assumes that both characters only serve as quotes, but your initial code also does the same, except that my does not check for them being paired.