I have a column called rea. Inside, there are data like below
oc'ean
I'm using the function tranwrd to remove the '
tranwrd(rea,'''','');
Can I use something else than the '''' to represent the '?
You can quote single-quotes within double-quotes, and vice versa.
tranwrd(rea,"'",'');
But if you just wish to remove the quotes, rather than replace them with a non-space character, use the compress function
compress(rea,"'");
Related
I want to make my form field input to pass through a validator to allow only alphabets number and three symbols - ' / to pass.
r'^[A-Za-z0-9\s-/]+$';
I have done for all except for symbol ' . Once I add in ' symbols it will assume I close the statement on there. How can I put in the symbols ' .
If that singlequote is still bothering you and nothing else worked, then there is another way to achieve it. A little tedious way but works pretty well.
Try using below regex,
^[^\u0000-\u001f\u0021-\u0026\u0028-\u002c.\u003a-\u0040\u005b-\u0060\u007b-\uffff]+$
Basically this regex excludes the character ranges that are not valid in your character set. I can add detailed explanation once you confirm it works for you and it should as it doesn't have any singlequote in the regex which was causing problem.
Had to use Unicode notation to prohibit matching Unicode characters.
Check this demo for valid and invalid matches
First of all, always place - at the end of the character class, it is the safest method to use it inside brackets.
Next, adding ' to a single-quoted string literal is done with an escape single quote, \'. Since this does not work, I suspect the problem is that you have curly quotes.
Also, consider using triple-quoted string literals, r"""<pattern>""". This is the most convenient way of writing patterns with quotes.
So you can consider using
pattern = r'''^[A-Za-z0-9\s/'‘’-]+$'''
If there is some warning you get, escape these special chars
pattern = r'''^[A-Za-z0-9\s\/\'\‘\’\-]+$'''
I am attempting to use REGEXREPLACE in Google Sheets to remove the repeating special character \n.
I can't get it to replace all repeating instances of the characters with a single instance.
Here is my code:
REGEXREPLACE("Hi Gene\n\n\n\n\nHope","\\n+","\\n")
I want the results to be:
Hi Gene\nHope
But it always maintains the new lines.
Hi Gene\n\n\n\n\nHope
It has to be an issue with replacing the special characters because this:
REGEXREPLACE("Hi Gennnne\nHope","n+","n")
Produces:
Hi Gene\nHope
How do I remove repeating instances of special characters with a single instance of the special character in Google Sheets?
Edit
Just found easier way:
=REGEXREPLACE("Hi Gene\n\n\n\n\nHope","(\\n)+","\\n")
Original solution
Thy this formula:
=REGEXREPLACE(A1,REPT(F2,(len(A1)-len(REGEXREPLACE(A1,"\\n","")))/2),"\\n")
Put your text in A1.
How it works
It's workaround, we want to use final formula like this:
REGEXREPLACE("Hi Gene\n\n\n\n\nHope","\\n+\\n+\\n+\\n+\\n+","\\n")
First target is to find, how many times to repeat \\n+:
=(len(F1)-len(REGEXREPLACE(F1,F2,F3)))/2
Then just combine RegEx.
https://support.google.com/docs/answer/3098245?hl=en
REGEXREPLACE(text, regular_expression, replacement)
The problem seems to be how it interprets the "text". If I put this in a cell REGEXREPLACE("Hi Gene\n\n\n\n\nHope","","")
the output is Hi Gene\n\n\n\n\nHope as well.
If I place the text in a cell by itself with proper newlines and have this REGEXREPLACE(A1, "(\n)\n*", "$1") it works.
Note I could not just do s/\n+/\n/ as it still does not interpret the newline notation as anything special. It would just output \n instead of a newline.
I believe that you don't need to double escape the newlines, e.g. just search for \n:
REGEXREPLACE("Hi Gene\n\n\n\n\nHope", "\n+", "\n")
When you replace \\n you are searching for the literal text \n, rather than newline.
Using regexp_replace within PostgreSQL, I've developed (with a lot of help from SO) a pattern to match the first n characters, if the last character is not in a list of characters I don't want the string to end in.
regexp_replace(pf.long_description, '(^.{1,150}[^ -:])', '\1...')::varchar(2000)
However, I would expect that to simply end the string in an ellipses. However what I get is the first 150 characters plus the ellipses at the end, but then the string continues all the way to the end.
Why is all that content not being eliminated?
Why is all that content not being eliminated?
because you haven't requested that. you've asked to have the first 2-151 characters replaced with those same characters and elipsis. if you modify the pattern to be (^.{1,150}[^ -:]).* (notice the trailing .* has regex_replace work on the complete string, not just the prefix) you should get the desired effect.
Do your really want the range of characters between the space character and the colon: [^ -:]?
To include a literal - in a character class, put it first or last. Looks like you might actually want [^ :-] - that's just excluding the three characters listed.
Details about bracket expressions in the manual.
That whould be (building on what #just already provided):
SELECT regexp_replace(pf.long_decript-ion, '(^.{1,150}[^ :-]).*$', '\1...');
But it should be cheaper to use substring() instead:
SELECT substring(pf.long_decript-ion, '^.{1,150}[^ :-]') || '...';
Often I find myself inverting quotes:
from double quotes "" to single quotes '' and
from single quotes '' to double quotes "".
I know there is a way to switch single quotes to double quotes:
:%s/'\(\([^']*\)\)'/"\1"/g
And a way to switch double quotes to single quotes:
:%s/"\(\([^"]*\)\)"/'\1'/g
but how do I do both operations together without including the first swapped quotes in the 2nd swapping?
Typically, when you want to swap A & B like this, you need an intermediate step where you replace A with something entirely different and very likely to be unique within the document, whether an unusual character or something longer and crazier like |x-monkeyz-x|.
You can then convert all the Bs to As, and finally all the |x-monkeyz-x| to Bs.
For example,
Replace all ' with !X!
Replace all " with '
Replace all !X! with "
EDIT
This is better: Easiest way to swap occurrences of two strings in Vim?
If there is no escaped quotes inside string literals and it is not needed to
ensure correct pairing of quotes, one can use the command
:%s/['"]/\="'\""[submatch(0)!='"']/g
I usually use an intermediate string like my name that's unlikely to appear in the text:
Change single quote to UNLIKELY_STRING
Change double quote to single quote
Change UNLIKELY_STRING to double quote
Use \=:
:%s/'\([^']*\)'/\='"'.tr(submatch(1), '"', "'").'"'/g
. This assumes that both characters only serve as quotes, but your initial code also does the same, except that my does not check for them being paired.
I've written a simple CSV file parser. But after looking at the wiki page on CSV formats I noticed some "extensions" to the basic format. Specifically embedded comma via double quotes. I've managed to parse those, however there is a second issue: embedded double quotes.
Example:
12345,"ABC, ""IJK"" XYZ" -> [1234] and [ABC, "IJK" XYZ]
I can't seem to find the correct way to distinguish between an enclosed double quote and none. So my question is what is the correct way/algorithm to parse CVS formats such as the one above?
The way I normally think about this is basically to look at the quoted value as a single, unquoted value or a sequence of double quoted values that form a value joined by quotes. That is,
to parse the next atom in the row:
read up to the first non whitespace character
if the current character is not a quote:
mark the current spot
read up to the next comma or newline
return the text between the mark and the character before the comma (strip spaces if appropriate)
if the current character is a quote:
create an empty string buffer
while the current character is not a quote
mark the current position +1 (skip the quote character)
read up to the next quote
if the buffer is not empty, append a quote to it
append to the buffer the text between the mark and the character before the current position (to strip both quotes)
advance one character (past the just read quote)
read up to the next comma or newline
return the buffer
essentially, split each double quoted segment of the quoted string and then catenate them together with quotes. thus: "ABC, ""IJK"" XYZ" becomes ABC, , IJK, XYZ, which in turn becomes ABC, "IJK" XYZ
I would do this using a single character look-ahead, so if you're scanning the string and find a double quote, look at the next character to see if it is also a double quote. If it is, then the pair represents a single doublequote character in the output. If it's any other character, you're looking at the end of the quoted string (and hopefully that next character is a comma!). Be sure to account for the end-of-line condition when looking at the next character, too.
If you find a double-quote, then you should look for a double-quote in the end of the word/string. If you can't find, then there is an error. The same for a quote.
I suggest you try Flex/Bison in order to write a parser for the CSV file. Both tools will help you to generate a parser and then you can use the C files with the parser and call it from your C++ program.
On Flex, you create a scanner that can find your tokens, like "word" or ""word"". On Bison, you define the syntax.
A double double-quote ("") is a literal double-quote, while a lone double-quote (") is used for enclosing text (including commas).
Here's a regex for a csv field, if that makes things easier:
([^",\n][^,\n]*)|"((?:[^"]|"")+)"
Group 1 will contain the field if it isn't in quotes, group 2 will contain the field if it is in quotes, minus the surrounding quotes. In that case, just replace all instances of "" with ".
I suggest reading: Stop Rolling Your Own CSV Parser and this CSV RFC. The first is really just someone who wants you to use their C# CSV parser, but still explains many issues.
Your parser should be examining a character at a time. I used a double bool strategy for my parser in D. Each quote toggles weather the string is quoted or not. When in a quoted Cell you flag when hit a quote, and turn off quoting. If the next character is a quote, quoting is turned on, a quote is added to the result and the flag is turned off. If the next character isn't a quote then the flag is turned off and so is quoting.