How to remove last occurrence of a character in a string in Apex? - replace

I have a string as below.
*Two professional references
*Statement of purpose
*An up-to-date resume
*Official transcripts
*
Our next start is August 29, 2022
I need to remove last "*" character from this String

x.substringBeforeLast('*') + x.substringAfterLast('*');
should work well.
https://developer.salesforce.com/docs/atlas.en-us.apexref.meta/apexref/apex_methods_system_string.htm

Related

regex extract word group before the end of line

I have following samples of strings
commit: February 2021 Master DS Commit c285fcbf59b698ec5695d35989a420da7659
commit: March market Sales Volume PFJan2021
commit: August market Input PFJan2021
commit: February 2021 Master DS Commit c5695d359820da7659
I want to extract words which are in between commit: and Commit c285cb... or the end of line. The result is as follows:
February 2021 Master DS
March market Sales Volume PFJan2021
August market Input PFJan2021
February 2021 Master DS
I understand that using following regex, I can extract words after commit:
.*commit: (.*)
I have tried another regex: .*commit: (.*)Commit.* but it returns blank where Commit word doesn't exist (I believe that's the expected behaviour).
Is there a way to use some style of or condition which checks if Commit word exists return words before it and after commit:?
You can use
commit:\s*(.*?)(?:\s*Commit\b|$)
See the regex demo.
Details:
commit: - a literal string
\s* - zero or more whitespaces
(.*?) - Group 1: any zero or more chars other than line break chars as few as possible
(?:\s*Commit\b|$) - either zero or more whitespaces and Commit as a whole word or end of string.

Regex for date format dd Mmm yyyy from email header

I have the following regex that I have been working on:
^(\d\d)\s(Jan|Feb|Mar|Apr|May|Jun|Jul|Aug|Sep|Oct|Nov|Dec)\s(\d{4})?$
I am trying to grab the date from an email header that is formatted like so:
"Mon, 18 Nov 2019 09:19:17 -0700 (MST)"
and I want the result to be:
18 Nov 2019
It seems that the \s for whitespace could be the culprit, but I have yet to find another forum result that grabs dates with whitespace instead of "-" or "/".
Does anyone have any suggestions for getting this working to extract as described above? Thanks in advance.
The problem is that you have added the "^" and "$" symbol on the start and end of the regex.
"^n": The ^n quantifier matches any string with n at the beginning of it.
"n$": The n$ quantifier matches any string with n at the end of it.
Since the text is not start with 2 digit (\d\d) and end with 2 digit (\d{4}). You will not get any result from this regex.
You can simply remove those two symbol or use the following code to achieve that.
/(\d{2}\s(?:Jan|Feb|Mar|Apr|May|Jun|Jul|Aug|Sep|Oct|Nov|Dec)\s\d{4})/.exec("Mon, 18 Nov 2019 09:19:17 -0700 (MST)")[1]

Regexp to match only first occurrence starting from top of the text

I'm working on extracting specif string set after matching a pattern but results are not as expected. Instead of first occurrence starting from top of the text file the function picks very last occurrence.
Function:
[\n\r].*Sent:\s*([^\n\r]*)
Sample text:
From: Y Sent: Monday, November 6, 2018 6:38 AM To: X
BLA BLA
Thank you,
From: X Sent: Monday, November 5, 2018 8:38 AM To: Y
Hi Y BLA
Thanks,
Expected results:
Monday, November 6, 2018 6:38 AM
Curently returns:
Monday, November 5, 2018 8:38 AM
The first occurrence is not being matched because you start your regex with [\n\r] which matches a newline and is not present before the first line in your example data.
To get your matches, you can omit [\n\r].* from the beginning and add To: at the end. If you don't use the global flag you will get only the first occurence and your match is in the first capturing group.
Sent:\s*([^\n\r]*) To:
Regex demo
You're close. Try this:
Sent:\s?(.*?)\sTo:
This looks for 'Sent', a colon, an optional White Space, then it creates Group 1, matching any number of any char until it reaches a White Space and 'To:'.
If you set the global flag, it will match both dates, otherwise just the first.
The date will be in Group 1.

Parsing dates from string - regex

I'm terible with regex and I can't seem to wrap my head around this simple task.
I need to parse out the two dates in a string which always has one of two formats:
"Inquiry at your property for December 29, 2013 - January 03, 2014"
OR
"Inquiry at your property for 29 December , 2013 - 03 January, 2014"
the 2 different date formats are throwing me off. Any insights would be appreciated!
/(\d+ \w+, \d+|\w+ \d+, \d+)/ for example. Try it out on Rubular.
For sure, it would pickup more stuff, like 2013 NotReallyAMonth, 12345. But if you don't have things in the input that look like a date, but not actually a date this might work.
You could make the regexp stronger, but applying more restrictions on what is matched:
/(\d{2} (?:January|December), \d{4}|(?:January|December) \d{2}, \d{4})/
In this case the day is always two digits, the year is 4. Months are listed explicitly (you would have to list all of them).
Update: For ranges it would be a different regexp:
/((?:Jan|Dec) \d+ - \d+, \d{4})/
Obviously they can all be combined together:
/(\d{2} (?:January|December), \d{4}|(?:January|December) \d{2}, \d{4}|(?:Jan|Dec) \d+ - \d+, \d{4})/

Replacing multiple blank lines with one blank line using RegEx search and replace

I have a file that I need to reformat and remove "extra" blank lines.
I am using the Perl syntax regular expression search and replace functionality of UltraEdit and need the regular expression to put in the "Find What:" field.
Here is a sample of the file I need to re-format.
All current text
REPLACE with all the following:
Winter 2011 Class Schedule
Winter 2011 Class Registration Dates: Dec. 6, 2010 – Jan. 1, 2011
Winter 2011 Class Session Dates: Jan. 5 – Feb. 12, 2011
DANCE
Adventures in Ballet & Tap
3 – 6 years Instructor: Ann Newby
Tots ages 3 – 6 years old develop a greater sense of rhythm, flexibility and coordination as they explore the basic elements of movement.
Saturdays 9 - 10 a.m. Jan. 8 – Feb. 12 Six-week fees: $30
African Storytelling
3 – 6 years Instructor: Ann Newby
Tots ages 3 – 6 years old explore storytelling and fables through spoken word, music, movement and visual arts experiences.
Saturdays 10 – 11 a.m. Jan. 8 – Feb. 12 Six-week fee: $30
African Dance / Children
You'll notice that some of the double blank lines have spaces or tabs or both in them.
After the search and replace has been run I should have a file that looks like this.
All current text
REPLACE with all the following:
Winter 2011 Class Schedule
Winter 2011 Class Registration Dates: Dec. 6, 2010 – Jan. 1, 2011
Winter 2011 Class Session Dates: Jan. 5 – Feb. 12, 2011
DANCE
Adventures in Ballet & Tap
3 – 6 years Instructor: Ann Newby
Tots ages 3 – 6 years old develop a greater sense of rhythm, flexibility and coordination as they explore the basic elements of movement.
Saturdays 9 - 10 a.m. Jan. 8 – Feb. 12 Six-week fees: $30
African Storytelling
3 – 6 years Instructor: Ann Newby
Tots ages 3 – 6 years old explore storytelling and fables through spoken word, music, movement and visual arts experiences.
Saturdays 10 – 11 a.m. Jan. 8 – Feb. 12 Six-week fee: $30
African Dance / Children
Replacing
^(\s*\r\n){2,}
With
\r\n
Is what I ended up with.
This only selects blank lines in multiples of two or more and replaces them with one.
It depends what the line endings are. Assuming \n, replace this:
([ \t]*\n){3,}
with \n\n.
Try this perl oneliner perl -00pe0, if you want in place editing, just add -i option
Replacing
\n\s*\n\s*
with
\n\n
should do the trick
For completeness I want to reference here the large post Remove / delete blank and empty lines in the user forums of UltraEdit which contains at bottom after all the explanations for newbies the solution for reducing two or more lines with nothing (empty lines) or just whitespaces (blank lines) to one empty line independent on line terminator type.
And some words on what Alan Moore wrote in his answer:
UltraEdit's Perl regular expression support is not crippled by its line-based architecture. Perl regular expression engines have a flag which determine if a dot matches all characters except newline characters like carriage return (CR) and line feed (LF) or really all characters including CR and LF. This makes the difference if a text file is interpreted as large byte stream or as a sequence of lines for Perl regular expression finds/replaces. In UltraEdit the flag is set by default to not include \r (CR) and \n (LF) by a dot in the regular expression search string. But this behavior can be easily changed in UltraEdit by starting the regular expression string with (?s) which changes the value of the flag match_not_dot_newline as posted in UltraEdit user forums at topic "." in Perl regular expressions doesn't include CRLFs?
A Perl regular expression replace working for files with
carriage return + line feed (DOS/Windows) or
only line feed (Unix, Mac OS 10.0 and later versions) or
only carriage return (Mac OS 9 and previous versions)
as line ending with optionally trailing spaces and tabs at end of a paragraph (one or more lines) and with two or more lines without (empty line) or with whitespaces (blank line) below the paragraph could be done with search string \h*(\r?\n|\r)(?:\h*\1){2,} and \1\1 as replace string.
Explanation:
\h* matches any horizontal whitespace character according to Unicode 0 or more times. This first part of the search expression matches horizontal whitespace characters at end of a line like horizontal tabs, normal spaces, no-break-spaces and some other not often used spaces.
The usage of \s is not good as this character class matches any whitespace character including the vertical whitespace characters carriage return and line feed.
(\r?\n|\r) ... is an OR expression with two arguments in a marking group. The first argument matches a line feed optionally with a preceding carriage return while the second argument matches just a carriage return. So this expression matches all three common types of line terminations completely correct. It is important for the rest of the search and the replace to match always either CR+LF (both together) or just LF or just CR.
(?:\h*\1) ... is a non marking group which matches 0 or more horizontal whitespaces and the newline as found before back-referenced with \1, i.e. CR+LF or just LF or just CR. So this part of the expression finds an empty or blank line.
{2,} ... is a multiplier for the previous expression in the non marking group which means at least two times. So after end of a paragraph there must be two or more empty or blank lines. Only one empty or blank line below a paragraph is not enough for a positive match of search expression.
The replace string \1\1 references twice the first found line break.
The advantage of this regular expression in comparison to the others posted here is that the line ending type must not be known. The search expression finds that out and found line ending is referenced in the replace string. And probably existing trailing whitespaces at end of a paragraph and whitespaces on next line are removed also by this regular expression replace if there are two or more empty or blank lines below a paragraph.
{2,} can be replaced by + in search string if trimming whitespaces at end of a paragraph and on next empty or blank line should be also done on running this Perl regular expression replace. But please note that in this case the replace makes replaces which do not change anything at all if there are not trailing whitespaces at end of a paragraph and next line is an empty line.
In Vim, Using
:%!cat -s
I find this is the easiest way to delete extra empty line so far.
I'm not sure what UltraEdit lets you get away with in the "replace" area, but if you cannot use a newline (I've had this problem before) but can use capture references, this might work:
Find : \s*(\r\n)\s*(\r\n)\s*\r\n
Replace : $1$2
Not tested extensively, but seems to work on the sample you provided.
See this thread for what's causing the problem. As I understand it, UltraEdit regexes are greedy at the character level (i.e., within a line), but non-greedy at the line level (roughly speaking). I don't have access to UE, but I would try writing the regex so it has to match something concrete after the last blank line. For example:
search: (\r\n[ \t]*){2,}(\S)
replace: $1$2
This matches and captures two or more instances of a line separator and any horizontal whitespace that follows it, but it only retains the last one. The \S should force it to keep matching until it finds a line with at least one non-whitespace character.
I admit that I don't have a whole lot of confidence in this solution; UltraEdit's regex support is crippled by its line-based architecture. If you want an editor that does regexes right, and you don't want to learn a whole new regex syntax (like vim's), get EditPadPro.
Should also work with spaces on blank lines
Search - /\n^\s*\n/
Replace - \n\n
On my Intellij IDE what was search for \n\n and Replace it by \n