Detect the uppercase with Regex Yahoo Pipes - regex

I have trouble with Regex. I would like to detect the uppercase and put a #in front. And also remove the spaces...
Example: Chris Pratt talks Jurassic World
#ChrisPratt talks #JurassicWorld
Any idea?

The regex to find two consecutive uppercased words seperated by a space would be:


How to find capital letter using RegExp?

I need a simple solution. I have a text that is improperly punctuated and in many places a comma is followed by a capital letter. Example: Here you are, You sicko. A comma followed by a cap. Any string to find these? ,\w doesn't work. I only want caps.
I only know basic regex. I'll use it to search in Notepad++
Thank you.
Try this one:
, [A-Z]
In general case, for any punctuation,
[.,!?\\-]+ [A-Z]+
See image below:

How can I extract an exact set of words from a string using a regex?

I've looked everywhere and haven't been able to find a question that answers this specific use case (maybe I've missed it). But basically I'm wanting to extract the following text from a string: Welcome James:
This text must be at the start of the string, e.g:
Welcome James: Now some text follows...blahblah - This would be a match
This is some text Welcome James: some more text... - This would not be a match.
So basically I'd hard code Welcome James: into the regex (I don't need any other variables of Welcome <name>:.
Is this possible? All I've been able to find is regexes that match single words without spaces or characters.
To search at the start of a string, just prefix the regex with the ^ (caret) character:
/^Welcome James/
Here is the answer :) But #charles gave it too!
^(Welcome James)

Using a Regex Pattern that finds Abbrevations

I am looking through volumes of data and need to identify certain patterns one of which is abbreviations. The basic rules to identify them in the content I am going through is
They are all is capital letters.
They are separated by dots.
They may be one or more alphabets
They may or may not end with a dot.
I am looking at individual words therefore looking for multiple occurrences in the string is not required.
U.S., U.S, U.S.S.R., V.
Can someone help construct a regex search pattern for me?
Many thanks
You can use this regex:
RegEx Demo
This should do the trick:
I've used \p{Lu} (unicode uppercase letters) since you want to match any alphabet.
If you can't make \b unicode aware in your dialect, here's an alternative:
This will work. it also matches the ending dots.

Regular Expression to match sentence that end with special characters like . ! ? but ignore words like George W. Bush,Mr. etc

I'm looking for a regular expression to parse a text file in which the sentences end with special characters like ., ! and ? but ignore words like George W. Bush, Mr. Hopkins Mrs. Violet etc.
I tried (?!Mr|Mrs|[A-Za-z]\.\s)\S.+?[.!?](?=\s+|$) but this doesn't not seem to be working.
English is a decidedly non-regular language. I don't think a regex will be sufficient: you'll probably need a full tokenizer, plus some kind of machine learning, possibly a Markov model, to detect where one sentence ends and the next begins. And even then it would only be a heuristic -- since human language use is sloppy, an exact solution may never be possible.
A regex can not intelligently recognise what is an abbreviation and what is the end of the sentence.
What regex can do, is to define a set of characters that mark the end of the sentence and are therefore not matched and to define a set of exceptions when those characters should be matched anyway.
See it here on Regexr.
This will not match the chars .!?
But will match those chars anyway when they are preceeded by something out of this alternation etc|Dr|Mr|Mrs|\b[A-Za-z]|\s
I'm no regex expert, but I found this regex to work well at identifying breaks between sentences.
It looks for sentence punctuation followed by a capital letter, excluding where there is a word beginning with a capital, because titles are capitalized.
Also note this is java regex, so \p{Upper} might not work.
Also, the title length of 4 is arbitrary, regex requires a fixed length for lookback, and I couldn't think any title abbreviations longer than 4 characters.
Let me break it down for anyone learning regex.
# Don't match where we have a short word beginning with a capital (for titles)
(?=[.?!]\s*\p{Upper}) # Only match when followed by a captial. (for abbreviations)
[.?!] #match the punctuation
\s* #also match white space, so no trimming is required (optional)
And here's a nonsense testing paragraph that puts this regex through the ropes:
This is a sentence. I really want to win, etc. and win more. This is pretty neat. I want to thank Mr. Shea for his work. Mr. Hugo helped as well. M. Thénardier is thankful as well. The wonderful Mr. Albert Einstien PhD. is a cool dude as well.
Edit: I've been thinking about this, and I've found one case where this regex doesn't work. Consider this phrase:
Joey loved talking to Max. This was because Max is his best friend.
In this example, Max. This is picked up as a name and title. This only works with short names (under five characters with \w{0,4}, the 4 could be adjusted to something smaller to filter out longer titles) I can't think of any way to fix this other than learning what words are name or titles. I guess my method is't perfect, but I think it's close enough for most circumstances.

What regex can I use to match only letters, numbers, and one space between each word?

How can I create a regex expression that will match only letters and numbers, and one space between each word?
Good Examples:
Hello World
I am 500 years old
Bad Examples:
Hello world
I am 500 years old.
I am Chuck Norris
Most regex implementations support named character classes:
^[[:alnum:]]+( [[:alnum:]]+)*$
You could be clever though a little less clear and simplify this to:
^([[:alnum:]]+ ?)*$
FYI, the second one allows a spurious space character at the end of the string. If you don't want that stick with the first regex.
Also as other posters said, if [[:alnum:]] doesn't work for you then you can use [A-Za-z0-9] instead.
([a-zA-Z0-9]+ ?)+?
its works
(?:[a-zA-Z0-9]+[ ])+[a-zA-Z0-9]+
If I understand you correctly the above regex should work.
See screenshot below:
This would match a word
'[a-zA-Z0-9]+\ ?'