Notepad++ Capitalize Every First Letter of Every Word - regex

Input: "notepad++ capitalize every first letter of every word"
Output: "Notepad++ Capitalize Every First Letter Of Every Word"
I have been attempting to capitalize the first letter of every word using ctr+F and regex.
So far I have been attempting to use find:\b(.) or \<(.) with replace:\u\1 but this results in all of my letters being capitalized.
I have made due with ^(.) & \u\1 followed by \s\b(.) & \u\1.
However, this seems silly to me as there are many posts talking about using the start of word boundaries. I am just having difficulty making them work. Thanks for your consideration!

Background
According to Notepad++ specification (see Substitutions section), there are three operators that can be useful when turning substrings uppercase:
\u
Causes next character to output in uppercase
\U
Causes next characters to be output in uppercase, until a \E is found.
\E
Puts an end to forced case mode initiated by \L or \U.
Thus, you can either match a substring and turn its first character uppercase with \u to capitalize it, or match a character and use \U/\E.
Note that Unicode characters won't be turned uppercase, only ASCII letters are affected.
BOW (Beginning of Word) Bug in Notepad++
Note that currently (in Notepad++ v.6.8.8) the beginning of word does not work for some reason. A common solution that works with most engines (use it in Sublime Text and it will match) does not work:
\b(\w)
This regex matches all word characters irrespective of their position in the string.
I logged a bug Word boundary issue with a generic subpattern next to it #1404.
Solution #1 (for the current Notepad++ v.6.8.8)
The first solution can be using the \w+ and replace with \u$0 (no need using any capturing groups). Though this does not mean we only match the characters at the beginning of a word, the pattern will just match chunks of word characters ([a-zA-Z0-9_] + all Unicode letters/digits) and will turn the first character uppercase.
Solution #2 (for the current Notepad++ v.6.8.8)
The second solution can be implemented with special boundaries defined with lookbehinds:
(?:(?<=^)|(?<=\W))\w
And replace with \U$0\E.
The regex (?:(?<=^)|(?<=\W))\w matches an alphanumeric only at the beginning of a line ((?<=^)) or after a non-word character ((?<=\W)).
The replacement - \U$0\E - contains a \U flag that starts turning letters uppercase and \E is a flag that tells Notepad++ to stop converting case.
Edge case
In case you have hyphenated words, like well-known, and you only want the first part to be capitalized, you can use [\w-]+ with \u$0 replacement. It will also keep strings like -v or --help intact.

A simpler regex that worked for me:
Find: (\w+)
Replace: \u$0

There is a shortcut available in Notepad++ v7.3.2 to capitalize every first letter of every word.
ALT + U
Not sure about prior versions.

Uppercase The First Letter Of Every Word:
Use the shortcut: Alt + U
lowercase the first letter of every word:
Use the shortcut: Clt + U
Shortcut working in version 7.6.3

I have achieved something similar by recording a macro that uses the following replacement.
Find what: ([a-z])+
Replace with: \u$0\E
Tick 'In selection'
This is the resulting macro that I extracted from C:\Users\%USERNAME%\AppData\Roaming\Notepad++\shortcuts.xml.
<Macro name="Title Case" Ctrl="no" Alt="no" Shift="no" Key="0">
<Action type="3" message="1700" wParam="0" lParam="0" sParam="" />
<Action type="3" message="1601" wParam="0" lParam="0" sParam="([A-Z])" />
<Action type="3" message="1625" wParam="0" lParam="2" sParam="" />
<Action type="3" message="1602" wParam="0" lParam="0" sParam="\L$0" />
<Action type="3" message="1702" wParam="0" lParam="898" sParam="" />
<Action type="3" message="1701" wParam="0" lParam="1609" sParam="" />
<Action type="3" message="1700" wParam="0" lParam="0" sParam="" />
<Action type="3" message="1601" wParam="0" lParam="0" sParam="([a-z])+" />
<Action type="3" message="1625" wParam="0" lParam="2" sParam="" />
<Action type="3" message="1602" wParam="0" lParam="0" sParam="\u$0\E" />
<Action type="3" message="1702" wParam="0" lParam="898" sParam="" />
<Action type="3" message="1701" wParam="0" lParam="1609" sParam="" />
</Macro>
Extra: you can add this to your right-click context menu (contextMenu.xml) using:
<Item MenuEntryName="Macro" MenuItemName="Title Case" />

Related

The problem of using regular expressions in the shell script

I have a regex that removes the content in the Activity tag. The regex is \s*<activity .*>(?:\s|\S)*<\/activity>. It is possible in Java, but it will not work when written in the shell. The wording in the shell is as follows:
sed 's+\s*<activity .*>(?:\s|\S)*<\/activity>++g' AndroidManifest.xml
AndroidManifest.xml
<?xml version="1.0" encoding="utf-8"?>
<!-- GENERATED BY UNITY. REMOVE THIS COMMENT TO PREVENT OVERWRITING WHEN EXPORTING AGAIN-->
<manifest xmlns:android="http://schemas.android.com/apk/res/android" package="com.unity3d.player" xmlns:tools="http://schemas.android.com/tools">
<application>
<activity android:name="com.unity3d.player.UnityPlayerActivity" android:theme="#style/UnityThemeSelector" android:screenOrientation="fullSensor" android:launchMode="singleTask" android:configChanges="mcc|mnc|locale|touchscreen|keyboard|keyboardHidden|navigation|orientation|screenLayout|uiMode|screenSize|smallestScreenSize|fontScale|layoutDirection|density" android:hardwareAccelerated="false">
<intent-filter>
<action android:name="android.intent.action.MAIN" />
<category android:name="android.intent.category.LAUNCHER" />
</intent-filter>
<meta-data android:name="unityplayer.UnityActivity" android:value="true" />
<meta-data android:name="android.notch_support" android:value="true" />
</activity>
<meta-data android:name="unity.splash-mode" android:value="0" />
<meta-data android:name="unity.splash-enable" android:value="True" />
<meta-data android:name="notch.config" android:value="portrait|landscape" />
<meta-data android:name="unity.build-id" android:value="07a923ed-bdbd-46ed-98bd-afef17a7904a" />
</application>
<uses-feature android:glEsVersion="0x00030000" />
<uses-feature android:name="android.hardware.vulkan.version" android:required="false" />
<uses-feature android:name="android.hardware.touchscreen" android:required="false" />
<uses-feature android:name="android.hardware.touchscreen.multitouch" android:required="false" />
<uses-feature android:name="android.hardware.touchscreen.multitouch.distinct" android:required="false" />
</manifest>
What should I do. Thanks.
The syntax of regular expression are roughly classified in three
variants: BRE, ERE and PCRE. The latter has more features and power of
expression. Your regex is written in PCRE while sed supports up to ERE.
Another problem is that sed processes the input file line by line and
it requires some trick to make sed regex match across lines.
With sed please try the following:
sed -E '
:l # define a label "l"
N # append the next line of input into the pattern space
$!b l # repeat until the last line
# then whole lines are stored in the pattern space
s+[[:blank:]]*<activity .*>.*<\/activity>++g
# perform the replace command over the pattern space
' AndroidManifest.xml
The -E option enables ERE
It slurps the whole file at first then performs the replacement next.
BTW if perl is your option, you can apply your regex as is:
perl -0777 -pe 's+\s*<activity .*>(?:\s|\S)*<\/activity>++g' AndroidManifest.xml
There is one caveat regarding the (?:\s|\S)* expression. The quantifier *
is greedy and tries to match as long as possible. If the xml file contains multiple <activity> .. </activity>
tags, the entire block across the tags is removed including the intermediate lines which should
not be removed. It will be better to rewrite it as: (?:\s|\S)*? or
[\s\S]*? in a common manner.

How to Create Variable Definitions from Phrases using Regexes (Notepad++)

Suppose we have a list of phrases - words separated by spaces. And suppose we want to define a bunch of variables based on these phrases such that the following hold:
Phrases already exist and are surrounded by quotes (if not, you can easily use a regex to achieve this)
Phrases only contain letters (this actually isn't true for me in practice, but I can handle those cases manually)
Variable name, followed by an equals sign, should precede the phrase
Variable name should be a lowerCamelCase version of the phrase
Example
Input
"hello World"
"foo bAr"
Expected Output
helloWorld = "hello World"
fooBar = "foo bAr"
Use Case
Often in my line of work I am presented with a bunch of constants which come from an Excel spreadsheet and I need to define a bunch of variables in code for them. The phrases have spaces in them, but the variables can't. I'd usually like to keep the variable names as close to the phrases as I can. I'd like a way to do it in bulk, without having to individually type out each variable name.
Notes
I have come up with a way to do this, which I want to record here in case I need it in future and in case others might need it. I also want to post it here because I have a feeling there are optimizations that can be made to my process, or at least alternatives.
I haven't found a single find/replace step that'll do everything for you, but I have managed to do it using a sequence of regexes, applied one after another. The first pulls out the content for the variable name and inserts the "=". The next one does the main heavy lifting and removes spaces and applies the correct casing. The final one ensures all variables begin with lowercase letters. Apply them in sequence to achieve the desired result.
Regex #1
All we're doing here is pulling content out of the quotes and inserting it on the left hand side.
Note: here and below, I need to use this character for whitespace because SO doesn't render it correctly: ␣. So replace that with a space when you use this or other regexes in this answer.
Find: "(.+)"
Replace: ␣\1 = "\1" (note the leading space)
In our example, after this step, we end up with:
hello World = "hello World"
foo bAr = "foo bAr"
Regex #2
Here, we want to match each word on the left hand side, with the goal of removing the whitespace and simultaneously fixing the casing.
Find: ␣(\S)(\S+)(?=.*=) (note the leading space)
Replace: \u\1\L\2 (absence of space in replacement pattern achieves the removal of the space)
After this step, we end up with:
HelloWorld = "hello World"
FooBar = "foo bAr"
Correct, except for the first letter of each variable name.
Regex #3
This fixes the leading characters to be lowercase:
Find: ^(.)
Replace: \l\1
After this step, our output is as desired:
helloWorld = "hello World"
fooBar = "foo bAr"
Optional Regex #4 (Remove Invalid Characters)
Though the requirement assumed all letters, this is often not the case. First, you may want some numbers in there. Second, there may be junk like parentheses. In this case, just do a find/replace with the replace expression empty for the following find expression:
[^\w\r\n](?!=)(?=.*=)
What that does is first matches negatively to anything that's not a letter, a digit, an underscore or an end of line character. It then ensures that the match is followed by an = down the line but not immediately followed by an =, meaning the space before the = is preserved.
As a Macro
Rather than manually do all 4 steps above, you can record them as a macro and save it to your Notepad++. Or just paste the XML below inside the <Macros> XML element in the file shortcuts.xml inside %appdata%\Notepad++. If you do paste, the shortcut is ctrl+alt+shift+V, but you can change that to whatever you want:
<Macro name="DefineVariables" Ctrl="yes" Alt="yes" Shift="yes" Key="86">
<Action type="3" message="1700" wParam="0" lParam="0" sParam="" />
<Action type="3" message="1601" wParam="0" lParam="0" sParam="(.+)" />
<Action type="3" message="1625" wParam="0" lParam="2" sParam="" />
<Action type="3" message="1602" wParam="0" lParam="0" sParam=' \1 = "\1"' />
<Action type="3" message="1702" wParam="0" lParam="768" sParam="" />
<Action type="3" message="1701" wParam="0" lParam="1609" sParam="" />
<Action type="3" message="1700" wParam="0" lParam="0" sParam="" />
<Action type="3" message="1601" wParam="0" lParam="0" sParam=" (\S)(\S+)(?=.*=)" />
<Action type="3" message="1625" wParam="0" lParam="2" sParam="" />
<Action type="3" message="1602" wParam="0" lParam="0" sParam="\u\1\L\2" />
<Action type="3" message="1702" wParam="0" lParam="768" sParam="" />
<Action type="3" message="1701" wParam="0" lParam="1609" sParam="" />
<Action type="3" message="1700" wParam="0" lParam="0" sParam="" />
<Action type="3" message="1601" wParam="0" lParam="0" sParam="^(.)" />
<Action type="3" message="1625" wParam="0" lParam="2" sParam="" />
<Action type="3" message="1602" wParam="0" lParam="0" sParam="\l\1" />
<Action type="3" message="1702" wParam="0" lParam="768" sParam="" />
<Action type="3" message="1701" wParam="0" lParam="1609" sParam="" />
<Action type="3" message="1700" wParam="0" lParam="0" sParam="" />
<Action type="3" message="1601" wParam="0" lParam="0" sParam="[^\w\r\n](?!=)(?=.*=)" />
<Action type="3" message="1625" wParam="0" lParam="2" sParam="" />
<Action type="3" message="1602" wParam="0" lParam="0" sParam="" />
<Action type="3" message="1702" wParam="0" lParam="768" sParam="" />
<Action type="3" message="1701" wParam="0" lParam="1609" sParam="" />
</Macro>

What does the following regex matches?

I am trying to use IIS URL rewrite to take a user to a WWW domain instead of a non-WWW. I came across an article which uses the following regex to match domain names:
^[^\.]+\.[^\.]+$
I can't figure out what sort of domain is being matched with this regex. Here is the complete piece of code:
<rule name="www redirect" enabled="true" stopProcessing="true">
<match url="." />
<conditions>
<add input="{HTTP_HOST}" **pattern="^[^\.]+\.[^\.]+$"** />
<add input="{HTTPS}" pattern="off" />
</conditions>
<action type="Redirect" url="http://www.{HTTP_HOST}/{R:0}" />
</rule>
<rule name="www redirect https" enabled="true" stopProcessing="true">
<match url="." />
<conditions>
<add input="{HTTP_HOST}" **pattern="^[^\.]+\.[^\.]+$"** />
<add input="{HTTPS}" pattern="on" />
</conditions>
<action type="Redirect" url="https://www.{HTTP_HOST}/{R:0}" />
</rule>
^ # anchor the pattern to the beginning of the string
[^\.] # negated character class: matches any character except periods
+ # one or more of those characters
\. # matches a literal period
[^\.] # negated character class: matches any character except periods
+ # one or more of those characters
$ # anchor the pattern to the end of the string
The anchors are important to make sure that there is nothing around the domain that is not allowed.
As Tim Pietzker mentioned, the periods do not need to be escaped inside the character class.
To answer your question, the most basic way: what does this match? Any string that contains exactly one ., which is neither the first nor the last character.

Merging two lines into one - Notepad++

I have a line like this
assignee: Akebono Brake Industry Co. Ltd. ,
Fujitsu Limited application_no: 06/946,825
I want the output to be
assignee: Akebono Brake Industry Co. Ltd. , Fujitsu Limited
application_no: 06/946,825
To bring the application_no: 06/946,825 to the next line, I can find application_no: and replace it with \napplication_no: in my NOTEPAD++
But, how can I bring that string that spans to next line back to the first line? I mean what should I do to get the Fujitsu Limited to the line starting with assignee:
Any guidance please?
Since Extended is the only mode that handles the newlines correctly but you need to match with regular expressions, you will need to do this in two steps.
First, use a regex find and replace to add some recognizable token to the beginning of each line you want to move up, I used 'MATCH' but you could definitely change this.
Then, switch to Extended to search for a newline followed by the token, and replace it with an empty string to delete both the line break and the token.
Here is another solution. One step:
Search:
(^.*,.*$)\r\n([ A-Z]*[ ])
Replace:
\1\2\r\n
I am unfamiliar with notepad++, but surely there is a "/n" after the comma? Could you not just remove the char that creates the new line segment? ie: the inverse of what you are doing to application_no:
You can't do that with regular expressions due to a flaw in the Scintilla engine which Notepad++ uses. However, it works in "extended" find mode, so use that. Search for ,\r\n and replace with ,.
Change the \r\n to only \n on Linux, or to only \r on Mac OS.
I've just wrote that macro and works with your example.
Add this macro into shortcuts.xml, if you are using win7 file is located at C:\Users\{username}\AppData\Roaming\Notepad++
Just open your text file and get cursor to firts line, then run this macro.
<Macro name="stackoverflow" Ctrl="no" Alt="no" Shift="no" Key="0">
<Action type="3" message="1700" wParam="0" lParam="0" sParam="" />
<Action type="3" message="1601" wParam="0" lParam="0" sParam="application_no" />
<Action type="3" message="1625" wParam="0" lParam="0" sParam="" />
<Action type="3" message="1702" wParam="0" lParam="768" sParam="" />
<Action type="3" message="1701" wParam="0" lParam="1" sParam="" />
<Action type="0" message="2302" wParam="0" lParam="0" sParam="" />
<Action type="0" message="2451" wParam="0" lParam="0" sParam="" />
<Action type="0" message="2306" wParam="0" lParam="0" sParam="" />
<Action type="0" message="2326" wParam="0" lParam="0" sParam="" />
<Action type="3" message="1700" wParam="0" lParam="0" sParam="" />
<Action type="3" message="1601" wParam="0" lParam="0" sParam="application_no" />
<Action type="3" message="1625" wParam="0" lParam="0" sParam="" />
<Action type="3" message="1702" wParam="0" lParam="768" sParam="" />
<Action type="3" message="1701" wParam="0" lParam="1" sParam="" />
<Action type="0" message="2308" wParam="0" lParam="0" sParam="" />
<Action type="1" message="2170" wParam="0" lParam="0" sParam="
" />
<Action type="1" message="2170" wParam="0" lParam="0" sParam="
" />
</Macro>

URLRewrite IIS 7

I am trying to perform a simple URLRewriting. if you visit azamsharp.com it will take to some folder browsing structure it should go to http://www.azamsharp.com/AzamSharpWebApps/Default.aspx.
I don't want to see the AzamSharpWebApps in the URL: Here is the URL Rewrite I am using:
<system.webServer>
<rewrite>
<rules>
<rule name="Virtual Director" enabled="true" stopProcessing="false">
<match url=".*" />
<conditions>
<add input="{MyDomains:{HTTP_HOST}}" pattern="(.+)" />
</conditions>
<action type="Rewrite" url="{C:1}{REQUEST_URI}" />
</rule>
</rules>
<rewriteMaps>
<rewriteMap name="MyDomains">
<add key="azamsharp.com" value="/AzamSharpWebApps/default.aspx" />
<add key="www.azamsharp.com" value="/AzamSharpWebApps/default.aspx" />
</rewriteMap>
</rewriteMaps>
</rewrite>
</system.webServer>
I don't know how IIS UrlRewrite works, but at a general regex level you want to replace ^(?!/AzamSharpWebApps).+ with /AzamSharpWebApps/$0
The ^ is start of string, and the (?!..) is a negative lookahead, saying "make sure the following text is not "/AzamSharpWebApps", and then the .+ matches any character until the end of the string.
In the replace side, the $0 indicates the entire captured text - so basically the regex is saying "if it doesn't already start with "/AzamSharpWebApps", prefix it with that.
(You'll need to experiment with whether you need a / before the $0 or not.)
Anyhow, looks like IIS uses {C:0} instead of $0, just to be different, so I guess it will look something like this inside the rule:
<match url="^(?!/AzamSharpWebApps).+" />
<action type="Rewrite" url="/AzamSharpWebApps{C:0}" />
But possibly it needs more than just that.