Regex in Atom.io search field - regex

So, for example, I have this list:
10text
11text
12text
13text
14text
15text
16text
17text
18text
19text
Now I need to copy this and make it all into the 20-29 and 30-39 range with a Regex.
So only the very first 1 needs to be changed into a 2 and a 3 etc.
I can't seem to figure out what the regex is.
I tried: 1.text 1*text
Now I have been doing some reading and probably because it ain't in my native language it is still magic to me.
http://www.ntu.edu.sg/home/ehchua/programming/howto/regexe.html
https://www.regular-expressions.info/tutorial.html
http://2017.compciv.org/guide/topics/end-user-software/atom/how-to-use-regex-atom.html
Reference - What does this regex mean?
https://stackoverflow.com/tags/regex/info
What was I hoping for
When I fill in this in the search field:
1*text
and this in the replace field:
2*text
then on pressing change all the list becomes
20text
21text
22text
23text
24text
25text
26text
27text
28text
29text

I don't know exactly what you are trying to do here, but if you want to convert 1xtext to 2xtext, then try searching for this pattern:
^1
and then replace with just 2. Or, more generally, to match something like (1xtext) you could try using the pattern \b1, and again replace with 2.

Related

VB.Net Beginner: Replace with Wildcards, Possibly RegEx?

I'm converting a text file to a Tab-Delimited text file, and ran into a bit of a snag. I can get everything I need to work the way I want except for one small part.
One field I'm working with has the home addresses of the subjects as a single entry ("1234 Happy Lane Somewhere, St 12345") and I need each broken down by Street(Tab)City(Tab)State(Tab)Zip. The one part I'm hung up on is the Tab between the State and the Zip.
I've been using input=input.Replace throughout, and it's worked well so far, but I can't think of how to untangle this one. The wildcards I'm used to don't seem to be working, I can't replace ("?? #####") with ("??" + ControlChars.Tab + "#####")...which I honestly didn't expect to work, but it's the only idea on the matter I had.
I've read a bit about using Regex, but have no experience with it, and it seems a bit...overwhelming.
Is Regex my best option for this? If not, are there any other suggestions on solutions I may have missed?
Thanks for your time. :)
EDIT: Here's what I'm using so far. It makes some edits to the line in question, taking care of spaces, commas, and other text I don't need, but I've got nothing for the State/Zip situation; I've a bad habit of wiping something if it doesn't work, but I'll append the last thing I used to the very end, if that'll help.
If input Like "Guar*###/###-####" Then
input = input.Replace("Guar:", "")
input = input.Replace(" ", ControlChars.Tab)
input = input.Replace(",", ControlChars.Tab)
input = "C" + ControlChars.Tab + strAccount + ControlChars.Tab + input
End If
input = System.Text.RegularExpressions.Regex.Replace(" #####", ControlChars.Tab + "#####") <-- Just one example of something that doesn't work.
This is what's written to input in this example
" Guar: LASTNAME,FIRSTNAME 999 E 99TH ST CITY,ST 99999 Tel: 999/999-9999"
And this is what I can get as a result so far
C 99999/9 LASTNAME FIRSTNAME 999 E 99TH ST CITY ST 99999 999/999-9999
With everything being exactly what I need besides the "ST 99999" bit (with actual data obviously omitted for privacy and professional whatnots).
UPDATE: Just when I thought it was all squared away, I've got another snag. The raw data gives me this.
# TERMINOLOGY ######### ##/##/#### # ###.##
And the end result is giving me this, because this is a chunk of data that was just fine as-is...before I removed the Tabs. Now I need a way to replace them after they've been removed, or to omit this small group of code from a document-wide Tab genocide I initiate the code with.
#TERMINOLOGY###########/##/########.##
Would a variant on rgx.Replace work best here? Or can I copy the code to a variable, remove Tabs from the document, then insert the variable without losing the tabs?
I think what you're looking for is
Dim r As New System.Text.RegularExpressions.Regex(" (\d{5})(?!\d)")
Dim input As String = rgx.Replace(input, ControlChars.Tab + "$1")
The first line compiles the regular expression. The \d matches a digit, and the {5}, as you can guess, matches 5 repetitions of the previous atom. The parentheses surrounding the \d{5} is known as a capture group, and is responsible for putting what's captured in a pseudovariable named $1. The (?!\d) is a more advanced concept known as a negative lookahead assertion, and it basically peeks at the next character to check that it's not a digit (because then it could be a 6-or-more digit number, where the first 5 happened to get matched). Another version is
" (\d{5})\b"
where the \b is a word boundary, disallowing alphanumeric characters following the digits.

Complex regex situation

I have a results list that looks like this:
1lemon_king9mumu (2-1), YearofHell (2-0), kriswithak (2-1)0.44440.75000.4444
2mumu6lemon_king (1-2), MogwaiAC (2-0), Dathanja (2-1)0.66670.62500.5655
3MogwaiAC6Dathanja (2-0), mumu (0-2), Jebnarf (2-1)0.55560.57140.5417
4Jebnarf6YearofHell (2-1), kriswithak (2-0), MogwaiAC (1-2)0.44440.62500.4266
5YearofHell3Jebnarf (1-2), lemon_king (0-2), Mig82 (2-1)0.66670.37500.6012
6Dathanja3MogwaiAC (0-2), Mig82 (2-1), mumu (1-2)0.55560.37500.5417
7Mig823Bye, Dathanja (1-2), YearofHell (1-2)0.33330.42860.3750
8kriswithak0Jebnarf (0-2), lemon_king (1-2)0.83330.20000.6875
I want to be able to pull the username of the person AFTER the rank (first number) but it is mashed together with points gained by the player, as well as their first opponent.
For example, the first persons name is "Lemon_king", and his opponents were "Mumu", "YearofHell" and "Kriswithak". The numbers on the right are irrelevant for me, but the major problem I have is that the number of points won by the player is there. Lemon_King wins 9 points for first place. I would normally just get the name by looking for the string between 1 and 9, but players usernames can have a 9 in it as well.
Can anyone think of a good solution to this problem to be able to grab the persons username?
Thanks
I think you'd need a list of the usernames to compare against; it doesn't look like the results list is "regular" enough for a regular expression.
For example the line
7Mig823Bye, Dathanja
Could be "Mig82" 3 points vs "Bye, Dathanja", but it could also be "Mig8", 23 points, "Bye, Dathanja" or "Mig8", 2 points, "3Bye, Dathanja".
Is that correct? Because if it is, you aren't going to get away with a simple solution.
Edit: Wilson commented that getting the list of usernames might be an option. In that case, something like the following might work:
/^\d+?(username1|username2|username3)\d+?(username1|username2|username3)/
It will probably take some fiddling to get right.
Here's a plnkr demonstrating it with the data you provided: http://plnkr.co/edit/nJeGfbfHgvh5zJcTWRXS?p=preview
That said, a regex might not be the right tool for this job.
As far as I can tell, you want something like
(?x) # allow whitespace and comments just like
# any real programming language
^ # beginning of line
( \d+ ) # starts with one or more digits: CAPTURE 1
(?= \D ) # must have a non-digit following
( \w+ ) # capture one or more "word" characters: CAPTURE 2
( \d ) # next is a single digit: CAPTURE 3
(?= \D ) # must have a non-digit following
( \w+ ) # capture one or more "word" characters: CAPTURE 4
# now add things for the rest of the line if you want
Your username should now be in the second capture. I’ve been a tad more careful than strictly necessary, but if you end up munging this, you may need that. I’ve alos put all the captures in case you want to move stuff around or pull more stuff out.
Please provide a bit more information, if you want the thing between the first number and second number:
[0-9]+([^0-9])
The first group will contain the first username.
Please comment on this (so I check) an edit your question with more detail though.
I wouldnt use regex. It will be a pain to debug it, and you'll never be 100% certain you've covered all the edge cases.
Try doing 'manual' parsing using your language of choice's built in string manipulation functions.

How can I match all strings unless it contains a certain string?

So I want to match every string in this list, except the ones that contain the product SKU, which is /s7892632 <---- random string of numbers. I've been trying to do this for quite some time and have been unsuccessful. Any insight would be greatly appreciated.
/account/login?returnurl=/account/forgotpassword
/account/login?returnurl=/account/orders
/account/orders
/account/updateaddress
/account/updateemail
/account/updaterewardscard
/brands/havaianas
/careers
/Category List
/checkout
/checkout/addresses
/checkout/addresses/delivery
/checkout/addresses/deliverymethod
/checkout/affilinetbasket
/checkout/anonymous
/checkout/confirmation
/checkout/express
/checkout/login
/checkout/login?returnurl=/checkout/addresses
/checkout/null
/checkout/payment
/checkout/paypal
/checkout/quickshop/
/checkout/verify
/click-and-collect
/click-and-collect/click-and-collect-overview
/corporate/about-matalan
/corporate/careers
/corporate/cookies
/corporate/history
/customer-services/accessibility
/customer-services/contact
/customer-services/customer-services-home
/customer-services/delivery
/customer-services/faq
/customer-services/fitting-room
/customer-services/here-to-help
/customer-services/size-guides
/delivery
/events/mothers-day
/events/mothers-day/s2516241/tassle-detail-slouch-bag
/events/mothers-day/s2518752/waxed-jacket
/events/mothers-day/s2519237/fabric-buckle-tote-bag
/events/mothers-day/s2521182/heart-print-nightie
/events/mothers-day/s2521184/heart-print-dressing-gown
/events/mothers-day/s2521185/heart-print-pyjama-set
/events/mothers-day/s2521679/structured-tote-bag
/events/mothers-day/s2522143/chiffon-print-dress
/events/mothers-day/s2522347/butterfly-enamel-bowl-32cm-x-8cm
/events/mothers-day/s2526013/animal-print-jersey-blazer
/events/mothers-day/s2527624/croc-tote-bag
/events/mothers-day/s2529731/shift-dress
/events/mothers-day?page=1&size=120&cols=4&sort=&id=/events/mothers-day&priceRange[min]=2&priceRange[max]=59
/events/mothers-day?page=2&size=120&cols=4&sort=&id=/events/mothers-day&priceRange[min]=2&priceRange[max]=59
/events/mothers-day?page=2&size=36&cols=4&sort=&id=/events/mothers-day&priceRange[min]=2&priceRange[max]=59
/events/mothers-day?page=3&size=36&cols=4&sort=&id=/events/mothers-day&priceRange[min]=2&priceRange[max]=59
The following should work:
^(?!.*/s\d{7}/).*
Example: http://regexr.com?343nf
This assumes you have each string as a separate element in a list. If this is actually matching one big string with multiple lines you can use the same regex, but you may need to enable global and multiline options depending on the tool you are using (and make sure dotall/singleline is disabled).
Try this:
boolean noSku = !line.matches(".*/s\\d{5,}.*");
This uses {5,} which allows for any number of digits in the SKU greater than 4 (giving you flexibility with matching). You can change the number to whatever suits.
this matches lines that don't have the code....
^((?!s\d{7}).)*$

Regexp: Keyword followed by value to extract

I had this question a couple of times before, and I still couldn't find a good answer..
In my current problem, I have a console program output (string) that looks like this:
Number of assemblies processed = 1200
Number of assemblies uninstalled = 1197
Number of failures = 3
Now I want to extract those numbers and to check if there were failures. (That's a gacutil.exe output, btw.) In other words, I want to match any number [0-9]+ in the string that is preceded by 'failures = '.
How would I do that? I want to get the number only. Of course I can match the whole thing like /failures = [0-9]+/ .. and then trim the first characters with length("failures = ") or something like that. The point is, I don't want to do that, it's a lame workaround.
Because it's odd; if my pattern-to-match-but-not-into-output ("failures = ") comes after the thing i want to extract ([0-9]+), there is a way to do it:
pattern(?=expression)
To show the absurdity of this, if the whole file was processed backwards, I could use:
[0-9]+(?= = seruliaf)
... so, is there no forward-way? :T
pattern(?=expression) is a regex positive lookahead and what you are looking for is a regex positive lookbehind that goes like this (?<=expression)pattern but this feature is not supported by all flavors of regex. It depends which language you are using.
more infos at regular-expressions.info for comparison of Lookaround feature scroll down 2/3 on this page.
If your console output does actually look like that throughout, try splitting the string on "=" when the word "failure" is found, then get the last element (or the 2nd element). You did not say what your language is, but any decent language with string splitting capability would do the job. For example
gacutil.exe.... | ruby -F"=" -ane "print $F[-1] if /failure/"

Regex to replace string with another string in MS Word?

Can anyone help me with a regex to turn:
filename_author
to
author_filename
I am using MS Word 2003 and am trying to do this with Word's Find-and-Replace. I've tried the use wildcards feature but haven't had any luck.
Am I only going to be able to do it programmatically?
Here is the regex:
([^_]*)_(.*)
And here is a C# example:
using System;
using System.Text.RegularExpressions;
class Program
{
static void Main()
{
String test = "filename_author";
String result = Regex.Replace(test, #"([^_]*)_(.*)", "$2_$1");
}
}
Here is a Python example:
from re import sub
test = "filename_author";
result = sub('([^_]*)_(.*)', r'\2_\1', test)
Edit: In order to do this in Microsoft Word using wildcards use this as a search string:
(<*>)_(<*>)
and replace with this:
\2_\1
Also, please see Add power to Word searches with regular expressions for an explanation of the syntax I have used above:
The asterisk (*) returns all the text in the word.
The less than and greater than symbols (< >) mark the start and end
of each word, respectively. They
ensure that the search returns a
single word.
The parentheses and the space between them divide the words into
distinct groups: (first word) (second
word). The parentheses also indicate
the order in which you want search to
evaluate each expression.
Here you go:
s/^([a-zA-Z]+)_([a-zA-Z]+)$/\2_\1/
Depending on the context, that might be a little greedy.
Search pattern:
([^_]+)_(.+)
Replacement pattern:
$2_$1
In .NET you could use ([^_]+)_([^_]+) as the regex and then $2_$1 as the substitution pattern, for this very specific type of case. If you need more than 2 parts it gets a lot more complicated.
Since you're in MS Word, you might try a non-programming approach. Highlight all of the text, select Table -> Convert -> Text to Table. Set the number of columns at 2. Choose Separate Text At, select the Other radio, and enter an _. That will give you a table. Switch the two columns. Then convert the table back to text using the _ again.
Or you could copy the whole thing to Excel, construct a formula to split and rejoin the text and then copy and paste that back to Word. Either would work.
In C# you could also do something like this.
string[] parts = "filename_author".Split('_');
return parts[1] + "_" + parts[0];
You asked about regex of course, but this might be a good alternative.