Regex to parse the first letter of each line? - regex

This is the list:
Work
Work
Fire
Global
And I want to extract the string WWFG from it. [(?).*\n] just give me Global. What should I rather be using?
For context, I'm using Rainmeter's webparser plugin.

Try this: (?simU)^(.)
RainRegExp seems to lack the replacement feature, so it is impossible to get all the captures concatenated into one string.

You need to use the multiline flag with an anchor. I would use: /^(.)/gm (syntax differs from language to language)
See example here: http://regex101.com/r/uC1gV5

The easiest way depends on what language you're using, but you want to replace
(.).*\n
with
$1

(?siU)(?(?=.)(.))(?(?=.*\n).*\n(.))(?(?=.*\n).*\n(.))(?(?=.*\n).*\n(.))
Answered by #moshi here. And it works perfectly with Rainmeter.

I have no idea what language you're using, but here's some Python!
>>> import re
>>> ''.join(re.findall("(.).*", "Work\nWork\nFire\nGlobal"))
'WWFG'

This will capture the first character from each line
([a-z])[^\n]+\n*
the replace with \1 or $1
Depending on what is in the text you might need to change [a-z] to something more all-encompassing

If that is Lua, try s:gsub("(.).-\n","%1").

Related

Why /^[a-zA-Z0-9]+#[a-zA-Z0-9]\.(com)|(edu)|(org)$/i does not work as expected

I have this regex for email validation (assume only x#y.com, abc#defghi.org, something#anotherhting.edu are valid)
/^[a-zA-Z0-9]+#[a-zA-Z0-9]\.(com)|(edu)|(org)$/i
But #abc.edu and abc#xyz.eduorg are both valid as to the regex above. Can anyone explain why that is?
My approach:
there should be at least one character or number before #
then there comes #
there should be at least one character or number after # and before .
the string should end with either edu, com, or org.
Try this
/^[a-zA-Z0-9]+#[a-zA-Z0-9]+\.(com|edu|org)$/i
and it should become clear - you need to group those alternatives, otherwise you can match any string that has 'edu' in it, or any string that ends with org. To put it another way, your version matches any of these patterns
^[a-zA-Z0-9]+#[a-zA-Z0-9]\.(com)
(edu)
(org)$
It's worth pointing out that the original poster is using this as a regex learning exercise. This would be a terrible regex for actual production use! It's a thorny problem - see Using a regular expression to validate an email address for a lot more depth.
Your grouping parentheses are incorrect:
/^[a-zA-Z0-9]+#[a-zA-Z0-9]+\.(com|edu|org)$/i
Can also just use one case as you're using the i modifier:
/^[a-z0-9]+#[a-z0-9]+\.(com|edu|org)$/i
N.B. you were also missing a + from the second set, I assume this was just a typo...
What you have written is the equivalent of matching something that:
Begins with [a-zA-Z0-9]+#[a-zA-Z0-9].com
contains edu
or ends with org
What you were looking for was:
/^[a-z0-9]+#[a-z0-9]+\.(com|edu|org)$/i
Your regex looks ok.
I guess you are looking using a find function in stead of a match function
Without specifying what you use it is a bit difficult, but in Python you would write
import re
pattern = re.compile ('^[a-zA-Z0-9]+#[a-zA-Z0-9]\.(com)|(edu)|(org)$')
re.match('#abc.edu') # fails, use this to validate an input
re.search('#abc.edu') # matches, finds the edu
Try to use it:
[a-zA-Z0-9]+#[a-zA-Z0-9]+.(com|edu|org)+$
U forget about + modificator if u want to catch any combinations of (com|edu|org)
Upd: as i see second [a-zA-Z0-9] u missed + too

What regular expression can I use to extract the value of a query-string parameter?

This is my string: href="/store/apps/details?id=SomeString&.
How can I extract the SomeString with PerlRegEx? I'm using Delphi XE2.
You can use the following:
href="/store/apps/details\?id=([^&"]*)
SomeString will be captured in group 1.
use the regex: \bid=.*(&)?
You can use the following regex
[?&]id=([^a]*)&?
http://rubular.com/r/6ivQNNBLxP
mind you depending on what is and is not an allowable input, just about any conceivable reg-ex can be tricked
You can use : .*id=\([^&]*\)&
Depending on the language you are using, you might find tools cleaner than regexps to perform this task.

Issues with RegEx

I am trying to make an if-then-else statement using RegEx. I want to match the text if it contains Monty and also contains Python. Also the text should get matched if Monty is not present in the text.
RegEx
(?(?=Monty)(?(?=Python).*|)|^.*).*$
Kindly help!
How about this:
(^(?!.*Monty(?!.*Python.*).*).*$|^.*Python.*Monty.*$)
This passes my tests, but let me know if it works for you.
I am not versed in lookahead regex but just tried to build the regex from what I understood from above description. Check the link to see if this is what you are trying to do.
try this instead
((?=Monty)((?=Python).*|)|^.*).*$

What is the difference between "a{1}" and "a" in regex?

Some string was matched with the following `regex
([0-9]\s+){1}
Why did author use {1} in the end of regex?
Can I safely remove it?
Yes, there is no difference at all. Possibly it was left over from tweaks made while the regex was being built and tested.
{1} limits the regex match to only one integer or space, in your example.
It is probably a leftover from debugging/writing the query when the author experimented with {1,2} or so.
Yes, you can remove it.
if it is the result of an interpreted code (log/debug coming from script for exemple) the 1 could be the value of a variable.
If it is directly in a script, {1} is the default behavior so it is the same (but take longer to work due to extra interpreation to make by the parser)

Regex: how do I capture the file extension?

How do I determine the file extension of a file name string?
lets say I have
I'm.a.file.name.tXt
the regex should return tXt
something like \.[^.]*$ should do it
You probably don't need regex - most languages will have the equivalent to this:
ListLast(Filename,'.')
(If you do need regex for some reason, Scharron's answer is correct.)
What language is this in? It's quite possible you don't want to actually use a regex - assuming the name is a String you'll probably just want to do something like split it over periods and then choose the last segment. Okay, that's sort of a regex answer, but not a proper one.
/^(.*?)\.(.*)$/
The '?' makes it greedy. Your result will be in the second group.