Regex to find specific character between two other characters - regex

I've been trying to find a way to find a single comma between inverted commas without much luck. Example: "text , text " - how do I isolate the "," between the inverted commas line by line in a flat file?
My attempt .["].[,].["].
Thanks in advance

this regex will work
(?<=truck).*(?=car)
finds e.g. "plane" in the string
truckplanecar
so for test,test the regex would be
(?<=test).*(?=test)
PS. can you please provide an more detailed example what you would like to do

Try using 2 group at the start and end of the string, the following regex should work:
(".*),(.*")
it does match the example you've shared:
"text , text "
Furthermore, using groups, you can isolate the string before the comma and afterwards, in case you'll be needed it.

Related

Find commas in pattern

I have file with rows like this:
"B4P(6-3,5)-VH(LF)(SN)",JST,2018+,34000,SMD
893D226X0016C8W,VISHAY,2018+,"30,000",SMD
BL-BUF1V4V-AT-L,FOXLINK,2018+,1890,CONN
"TLP721F(D4-GR,M,F)",NSC,2001+,114,AUCDIP-16
How can i find all commas inside quotes? For example, i need to find this:
"B4P(6-3 >>,<< 5)-VH(LF)(SN)",JST,2018+,34000,SMD
893D226X0016C8W,VISHAY,2018+,"30 >>,<< 000",SMD
BL-BUF1V4V-AT-L,FOXLINK,2018+,1890,CONN
"TLP721F(D4-GR >>,<< M >>,<< F)",NSC,2001+,114,AUCDIP-16
Now I can only find text in quotes, tell me how to select only commas from it, using one regular expression?
("(?:\[??[^\[]*?"))
Regex101 - online regex editor and debugger
Here is a simplistic solution that works with your example:
It match only quoted strings having one or more , inside.
grep '"[^,]*,[^"]*"'
Hope it works for you.
Explanation
"[^,]* match " and following non , chars
, match the first , char
[^"]*" match following non " till find the next"

Regex Find Spaces between single qoutes and replace with underscore

I have a database table that I have exported. I need to replace the image file name with a space and would like to use notepad++ and regex to do so. I have:
'data/green tea powder.jpg'
'data/prod_img/lumina herbal shampoo.JPG'
'data/ALL GREEN HERBS.jpeg'
'data/prod_img/PSORIASIS KIT (640x530) (2).jpg'
and need to make them look like this:
'data/green_tea_powder.jpg'
'data/prod_img/lumina_herbal_shampoo.JPG'
'data/ALL_GREEN_HERBS.jpeg'
'data/prod_img/PSORIASIS_KIT_(640x530)_(2).jpg'
I just want to change the spaces between the quotes (I don't want to change the capitalization). To be more specific I would like to replace any and all spaces between 'data/ and ' because there are other spaces between quotes in the DB, for example:
'data/ REPLACE ANY SPACE HERE '
I found this:
\s(?!(?:[^']*'[^']*')*[^']*$)
but there are other places where there are spaces between quotes so I'd like to search for data/ in the beging and not just a single quote but I can't figure out how. I tried \s(?!(?:[^'data\/]*'[^']*')*[^']*$) but it didn't work and I am not familiar enough with regex to make it do so.
An example of a full line from the database is:
(712, 'GRTE-P', '', 'data/green tea powder.jpg', '2014-03-12 22:52:03'),
I don't want to replace the spaces in the time and data stamp at the end of the line, just the image file names.
Thanks in advance for your help!
You have to use a \G based pattern to ensure that matches are contiguous.
search: (?:\G(?!^)|'data/)[^' ]*\K[ ]replace: _
The first match uses the second branch of the alternation, then the next matches are contiguous and use the first branch.

Remove columns from CSV

I don't know anything about Notepad++ Regex.
This is the data I have in my CSV:
6454345|User1-2ds3|62562012032|324|148|9c1fe63ccd3ab234892beaf71f022be2e06b6cd1
3305611|User2-42g563dgsdbf|22023001345|0|0|c36dedfa12634e33ca8bc0ef4703c92b73d9c433
8749412|User3-9|xgs|f|98906504456|1534|51564|411b0fdf54fe29745897288c6ad699f7be30f389
How can I use a Regex to remove the 5th and 6th column? The numbers in the 5th and 6th column are variable in length.
Another problem is the User row can also contain a |, to make it even worse.
I can use a macro to fix this, but the file is a few millions lines long.
This is the final result I want to achieve:
6454345|User1-2ds3|62562012032|9c1fe63ccd3ab234892beaf71f022be2e06b6cd1
3305611|User2-42g563dgsdbf|22023001345|c36dedfa12634e33ca8bc0ef4703c92b73d9c433
8749412|User3-9|xgs|f|98906504456|411b0fdf54fe29745897288c6ad699f7be30f389
I am open for suggestions on how to do this with another program, command line utility, either Linux or Windows.
Match \|[^|]+\|[^|]+(\|[^|]+$)
Repalce $1
Basically, Anchor to the end of the line, and remove columns [-1] and [-2] (I assume columns can't be empty. Replace + with * if they can)
If you need finer detail then that, I'd recommend writing a Java or Python script to manual parse and rewrite the file for you.
I've captured three groups and given them names. If you use a replace utility like sed or vimregex, you can replace remove with nothing. Or you can use a programming language to concatenate keep_before and keep_after for the desired result.
^(?<keep_before>(?:[^|]+\|){3})(?<remove>(?:[^|]+\|){2})(?<keep_after>.*)$
You may have to remove the group namings and use \1 etc. instead, depending on what environment you use.
Demo
From Notepad++ hit ctrl + h then enter the following in the dialog:
Find what: \|\d+\|\d+(\|[0-9a-z]+)$
Replace with: $1
Search mode: Regular Expression
Click replace and done.
Regex Explain:
\|\d+ : match 1st string that starts with | followed by number
\|\d+ : match 2nd string that starts with | followed by number
(\|[0-9a-z]+): match and capture the string after the 2nd number.
$ : This is will force regex search to match the end of the string.
Replacement:
$1 : replace the found string with whatever we have between the captured group which is whatever we have between the parentheses (\|[0-9a-z]+)

perl script to match comma

I have a netlist generated from schematic. This netlist includes power pins. Iam trying to write a perl script to remove power pins from netlist.
As part of this i have to search for a string that matches the pattern shown below:
", );"
I have used the following code and it is not working
$line =~ s/,\s+\);//g
I have observed that pattern end with comma are matched but pattern starting with comma or pattern with comma in middle are not matched.
Any suggestions on how to get this work
You need to use this instead:
s/,\s*\);//
You should be defensive and be able to handle no whitespace between the , and the ). You have to escape the ). See perldoc perlre for more info.
Thank you every one. I have found the problem. The problem was that the pattern to be recognized is split in to two different lines. The "," is in one line followed by ");" in next line. At first, iam removing the new line character and assumed that the next line will get appended to the current line, which is not happening. Hence, the pattern matching did not work.
To resolve this, i have to read the file once again and then replace the pattern.

Replace a comma in text values in CSV using regex in Notepad++

I searched a lot but couldn't find any exact soluion.
I have a CSV which contains some values that contains a comma in between the values.
Following is a sample row
"BEIAAGJIPAMBPJIF",2757,08042010,"13:53.59",09042010,"01:55.39","SIHAM","BEIAIGHEIPLGPJIF",20,"A",20,"S",0.00,0.00,0.00,"OLY
SPECIAL ORDER","IN STOCK , DESIGNER",0.00000,0,"","N","N",
Now it you look at the value "IN STOCK , DESIGNER", it containts a comma in between. due to which while reading the csv in my .net application and in MS Dynamics CRM import file wizard, it breaks it into two seprate values instead of one single value.
I need a regex that can match such strings and replace the comma with a hyphen "-" that I can use in Notepad ++.
Kindly help.
Thanks.
This solution worked for me, although it is a bit indirect:
by searching, detect character which is unused in the file, e.g. #
use the following regex replace to replace all delimiters: find: (".*?"|.*?), replace: \1# (note the character from step 1)
now, all leftover commas are only those which are inside the quotes. Mass replace them for -
replace back all #'s for commas