When I try to add .jpg to the end of a cell's text in calc, using concatenate, multiple spaces are inserted - openoffice-calc

I have in column B a list of product names (I.e. AFFALL1) and in column C, I need to have the same text, but with a .jpg on the end.
I have tried to use concatenate as follows:
In cell C2:
=concatenate(B2;".jpg")
which partially works, except that it keeps inserting multiple spaces, whereas I need to it have no spaces.
It looks like this:
A4499FA .jpg
When I need it to look like:
A4499FA.jpg
I have no idea where these extra spaces are coming from. Any help would be appreciated.

#Barbara has the right idea but OpenOffice uses semicolon as the parameter separator, so:
=TRIM(B2)&".jpg"

Related

Regex Wrapping Quotes

I am trying to wrap quotes around certain section of content in a CSV file, the current layout is something like this:
###element1,element2,element3,element4,element5,element6,element7,element8, "element9,
element9,""element9"",element9,
element9,element9,""element9",element10,
###
the ### symbols depict a new line and each new line should have one, the problem is I need to get to all of element 9 in to one set of double quotes, however there are multiple instances of doublequotes within that area which break up the element in to new fields making my table expand beyond the fields I initially set. So I believe I need to remove all the " marks between the start and end of element9 and then reintroduce one set to highlight the whole section.
I approached this firstly by trying to select the 8th Comma from the start and the 2 comma from the end:
^((?:[^,]+,){8})(.+)((?:,[^,]*){2})$
and replacing with
$1"$2"$3
I tried to target the starting ### and ending ### to select those two elements but with no success.
any suggestions on how I can do this
UPDATE
###BLAHBLAH,BLAHBLAH,BLAHBLAH,BLAHBLAH,BLAHBLAH,BLAHBLAH,BLAHBLAH,BLAHBLAH,BLAHBLAH,
BLAHBLAH,
BLAHBLAH,
BLAHBLAH, BLAHBLAH,
BLAHBLAH, BLAHBLAH,
BLAHBLAH,
"BLAHBLAH""",E,
###
The last field always seem to contain a capital letter, the fields before vary in quotation placement so to really target that whole section I need to work out how many commas along and how many back I need to go, remove the quotes and then reinstate them in the correct positions.
###(?:[^,]*,){8}\K([\s\S]*?)(?=,[^,]*,[^,]*?###)
Try this.Replace by "\1" or "$1".See demo.
https://regex101.com/r/tD0dU9/13
/^(?:[^,]*,){8}([^#]*),[^,]*,[^,]*$/s
https://regex101.com/r/hU8yO6/1
I think the regexp you had is about right, except for needing the /s modifier.
For notepad++, get the s modifier by ticking ". matches newline":
^(?:[^,]*,){8}([^#]*),[^,]*,[^,]*$
This looks like a good reference: http://docs.notepad-plus-plus.org/index.php/Regular_Expressions
You'll probably want to add parens appropriately to make capture groups also.
^#+[^"]+"([^#]+),[^,]+,[^,]+###\s*$

REGEXREPLACE in Google Spreadsheet

I am trying to use REGEX in Google Sheets to clean up form data arriving as comma delimited data with arbitrary leading commas and single spaces.
sample data from form:
,,Refrigerator,,,,, ,,Slide,,Dual Slide,,Microwave Oven,,Indoor Shower,Built in Stereo,Day/Night Switch,,BluRay/DVD
I want to use
REGEXREPLACE(text, regular_expression, replacement)
to remove multiple commas and single spaces that may occur between commas, replacing with a single comma so the line reads
Refrigerator,Slide,Dual Slide,Microwave Oven, . . . etc
The match string (^,+|(,+ ,)|,+) works properly in the Rubular.com simulator, but when used in the Google Spreadsheet as in example with raw data above pasted in at cell M12 as source text:
REGEXREPLACE("M12","(^,+|(,+ ,)|,+)",",")
it fails by not removing one of the leading commas.
,Refrigerator,,,,, ,,Slide,,Dual Slide,,Microwave Oven,,Indoor Shower,Built in Stereo,Day/Night Switch,,BluRay/DVD
The Googlesheet REGEX help points to https://github.com/google/re2/blob/master/doc/syntax.txt which seems to describe the operations the same as the simulator.
From what you're describing, Google is working as expected and the other site linked isn't. Your regex is matching ^,+, amongst other things, (ie one or more commas at the start), and replacing them with a single comma. If the input string has commas at the start, I would expect the output to have one too.
You could build on what you've done with another regular expression replace, and strip any leading commas:
REGEXREPLACE(REGEXREPLACE(M12,"((,+ ,)|,+)",","), "^,+", "")
This uses your original one, minus the leading commas part, to do the original replace, then wraps it in a second call looking for just leading commas, and replacing those with nothing.
Having said that, your original regex is also not quite working as expected either and isn't stripping all the commas and spaces down to a single comma in all circumstances. Instead, you can use this one:
REGEXREPLACE(REGEXREPLACE(M12,"( ?(, *)+)",","), "^,+", "")
This looks for an optional space, followed by one or more commas, each with zero or more commas after them, replacing the whole lot with a single comma, then keeping the new "remove all commas at the start" replace also.
One more good way to do this:
=TEXTJOIN(", ",1,SPLIT(A1,", "))

Word removal using re results in wrong words being removed

Given a text "article_utf8" i want to remove a list of words:
remove = "el|la|de|que|y|a|en|un|ser|se|no|haber|..."
regex = re.compile(r'\b('+remove+r')\b', flags=re.IGNORECASE)
article_out = regex.sub("", article_utf8)
however this is incorrectly removing some words and parts of words for example:
1- aseguro becomes seguro
2- sería becomes í
3- coma becomes com
4- miercoles becomes 'ercoles'
Technically parts of a word can match a regexp. To solve this you would have to make sure that whatever sequence of letters your regexp matches is a single word and not part of it.
One way would be to make the regexp contain leading and trailing spaces, but words could also be separated with periods or commas so you would have to take those into account too if you want to catch all instances.
Alternatively, you can try splitting the list first into words using the built-in split method (https://docs.python.org/2/library/stdtypes.html#str.split). Then I would check each word in the resulting list, remove the ones I don't want and rejoin the strings. This method, however doesn't even need regexps so it's probably not what you intended despite being simple and practical.
After much testing, the following will remove the small words in a natural language string, without removing them from parts of other words:
regex = re.compile(r'[\s]?\b('+remove+')[\b\s\.\,]', flags=re.IGNORECASE)

Replace a comma in text values in CSV using regex in Notepad++

I searched a lot but couldn't find any exact soluion.
I have a CSV which contains some values that contains a comma in between the values.
Following is a sample row
"BEIAAGJIPAMBPJIF",2757,08042010,"13:53.59",09042010,"01:55.39","SIHAM","BEIAIGHEIPLGPJIF",20,"A",20,"S",0.00,0.00,0.00,"OLY
SPECIAL ORDER","IN STOCK , DESIGNER",0.00000,0,"","N","N",
Now it you look at the value "IN STOCK , DESIGNER", it containts a comma in between. due to which while reading the csv in my .net application and in MS Dynamics CRM import file wizard, it breaks it into two seprate values instead of one single value.
I need a regex that can match such strings and replace the comma with a hyphen "-" that I can use in Notepad ++.
Kindly help.
Thanks.
This solution worked for me, although it is a bit indirect:
by searching, detect character which is unused in the file, e.g. #
use the following regex replace to replace all delimiters: find: (".*?"|.*?), replace: \1# (note the character from step 1)
now, all leftover commas are only those which are inside the quotes. Mass replace them for -
replace back all #'s for commas

Notepad++ regex replace - replace all commas with \, within quotations

I am trying to import a csv file into mysql, and I need to convert it into a proper format before importing.
If there's a comma in a column, the csv encloses it within double quotations, here's an example of a row without a comma, and a row with a comma:
1,Superman
2,"Batman,Flash"
What I need to do is to convert all columns which have commas to escape the comma and remove the quotations... such as "Batman,Flash" to Batman\,Flash
Here's what I have so far
Find: "(.*),(.*)"
Replace: \1\\,\2
However, there are two cases in which this does not work:
It will only replace one comma if there's more than one comma withing a quoted column. So something like "Batman,Flash,Robin" will be converted to Batman,Flash\,Robin
This doesn't work if the first column has a comma as well. For example, on a row such as "1,2,3","Batman,Robin"
How can I change the regexes to accommodate the two cases that don't yet work?
I'm sorry, but regex is not the tool for this. You must parse it.
Why?
Do you want to convert this?
"test\, w00t!"
Or what about this?
"test\\\\\, w00t!"
Heck, even this?
"tes\\","\"ing\,\\,"