I have a txt files with some lines containing GPS data, which I need to shorten.
So I have
5|{"mResults":0.0|0.0|"mProvider":"fused"|"mDistance":0.0|"mTime":1395061255413|"mAltitude":161.0|"mLongitude":29.0459152|"mLon2":0.0|"mLon1":0.0|"mLatitude":41.0854122|"mLat1":0.0|"mLat2":0.0|"mInitialBearing":0.0|"mHasSpeed":true|"mHasBearing":false|"mHasAltitude":true|"mHasAccuracy":true|"mAccuracy":15.0|"mSpeed":0.425211|"mBearing":0.0}|1395061255413
and I need to extract only the coordinates,
so convert it into this :
29.0459152|41.0854122
Edit:
Turns out I need this:
GPS|29.0459152|41.0854122|0|1395061255413
Please note that I need to :
add GPS| in front, and |0 at the end.
and also I need to append the timestamp value (the last value in the original one) |1395061255413
How can I do this with Notepad++?
Thanks for any help !
Here is how you could do that in Notepad++:
use Ctrl+H to open the Replace pop-up
tick Regular expression in the Search mode section
search for this pattern: .*?"mLongitude":(\d+(?:\.\d+)?).*?"mLatitude":(\d+(?:\.\d+)).*
replace the matched string by \1|\2
click on Replace all ;)
How it works:
The regex pattern featured in my answer extracts the values for the longitude and latitude from each line and replaces the whole line by:
the mLongitude value,
a literal "|" and
the mLatitude value.
Would you like to have this pattern explained in more details, please check out this permalink on regex101.
EDIT:
To include the timestamp you're referring to, you need to use this regex:
.*?"mLongitude":(\d+(?:\.\d+)?).*?"mLatitude":(\d+(?:\.\d+)).*\|(\d+)
Then, you just have to change the formatting of your replacement string to this:
GPS|\1|\2|0|\3
You probably got that already but let's still write down the complete usable solution ;)
I hope this helps!
I solved my problem by this :
search:
.*?"mTime":(\d+(?:\.\d+)?).*?"mLongitude":(\d+(?:\.\d+)).*?"mLatitude":(\d+(?:\.\d+)).*
replace :
GPS|\2|\3|0|\1
But this is because the same data (timestamp) was also inside the string. So this does not extract the timestamp from the end, but from the middle of the string.
So if #ccjmne edits the answer that extracts it from the end, I will accept that answer.
Related
I don't know anything about Notepad++ Regex.
This is the data I have in my CSV:
6454345|User1-2ds3|62562012032|324|148|9c1fe63ccd3ab234892beaf71f022be2e06b6cd1
3305611|User2-42g563dgsdbf|22023001345|0|0|c36dedfa12634e33ca8bc0ef4703c92b73d9c433
8749412|User3-9|xgs|f|98906504456|1534|51564|411b0fdf54fe29745897288c6ad699f7be30f389
How can I use a Regex to remove the 5th and 6th column? The numbers in the 5th and 6th column are variable in length.
Another problem is the User row can also contain a |, to make it even worse.
I can use a macro to fix this, but the file is a few millions lines long.
This is the final result I want to achieve:
6454345|User1-2ds3|62562012032|9c1fe63ccd3ab234892beaf71f022be2e06b6cd1
3305611|User2-42g563dgsdbf|22023001345|c36dedfa12634e33ca8bc0ef4703c92b73d9c433
8749412|User3-9|xgs|f|98906504456|411b0fdf54fe29745897288c6ad699f7be30f389
I am open for suggestions on how to do this with another program, command line utility, either Linux or Windows.
Match \|[^|]+\|[^|]+(\|[^|]+$)
Repalce $1
Basically, Anchor to the end of the line, and remove columns [-1] and [-2] (I assume columns can't be empty. Replace + with * if they can)
If you need finer detail then that, I'd recommend writing a Java or Python script to manual parse and rewrite the file for you.
I've captured three groups and given them names. If you use a replace utility like sed or vimregex, you can replace remove with nothing. Or you can use a programming language to concatenate keep_before and keep_after for the desired result.
^(?<keep_before>(?:[^|]+\|){3})(?<remove>(?:[^|]+\|){2})(?<keep_after>.*)$
You may have to remove the group namings and use \1 etc. instead, depending on what environment you use.
Demo
From Notepad++ hit ctrl + h then enter the following in the dialog:
Find what: \|\d+\|\d+(\|[0-9a-z]+)$
Replace with: $1
Search mode: Regular Expression
Click replace and done.
Regex Explain:
\|\d+ : match 1st string that starts with | followed by number
\|\d+ : match 2nd string that starts with | followed by number
(\|[0-9a-z]+): match and capture the string after the 2nd number.
$ : This is will force regex search to match the end of the string.
Replacement:
$1 : replace the found string with whatever we have between the captured group which is whatever we have between the parentheses (\|[0-9a-z]+)
I want to replace each occurrence of a specific word, but it has to be in a line which begins with another certain word.
Example text:
This is some random text here
That is also some random text here
I want only to select lines beginning with "This" and change the "text" to e.g. "word".
Result of fin&replace in Notepad++ would be:
This is some random word here
That is also some random text here
So far, I was able to select the line, no problem there: (This.+)
The problem is how to search for and replace the word "text", since I can't get the group/sub-pattern to work within itself, using \1.
I was able to select a string from and to a certain word, but can't figure out how to search within a line that is found.
I'm a regex rookie, so have patience. :)
Many thanks for sharing your brilliant thoughts!
^This\b.*?\K\btext\b
Try this.Replace by word.See demo.
https://regex101.com/r/jV9oV2/9
or
^(This\b.*?)\btext\b
Replace by \1.
I am trying to use NotePad++ to do a search and replace using the regex function that replaces a string of characters but maintains one part of the string. My description isn't very good so perhaps it will be better if I just give you the example.
Throughout and xml doc I have the following elements...
<AddressLine3>addressLine3>
<AddressLine2>addressLine2>
I want to replace these with
<addressLine3> <addressLine2>
So I need to maintain the address line number.
I know that
AddressLine([0-9]{1})>addressLine([0-9]{1})
is a valid reg ex but I'm not sure what to put in the replace with section to tell it to maintain whatever value was found by ([0-9]{1}).
Thanks.
It's \{number of the group}, so \1, \2, ...
Edit with your precisions (I changed a bit your regex for simpler groups):
(AddressLine[0-9]{1}>)(addressLine[0-9]{1}) is replaced by \2
You can capture it in group and replace them
Find:(AddressLine[0-9])>(addressLine[0-9])
Replace:$1 <$2
Find what : (<AddressLine\d>)AddressLine\d
Replace by: $1
You have to select the choice regular expression
I'm trying to select a substring using regex and I'm going round in circles. I need to select everything before the first "_".
exampale URL - GI_2013_JUNE_10_VOL3_LASTCHANCE
So the result Im looking for from the URL above would be "GI". The text before the first "_" can vary in length.
Any help would be much apprecited
The regex would be:
^[^_]+
and grab the whole regex match. But as a comment says, using a substring function is more efficient!
^[^_]*
...is the expression you're looking for.
It basically says: Select everything that is not an underscore, starting at the beginning of the string.
http://regexr.com?356in
I'm trying to match a pattern using RegEx in notepad++, but not having much luck. I'm able to match part but not all of it.
I need to search for this line:
<size value="Large" pax="13074"/>
And replace it with this:
<size value="Very_large" pax="41450" cargo="Largest" cargovolume="3227"/>
Essentially I need to find all patterns matching pax="n"/> and replace them with pax="n" cargo="Largest" cargovolume="0"/> while retaining the initial value of n.
So, ideas anyone?
Press Ctrl + F, move to tab Replace, in Find what do: pax="(\d+)" and in Replace with put this: pax="\1" cargo="Largest" cargovolume="0"
Remember to mark regex. That should retain the number and replace the content.
UPDATE: Hint about saving text for replacement.
Whenever you use regex to do text replacement, wrap the content you want to save in parenthesis and then you can access them using \i where i is the order of appearance of the parenthesis starting at 1.
Hope it helps!