How to get comments and string in regex?

How to get comments and string in regex? - regex

i have create a programming language KAGSA, and i have to create a syntax highlighter i start with VSCode highlighter i write every thing well but i have problem with regex of strings (more than one line) and comments (more than one line) this is the code :
Match is the code:
Comments :
"comments": {
"patterns": [{
"name": "comment.line.shebang.kagsa",
"match": "//..*|/\\*(.*?|\n)*\\*/|//|/\\**\\*"
}]
},
The problem is wit the /*Comment*/ comment.
and string code :
"strings": {
"name": "string.quoted.double.kagsa",
"patterns": [{
"name": "string.quoted.double.kagsa",
"match": "'(.*?)'|\"(.*?)\"|``(.*?|\n)*``"
}]
},
my problem is with ``String``
and the Color i get :
[the output color][https://i.stack.imgur.com/NPbS0.png]

You have this issue because match doesn't work for multiline string literals.
I found a similar problem.
As said by Gama11 in his answer:
Try to use a begin / end pattern instead of a simple match.

Related

How can I find and replace values bracket by bracket?

I'm trying to find every "color" value and replace it with a specific string, but only the "color" value of every "name" that has Bismuthinite" in it.
[
{
"name": "Poor Gneiss Bismuthinite",
"blockName": "tfc:ore/poor_bismuthinite/gneiss",
"order": 789,
"color": 5015620,
"drawing": false
},
{
"name": "Slate Halite",
"blockName": "tfc:ore/halite/slate",
"order": 1046,
"color": 7153517,
"drawing": false
},
The information wthin the next brackets (block? im not sure what the terminology is, i'm very new to coding in general) should not be selected or altered in any way. Only the information that matches "name" includes Bismuthinite" .
I've tried using a multiline find and replace using the ToolBucket plugin for Notepad++, but either it won't accomplish what I want it to do, or I just don't know how.

How to match a string exactly OR exact substring from beginning using Regular Expression

I'm trying to build a regex query for a database and it's got me stumped. If I have a string with a varying number of elements that has an ordered structure how can I find if it matches another string exactly OR some exact sub string when read from the left?
For example I have these strings
Canada.Ontario.Toronto.Downtown
Canada.Ontario
Canada.EasternCanada.Ontario.Toronto.Downtown
England.London
France.SouthFrance.Nice
They are structured by most general location to specific, left to right. However, the number of elements varies with some specifying a country.region.state and so on, and some just country.town. I need to match not only the words but the order.
So if I want to match "Canada.Ontario.Toronto.Downtown" I would want to both get #1 and #2 and nothing else. How would I do that? Basically running through the string and as soon as a different character comes up it's not a match but still allow a sub string that ends "early" to match like #2.
I've tried making groups and using "?" like (canada)?.?(Ontario)?.? etc but it doesn't seem to work in all situations since it can match nothing as well.
Edit as requested:
Mongodb Database Collection:
[
{
"_id": "doc1",
"context": "Canada.Ontario.Toronto.Downtown",
"useful_data": "Some Data"
},
{
"_id": "doc2",
"context": "Canada.Ontario",
"useful_data": "Some Data"
},
{
"_id": "doc3",
"context": "Canada.EasternCanada.Ontario.Toronto.Downtown",
"useful_data": "Some Data"
},
{
"_id": "doc4",
"context": "England.London",
"useful_data": "Some Data"
},
{
"_id": "doc5",
"context": "France.SouthFrance.Nice",
"useful_data": "Some Data"
},
{
"_id": "doc6",
"context": "",
"useful_data": "Some Data"
}
]
User provides "Canada", "Ontario", "Toronto", and "Downtown" values in that order and I need to use that to query doc1 and doc2 and no others. So I need a regex pattern to put in here: collection.find({"context": {$regex: <pattern here>}) If it's not possible I'll just have to restructure the data and use different methods of finding those docs.

At each dot, start an nested optional group for the next term, and add start and end anchors:
^Canada(\.Ontario(\.Toronto(\.Downtown)?)?)?$
See live demo.

Replace specific part of a string

I'm currently encountering a pickle in modifications of a document. Lets say for example, I have this chunk of text:
"id": "EFM",
"type": "Casual",
"hasBeenAssigned": false,
"hasRandomAssigned": false
},
I currently have roughly 73 - 80 occourances of:
"id" : "somethingdifferent",
Using a regular expression in notepad++, How can I select the entire string:
"id" : "",
but only change the contents between the second set of quotes?
Edit
An oversight made me leave this information out:
"equipedOutfit": {
"id": "MkIV",
"type": "Outfit",
"hasBeenAssigned": false,
"hasRandonAssigned": false
},
"equipedWeapon": {
"id": "EFM",
"type": "Casual",
"hasBeenAssigned": false,
"hasRandonAssigned": false
},
The selected text, looking for is:
"id" : "EFM",

You can use a regex like this:
("id": ").*?"
With a replacement string:
$1whatever"
^^^^^^^^--- replace 'whatever' with whatever you want
Working demo
Update: as you updated your question, I'm updating the answer. If you want only to replace "id": "EFM" then you have just to look for that text only and put the replacement string you want.

"id":\s*"\K[^"]*
You can use \K here and replace by whatever you want.See demo.
https://regex101.com/r/sS2dM8/29
EDIT:
If you want only EFM then use
"id"\s*:\s*"\KEFM(?=")

Find what: ("id"\s?:\s?").*(")
Replace with: \1somethingdifferent\2
Options:
Regular expression, Wrap around

Find pattern with regex in Sublime text 2.02

I would like to create a new Syntax Rule in Sublime in order to search a string pattern so that that pattern is highlighted. The parttern I am looking for is IPC or TST, therefore I was making use of the following Sublime Syntax rule
{ "name": "a3",
"scopeName": "source.a3",
"fileTypes": ["a3"],
"patterns": [
{ "name": "IPC",
"match": "\\b\\w(IPC|TST)\\w\\b "
}
],
"uuid": "c76f733d-879c-4c1d-a1a2-101dfaa11ed8"
}
But for some reason or another, it doesn't work at all.
Could someone point me out in the right direction?
Thanks in advance

After looking around and testing a lot, I have found the issue, apparently apart from identifying the patter, I should invoke the colour, for doing it I have to make use of "capture", being the command as follows:
{ "name": "IPC colour",
"match": "\\b(IPC|TST)\\b",
"captures": {
"1": { "name": "meta.preprocessor.diagnostic" }
}
},
Where "name": "meta.preprocessor.diagnostic" will indicate the sort of colour assign to the found pattern.
regards!

sed - trying to replace first occurrence after a match

I am facing a situation that drives me nuts.
I am setting up an update server which uses a json file.
Don't ask why or how, it sucks and is my only possibility to achieve it.
I have been trying and researching for HOURS (many) because I went ballistic and wanted to crack this on my own. But I have to realize I got stuck and need help.
So sorry for this chunk but I think it is somewhat important to see...
The file is a one liner and repeating the following sequence with changing values (of course).
"plugin_name_foo_bar": {"buildDate": "bla", "dependencies": [{"name": "bla", "optional": true, "version": "1.00"}], "developers": [{"developerId": "bla", "email": "bla#gmail.com", "name": "Bla bla2nd"}], "excerpt": "some text {excerpt} !bla.png|thumbnail,border=1! ", "gav": "bla", "labels": ["report", "scm-related"], "name": "plugin_name_foo_bar", "previousTimestamp": "bla", "previousVersion": "1.0", "releaseTimestamp": "bla", "requiredCore": "1", "scm": "github.com", "sha1": "ynnBM2jWo25ZLDdP3ybBOnV/Pio=", "title": "bla", "url": "http://bla.org", "version": "1.0", "wiki": "https://bla.org"}, "Exclusion": {"buildDate": "bla", "dependencies": [],
and the next plugin block is glued straight afterwards.
What I now want to do is to search for "plugin_foo_bar": {" as this is the unique identifier for a new plugin description block.
I want to replace the first sha1 value occuring afterwards. That's where I keep failing. I always grab the first,last or any occurrence in the entire file and not the block :(
"title" is the unique identifier after the sha1 value.
So I tried to make the .* less greedy but it ain't working out.
last attempt was heading towards:
sed -i 's/("name": "plugin_name_foo_bar.*sha1": ")([a-zA-Z0-9!##\$%^&*()\[\]]*)(", "title"\)/\1blablabla\2/1' default.json
to find the sha1 value of that plugin but still no joy. I hope someone knows - preferably a simpler approach - before I now continue with trial and error until I have to puke and freakout.
I am working with SED on Windows, so Unix approach might help me to figure out how to achieve this in batch but please make it as one-liner if possible. Scripts are a real pain to convert.
And I just need SED and no other solution with other tools like AWK. That is absolutely out of discussion.
Any help is appreciated :)
Cheers
Jan

Don't use regex (sed) to parse JSON, instead use a proper JSON parser, or javascript directly like I do :
Using javascript and nodejs in a script :
File /tmp/file.json is :
{
"plugin_name_foo_bar" : {
"excerpt" : "some text {excerpt} !bla.png|thumbnail,border=1! ",
"dependencies" : [
{
"name" : "bla",
"version" : "1.00",
"optional" : true
}
],
"title" : "bla",
"previousTimestamp" : "bla",
"releaseTimestamp" : "bla",
"sha1" : "ynnBM2jWo25ZLDdP3ybBOnV/Pio=",
"labels" : [
"report",
"scm-related"
],
"buildDate" : "bla",
"version" : "1.0",
"previousVersion" : "1.0",
"name" : "plugin_name_foo_bar",
"scm" : "github.com",
"url" : "http://bla.org",
"gav" : "bla",
"developers" : [
{
"email" : "bla#gmail.com",
"developerId" : "bla",
"name" : "Bla bla2nd"
}
],
"wiki" : "https://bla.org",
"requiredCore" : "1"
},
"Exclusion" : {
"dependencies" : [],
"buildDate" : "bla"
}
}
The script script.js :
var js = require('/tmp/file.json')
js.plugin_name_foo_bar.sha1 = "xxx"
console.log(js)
Usage :
nodejs script.js

As sputnick points out parsing is a little beyond what sed's meant for. Still, sed's Turing-complete and bludgeoning it into doing what you want can satisfy that {sad,masoch}istic urge so many of us feel from time to time.
This one's even easy.
sed '
s/"sha1": /\n/g
s/\("name": "plugin_name_foo_bar"[^\n]*\n"\)[^"]*/\1thenewsha/
s/\n/"sha1": /g
'

For windows command line, with escaped quotes, replacing inline and using regular expression
sed -i -r "s/(plugin_name_foo_bar.+?sha1\": \")[^\"]+\"/\1abcdefghijkl\"/" default.json

sed -r "s/(plugin_name_foo_bar[^!]+sha1.: .)[^\"]+/\1abcdefghijkl/" file

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

How to get comments and string in regex? - regex

You have this issue because match doesn't work for multiline string literals. I found a similar problem. As said by Gama11 in his answer: Try to use a begin / end pattern instead of a simple match.

Related

How can I find and replace values bracket by bracket?

How to match a string exactly OR exact substring from beginning using Regular Expression

Replace specific part of a string

Find pattern with regex in Sublime text 2.02

sed - trying to replace first occurrence after a match

Categories

Resources