Minify JSON with regex

Minify JSON with regex - regex

Problem Description
I want to minify a JSON. Meaning:
Desired Result
Before
{
"keyOne": "First Value",
"keyTwo": "Second Value"
}
After
{"keyOne": "First Value", "keyTwo": "Second Value"}
I want to achieve this using RegEx.
What I tried is to replace \s with an empty string. But this leads to the unwanted result that whitespaces also gets removed from values:
Result of Solution attempt
Before
{
"keyOne": "First Value",
"keyTwo": "Second Value"
}
After
{"keyOne": "FirstValue", "keyTwo": "SecondValue"}
Research done / Solution attempts
Searching Google and Stack Overflow, without success since all found questions target other use cases
Honestly just fooling around with basic RegEx knowledge
To clarify the question: I do not want to do this in JavaScript. I know I can go to the console and run something like copy(JSON.stringify(<the-json>)).
I want to quickly do this in an editor, in this case Webstorm using the Replace Tool – without installing any plugins or switching tools.
Final solution
To steps are needed:
Replace \n with an empty string. This removes linebreaks
Replace \s+" with " to remove whitespances.

You need two steps to achieve that in webstrom:
first replace \n with (nothing!) to remove line breaks;
then \s{2}" with " to remove two whitespaces before each key;

The way object is shown in JS isn't related to the way you can handle it;
{\n
"keyOne": "First Value",\n
"keyTwo": "Second Value"\n
}
the \n characters here are shown to make it more human readable, they don't actually exist in the object itself;
const data = {
"keyOne": "First Value",
"keyTwo": "Second Value"
};
console.log(JSON.stringify(data))
thus you can't apply regex to a regular JavaScript object;
however (very unlikely) if you have a string representation of an object (from somewhere?)
you can apply regex to achieve the result you want like shown in the snippet below:
const strObj = `{
"keyOne": "First Value",
"keyTwo": "Second Value"
}`;
//since it is string we can't access it like normal js objects
console.log(strObj["keyOne"]);
console.log(typeof strObj, strObj);
//replacing the new line with nothing to make it linear
let result = strObj.replace(/\n/g,"")
console.log(typeof result, result);
//casting result to a valid json to an actual js object
let castedResult = JSON.parse(result);
// it will be shown as human readable since its normal object :)
console.log(typeof castedResult,castedResult);
//accessing one its attributes since its normal object now
console.log(castedResult.keyOne);

Related

How to convert json property names from snake to camel case

I have a json document like so... and I'm trying to convert the property names (not values) from snake case to camel.
ex -
message_type_id to messageTypeId
and _id to id
and point_altitude to pointAltitude
{
"#version": "1",
"point_altitude": 530,
"_id": "3325",
"header": {
"raw_message": "",
"message_type_id": "ping_event"
}
}
I've tried find ((\w)[_]{1,1}([a-z]{1,1})) and replace $1\U$2
but that also changes the values as well. I've also tried using positive lookaheads by adding .+?(?=\:) to the end of the find but that stops finding any second underscores in the property names.
https://regex101.com/r/jK5mP3/14

Doing this with a single regex replace is possible but probably not the best choice. Try
(?<=[\w])(?:_([a-z]))([^_"]*+)(?!"\s)|"_([a-z]+)"
Demo
I would rather suggest parsing the JSON and simply iterate of the property names. Depending on your environment, you could use code or a library like camelize or a command-line tool like jd (e.g. this jd answer deals with a similar problem).

Regex for getting content of a html property when another specific property doesn't exist

I struggle to find a solution for what is probably pretty simple, and despite I crawl a lot of questions, I can't manage to make it work.
Here are 2 HTML elements:
Test1
Test2
I want to get ONLY the content of the 1st element's href property (#content1). It must match because the html element contains no "onclick" property.
This regex works for matching the 1st element only:
^<a href="#"((?!onclick).)*$
but I can't figure out how to get the HREF content.
I've tried this:
^<a href="#(.*)"((?!onclick).)*$
but in this case, both elements are matching.
Thanks for your help !

I strongly suggest that you should do that in two steps. For one thing, parsing arbitrary html with a regexp is a notoriously slippery and winding road. For the other: there is no achievement in doing everything with one illegible regex.
And there's more to it: "contains no "onclick" attribute" is not the same as "href attribute is not directly followed by onclick attribute". So, a one-regex-solution would be either very complicated or very fragile (html tags have arbitrary attributes order).
var a = [
'Test1',
'Test2'
];
console.log(
a.filter(i => i.match(/onclick/i) == null)
.map(i => i.match(/href="([^"]+)"/i)[1]
)
This assumes that your href attribute values are valid and do not contain quotes (which is, of course, technically possible).

Regex is not made for this. JavaScript would work better. This code will store an array of the hrefs matching your requirements in the variable hrefArray.
var hrefArray = [];
for (var elem of document.getElementsByTagName('a')) {
if (elem.onclick) hrefArray.push(elem.href)
}
An example with your HTML is in the snippet below:
var hrefArray = [];
for (var elem of document.getElementsByTagName('a')) {
if (elem.onclick) hrefArray.push(elem.href)
}
console.log(hrefArray);
body {
background-color: gray;
}
Test1
Test2

Custom vallidator to ban a specific wordlist

I need a custom validator to ban a specific list of banned words from a textarea field.
I need exactly this type of implementation, I know that it's not logically correct to let the user type part of a query but it's exactly what I need.
I tried with a regExp but it has a strange behaviour.
My RegExp
/(drop|update|truncate|delete|;|alter|insert)+./gi
my Validator
export function forbiddenWordsValidator(sqlRe: RegExp): ValidatorFn {
return (control: AbstractControl): { [key: string]: any } | null => {
const forbidden = sqlRe.test(control.value);
return forbidden ? { forbiddenSql: { value: control.value } } : null;
};
}
my formControl:
whereCondition: new FormControl("", [
Validators.required,
forbiddenWordsValidator(this.BAN_SQL_KEYWORDS)...
It works only in certain cases and I don't understand why does the same string works one time and doesn't work if i delete a char and rewrite it or sometimes if i type a whitespace the validator returns ok.

There are several issues here:
The global g modifier leads to unexpected alternated results when used in RegExp#test and similar methods that move the regex index after a valid match, it must be removed
. at the end requires any 1 char other than line break char, hence it must be removed.
Use
/drop|update|truncate|delete|;|alter|insert/i
Or, to match the words as whole words use
/\b(?:drop|update|truncate|delete|alter|insert)\b|;/i
This way, insert in insertion and drop in dropout won't get "caught" (=matched).
See the regex demo.

it's not a great idea to give such power to the user

Regular Expression If 2nd parameter is Enrollment

I have below response
{
"id": "3452",
"enrollable_id": "3452",
"enrollable_type": "Enrollment"
}
{
"id": "3453",
"enrollable_id": "3453",
"enrollable_type": "Task"
}
{
"id": "3454",
"enrollable_id": "3454",
"enrollable_type": "Enrollment"
}
{
"id": "3455",
"enrollable_id": "3455",
"enrollable_type": "Task"
}
I would like to get id [3452 and 3454] only if enrollable_type= Enrollment. This is for jmeter regex extractor so it would be great if I can just use one liner regex to fetch 3452 and 3454.

The RegEx you are looking for is:
_id":\s*"([^"]+(?=[^\0}]+_type":\s*"E))
Try it online!
Explanation
_id":\s*" Finds the place where the enrollment_id is
[^"]+(?= Matches the ID if:
[^\0}]+_type":\s* Finds the place where enrollable_type is
"E Checks if the enrollable type begins with an uppercase E
) End if
( ) Captures the ID
It's important to note that this RegEx will match on valid people and capture the valid ID. This means you will need to get each match's capture rather than just getting each match.
Disclaimer
The above RegEx contains backslashes, which you will need to escape if using the RegEx as a string literal.
This is the RegEx with all necessary-to-escape characters escaped:
_id":\\s*"([^"]+(?=[^\\0}]+_type":\\s*"E))

It's usually a bad idea to parse structured data with just a regex, but if you're intent on going this route then here you go:
"(\d+)"\s*,\s*(?="enrollable_type":\s*"Enrollment")
This assumes that entrollable_type always follows enrollable_id and that everything is quoted consistently with a little allowance for variance in white space. You should be able to handle a little more variance if necessary, such as if you're unsure if can depend on keys or data being quoted (["']?). However, if you can depend on the order of the properties (such as if they type comes before id) then you should abandon using a regex.
Here's a sample working in JavaScript
const text = `{ "id": "3452", "enrollable_id": "3452", "enrollable_type": "Enrollment" } { "id": "3453", "enrollable_id": "3453", "enrollable_type": "Task" } { "id": "3454", "enrollable_id": "3454", "enrollable_type": "Enrollment" } { "id": "3455", "enrollable_id": "3455", "enrollable_type": "Task" }`;
const re = /"(\d+)"\s*,\s*(?="enrollable_type":\s*"Enrollment")/g;
var match;
while(match = re.exec(text)) {
console.log(match[1]);
}

Your response seems to be a JSON one (however it's malformed). If this is the case and it's really JSON - I would recommend going for JSON Extractor instead as regular expressions are fragile, sensitive to markup change, new lines, order of elements, etc. while JSON Extractor looks only into the content.
The relevant JSON Path query would be something like:
$..[?(#.enrollable_type == 'Enrollment')].enrollable_id
Demo:
More information: JMeter's JSON Path Extractor Plugin - Advanced Usage Scenarios

You can extract the data in 2 ways
Using Json Extractor.
To extract data using json extractor response data should follow json syntax rules,
To extract data use the following JSON path in json extractor
$..[?(#.enrollable_type=="Enrollment")].id
and use match no -1 as shown below
To extract data using regular expression extractor use the following regex
id": "(.+?)",\s*(.+?)\s*"enrollable_type": "Enrollment
template : $1$2$3$4$
Match no -1
as shown below
you can see the variables stored using debug sampler
More information
extract variables

Regular Expression to strip sensitive information from a JSON object

I have a JSON object something like below from which i wanted to strip out sensitive information like password, mobile no, etc. using Regular Expressions,
Example JSON
{
"username":"abc",
"password":"xyz123",
"Security":{
"SecurityQuestion":"what is your first pet name",
"SecurityAnswer": "snoopy"
}
}
From the above JSON object, I wanted to strip out sensitive information like "password" and "SecurityAnswer". I tried various regular expression patterns but it was removing only either any one of the item.
I need help or guidance on how to construct a regular expression, in which i can include any names in the expression and then those fields will be stripped out of the JSON.
Expected Output:
{
"username":"abc",
"Security":{
"SecurityQuestion":"what is your first pet name"
}
}
Note: If a password is the last property, then the expression should be able to remove the comma (,) also from the previous property.
I tried the expression from Regex remove json property with various combinations but none were working as per my requirement.

If you want to get values from JSON, you don't need to use regex and make a very complex regular expression.
var data = {
"username":"abc",
"password":"xyz123",
"Security":{
"SecurityQuestion":"what is your first pet name",
"SecurityAnswer": "snoopy"
}
}
That is your object, now if you want to retrieve the data simply treat it like a json.
function retrieveData( Obj ) {
return {
username: Obj.username,
Security:{
SecurityQuestion: Obj.Security.SecurityQuestion
}
}
}
var extractedData = retrieveData(data);

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

Minify JSON with regex - regex

You need two steps to achieve that in webstrom: first replace \n with (nothing!) to remove line breaks; then \s{2}" with " to remove two whitespaces before each key;

Related

How to convert json property names from snake to camel case

Regex for getting content of a html property when another specific property doesn't exist

Custom vallidator to ban a specific wordlist

Regular Expression If 2nd parameter is Enrollment

Regular Expression to strip sensitive information from a JSON object

Categories

Resources