I would like to send my users directly to a specific log group and filter but I need to be able to generate the proper URL format. For example, this URL
https://console.aws.amazon.com/cloudwatch/home?region=us-east-1#logsV2:log-groups/log-group/
%252Fmy%252Flog%252Fgroup%252Fgoes%252Fhere/log-events/$3FfilterPattern$3D$255Bincoming_ip$252C$2Buser_name$252C$2Buser_ip$2B$252C$2Btimestamp$252C$2Brequest$2B$2521$253D$2B$2522GET$2B$252Fhealth_checks$252Fall$2B*$2522$252C$2Bstatus_code$2B$253D$2B5*$2B$257C$257C$2Bstatus_code$2B$253D$2B429$252C$2Bbytes$252C$2Burl$252C$2Buser_agent$255D$26start$3D-172800000
will take you to a log group named /my/log/group/goes/here and filter messages with this pattern for the past 2 days:
[incoming_ip, user_name, user_ip , timestamp, request != "GET /health_checks/all *", status_code = 5* || status_code = 429, bytes, url, user_agent]
I can decode part of the URL but I don't know what some of the other characters should be (see below), but this doesn't really look like any standard HTML encoding to me. Does anyone know a encoder/decoder for this URL format?
%252F == /
$252C == ,
$255B == [
$255D == ]
$253D == =
$2521 == !
$2522 == "
$252F == _
$257C == |
$2B == +
$26 == &
$3D == =
$3F == ?
First of all I'd like to thank other guys for the clues. Further goes the complete explanation how Log Insights links are constructed.
Overall it's just weirdly encoded conjunction of an object structure that works like that:
Part after ?queryDetail= is object representation and {} are represented by ~()
Object is walked down to primitive values and the latter are transformed as following:
encodeURIComponent(value) so that all special characters are transformed to %xx
replace(/%/g, "*") so that this encoding is not affected by top level ones
if value type is string - it is prefixed with unmatched single quote
To illustrate:
"Hello world" -> "Hello%20world" -> "Hello*20world" -> "'Hello*20world"
Arrays of transformed primitives are joined using ~ and as well put inside ~() construct
Then, after primitives transformation is done - object is joined using "~".
After that string is escape()d (note that not encodeURIComponent() is called as it doesn't transform ~ in JS).
After that ?queryDetail= is added.
And finally this string us encodeURIComponent()ed and as a cherry on top - % is replaced with $.
Let's see how it works in practice. Say these are our query parameters:
const expression = `fields #timestamp, #message
| filter #message not like 'example'
| sort #timestamp asc
| limit 100`;
const logGroups = ["/application/sample1", "/application/sample2"];
const queryParameters = {
end: 0,
start: -3600,
timeType: "RELATIVE",
unit: "seconds",
editorString: expression,
isLiveTrail: false,
source: logGroups,
};
Firstly primitives are transformed:
const expression = "'fields*20*40timestamp*2C*20*40message*0A*20*20*20*20*7C*20filter*20*40message*20not*20like*20'example'*0A*20*20*20*20*7C*20sort*20*40timestamp*20asc*0A*20*20*20*20*7C*20limit*20100";
const logGroups = ["'*2Fapplication*2Fsample1", "'*2Fapplication*2Fsample2"];
const queryParameters = {
end: 0,
start: -3600,
timeType: "'RELATIVE",
unit: "'seconds",
editorString: expression,
isLiveTrail: false,
source: logGroups,
};
Then, object is joined using ~ so we have object representation string:
const objectString = "~(end~0~start~-3600~timeType~'RELATIVE~unit~'seconds~editorString~'fields*20*40timestamp*2C*20*40message*0A*20*20*20*20*7C*20filter*20*40message*20not*20like*20'example'*0A*20*20*20*20*7C*20sort*20*40timestamp*20asc*0A*20*20*20*20*7C*20limit*20100~isLiveTrail~false~source~(~'*2Fapplication*2Fsample1~'*2Fapplication*2Fsample2))"
Now we escape() it:
const escapedObject = "%7E%28end%7E0%7Estart%7E-3600%7EtimeType%7E%27RELATIVE%7Eunit%7E%27seconds%7EeditorString%7E%27fields*20*40timestamp*2C*20*40message*0A*20*20*20*20*7C*20filter*20*40message*20not*20like*20%27example%27*0A*20*20*20*20*7C*20sort*20*40timestamp*20asc*0A*20*20*20*20*7C*20limit*20100%7EisLiveTrail%7Efalse%7Esource%7E%28%7E%27*2Fapplication*2Fsample1%7E%27*2Fapplication*2Fsample2%29%29"
Now we append ?queryDetail= prefix:
const withQueryDetail = "?queryDetail=%7E%28end%7E0%7Estart%7E-3600%7EtimeType%7E%27RELATIVE%7Eunit%7E%27seconds%7EeditorString%7E%27fields*20*40timestamp*2C*20*40message*0A*20*20*20*20*7C*20filter*20*40message*20not*20like*20%27example%27*0A*20*20*20*20*7C*20sort*20*40timestamp*20asc*0A*20*20*20*20*7C*20limit*20100%7EisLiveTrail%7Efalse%7Esource%7E%28%7E%27*2Fapplication*2Fsample1%7E%27*2Fapplication*2Fsample2%29%29"
Finally we URLencode it and replace % with $ and vois la:
const result = "$3FqueryDetail$3D$257E$2528end$257E0$257Estart$257E-3600$257EtimeType$257E$2527RELATIVE$257Eunit$257E$2527seconds$257EeditorString$257E$2527fields*20*40timestamp*2C*20*40message*0A*20*20*20*20*7C*20filter*20*40message*20not*20like*20$2527example$2527*0A*20*20*20*20*7C*20sort*20*40timestamp*20asc*0A*20*20*20*20*7C*20limit*20100$257EisLiveTrail$257Efalse$257Esource$257E$2528$257E$2527*2Fapplication*2Fsample1$257E$2527*2Fapplication*2Fsample2$2529$2529"
And putting it all together:
function getInsightsUrl(queryDefinitionId, start, end, expression, sourceGroup, timeType = 'ABSOLUTE', region = 'eu-west-1') {
const p = m => escape(m);
const s = m => escape(m).replace(/%/gi, '*');
const queryDetail
= p('~(')
+ p("end~'")
+ s(end.toUTC().toISO()) // converted using Luxon
+ p("~start~'")
+ s(start.toUTC().toISO()) // converted using Luxon
// Or use UTC instead of Local
+ p(`~timeType~'${timeType}~tz~'Local~editorString~'`)
+ s(expression)
+ p('~isLiveTail~false~queryId~\'')
+ s(queryDefinitionId)
+ p("~source~(~'") + s(sourceGroup) + p(')')
+ p(')');
return `https://${region}.console.aws.amazon.com/cloudwatch/home?region=${region}#logsV2:logs-insights${escape(`?queryDetail=${queryDetail}`).replace(/%/gi, '$')}`;
}
Of course reverse operation can be performed as well.
That's all folks. Have fun, take care and try to avoid doing such a weird stuff yourselves. :)
I had to do a similar thing to generate a back link to the logs for a lambda and did the following hackish thing to create the link:
const link = `https://${process.env.AWS_REGION}.console.aws.amazon.com/cloudwatch/home?region=${process.env.AWS_REGION}#logsV2:log-groups/log-group/${process.env.AWS_LAMBDA_LOG_GROUP_NAME.replace(/\//g, '$252F')}/log-events/${process.env.AWS_LAMBDA_LOG_STREAM_NAME.replace('$', '$2524').replace('[', '$255B').replace(']', '$255D').replace(/\//g, '$252F')}`
A colleague of mine figured out that the encoding is nothing special. It is the standard URI percent encoding but applied twice (2x). In javascript you can use the encodeURIComponent function to test this out:
let inp = 'https://console.aws.amazon.com/cloudwatch/home?region=us-east-1#logsV2:log-groups/log-group/'
console.log(encodeURIComponent(inp))
console.log(encodeURIComponent(encodeURIComponent(inp)))
This piece of javascript produces the expected output on the second encoding stage:
https%3A%2F%2Fconsole.aws.amazon.com%2Fcloudwatch%2Fhome%3Fregion%3Dus-east-1%23logsV2%3Alog-groups%2Flog-group%2F
https%253A%252F%252Fconsole.aws.amazon.com%252Fcloudwatch%252Fhome%253Fregion%253Dus-east-1%2523logsV2%253Alog-groups%252Flog-group%252F
Caution
At least some bits use the double encoding, not the whole link though. Otherwise all special characters would occupy 4 characters after double encoding, but some still occupy only 2 characters. Hope this helps anyway ;)
My complete Javascript solution based on #isaias-b answer, which also adds a timestamp filter on the logs:
const logBaseUrl = 'https://console.aws.amazon.com/cloudwatch/home?region=us-east-1#logsV2:log-groups/log-group';
const encode = text => encodeURIComponent(text).replace(/%/g, '$');
const awsEncode = text => encodeURIComponent(encodeURIComponent(text)).replace(/%/g, '$');
const encodeTimestamp = timestamp => encode('?start=') + awsEncode(new Date(timestamp).toJSON());
const awsLambdaLogBaseUrl = `${logBaseUrl}/${awsEncode('/aws/lambda/')}`;
const logStreamUrl = (logGroup, logStream, timestamp) =>
`${awsLambdaLogBaseUrl}${logGroup}/log-events/${awsEncode(logStream)}${timestamp ? encodeTimestamp(timestamp) : ''}`;
I have created a bit of Ruby code that seems to satisfy the CloudWatch URL parser. I'm not sure why you have to double escape some things and then replace % with $ in others. I'm guessing there is some reason behind it but I couldn't figure out a nice way to do it, so I'm just brute forcing it. If you have something better, or know why they do this, please add a comment.
NOTE: The filter I tested with is kinda basic and I'm not sure what might need to change if you get really fancy with it.
# Basic URL that is the same across all requests
url = 'https://console.aws.amazon.com/cloudwatch/home?region=us-east-1#logsV2:log-groups/log-group/'
# CloudWatch log group
log_group = '/aws/my/log/group'
# Either specify the instance you want to search or leave it out to search all instances
instance = '/log-events/i-xxxxxxxxxxxx'
OR
instance = '/log-events'
# The filter to apply.
filter = '[incoming_ip, user_name, user_ip , timestamp, request, status_code = 5*, bytes, url, user_agent]'
# Start time. There might be an End time as well but my queries haven't used
# that yet so I'm not sure how it's formatted. It should be pretty similar
# though.
hours = 48
start = "&start=-#{hours*60*60*1000}"
# This will get you the final URL
final = url + CGI.escape(CGI.escape(log_group)) + instance + '$3FfilterPattern$3D' + CGI.escape(CGI.escape(filter)).gsub('%','$') + CGI.escape(start).gsub('%','$')
A bit late but here is a python implementation
def get_cloud_watch_search_url(search, log_group, log_stream, region=None,):
"""Return a properly formatted url string for search cloud watch logs
search = "{$.message: "You are amazing"}
log_group = Is the group of message you want to search
log_stream = The stream of logs to search
"""
url = f'https://{region}.console.aws.amazon.com/cloudwatch/home?region={region}'
def aws_encode(value):
"""The heart of this is that AWS likes to quote things twice with some substitution"""
value = urllib.parse.quote_plus(value)
value = re.sub(r"\+", " ", value)
return re.sub(r"%", "$", urllib.parse.quote_plus(value))
bookmark = '#logsV2:log-groups'
bookmark += '/log-group/' + aws_encode(log_group)
bookmark += "/log-events/" + log_stream
bookmark += re.sub(r"%", "$", urllib.parse.quote("?filterPattern="))
bookmark += aws_encode(search)
return url + bookmark
This then allows you to quickly verify it.
>>> real = 'https://us-west-2.console.aws.amazon.com/cloudwatch/home?region=us-west-2#logsV2:log-groups/log-group/$252Fapp$252Fdjango/log-events/production$3FfilterPattern$3D$257B$2524.msg$253D$2522$2525s$2525s+messages+to+$2525s+pk$253D$2525d...$2522$257D'
>>> constructed = get_cloud_watch_search_url(None, search='{$.msg="%s%s messages to %s pk=%d..."}', log_group='/app/django', log_stream='production', region='us-west-2')
>>> real == constructed
True
I encountered this problem recently when I wanted to generate cloudwatch insights URL. Typescript version below:
export function getInsightsUrl(
start: Date,
end: Date,
query: string,
sourceGroup: string,
region = "us-east-1"
) {
const p = (m: string) => escape(m);
// encodes inner values
const s = (m: string) => escape(m).replace(/\%/gi, "*");
const queryDetail =
p(`~(end~'`) +
s(end.toISOString()) +
p(`~start~'`) +
s(start.toISOString()) +
p(`~timeType~'ABSOLUTE~tz~'UTC~editorString~'`) +
s(query) +
p(`~isLiveTail~false~queryId~'`) +
s(v4()) +
p(`~source~(~'`) +
s(sourceGroup) +
p(`))`);
return (
`https://console.aws.amazon.com/cloudwatch/home?region=${region}#logsV2:logs-insights` +
escape("?queryDetail=" + queryDetail).replace(/\%/gi, "$")
);
}
Github GIST
A Python solution based on #Pål Brattberg's answer:
cloudwatch_log_template = "https://{AWS_REGION}.console.aws.amazon.com/cloudwatch/home?region={AWS_REGION}#logsV2:log-groups/log-group/{LOG_GROUP_NAME}/log-events/{LOG_STREAM_NAME}"
log_url = cloudwatch_log_template.format(
AWS_REGION=AWS_REGION, LOG_GROUP_NAME=CLOUDWATCH_LOG_GROUP, LOG_STREAM_NAME=LOG_STREAM_NAME
)
Make sure to substitute illegal characters first (see OP) if you used any.
I encountered this problem recently when I wanted to generate cloudwatch insights URL. PHP version below:
<?php
function getInsightsUrl($region = 'ap-northeast-1') {
// https://stackoverflow.com/questions/67734825/why-is-laravels-carbon-toisostring-different-from-javascripts-toisostring
$start = now()->subMinutes(2)->format('Y-m-d\TH:i:s.v\Z');
$end = now()->addMinutes(2)->format('Y-m-d\TH:i:s.v\Z');
$filter = 'INFO';
$logStream = 'xxx_backend_web';
$sourceGroup = '/ecs/xxx_backend_prod';
// $sourceGroup = '/aws/ecs/xxx_backend~\'/ecs/xxx_backend_dev'; // multiple source group
$query =
"fields #timestamp, #message \n" .
"| sort #timestamp desc\n" .
"| filter #logStream like '$logStream'\n" .
"| filter #message like '$filter'\n" .
"| limit 20";
$queryDetail = urlencode(
("~(end~'") .
($end) .
("~start~'") .
($start) .
("~timeType~'ABSOLUTE~tz~'Local~editorString~'") .
($query) .
("~isLiveTail~false~queryId~'") .
("~source~(~'") .
($sourceGroup) .
("))")
);
$queryDetail = preg_replace('/\%/', '$', urlencode("?queryDetail=" . $queryDetail));
return
"https://console.aws.amazon.com/cloudwatch/home?region=${region}#logsV2:logs-insights"
. $queryDetail;
}
A coworker came up with the following JavaScript solution.
import JSURL from 'jsurl';
const QUERY = {
end: 0,
start: -3600,
timeType: 'RELATIVE',
unit: 'seconds',
editorString: "fields #timestamp, #message, #logStream, #log\n| sort #timestamp desc\n| limit 200\n| stats count() by bin(30s)",
source: ['/aws/lambda/simpleFn'],
};
function toLogsUrl(query) {
return `#logsV2:logs-insights?queryDetail=${JSURL.stringify(query)}`;
}
toLogsUrl(QUERY);
// #logsV2:logs-insights?queryDetail=~(end~0~start~-3600~timeType~'RELATIVE~unit~'seconds~editorString~'fields*20*40timestamp*2c*20*40message*2c*20*40logStream*2c*20*40log*0a*7c*20sort*20*40timestamp*20desc*0a*7c*20limit*20200*0a*7c*20stats*20count*28*29*20by*20bin*2830s*29~source~(~'*2faws*2flambda*2fsimpleFn))
I HAVE to elevate #WayneB's answer above bc it just works. No encoding required - just follow his template. I just confirmed it works for me. Here's what he said in one of the comments above:
"Apparently there is an easier link which does the encoding/replacement for you: https://console.aws.amazon.com/cloudwatch/home?region=${process.env.AWS_REGION}#logEventViewer:group=${logGroup};stream=${logStream}"
Thanks for this answer Wayne - just wish I saw it sooner!
Since Python contributions relate to log-groups, and not to log-insights, this is my contribution. I guess that I could have done better with the inner functions though, but it is a good starting point:
from datetime import datetime, timedelta
import re
from urllib.parse import quote
def get_aws_cloudwatch_log_insights(query_parameters, aws_region):
def quote_string(input_str):
return f"""{quote(input_str, safe="~()'*").replace('%', '*')}"""
def quote_list(input_list):
quoted_list = ""
for item in input_list:
if isinstance(item, str):
item = f"'{item}"
quoted_list += f"~{item}"
return f"({quoted_list})"
params = []
for key, value in query_parameters.items():
if key == "editorString":
value = "'" + quote(value)
value = value.replace('%', '*')
elif isinstance(value, str):
value = "'" + value
if isinstance(value, bool):
value = str(value).lower()
elif isinstance(value, list):
value = quote_list(value)
params += [key, str(value)]
object_string = quote_string("~(" + "~".join(params) + ")")
scaped_object = quote(object_string, safe="*").replace("~", "%7E")
with_query_detail = "?queryDetail=" + scaped_object
result = quote(with_query_detail, safe="*").replace("%", "$")
final_url = f"https://{aws_region}.console.aws.amazon.com/cloudwatch/home?region={aws_region}#logsV2:logs-insights{result}"
return final_url
Example:
aws_region = "eu-west-1"
query = """fields #timestamp, #message
| filter #message not like 'example'
| sort #timestamp asc
| limit 100"""
log_groups = ["/application/sample1", "/application/sample2"]
query_parameters = {
"end": datetime.utcnow().isoformat(timespec='milliseconds') + "Z",
"start": (datetime.utcnow() - timedelta(days=2)).isoformat(timespec='milliseconds') + "Z",
"timeType": "ABSOLUTE",
"unit": "seconds",
"editorString": query,
"isLiveTrail": False,
"source": log_groups,
}
print(get_aws_cloudwatch_log_insights(query_parameters, aws_region))
Yet another Python solution:
from urllib.parse import quote
def aws_quote(s):
return quote(quote(s, safe="")).replace("%", "$")
def aws_cloudwatch_url(region, log_group, log_stream):
return "/".join([
f"https://{region}.console.aws.amazon.com/cloudwatch/home?region={region}#logsV2:log-groups",
"log-group",
aws_quote(log_group),
"log-events",
aws_quote(log_stream),
])
aws_cloudwatch_url("ap-southeast-2", "/var/log/syslog", "process/pid=1")
https://ap-southeast-2.console.aws.amazon.com/cloudwatch/home?region=ap-southeast-2#logsV2:log-groups/log-group/$252Fvar$252Flog$252Fsyslog/log-events/process$252Fpid$253D1
I have data like this:
lm-sample prod
lm sample prod
lm-exit nonprod-shared
lm- value dev
I want to extract just the last value after the first space from right. So in this example:
prod
prod
nonprod-shared
dev
I tried:
Env =
Var FirstSpace = FIND(" ", 'AWS Reservations'[Account])
Return RIGHT('AWS Reservations'[Account],FirstSpace - 1)
But that is giving me odd results. Would appreciate some help on solving this. Thanks.
The lack of options for FIND or SEARCH to search from the end of the string makes this quite tricky.
You can use:
Env = TRIM(RIGHT(SUBSTITUTE('AWS Reservations'[Account], " ", REPT(" ", LEN('AWS Reservations'[Account]))), LEN('AWS Reservations'[Account])))
Or break it down for better understanding:
Env =
VAR string_length = LEN('AWS Reservations'[Account])
RETURN
TRIM(
RIGHT(
SUBSTITUTE(
'AWS Reservations'[Account],
" ",
REPT(" ", string_length)
),
string_length
)
)
Take lm-sample prod as an example.
First, we use REPT(" ", string_length) to create a string of " " which has the same length as the value lm-sample prod.
Then, we substitute all the occurrences of " " with this extra long " " and the string will become lm-sample prod
After that, we'll be comfortable getting the substring of string_length from the right, i.e. " prod"
Finally, we trim the result to get what we want, prod.
Results:
Reference
You were passing the start character index rather than desired character count to the RIGHT() function.
Right(Text, NumberOfCharacters)
Try this:
Env =
Var FirstSpace = FIND(" ", 'AWS Reservations'[Account])
Return RIGHT('AWS Reservations'[Account], LEN('AWS Reservations'[Account]) - FirstSpace)
A minor clarification that might help others: if you're searching for the last instance of a string part separated by other than space, say period (.), e.g. to get "third" from "first.second.third" you'd need to substitute period for space in the first " ", but keep the second " ", as that's the "extra long" string, TRIM'ed out ...
I need a regular expression for the next rules:
should not start or end with a space
should contain just letters (lower / upper), digits, #, single quotes, hyphens and spaces (spaces just inside, but not at the beginning and the end, as I already said)
should contain at least one letter (lower or upper).
Thank you
I think
^[^ ](?=.*[a-zA-Z]+)[a-zA-Z0-9#'\- ]*[^ ]$
should help you.
"Does it really matter guys?"
with regards to the dialect of regex: yes it does matter. Different languages may have different dialects. One example off the top of my head is that the RegEx library in PHP supports lookbehinds whereas RegEx library in JavaScript does not. This is why it is important for you to list the underlying language that you're using. Also for future reference, it is helpful for those wanting to answer your questions to provide us with sample input and sample matches from the input.
Using the information that you provided, this is also a question that I feel as though you should use RegEx and JavaScript to validate the input. Take a look at this example:
window.onload = function() {
var valid = "a1 - 'super' 1";
var invalid1 = " a1 - 'super' 1"; //leading ws
var invalid2 = "a1 - 'super' 1 "; //trailing ws
var invalid3 = "a1 - 'super' 1?"; //invalid (?) char
var invalid4 = "1 - '123'"; //no letters
console.log(valid + ": " + validation(valid));
console.log(invalid1 + ": " + validation(invalid1));
console.log(invalid2 + ": " + validation(invalid2));
console.log(invalid3 + ": " + validation(invalid3));
}
function validation(input) {
var acceptableChars = new RegExp(/[^a-zA-Z\d\s'-]/g);
var containsLetter = new RegExp(/[a-zA-Z]/);
return input.length > 1 && input.trim().length == input.length && !acceptableChars.test(input) && containsLetter.test(input);
}
I need a regex for filtering out a query. For example, I get a query input as below.
state:CA AND country:US OR postalcode:8888
Here, I need to extract terms based on " AND ", " OR " (any case). Can someone please provide the regex with which I can extract terms like "state:CA", "country:US" etc?
I want to consider the spaces before and after the AND, OR as the other terms might contain "and", "or" as part of string.
Eg: state:OR AND country:US
UPDATE:
I have tried something like this
\sAND\s|\sOR\s
With this, I could find the patterns " AND ", " OR ". But, how to make it case-insensitive?
What flavor or regex are you using ?
If the value in your key/pair values will always be comprised of one word only, this would do:
\w+:\w+
Test it here.
Update:
Since your values are comprised by more than one word only, I think you should be splitting the string into key/value pairs instead of using regexes.
Here's how you could do it in javascript:
var s = 'state:New York AND country:US OR postalcode:8888'
var dataBlocks = s.replace(/AND|and|And|OR|Or/g, '|').split('|')
for(var i = 0; i < dataBlocks.length; i++) dataBlocks[i] = dataBlocks[i].trim()
//your resulting array would like like
//Array [ "state:New York", "country:US", "postalcode:8888" ]
The same solution, in C#:
Regex r = new Regex(#"AND|and|And|OR|Or");
var s = "state:New York AND country:US OR postalcode:8888";
var keyValuePairs = r.Replace(s, "|").Split(new char[] { '|' }).Select(z =>
{
var keyValue = z.Trim().Split(new char[] { ':' });
return new KeyValuePair<string, string>(keyValue.FirstOrDefault(), keyValue.LastOrDefault());
});
foreach (var keyValuePair in keyValuePairs)
Console.WriteLine("Key: {0}\tValue:{1}", keyValuePair.Key, keyValuePair.Value);