Google Load Balancer map url containing has '#'

Google Load Balancer map url containing has '#' - google-cloud-platform

I have a website configured using GCP load-balancer and GCP storage as backend service.
What is now :
https://example.com/#/ --> works
https://example.com/#/path --> works
what I want:
https://example.com/#/ but in backend it should hit /#/path.
I have tried with GCP path mapping using host and path rules but symbol # is causing problem. It converts # to %23 in the broswer and says key not found.
Any idea?

In a URL/URI, the symbol hash (#) has a special meaning and it is a reserved character used as a generic delimiter 1, as a forward slash (/) or at (#) does.
Actually the hash symbol is interpreted as an anchor in the URL, so it is expected to point to an anchored part in your document. An example would be:
http://example.com/your_page.html#my_document
It will link the URL directly at the my_document anchor in the your_page.html.
So, if you use the hash character differently than this, the URL map will be encoded for security reasons. As stated in the RFC1738 2: The character "#" is unsafe and should always be encoded because it is used in World Wide Web and in other systems to delimit a URL from a fragment/anchor identifier that might follow it.
Due to that, your URL string is being encoded by the browser mechanism.
More information here 3.
Despite it being possible to set an URL mapping using a hash symbol, it is not recommended to do it. So, I kindly encourage you to not use a hash symbol in the URL map.
The reason for it is working as you mention is simply because the hash symbol “#” has been ignored by your web application when it has not been encoded.
So it can works as for /#/path or /#/#/#/path. But in truth it is interpreting just the /path.

Related

How to add regex constraints to Gin framework's router?

Use Rails' routing, for a URL like https://www.amazon.com/posts/1, can use this way to do
get 'posts/:url', to: 'posts#search', constraints: { url: /.*/ }
Use go's gin framework, didn't find a regex constraints method for such a routing
r.GET("posts/search/:url", post.Search)
In the post controller
func Search(c *gin.Context) {
fmt.Println(c.Param("url"))
}
When call http://localhost:8080/posts/search/https://www.amazon.com/posts/1, it returned 404 code.
Like https://play.golang.org/p/dsB-hv8Ugtn
➜ ~ curl http://localhost:8080/site/www.google.com
Hello www.google.com%
➜ ~ curl http://localhost:8080/site/http://www.google.com/post/1
404 page not found%
➜ ~ curl http://localhost:8080/site/https%3A%2F%2Fwww.google.com%2Fpost%2F1
404 page not found%
➜ ~ curl http://localhost:8080/site/http:\/\/www.google.com\/post\/1
404 page not found%

Gin does not support regular expressions in the router. This is probably because it builds a tree of paths in order to not have to allocate memory while traversing and results in excellent performance.
The parameter support for paths is also not very powerful but you can work around the issue by using an optional parameter like
c.GET("/posts/search/*url", ...)
Now c.Param("url") could contain slashes. There are two unsolved problems though:
Gin's router decodes percent encoded characters (%2F) so if the original URL had such encoded parts, it would wrongly end up decoded and not match the original url that you wanted to extract. See the corresponding Github issue: https://github.com/gin-gonic/gin/issues/2047
You would only get the scheme+host+path part of URLs in your parameter, the querystring would still be separate unless you also encode that. E.g. /posts/search/http://google.com/post/1?foo=bar would give you a "url" param of "/http://google.com/posts/1"
As seen in the example above, optional parameters in Gin also (wrongly) always contain a slash at the beginning of the string.
I would recommend you pass the URL as an encoded querystring instead. This will result in a lot less headache. Otherwise I'd recommend looking for a different router or framework that is less restrictive because I don't think Gin will resolve these issues anytime soon - they have been open for years.

Regex Expression to replace email address domain, for users email address

I am trying to solve an email domain co-existence problem with Exchange online. Basically i need it so when a message is sent to one tenant (domain.com) and forwarded to another tenant (newdomain.com) - that the To and/or CC headers are replaced with the endpoint (newdomain.com) email addresses before they are delivered to the final destination.
For Example:
1) Gmail (or any) user sends and email to sally.sue#domain.com, MX is looked up for that domain, it is delivered to the Office 365 Tenant for domain.com
2) That same office 365 tenant, is set to forward emails to sally.sue#newdomain.com (different tenant)
3) When the message arrives to sally sue at newdomain.com and she hits "Reply All" the original sender AND her (sally.sue#domain.com) are added to the To: line in the email.
The way to fix that is to use Header Replacement with Proofpoint, which as mentioned below works on a single users basis. The entire question below is me trying to get it to work using RegEx (As thats the only solution) for a large number of users.
I need to convert the following users email address:
username#domain.com to username#newdomain.com
This has to be done using ProofPoint which is a cloud hosted MTA. They have been able to provide some sort of an answer but its not working.
Proofpoint support has suggested using this:
Header Name : To
Find Value : domain\.com$
Replace : newdomain\.com$ or just newdomain.com
Neither of the above work. In both cases the values are just completely ignored.
This seems to find the values:
Header Name : To
Find Value : \b[A-Z0-9._%-]+#[A-Z0-9.-]+\.[A-Z]{2,4}\b
Replace : $1#fake.com
But the above simply and only replaces the To: line (in the email) with the literal string: $1#fake.com
I would also need to be able to find lowercase and numbers in email addresses as well. i believe the above example only finds caps.
I need it do the following:
Header Name : To
Find Value : \b[A-Z0-9._%-]+#[A-Z0-9.-]+\.[A-Z]{2,4}\b (find users email address, domain)
Replace : user.name#newdomain.com
This is for a large number of users so there is no way to manually update or create separate rules for each user.
If i do create a individual rule, then it works as expected but as stated that requires manually typing out each user To: address And their new desired To: address.
This solution here almost worked: Regex to replace email address domains?

I have a couple of observations from general experience, although I have not worked with Office365 specifically.
First, a regex used for replacement usually needs to have a "capture group". This is often expressed with parentheses, as in:
match : \b([A-Z0-9._%-]+)#domain.com$
replacement : $1#newdomain.com
The idea is that the $1 in the replacement pattern is replaced with whatever was found within the () in the matching pattern.
Note that some regex engines use a different symbol for the replacement, so it might be \1#newdomain.com or some such. Note also that some regex engines need the parentheses escaped, so the matching pattern might be something like \b\([A-Z0-9._%-]+\)#domain.com$
Second, if you want to include - inside a "character class" set (that is, inside square brackets []), then the - should be first; otherwise it's ambiguous because - is also used for a range of characters. The regex engine in question might not care, but I suggest writing your matching pattern as:
\b([-A-Z0-9._%]+)#domain.com$
This way, the first - is unambiguously itself, because there is nothing before it to indicate the start of a range.
Third, for lowercase letters, it's easiest to just expand your character class set to include them, like so:
[-A-Za-z0-9._%]

Facing difficulties with amazon url signature

I am following each step to create a signature for my url requests to amazon(or at least that's what I think) but it doesn't work.
I am trying to sign an example from the amazon's page( http://docs.aws.amazon.com/AWSECommerceService/latest/DG/rest-signature.html
I have downloaded the s3-sigtester, a javascript file that creates the signatures. The string that I am signing is:
GET \necs.amazonaws.co.uk \n/onca/xml \nAWSAccessKeyId=AKIAJOCH6NNDJFTB4LYA&Actor=Johnny%20Depp&AssociateTag=memagio-21&Operation=ItemSearch&ResponseGroup=ItemAttributes%2COffers%2CImages%2CReviews%2CVariations&SearchIndex=DVD&Service=AWSECommerceService&Sort=salesrank&Timestamp=2014-10-19T21%3A21%3A55Z&Version=2009-01-01
The string above is the result from the sigtester. I am feeding it in hex. I get a signature and then, I am trying to access the following url, in order to get the xml values:
http://ecs.amazonaws.co.uk/onca/xml?AWSAccessKeyId=AKIAJOCH6NNDJFTB4LYA
&Actor=Johnny%20Depp&AssociateTag=memagio-21&Operation=ItemSe
arch&ResponseGroup=ItemAttributes%2COffers%2CImages%2CReviews%2CV
ariations&SearchIndex=DVD&Service=AWSECommerceService&Signature=vZK%2BhDqtcV1CoTf6%2FN1ohR3Da5M%3D&Sort=salesrank&Ti
mestamp=2014-10-19T21%3A21%3A55Z&Version=2009-01-01
The AWASCcessKeyId and signature key are the AWS keys that I have created. However, I get an error that the signatures do not match. I think that I am following all the steps and I really don't know what's going on. Thanks.

The string that I am signing is:
GET \necs.amazonaws.co.uk \n/onca/xml \nAWSAccessKeyId=AKIAJOCH6NNDJFTB4LYA&Actor=Johnn
I assume \n denotes a newline character (Unicode 000A). The problem is that there should not be spaces before the newlines - it needs to be GET\necs.amazonaws.co.uk\n...

Insert link to local file onto xwiki

Hi this may apply to platforms/wikis outside of xwiki, but I am trying the embed a file by doing the following
[[myfile>>file://C:/users/myfile.txt]]
where clicking on the newly created link does nothing.
I have tried with backslashed file path too but no difference and three slashes infront of "file:"
this should be pretty straightforward....

There should be three slashes in a URI like file:///C:/.
After the "protocol" part, the file URI scheme takes first a host name (which can be omitted in your case, because you are trying to access a local resource), then the path. Between host and path there is a slash. (This holds for other URI schemes, as well...)
The slash has to remain, even if the host part is omitted.

URL general format

I have written a C++ program that allows URLs to be posted onto YouTube. It works by taking in the URL as input either from you typing it into the program or from direct input, and then it will replace every '/', '.' in the string with '*'. This modified string is then put on your clipboard (this is solely for Windows-users).
Of course, before I can even call the program usable, it has to go back: I will need to know when '.', '/' are used in URLs. I have looked at this article: http://en.wikipedia.org/wiki/Uniform_Resource_Locator , and know that '.' is used when dealing with the "master website" (in the case of this URL, "en.wikipedia.org"), and then '/' is used afterwards, but I have been to other websites, http://msdn.microsoft.com/en-us/library/windows/desktop/ms649048%28v=vs.85%29.aspx , where this simply isn't the case (it even replaced '(', ')' with "%28", "%29", respectively!)
I also seemed to have requested a .aspx file, whatever that is. Also, there is a '.' inside the parentheses in that URL. I have even tried to view the regular expressions (I don't quite fully understand those yet...) regarding URLs. Could someone tell me (or link me to) the rules regarding the use of '.', '/' in URLs?

Can you explain why you are doing this convoluted thing? What are you trying to achieve? It may be that you don't need to know as much as you think, once you answer that question.
In the mean time here is some information. A URL is really comprised of a number of sections
http: - the "scheme" or protocol used to access the resource. "HTTP", "HTTPS",
"FTP", etc are all examples of a scheme. There are many others
// - separates the protocol from the host (server) address
myserver.org - the host. The host name is looked up against a DNS (Dynamic Name Server)
service and resolved to an IP address - the "phone number" of the machine
which can serve up the resource (like "98.139.183.24" for www.yahoo.com)
www.myserver.org - the host with a prefix. Sometimes the same domain (`myserver.org`)
connects multiple servers (or ports) and you can be sent straight to the
right server with the prefix (mail., www., ftp., ... up to the
administrators of the domain). Conventionally, a server that serves content
intended for viewing with a browser has a `www.` prefix, but there's no rule
that says this must be the case.
:8080/ - sometimes, you see a colon followed by up to five digits after the domain.
this indicates the PORT on the server where you are accessing data
some servers allow certain specific services on just a particular port
they might have a "public access" website on port 80, and another one on 8080
the https:// protocol defaults to port 443, there are ports for telnet, ftp,
etc. Add these things only if you REALLY know what you are doing.
/the/pa.th/ this is the path relative to DOCUMENTROOT on the server where the
resource is located. `.` characters are legal here, just as they are in
directory structures.
file.html
file.php
file.asp
etc - usually the resource being fetched is a file. The file may have
any of a great number of extensions; some of these indicate to the server that
instead of sending the file straight to the requester,
it has to execute a program or other instructions in this file,
and send the result of that
Examples of extensions that indicate "active" pages include
(this is not nearly exhaustive - just "for instance"):
.php = contains a php program
.py = contains a python program
.js = contains a javascript program
(usually called from inside an .htm or .html)
.asp = "active server page" associated with a
Microsoft Internet Information Server
?something=value&somethingElse=%23othervalue%23
parameters that are passed to the server can be shown in the URL.
This can be used to pass parameters, entries in a form, etc.
Any character might be passed here - including '.', '&', '/', ...
But you can't just write those characters in your string...
Now comes the fun part.
URLs cannot contain certain characters (quite a few, actually). In order to get around this, there exists a mechanism called "escaping" a character. Typically this means replacing a character with the hexadecimal equivalent, prefixed with a % sign. Thus, you frequently see a space character represented as %20, for example. You can find a handly list here
There are many functions available for converting "illegal" characters in a URL automatically to a "legal" value.
To learn about exactly what is and isn't allowed, you really need to go back to the original specifications. See for example
http://www.ietf.org/rfc/rfc1738.txt
http://www.ietf.org/rfc/rfc2396.txt
http://www.ietf.org/rfc/rfc3986.txt
I list them in chronological order - the last one being the most recent.
But I repeat my question -- what are you really trying to do here, and why?

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js