How to rewrite the URL in Nginx Regex and debug? - regex

I am trying to remove /api/v1 from the request URL and simply pass the rest of the part. For example /api/v1/test/ as simply /test. /api/v1/test/ready as simply /test/ready.
Here is what I have tried. I am thinking the first parameter i.e. $1 will be captured in ^(/api/v1) and the rest of the part will be captured (.*?).
So, I am simply passing $2 and breaking it. However, it is not working. I am unsure what am I doing wrong. I tried some debugging but failed to do so.
location /api/v1 {
rewrite ^(/api/v1)(.*?) $2; break;
# tried return 200 $2;
# but this will never hit since it is rewriting
# the rules will be evaluated again.
# I guess break will not allow it to re run again for the rules.
include uwsgi_params;
uwsgi_pass 10.0.2.15:3031;
}
I was trying to debug it but I couldn't.
It never hit the location /. For example.
location / {
return 200 $request_uri;
}

Finally, I could spot the problem. Here is my solution.
I think the obvious way to figure out what was going wrong was after enabling the redirection logs -
error_log /var/log/nginx/error.log notice;
rewrite_log on;
Here is the location context.
location /api/v1 {
# uncomment the below two lines to see the redirection.
# the re-direction happens from /api/v1/ to /
# error_log /var/log/nginx/error.log notice;
# rewrite_log on;
rewrite ^(/api/v1)(.*)/$ https://$host$2 break;
}
location / {
uwsgi_pass unix:///tmp/sb_web_wsgi.sock;
include uwsgi_params;
}

Related

Nginx internal location ignores django completely and allows free access instead

I want to have a private media folder on my django website, accessible only to logged in users, so I got to know that I should handle authentication part on the django side, and file serving on the nginx side. However following internal location config examples I find it impossible to make it work. Nginx ignores django completely (only for the internal location case). Even if I don't have the url allowed in my urls.py and I have it listed as internal location in nginx, it will still be freely accessible to everybody.
I am posting my nginx configuration in hope that someone can find a mistake in it.
My expectation is that everything in /internal/ folder will not be accessible to anonymous users and it will only be accessible by the django application through X-Accel-Redirect header. Right now if I go to /internal/test.png in an incognito window it will show me the picture.
I am not posting my django code for now, since it is ignored anyway by nginx, so it must be the nginx config problem.
server {
server_name XXX.XX.XX.XXX example.com www.example.com;
location = /favicon.ico {
access_log off;
log_not_found off;
alias /home/user/myproject/static/favicon4.ico;
}
location /static/ {
root /home/user/myproject;
}
location /media/ {
root /home/user/myproject;
}
location / {
include proxy_params;
proxy_pass http://unix:/run/gunicorn.sock;
}
location /internal/ {
internal;
root /home/user/myproject;
}
root /home/user/myproject;
location ~* \.(jpg|jpeg|png|webp|ico|gif)$ {
expires 30d;
}
location ~* \.(css|js|pdf)$ {
expires 1d;
}
client_max_body_size 10M;
# below in this server block is only my Certbot stuff
}
P.S. I swapped identifiable data to X characters and basic names.
I had 2 more problems in this config and I will show everything I did to make it work. The original problem why nginx was ignoring django was in how nginx chooses which location block to use, as suggested by Richard Smith.
From nginx.org we can read:
To find location matching a given request, nginx first checks locations defined using the prefix strings (prefix locations). Among them, the location with the longest matching prefix is selected and remembered. Then regular expressions are checked, in the order of their appearance in the configuration file. The search of regular expressions terminates on the first match, and the corresponding configuration is used. If no match with a regular expression is found then the configuration of the prefix location remembered earlier is used.
And also:
If the longest matching prefix location has the “^~” modifier then regular expressions are not checked.
So regular expressions, if available, will be chosen first. ^~ modifier before prefix makes it chosen instead of regular expressions.
I changed location /internal/ { line to location ^~ /internal/ { and then I got 404 errors every time and no matter how I tried to access the files, but at least I knew nginx was going to this location.
The 2nd mistake was thinking that I can get away with using the same url as the folder name, or in other words, that I can put in my urls.py
path('internal/<path>', views.internal_media, name='internal_media')
together with
location ^~ /internal/ {
internal;
root /home/user/myproject;
}
in my nginx config.
I can't. The url must be different, because otherwise the url doesn't lead to django urls.py - it still leads to /internal/ location through nginx (again, due to how nginx chooses locations).
I changed my urls.py line to point to private url instead:
path('private/<path>', views.internal_media, name='internal_media')
and in the views.py file I redirect to /internal/:
def internal_media(request, path):
if request.user.groups.filter(name='team-special').exists():
response = HttpResponse()
response['X-Accel-Redirect'] = '/internal/' + path
del response['Content-Type'] # without this your images will open as text
return response
else:
raise PermissionDenied()
Aaaand this still didn't work. 404 errors every time. The 3rd mistake was forgetting about the combo of those two:
location / {
include proxy_params;
proxy_pass http://unix:/run/gunicorn.sock;
}
location ~* \.(jpg|jpeg|png|webp|ico|gif)$ {
expires 30d;
}
Now if I went to the url /private/test.jpg nginx didn't let me go to django, because location / is lower in priority than regular expressions, so location ~* took precedence and I never got to django. I noticed it by accident after a lot of time being frustrated, when I put the url incorrectly in incognito mode. When I went to /private/test.jp now I got a 403 forbidden error instead of 404.
It started working immediately when I commented out this.
location ~* \.(jpg|jpeg|png|webp|ico|gif)$ {
expires 30d;
}
location ~* \.(css|js|pdf)$ {
expires 1d;
}
So now internal files worked nicely, but I didn't have caching...
To fix that, I modified my /static/ and /media/ locations, but maybe I won't go into that here, since it is a different topic. I'll just post my full nginx config that works :)
Well, what you might want to also know is that:
~* tells nginx that we are writing a regular expression that is case insensitive
~ would tell nginx that we were writing a regular expression that is case sensitive
server {
server_name XXX.XX.XX.XXX example.com www.example.com;
location = /favicon.ico {
access_log off;
log_not_found off;
alias /home/user/myproject/static/favicon4.ico;
expires 30d;
}
location /static/ {
root /home/user/myproject;
expires 30d;
}
location /media/ {
root /home/user/myproject;
expires 30d;
}
location ~* \/(static|media)\/\S*\.(css|js|pdf) {
root /home/user/myproject;
expires 1d;
}
location ^~ /internal/ {
root /home/user/myproject;
internal;
expires 1d;
}
location / {
include proxy_params;
proxy_pass http://unix:/run/gunicorn.sock;
}
client_max_body_size 10M;
# certbot stuff
}

How to redirect nginx location to different rule?

I'm using NGINX and trying to get all request that has subdirectory FusionChart goes to special place, my intention is all url with [ROOT_URL]/FusionChart/ should go to # Rule 3 below.
However, I have an existing nginx rules stated that all static content should go to # Rule 2.
Nginx configuration:
server {
listen 80;
server_name domain2.com www.domain2.com;
access_log logs/domain2.access.log main;
# Rule 1
location ~ ^/(images|javascript|js|css|flash|media|static)/
{
root /var/www/virtual/big.server.com/htdocs;
expires 30d;
}
# Rule 2
location ~* \.(js|css|png|jpg|jpeg|gif|jqGrid|images|common|ico|map|woff|woff2|ttf|html)$ {
root /home/rcp/dev/public/others;
expires 10y;
}
# Rule 3
location ~ ^/(FusionCharts)/ {
root /home/rcp/dev/public/charts;
expires 10y;
}
location / {
proxy_pass http://127.0.0.1:8080;
}
}
The Tested URL:
http://domain2.com/FusionCharts/index.html
This will fall to # Rule 2, how do I modify the rules so that the request above landed in # Rule 3?
Regex matching locations are checked from first to last, the first found match is used for request processing. You can either swap second and third locations or use location ^~ { ... } syntax (check the documentation for details):
location ^~ /FusionCharts/ {
root /home/rcp/dev/public/charts/;
expires 10y;
}
Please note that with above location index.html file for /FusionCharts/index.html request will be searched under /home/rcp/dev/public/charts/FusionCharts directory. If it isn't a desired behavior, use alias /home/rcp/dev/public/charts/; directive instead of root one.

Nginx regex to exclude certain paths

I am trying to exclude some paths in my nginx proxypass and want everything else to go to my proxypass.
i.e I dont want to give proxy_pass to any url which starts with 'tiny' or 'static', but want everythign else to go to my proxypass location.
and I am using following regex to achieve this:
~ ^((?!tiny|static).)*$
But I always get 404 error.
If I navigate to following url in browser
localhost:8080/xyz
I want it to go to
localhost:8000/api/tiny/records/xyz
Can someone please help me in pointing out what is the issue ?
Here is my full nginx conf file:-
server {
listen 8080;
server_name localhost;
location ~ ^((?!tiny|static).)*$ {
proxy_pass http://localhost:8000/api/tiny/records/$1;
}
location / {
proxy_pass http://localhost:8000;
}
error_page 500 502 503 504 /50x.html;
location = /50x.html {
root html;
}
}
Thanks a lot.
You are missing a / and have the * in the wrong place. The regular expression should be:
^(/(?!tiny|static).*)$
But you do not need to use a regular expression with a negative lookahead assertion. Instead, place a normal regular expression on the other location block.
For example:
location / {
proxy_pass http://localhost:8000/api/tiny/records/;
}
location ~ ^/(tiny|static) {
proxy_pass http://localhost:8000;
}

sed delete code block from file

I want to remove the server {...} code block from the default nginx configuration server configuration file.
sudo sed -i '/(\s*#?)server \s*{(?:[\s\S]+)\1}/ d' /opt/nginx/conf/nginx.conf
produces sed: -e expression #1, char 33: Invalid back reference
However using tools like Rubular the match works just fine. Essentially what I need to do is match the code block based on matched indentation otherwise too much will be deleted.
You can test this yourself in Rubular using the default nginx config as a test string:
#user nobody;
#Defines which Linux system user will own and run the Nginx server
worker_processes 1;
#Referes to single threaded process. Generally set to be equal to the number of CPUs or cores.
#error_log logs/error.log; #error_log logs/error.log notice;
#Specifies the file where server logs.
#pid logs/nginx.pid;
#nginx will write its master process ID(PID).
events {
worker_connections 1024;
# worker_processes and worker_connections allows you to calculate maxclients value:
# max_clients = worker_processes * worker_connections
}
http {
include mime.types;
# anything written in /opt/nginx/conf/mime.types is interpreted as if written inside the http { } block
default_type application/octet-stream;
#
#log_format main '$remote_addr - $remote_user [$time_local] "$request" '
# '$status $body_bytes_sent "$http_referer" '
# '"$http_user_agent" "$http_x_forwarded_for"';
#access_log logs/access.log main;
sendfile on;
# If serving locally stored static files, sendfile is essential to speed up the server,
# But if using as reverse proxy one can deactivate it
#tcp_nopush on;
# works opposite to tcp_nodelay. Instead of optimizing delays, it optimizes the amount of data sent at once.
#keepalive_timeout 0;
keepalive_timeout 65;
# timeout during which a keep-alive client connection will stay open.
#gzip on;
# tells the server to use on-the-fly gzip compression.
server {
# You would want to make a separate file with its own server block for each virtual domain
# on your server and then include them.
listen 80;
#tells Nginx the hostname and the TCP port where it should listen for HTTP connections.
# listen 80; is equivalent to listen *:80;
server_name localhost;
# lets you doname-based virtual hosting
#charset koi8-r;
#access_log logs/host.access.log main;
location / {
#The location setting lets you configure how nginx responds to requests for resources within the server.
root html;
index index.html index.htm;
}
#error_page 404 /404.html;
# redirect server error pages to the static page /50x.html
#
error_page 500 502 503 504 /50x.html;
location = /50x.html {
root html;
}
# proxy the PHP scripts to Apache listening on 127.0.0.1:80
#
#location ~ \.php$ {
# proxy_pass http://127.0.0.1;
#}
# pass the PHP scripts to FastCGI server listening on 127.0.0.1:9000
#
#location ~ \.php$ {
# root html;
# fastcgi_pass 127.0.0.1:9000;
# fastcgi_index index.php;
# fastcgi_param SCRIPT_FILENAME /scripts$fastcgi_script_name;
# include fastcgi_params;
#}
# deny access to .htaccess files, if Apache's document root
# concurs with nginx's one
#
#location ~ /\.ht {
# deny all;
#}
}
# another virtual host using mix of IP-, name-, and port-based configuration
#
#server {
# listen 8000;
# listen somename:8080;
# server_name somename alias another.alias;
# location / {
# root html;
# index index.html index.htm;
# }
#}
# HTTPS server
#
#server {
# listen 443 ssl;
# server_name localhost;
# ssl_certificate cert.pem;
# ssl_certificate_key cert.key;
# ssl_session_cache shared:SSL:1m;
# ssl_session_timeout 5m;
# ssl_ciphers HIGH:!aNULL:!MD5;
# ssl_prefer_server_ciphers on;
# location / {
# root html;
# index index.html index.htm;
# }
#}
}
sed doesn't allow regex to span multiple lines, so you would need to use multiple commands to achieve what you want, e.g. something like:
'/(\s*#?)server\s*\{/,/\1\}/d'
But unfortunately sed doesn't allow back references from previous regexes, so the above doesn't work.
A couple of things here.
First, by default, sed uses BRE as its regular expression format. You either need to write your regexes in BRE, or you need to use an option for sed that tells it to interpret ERE. The option will depend on your platform, which you haven't shared as a tag, so read your man page for sed to see what to use.
Second, in order to process text over multiple lines, you need to have those multiple lines in your edit buffer. You do this by appending them to your hold buffer as you step through the file, then processing them all at once. This is highly advanced sed usage, and more difficult than most people can deal with. Even if we can put together something that works, it will read like line noise and be virtually unsupportable after-the-fact.
I'd suggest using awk instead.
#!/usr/bin/awk -f
# pay attention to are "start of server" line,
/^[[:space:]]*server {/ { n=1 }
# increment bracket counter within the server block,
n>0 && /^[[:space:]]*{/ { n++ }
# decrement the bracket counter within the server block,
n>0 && /^[[:space:]]*}/ { n-- }
# and if we're still within the block, skip to the next line.
n>0 { next }
# short-hand for "print the current line"
1
Note that the conditions contain n>0 rather than just n because awk considers any non-zero value to evaluate to "true".
Note also that this will only work on files that contain a single squirly bracket per line. I'm not sure whether nginx requires this, but if it permits closing a section within a section using } }, beware that the script above will not parse that correctly.
YMMV. Not tested on animals. May contain nuts.

Nginx: How to rewrite all URLs except of images?

I'm new to nginx and I want to migrate my website from Apache to nginx. My site has URLs like:
www.mywebsite.com/category/product/balloon
www.mywebsite.com/category/product/shoe
www.mywebsite.com/information/help
etc.
Since I'm using PHP I need to rewrite all URLs to index.php except if it's an image OR if it's a "fake-request". My nginx.config so far:
#block fake requests
location ~* \.(aspx|jsp|cgi)$ {
return 410;
}
#rewrite all requests if it's not a image
location / {
root html;
index index.php 500.html;
if (!-f $request_filename) {
rewrite ^(.*)$ /index.php?q=$1 last;
break;
}
}
error_page 404 /index.php;
# serve static files directly
location ~* ^.+.(jpg|jpeg|gif|css|png|js|ico)$ {
access_log off;
}
# redirect server error pages to the static page /50x.html
#
error_page 500 502 503 504 /50x.html;
location = /50x.html {
root html;
}
# pass the PHP scripts to FastCGI server listening on 127.0.0.1:9000
#
location ~ \.php$ {
root html;
fastcgi_pass 127.0.0.1:9000;
fastcgi_index index.php;
fastcgi_param SCRIPT_FILENAME E:/test2/html/$fastcgi_script_name;
include fastcgi_params;
}
This configuration does not work because:
1. It doesn't block fake request to .php files and I can't add .php to (aspx|jsp|cgi)$
2. It doesn't rewrite the URL if the file exists which is wrong: It should only serve static files directly if it's a defined file-type in(jpg|jpeg|gif|css|png|js|ico)$
How can I solve these problems? I really appreciate every answer, clarification or feedback you can give me.
Thanks
Mike
You need to configure the HttpRewriteModule. This module makes it possible to change URI using regular expressions (PCRE), and to redirect and select configuration depending on variables.
If the directives of this module are given at the server level, then they are carried out before the location of the request is determined. If in that selected location there are further rewrite directives, then they also are carried out. If the URI changed as a result of the execution of directives inside location, then location is again determined for the new URI.