sed delete code block from file - regex

I want to remove the server {...} code block from the default nginx configuration server configuration file.
sudo sed -i '/(\s*#?)server \s*{(?:[\s\S]+)\1}/ d' /opt/nginx/conf/nginx.conf
produces sed: -e expression #1, char 33: Invalid back reference
However using tools like Rubular the match works just fine. Essentially what I need to do is match the code block based on matched indentation otherwise too much will be deleted.
You can test this yourself in Rubular using the default nginx config as a test string:
#user nobody;
#Defines which Linux system user will own and run the Nginx server
worker_processes 1;
#Referes to single threaded process. Generally set to be equal to the number of CPUs or cores.
#error_log logs/error.log; #error_log logs/error.log notice;
#Specifies the file where server logs.
#pid logs/nginx.pid;
#nginx will write its master process ID(PID).
events {
worker_connections 1024;
# worker_processes and worker_connections allows you to calculate maxclients value:
# max_clients = worker_processes * worker_connections
}
http {
include mime.types;
# anything written in /opt/nginx/conf/mime.types is interpreted as if written inside the http { } block
default_type application/octet-stream;
#
#log_format main '$remote_addr - $remote_user [$time_local] "$request" '
# '$status $body_bytes_sent "$http_referer" '
# '"$http_user_agent" "$http_x_forwarded_for"';
#access_log logs/access.log main;
sendfile on;
# If serving locally stored static files, sendfile is essential to speed up the server,
# But if using as reverse proxy one can deactivate it
#tcp_nopush on;
# works opposite to tcp_nodelay. Instead of optimizing delays, it optimizes the amount of data sent at once.
#keepalive_timeout 0;
keepalive_timeout 65;
# timeout during which a keep-alive client connection will stay open.
#gzip on;
# tells the server to use on-the-fly gzip compression.
server {
# You would want to make a separate file with its own server block for each virtual domain
# on your server and then include them.
listen 80;
#tells Nginx the hostname and the TCP port where it should listen for HTTP connections.
# listen 80; is equivalent to listen *:80;
server_name localhost;
# lets you doname-based virtual hosting
#charset koi8-r;
#access_log logs/host.access.log main;
location / {
#The location setting lets you configure how nginx responds to requests for resources within the server.
root html;
index index.html index.htm;
}
#error_page 404 /404.html;
# redirect server error pages to the static page /50x.html
#
error_page 500 502 503 504 /50x.html;
location = /50x.html {
root html;
}
# proxy the PHP scripts to Apache listening on 127.0.0.1:80
#
#location ~ \.php$ {
# proxy_pass http://127.0.0.1;
#}
# pass the PHP scripts to FastCGI server listening on 127.0.0.1:9000
#
#location ~ \.php$ {
# root html;
# fastcgi_pass 127.0.0.1:9000;
# fastcgi_index index.php;
# fastcgi_param SCRIPT_FILENAME /scripts$fastcgi_script_name;
# include fastcgi_params;
#}
# deny access to .htaccess files, if Apache's document root
# concurs with nginx's one
#
#location ~ /\.ht {
# deny all;
#}
}
# another virtual host using mix of IP-, name-, and port-based configuration
#
#server {
# listen 8000;
# listen somename:8080;
# server_name somename alias another.alias;
# location / {
# root html;
# index index.html index.htm;
# }
#}
# HTTPS server
#
#server {
# listen 443 ssl;
# server_name localhost;
# ssl_certificate cert.pem;
# ssl_certificate_key cert.key;
# ssl_session_cache shared:SSL:1m;
# ssl_session_timeout 5m;
# ssl_ciphers HIGH:!aNULL:!MD5;
# ssl_prefer_server_ciphers on;
# location / {
# root html;
# index index.html index.htm;
# }
#}
}

sed doesn't allow regex to span multiple lines, so you would need to use multiple commands to achieve what you want, e.g. something like:
'/(\s*#?)server\s*\{/,/\1\}/d'
But unfortunately sed doesn't allow back references from previous regexes, so the above doesn't work.

A couple of things here.
First, by default, sed uses BRE as its regular expression format. You either need to write your regexes in BRE, or you need to use an option for sed that tells it to interpret ERE. The option will depend on your platform, which you haven't shared as a tag, so read your man page for sed to see what to use.
Second, in order to process text over multiple lines, you need to have those multiple lines in your edit buffer. You do this by appending them to your hold buffer as you step through the file, then processing them all at once. This is highly advanced sed usage, and more difficult than most people can deal with. Even if we can put together something that works, it will read like line noise and be virtually unsupportable after-the-fact.
I'd suggest using awk instead.
#!/usr/bin/awk -f
# pay attention to are "start of server" line,
/^[[:space:]]*server {/ { n=1 }
# increment bracket counter within the server block,
n>0 && /^[[:space:]]*{/ { n++ }
# decrement the bracket counter within the server block,
n>0 && /^[[:space:]]*}/ { n-- }
# and if we're still within the block, skip to the next line.
n>0 { next }
# short-hand for "print the current line"
1
Note that the conditions contain n>0 rather than just n because awk considers any non-zero value to evaluate to "true".
Note also that this will only work on files that contain a single squirly bracket per line. I'm not sure whether nginx requires this, but if it permits closing a section within a section using } }, beware that the script above will not parse that correctly.
YMMV. Not tested on animals. May contain nuts.

Related

How to rewrite the URL in Nginx Regex and debug?

I am trying to remove /api/v1 from the request URL and simply pass the rest of the part. For example /api/v1/test/ as simply /test. /api/v1/test/ready as simply /test/ready.
Here is what I have tried. I am thinking the first parameter i.e. $1 will be captured in ^(/api/v1) and the rest of the part will be captured (.*?).
So, I am simply passing $2 and breaking it. However, it is not working. I am unsure what am I doing wrong. I tried some debugging but failed to do so.
location /api/v1 {
rewrite ^(/api/v1)(.*?) $2; break;
# tried return 200 $2;
# but this will never hit since it is rewriting
# the rules will be evaluated again.
# I guess break will not allow it to re run again for the rules.
include uwsgi_params;
uwsgi_pass 10.0.2.15:3031;
}
I was trying to debug it but I couldn't.
It never hit the location /. For example.
location / {
return 200 $request_uri;
}
Finally, I could spot the problem. Here is my solution.
I think the obvious way to figure out what was going wrong was after enabling the redirection logs -
error_log /var/log/nginx/error.log notice;
rewrite_log on;
Here is the location context.
location /api/v1 {
# uncomment the below two lines to see the redirection.
# the re-direction happens from /api/v1/ to /
# error_log /var/log/nginx/error.log notice;
# rewrite_log on;
rewrite ^(/api/v1)(.*)/$ https://$host$2 break;
}
location / {
uwsgi_pass unix:///tmp/sb_web_wsgi.sock;
include uwsgi_params;
}

Nginx try_files not working with domain that appends trailing slash

I have a dockerised Django app where nginx uses proxy_pass to send requests to the Django backend. I am looking to pre-publish certain pages that dont change often so Django doesnt have to deal with them.
I am trying to use try_files to check if that page exists locally and pass to Django if not.
Our URL structure requires that all URLs end with a forward slash and we dont use file type suffixes e.g. a page might be www.example.com/hello/. This means the $uri param in nginx in this instance is /hello/ and when try_files looks at that it is expecting a directory due to the trailing slash. If I have a directory with a list of files how do I get try_files to look at them without re-writing the URL to remove the slash as Django requires it?
My nginx conf is below.
server {
listen 443 ssl http2 default_server;
listen [::]:443 ssl http2 default_server;
server_name example.com;
root /home/testuser/example;
location / {
try_files $uri uri/ #site_root;
}
location #site_root {
proxy_pass http://127.0.0.1:12345;
}
}
If I have a file "hello" at /home/testuser/example/hello and call https://www.example.com/hello/ how can I get it to load the file correctly?
P.S. the permissions on the static content folder and its contents are all 777 (for now to rule out permissions issues)
Cheers in advance!
You can point the URI /hello/ to a local file called hello or hello.html using try_files, but you must first extract the filename using a regular expression location. See this document for details.
The advantage of using .html is that you will not need to provide the Content-Type of the response.
For example, using hello.html:
root /path/to/root;
location / {
try_files $uri uri/ #site_root;
}
location ~ ^(?<filename>/.+)/$ {
try_files $filename.html #site_root;
}
location #site_root {
proxy_pass ...;
}
If you prefer to store the local files without an extension, and they are all text/html, you will need to provide the Content-Type. See this document for details.
For example, using hello:
root /path/to/root;
location / {
try_files $uri uri/ #site_root;
}
location ~ ^(?<filename>/.+)/$ {
default_type text/html;
try_files $filename #site_root;
}
location #site_root {
proxy_pass ...;
}
In my case using NextJS, leaving the final slash causes errors.
So here is the solution I found to make it work nicely:
root /path/to/static/files/directory; # no trailing slash
rewrite ^/(.*)/$ /$1 permanent; # automatically remove the trailing slash
location / {
try_files $uri $uri.html $uri/index.html /404.html;
# try the provided uri
# then try adding .html
# then try uri/index.html (ex: homepage)
# finally serve the 404 page
}

How to forward all paths that start with a specific location in nginx?

I want to forward all paths that start with /api/ (/api/* ??) to port 1000 but the actual configuration either forwards only the paths that contain "/api/" and nothing else after (/api/login is not forwarded)
location /api/ {
proxy_pass http://localhost:1000/;
}
or it doesn't work at all
location ~ ^/api/(.*)$ {
proxy_pass http://localhost:1000/;
}
. The server is cinfigured as fallows:
server {
listen 80;
keepalive_timeout 70;
server_name server_name;
location / {
root /var/www/html;
index index.html;
}
location /api/ {
proxy_pass http://localhost:1000/;
}
}
I would appreciate any help, Thank you!
Note that with:
location /api/ {
proxy_pass http://localhost:1000/;
}
If there is request /api/foo, then your API server will see /foo.
If, on the other hand (note there is no trailing slash in proxy_pass) you use:
location /api/ {
proxy_pass http://localhost:1000;
}
Then for the same request, your API server will receive request "as is": /api/foo.
So make sure you use the right approach (slash / no slash) which depends on how your API server handles URLs (if it is configured to handle /api/foo URLs then you should not use trailing slash in the proxy_pass.

nginx: root chosen based on path

I want the following.
http://some.site/person1/some/path should access /home/person1/some/path (and http://some.site/person1 accesses /home/person1/index.html) and http://some.site/person2/some/path should access /home/person2/some/path (and http://some.site/person2 accesses /home/person2/index.html). There will be many personXes. It's important to use a regular expression to tell nginx where to find everything.
I tried coming up with a set of location, root and rewrite directives that would work. The closest I came was this for my sites-available/website.
server {
listen 80 default_server;
listen [::]:80 default_server;
server_name _;
root /some/default/root;
# Add index.php to the list if you are using PHP
index index.html index.htm index.nginx-debian.html;
location / {
# First attempt to serve request as file, then
# as directory, then fall back to displaying a 404.
try_files $uri.html $uri $uri/ =404;
}
location /person1 {
root /home/person1;
rewrite ^/person1(.*)$ $1 break;
}
}
This does what I want with all paths except for ones of the form http://some.site/person1. In this case, nginx doesn't access /home/person1/index.html like I want. Instead, the regex returns an empty string which nginx doesn't like (I see complaints in the nginx/error.log).
when you have common start root dir in /home, you can try with:
location ~* /person\d+ {
root /home;
}

Nginx: How to rewrite all URLs except of images?

I'm new to nginx and I want to migrate my website from Apache to nginx. My site has URLs like:
www.mywebsite.com/category/product/balloon
www.mywebsite.com/category/product/shoe
www.mywebsite.com/information/help
etc.
Since I'm using PHP I need to rewrite all URLs to index.php except if it's an image OR if it's a "fake-request". My nginx.config so far:
#block fake requests
location ~* \.(aspx|jsp|cgi)$ {
return 410;
}
#rewrite all requests if it's not a image
location / {
root html;
index index.php 500.html;
if (!-f $request_filename) {
rewrite ^(.*)$ /index.php?q=$1 last;
break;
}
}
error_page 404 /index.php;
# serve static files directly
location ~* ^.+.(jpg|jpeg|gif|css|png|js|ico)$ {
access_log off;
}
# redirect server error pages to the static page /50x.html
#
error_page 500 502 503 504 /50x.html;
location = /50x.html {
root html;
}
# pass the PHP scripts to FastCGI server listening on 127.0.0.1:9000
#
location ~ \.php$ {
root html;
fastcgi_pass 127.0.0.1:9000;
fastcgi_index index.php;
fastcgi_param SCRIPT_FILENAME E:/test2/html/$fastcgi_script_name;
include fastcgi_params;
}
This configuration does not work because:
1. It doesn't block fake request to .php files and I can't add .php to (aspx|jsp|cgi)$
2. It doesn't rewrite the URL if the file exists which is wrong: It should only serve static files directly if it's a defined file-type in(jpg|jpeg|gif|css|png|js|ico)$
How can I solve these problems? I really appreciate every answer, clarification or feedback you can give me.
Thanks
Mike
You need to configure the HttpRewriteModule. This module makes it possible to change URI using regular expressions (PCRE), and to redirect and select configuration depending on variables.
If the directives of this module are given at the server level, then they are carried out before the location of the request is determined. If in that selected location there are further rewrite directives, then they also are carried out. If the URI changed as a result of the execution of directives inside location, then location is again determined for the new URI.