I have in Google hundreds if not thousands of URLs that have the name of the product in it. My new e-commerce now replaces the whitespaces with hyphens when constructs the URL and I need to make an .htaccess to automatically redirect the old URLs to the new ones by replacing the whitespaces with hyphens.
The example URL I'm using is
detalle.php?titulo=Zapatillas%20Salomon%20Xr%20Mission&codigo=040-8800-072
but the number of whitespaces to be replaces can vary widely.
The last iteration of rules that I have tried is:
RewriteCond %{QUERY_STRING} ^(.*)[\s|%20](.*)&codigo=(.*)
RewriteRule detalle.php detalle.php?%1-%2&codigo=%3 [N=20]
In a tester I found online this only replaces the last whitespace and let the others without change, in my development server not even that.
I have spent almost a day with this and going nowhere, even when acording to Apache documentation this should work.
Thanks in advance.
Edit:
The solution given by #anubhava
RewriteCond %{THE_REQUEST} ^[A-Z]{3,}\s/+(.*?)(?:\+|%20|\s)+(.+?)\sHTTP [NC]
RewriteRule ^ /%1-%2 [L,NE,R=302]
worked as requested, but somehow broke the lines in my .htacces that previously had been working perfectly (minus the whitespaces)
RewriteCond %{QUERY_STRING} ^titulo=(.*)&codigo=(.*)$
RewriteRule detalle.php http://otherdomain/%1--det--%2? [R=301,L]
this is to transform the URLs with parameters into "friendly" URLs.
Edit2:
There was some kind of problem in the development server because it was in a subdirectory, I tried it on the production server and everything worked fine so I'm accepting the answer.
I put this edit just in case someone else have a similar situation.
You can use this rule as first rule in your root .htaccess to convert all spaces by hyphens.
RewriteEngine In
RewriteCond %{THE_REQUEST} ^[A-Z]{3,}\s/+(.*?)(?:\+|%20|\s)+(.+?)\sHTTP [NC]
RewriteRule ^ /%1-%2 [L,NE,R=302]
Related
Everything has to redirect to www.domain.com. except for test.domain.com. Which will host a new version of the site for testing.
both of the domains need to look within their web directory.
I've searched stack overflow but none of the similar questions seem to provide a working solution for me. Probably because I don't understand htacces / regex that well yet.
This is the current content of my .htacces file.
RewriteEngine on
RewriteCond %{HTTP_HOST} !^www\.
RewriteRule ^(.*)$ http://www.%{HTTP_HOST}/$1 [R=301,L]
RewriteCond %{REQUEST_URI} !web/
RewriteRule (.*) /web/$1 [L]
Try this for your second line (where test is your subdomain):
RewriteCond %{HTTP_HOST} !^(www|test)\.
The ! negates the match.
The brackets are a regex group.
The pipe symbol is an or.
What this should say is "match me all subdomains except www. & test."
Regex101 link here which might help explain further and gives my test data:
https://regex101.com/r/5udsER/1/
Disclaimer:
I wrote this afk so this is lacking an end-to-end test but should work.
I am trying to rewrite my URL's to remove index.php? but I'm struggling a little to get it to work. The closest I can get is the answer here: remove question mark from 301 redirect using htaccess when the user enters the old URL
I need to convert the URLs to pretty URLs on the way out, and rewrite them back to the proper URL on the way in. The structure of the URLs is as follows:
https://sub.domain.com/index.php?/folder1/folder2-etc
Using the code from the referenced answer results in a double forward slash:
https://sub.domain.com//folder1/folder2-etc
The rewrite rules I'm using from the referenced answer are:
RewriteEngine On
RewriteCond %{THE_REQUEST} /index\.php [NC]
RewriteRule ^(.*?)index\.php$ /$1 [L,R=301,NC,NE]
RewriteCond %{THE_REQUEST} \s/+\?([^\s&]+) [NC]
RewriteRule ^ /%1? [R=301,L]
# internal forward from pretty URL to actual one
RewriteRule ^((?!web/)[^/.]+)/?$ /index.php?$1 [L,QSA,NC]
I suspect I know how to solve the first bit, but I'm struggling to understand the second rule for the internal forward.
Additionally, I'm wondering if this is the best way to do this. I'm currently running an Apache backend behind an Nginx reverse proxy. Would I be better doing the rewrite on the Nginx side and the internal forward on Apache?
EDIT:
Complication: I've noticed an additional structure to complicate things. Some URLs appear to have https://sub.domain.com/picture.php?/folder1/folder2-etc
For these, I'd be quite happy to keep 'picture' and just remove the .php? bit.
I'm guessing that for the first bit, Id need to do something like the following:
RewriteCond %{THE_REQUEST} \s/+index\.php\?/([^\s&]+) [NC]
RewriteRule ^ /%1? [R=301,L]
RewriteCond %{THE_REQUEST} \s/+picture\.php\?/([^\s&]+) [NC]
RewriteRule ^(.*)$ /picture/%1 [R=301,L]
But have no idea where to start with the opposite.... ie converting pretty urls back to standard. It would help if the following section could be explained to me?
^((?!web/)[^/.]+)/?$ /index.php?$1 [L,QSA,NC]
RewriteRule ^/*picture/(.*)$ /picture.php?/$1 [L]
RewriteRule ^/*(?!/*index\.php$)(.*)$ /index.php?/$1 [L]
should do the trick. I wasn't able to test it yet though.
I only used the [L] last flag to stop applying rules on match. The QSA query string append flag doesn't seem to make sense as you don't seem to use ?key=value&... syntax anyway. Also dunno if you actually need the NC case-insensitive flag...
Side note:
I hope your php files don't serve paths with .. in them, as that would allow people to read arbitrary files from disk, e.g. /picture/../../../etc/passwd
Apologies, but as it turns out, the main reason I can't get anything to work is due to the use of relative URLs and dynamically generated links within the PHP. Not something I can change unfortunately. The not perfect URLs are something I'm going to have to live with. For reference, the app I'm using is Piwigo
I'm trying to match a url using regex in a .htaccess file.
I am able to match the end of the string using "$" just fine, but when I try to match the start of the string using "^", the RewriteRule stops working.
Here is my .htaccess file:
RewriteEngine On
RewriteBase "/"
RewriteCond %{REQUEST_FILENAME} !-f
RewriteRule "^example-url$" "index.php"
Without the "^" the regex will work, but is too lax. I've fiddled with this for a couple of hours now, but I'm not sure what else to try, since this very basic regex, that has worked in every regex tester I've tried.
My best guess atm is that it has something to do with what the start of the string actually is, but I'm not sure how to test this, since it should start just after the top level domain name with the "/".
EDIT:
Moving the .htaccess to the html folder made all my problems go away, the above .htaccess file now works fine. Thanks so much for the help #Riad Loukili and #P0lT10n.
I'm guessing my problems were caused by the way my virtual directories are setup.
Not sure but try
RewriteEngine On
RewriteBase /
RewriteCond %{REQUEST_FILENAME} !-f
RewriteRule ^example-url$ index.php
I tried it on This .htaccess tester and it worked.
Possible Explanation: You've already specified the RewriteBase, no need to start the url with / in the Regular Expression.
What is the difference in using ^ vs ^(.*)$ vs ^.*$ as wildcards in a RewriteRule?
My goal is to redirect http://carnarianism.com/ (anything) to the landing (default) page of http://carnarian.com/. I have found the following solutions, which all seem to work, so I wonder which is better for performance?
RewriteRule ^ http://carnarian.com/ [R=301,L]
RewriteRule ^.*$ http://carnarian.com/ [R=301,L]
RewriteRule ^(.*)$ http://carnarian.com/ [R=301,L]
All of these seem to work okay. This is my very first post on StackOverflow, most of the time I can find an answer just searching for it.
To be clear: ABOVE the questioned RewriteRule in my .htaccess is a RewriteCond and WWW Handler as follows:
RewriteEngine On
RewriteBase /
# FROM www. --TO-- NO www. See no-www.org
RewriteCond %{HTTP_HOST} ^www\.(.+)$ [NC]
RewriteRule ^(.*)$ http://%1/$1 [R=301,L]
RewriteCond %{HTTP_HOST} carnarianism\.com$ [NC]
########## The Above Questioned RewriteRule ??? ##########
RewriteRule ^ http://carnarian.com/ [R=301,L]
Note: I started this search with the following, but I did not want the following because the path was also passed, and I want it to go to the landing page only. Therefore, I know you need the parentheses to be able to use the $1 variable. I do not want the $1 variable.
RewriteRule ^(.*)$ http://carnarian.com/$1 [R=301,L]
^ makes none of the original URL accessible as backreferences. $0 is an empty string.
^.*$ makes the entire original URL accessible as the $0 backreference (so you can do e.g. http://example.com/oldurl.php?url=$0)
^(.*) makes the entire original URL accessible as both the $0 and $1 backreferences; it's usually used when you want to actually use the old URL in the replacement since it's more explicit about the use.
All of them match the same thing, but produce different backreference groups.
The one that is better performance wise is the one you have benchmarked yourself.
But since you are using a .htaccess file rather than having this configuration in the server directly (maybe via a VirtualHost?) which is parsed only once, it really doesn't matter. Parsing .htaccess files at every single request is much more time consuming than performing the regular expression by a factor of thousands.
If you care about performance you should never ever use .htaccess files and even disable their parsing with: AllowOverride None. Not disabling them, and having a request like: http://example.com/sites/css/theme/main.css Apache will still try to load all the following files:
.htaccess
sites/.htaccess
sites/css/.htaccess
sites/css/theme/.htaccess
It will generate system calls even if those file does not exist.
Trying therefore to improve your RewriteRule in an .htaccess file is like sneezing in the ocean in the hope of making it less salty. :)
Now, if you improved your setup to use server configuration and to answer your original question: ^.*$ might be more efficient than ^(.*)$ as less references needs to be created. Chance is high, however, that you can't measure it.
I posted this on Reddit (r/learnprogramming) and someone there PM'd me and told me to come here, so here I am!
I have been trying to learn regex's and I suck at them still. I seriously have difficulty grasping the pattern matching. I am solid in other OOP languages so I figured I would learn regex and it just evades me.
I have downloaded EditPad Pro so I can practice as http://www.regular-expressions.info/tutorial.html suggests. I can get expressions to match bulk text, but I am trying to parse URL's and I keep missing.
Here is what I am trying to do. I am writing my own permalink .htaccess file as a proof of concept study, so I can hopefully use this is in future sites.
I need to return the following dynamic content from a URL:
I need everything other than http:// www.domain.com/ or http:// domain.com/ or domain.com/:
(I am adding a space after http:// because of the limits on new accounts)
http:// www.domain.com/asdjh324hj.jpg
http:// www.domain.com/asa45s.png
http:// www.domain.com/aser24hj.gif/
http:// www.domain.com/wer234dsfa/
http:// www.domain.com/k3kjk4
http:// www.domain.com/k3kasd4/
The matched part will then be appended to:
http:// www.domain.com/some_dir/som_subdir/some_file.php?querystring=$1
But, I don't want any of these urls in the results:
http:// www.domain.com/some_dir/some_file.php
http:// www.domain.com/some_dir/some_subdir/some_file.html
And I need to prevent hotlinking to images in the image_dir:
http:// www.domain.com/image_dir/some_dir/some_subdir/some_image.jpg (or png,gif,etc)
Hotlinked images would be redirected to a page with the passed image as a querystring.
So what RewriteRule regex would I setup to grab this? I understand RewriteRules and the flags, putting matched results into variables, etc, I just can't figure out what regex I should write to grab the actual result.
If this is too complex for RewriteRules, then please let me know as I am struggling here.
Usually I do these in PHP and would start with:
.com/[a-zA-Z0-9-_.]+
([^/]+)/?$
Then do good 'ol substrings and checks. It's hacking it to death and I should be doing better!
I am currently going through the regular-expressions.info tutorials and am making progress, but I keep grabbing the wrong things too.
Thanks for any help you can send my way!
Update: I was able to resolve everything with a ton of help and discussed more here: Mod_Rewrite conditions help for hotlinking but allow local requests
I need everything other than http:// www.domain.com/ or http:// domain.com/ or domain.com/:
RewriteCond %{REQUEST_URI} !^/$
But, I don't want any of these urls in the results:
RewriteCond %{REQUEST_URI} !^/some_dir/
The matched part will then be appended to:
RewriteRule ^(.*)$ /some_dir/som_subdir/some_file.php?querystring=$1 [L]
So in all it should look something like this:
RewriteCond %{REQUEST_URI} !^/$
RewriteCond %{REQUEST_URI} !^/some_dir/
RewriteRule ^(.*)$ /some_dir/som_subdir/some_file.php?querystring=$1 [L]
Which will make it so when you request something like http://www.domain.com/asa45s.html, it will get internally rewritten to some_dir/som_subdir/some_file.php?querystring=asa45s.html. As for the hotlinking bit:
RewriteCond %{REQUEST_URI} ^/image_dir/
RewriteCond %{REQUEST_URI} \.(png|gif|jpe?g|bmp|ico)$ [NC]
RewriteCond %{HTTP_REFERER} !^https?://(www\.)?domain.com/
RewriteRule ^(.*)$ /some_dir/som_subdir/some_file.php?querystring=$1 [L]
This checks that first the request is for something in the /image_dir/ directory, then that the requested resource ends with a png/gif/jpeg/bmp/ico extension, then that the HTTP referer [sic] does not start with http://www.domain.com/, https://domain.com/ or whatever combination of the 2. If all those are true, then it rewrites the request to the /some_dir/som_subdir/some_file.php file with the original URI as the querystring parameter.