.htaccess RewriteRule Issue

Collapse
X
 
  • Time
  • Show
Clear All
new posts
  • Camo.Fish
    Member
    • Jul 2008
    • 42

    .htaccess RewriteRule Issue

    So, here _was_ my .htaccess:

    Code:
    RewriteEngine On
    # Remove www.
    RewriteCond %{HTTP_HOST} ^www\.(.+)$ [NC]
    RewriteRule ^(.*)$ http://%1/$1 [R=301,L]
    # Send to SSL
    RewriteCond %{HTTPS} off
    RewriteRule (.*) https://%{HTTP_HOST}%{REQUEST_URI}
    And I am trying to add some URL rewriting to the mix because I have had the feature programmed into my CMS for a while now, but not implemented it.

    Issue 1:
    All pages are accessed via /index.php. Currently, they also have ?id=\d{1,3}. I would like any page accessed with ?id= to be replaced with the semantic URL set in the database. I intend to use a rewrite map to do this, having the map updated each time a page's semantic URL is changed.

    Issue 2:
    I need all pages that are accessed semantically, like /a_page to be re-written to /index.php?url=a_page.

    I am decent but not stellar with regular expressions, and pretty early along my journey to master the rewrite module.

    I have been trying several things and I got it close for issue two once and have not moved on to issue one. If anyone has any input for me, I would appreciate it. I'll be working on this until I get it figured out, so I will post any progress that I make.

    Thanks!
    12
    What?
    0%
    0
    Authorization
    0%
    3
    URL Rewriting
    0%
    5
    Removing www
    0%
    1
    Adding www
    0%
    2
    Sending to SSL
    0%
    1
  • Camo.Fish
    Member
    • Jul 2008
    • 42

    #2
    Code:
    # Do not rewrite if sending the id via get
    RewriteCond %{QUERY_STRING} !id=
    # Do not rewrite if string has been rewritten already
    RewriteCond %{QUERY_STRING} !url=
    # Do not rewrite if in admin area
    RewriteCond %{REQUEST_FILENAME} !admin
    # Rewrite if request starts with a letter followed by anything
    RewriteRule ^([a-zA-Z].*) /index.php?url=$1
    Issue #2 complete. Not perfect, but it works.

    Comment

    • ZYV
      Senior Member
      • Sep 2005
      • 315

      #3
      I don't think the first issue has anything to do with mod_rewrite. If I understand it correctly, you want to get your visitors redirected like this: index.php?id=123 -> /clean_url_here/. Why wouldn't you just use

      Code:
      $myid = intval($_GET["id"]);
      if (in_array($myid, $map))
        header("Location: " .$map[$myid]. "");
      or something like that at the beginning of the script?

      As for issue 2, it looks like it does not take into account regular files existing with such a name, but if it works for you, than OK.

      Comment

      • Camo.Fish
        Member
        • Jul 2008
        • 42

        #4
        Code:
        RewriteCond %{REQUEST_FILENAME} !-F
        RewriteRule ^(.*) /index.php?url=$1
        Hmm, thanks for the comment! Tossing in a file check allows me to remove the other checks!

        As for using PHP to redirect, I thought perhaps an Apache rewrite map would be more efficient than an external redirect. And I would learn some extra wizardry. Perhaps I will test both and see which is more efficient, unless someone out there already knows?

        Comment

        • Camo.Fish
          Member
          • Jul 2008
          • 42

          #5
          Well, I went over to a friend's house and the FF on his wife's laptop brought up an apache 302 page that needed to be clicked on, so I have taken out the redirect to SSL and everything works fine. Does anyone know if this is finicky for a reason? I added the [R=301] to the SSL section and removed the L from the www. removal, but neither of those helped.
          Code:
          RewriteEngine On
          # Remove www.
          RewriteCond %{HTTP_HOST} ^www\.(.+)$ [NC]
          RewriteRule ^(.*)$ http://%1/$1 [R=301]
          # Send to SSL
          # RewriteCond %{HTTPS} off
          # RewriteRule (.*) https://%{HTTP_HOST}%{REQUEST_URI}  [R=301]
          # Translate Semantic to Get
          RewriteCond %{REQUEST_FILENAME} !-F
          RewriteRule ^(.*) /index.php?url=$1
          Also, although mod_rewrite is supposed to support extended posix regex, I can't ever get it to work using things like \w or \d{1,3} in the pattern matching strings. :-/
          Last edited by Camo.Fish; 08-23-2008, 08:51 PM. Reason: typo

          Comment

          • ZYV
            Senior Member
            • Sep 2005
            • 315

            #6
            Hmm, I do not use it for SSL redirects so I am not sure what problems may arise... So, to make it clear, what happens if you uncomment those SSL-related lines which look perfectly valid btw? (screenshot would be nice)

            I wonder whether you were confused by Apache's warning for self-signed certificate or it was something else.

            As a side note you might use RewriteLog (on your own testing server I guess, because you need to set it at the httpd.conf level) to troubleshoot obscure mod_rewrite problems (that is what I usually do).

            As for regex, what is \d{1,3} actually supposed to do? Isn't it an equivalent for ([0-9]{1,3}) ?

            Comment

            • Camo.Fish
              Member
              • Jul 2008
              • 42

              #7
              Yes, \d and [0-9] are the same.. So I suppose it does not support \w type of things, because [a-zA-Z] worked fine. The thing is, lots of things that _look_ fine do not process fine. I guess that is what makes mod_rewrite voodoo. :>

              Redirecting all traffic to SSL raises a few issues, some internal to the site, some external.. most of which I resolved, but in the long run I am getting rid of it because who knows what other problems will be raised by platforms I do not have access to.

              It would be very nice if RewriteLog ran in per directory context as that would supply a lot of useful information for me. :-/

              Also, I was not confused.. I am familiar with accepting dubious certs in multiple browsers.. What would happen is it would work fine in my FF3, just give an ambiguous unreachable error (or work properly) in IE 7 depending upon how I tried to access the url, and on the laptop in question which I imagine is running FF2 or a beta of 3 it would bring up the standard 302 redirect ('The document has moved, click here to ..').

              The long and the short of it is that I do not really have to redirect everything to SSL and it caused enough random issues over my testing in a couple days to bring me to the assumption that there would probably some I would not find out about or address, possibly ever.

              I am, hopefully, going to leave the www. subtraction in.. I have not been typing it in unless a site won't load without it since about '95.. or the http... think of how many keystrokes that has saved my poor fingers! Plus in theory, it should bring log files and search engines into cohesion properly.

              The one thing I wonder about with regard to search engines, is .. well, here is a quote from one article:

              302 and 301 Redirects

              From a search engine perspective, 301 redirects are the only acceptable way to redirect URLs. In the case of moved pages, search engines will index only the new URL, but will transfer link popularity from the old URL to the new one so that search engine rankings are not affected. The same behavior occurs when additional domains are set to point to the main domain through a 301 redirect.
              However, you will note that in my .htaccess I do not have R=301 on the semantic url line, which means it is redirecting with a 302. Will the pages then not be crawled by the almighty goog? Redirecting them with a 301 actually puts the index.php?url= part into the address bar and in my opinion that is not human friendly and because it ends up having Get parameters may not be search engine friendly? The 301 also causes a visual flash during the redirect whereas the 302 is instantaneous.

              This whole mess is the reason that I used to create a directory for every page that I wanted.. but that was not as dynamic as I would have liked and when I re-wrote my CMS this winter I knocked it down to just one /index.php and an administration index.php.

              Oh what a tangled web we weave...
              Last edited by Camo.Fish; 08-24-2008, 08:21 PM.

              Comment

              Working...