The Most Active and Friendliest
Affiliate Marketing Community Online!

“AdsEmpire”/  Direct Affiliate

htaccess files

  • Thread starter melkior_inactive
  • Start date
M

melkior_inactive

Guest
You can't live with them, you can't live without them.
Whether you want it or not, creating a good web based application these days involves editing and creating these little buggers.
And they are ugly little brutes. If you want to really offend someone tell him: "You look like a .htaccess file!" That should do the trick.

But all in all, you have to tame them in order to get some results and this is where this thread steps in.
I'll try to do my best in explaining possibilities of these files and things you can do with them in a friendly way.

Now, I've seen hundreds of .htaccess tutorials on the net and some are OK, some look like they were written for people with 3 brains and 7 microprocessors in the head. But most of them fail describing what a .htaccess file is. Sometimes even I'm not sure what they are but I'll try to explain.

First of all, they only work with Apache server! So it doesn't matter if your web server is running Linux, Mac OS or Windows, it needs to run on Apache.
Now when we have that cleared up, what do they do?
Imagine the Apache server as the USA (United States of America in case you think it's an IT acronym). And like the USA, Apache has its rules. USA has laws, the constitution and whatnot. Apache has a configuration file (usually httpd.conf or apache2.conf but it can be called different on your server). USA has amendments, and Apache has modules. These are the extensions and changes to the original set of rules.
Finally, each state in the USA has it's own state laws. Each folder has it's own "laws" and they can be found in the .htaccess file.
Laws are complicated to read, so is the htaccess file. But if lawyers can read laws, then webmasters should be able to read htaccess files.

OK, enough with the metaphor. A few more important things are:
If Apache can't find the .htaccess file in the folder it uses global rules set in the main configuration (which you usually don't have access to unless it's your own server). And if you apply an .htaccess file to the folder /yourserver/www/script/ the .htaccess file will work for all the sub folders in the script/ folder.
But sometimes you want some other subfolders to have a different set of rules. That's fairly easy. Just create a new .htaccess file in the subfolder of your choice and it'll overrule the .htaccess file in the parent folder.
In the next post I'll give you an overview of things that can be done with one of these files, and then we'll go on to writing our own rules.
 
One more thing I forgot to add: you should set the permissions to your .htaccess files (chmod) to 644 so the server can read them but not anyone else.
Exposing your .htaccess file imposes a serious security risk so be warned!

OK, here's a list of things you can do with them:
1. Password protect your folders
2. Rewrite your URLs (the infamous mod_rewrite) -- I'm not going to write about this since I have already written a post about this here.
3. Change the default error documents
4. Block and allow users by IP addresses
5. Block referrers
6. Block bots you don't like and/or offline browsers, site downloaders etc...
7. Prevent listing of directories
8. Add MIME types
9. Redirect pages
10. Stop people hotlinking your files
11. Enable SSI
12. Change your default pages
 
Password protecting your web folders

1. Password protecting your web folders

So, you've finally decided to write a love song for your girlfriend who works as a IT technician, so you thought putting it on your website might be romantic. But you don't want your friends see how you make a complete fool out of yourself.
Well, I have some good news and some bad news for you. The good news is, you can password protect your love song and send the access data to your girlfriend but the bad news is that you're still making a fool out yourself since it's not romantic at all.

You still want to do it? Wow, you're persistent. Well here's how to do it:
you have a site: www.yoursite.com
and you've decided to put the love song in:
www.yoursite.com/lovesong/
And you want it protected.
Create the subfolder lovesong in your public_html folder on the web server.
Create a .htaccess file with this content in the lovesong folder:
Code:
AuthUserFile /home/myaccount/safedirectory/.htpasswd
AuthGroupFile /dev/null
AuthName EnterPassword
AuthType Basic

require user mydarling
The above code will produce a password protected directory which only your darling could access.
The first line specifies the direct path (not the URL) to the .htpasswd file which contains the username and the hashed password. If possible, put this file in a folder which can't be accessed over the Web (usually a folder called private/) -- a folder which is not contained in the public_html/ or www/ folder).
Now all you need to do is create the content for this file.
It should look like this:
Code:
username:hashedpassword
To hash the password you can use this tool.

Although there are lots of tools to do this on the net. This one is just an example.

The require user line specifies that only the user mydarling can access the content of the folder lovesong/.
If you have more than one girlfriends (you dirty dog! :D), add their user data to the .htpasswd file and in your .htaccess file change the line:
Code:
require user mydarling
to
Code:
require user valid-user
This allows access to all authenticated users from the .htpasswd file.

Now upload your love song (or whatever you're trying to hide) to the lovesong/ sub folder and you're good to go (get the boot from the girlfriend).
 
Changing error documents

3. Changing error documents

So what's this? Well you must have seen HTTP errors from time to time. You know: "Error 404 - Not found" and stuff like that.
Instead of the default white pages with black text you can have flashy pages that go well with your design.
There's no real reason for doing this except the fact that it makes your website look more professional.
The usual errors you'll be creating new pages for are:
400 - Bad Request
401 - Authorization Required
403 - Forbidden
404 - Not Found
500 - Internal Server Error

There are more but this isn't a place or time to list them all and creating error pages for some isn't recommended (200 in example would create an infinite loop since it's a success code).

So first create your own custom error pages and give them names.
Put them all in one folder on your server, I'll use error/ in this example.
Add this to your .htaccess file (or create a new one if you don't have one already):
Code:
ErrorDocument 400 /error/400.html
ErrorDocument 401 /error/401.html
ErrorDocument 403 /error/403.html
ErrorDocument 404 /error/404.html
ErrorDocument 500 /error/500.html
You get the idea. Just be careful to get the names of your files right and you've done everything.

You can even specify HTML code in the htaccess file instead of linking to a file:
HTML:
ErrorDocument 404 "<body bgcolor=#FF0000><b>Not found!</b> But if you wait long enough someone might start looking for it. <img src="/smiley.gif" /></body>"

OK, you've now got custom error documents! You're way cool now! :D
 
Blocking and allowing IPs

4. Blocking and/or allowing IPs

Remember that (ex-)girlfriend of yours you wrote that love song for? Well, she ditched your ass but that's not all. Now she started spamming your site's forum, guestbook, blog. She's all over the place and you just don't have the time to delete her comments.
But you know she has a static IP. You're in luck!
Add these lines to your .htaccess file:
Code:
deny from 123.123.123.12
If her IP is 123.123.123.12 she won't be able to access your site.
You can also deny all users (even you) but the server will still be able to access the files in the folder:
Code:
deny from all

You can allow only certain users:
Code:
allow from 123.123.123.12

That's not enough?
Well you can ban or allow IP ranges. Let's say you want to ban all the users from 123.123.123.1 to 123.123.123.255
Do this:
Code:
deny from 123.123.123.
And you're set.

You can even allow or ban certain domains.
Example:
Code:
allow from www.ukwebmasterworld.com
Which would allow access to the part of your site from www.ukwebmasterworld.com

Your site is now safe from the old hag! :D
 
Blocking referrers

5. Blocking referrers

Blocking links to your site which come from one domain has numerous reasons and I'm not going to list them here. You have your own reasons and I respect them.

This is actually an update to the mod_rewrite setting. So you can only do this if you have the mod_rewrite module on your server.

This is what you write in your .htaccess:
Code:
RewriteEngine on
Options +FollowSymlinks
RewriteCond %{HTTP_REFERER} siteyouareblocking\.com [NC]
RewriteRule .* - [F]

That will block the siteyouareblocking.com
To block multiple sites do this:
Code:
RewriteEngine on
Options +FollowSymlinks
RewriteCond %{HTTP_REFERER} siteyouareblocking\.com [NC,OR]
RewriteCond %{HTTP_REFERER} anothersite\.com
RewriteRule .* - [F]
The [NC] makes the domain case insensitive.
The [F] in the RewriteRule is to show the 403 Forbidden error to those who go to your site via the blocked site.
 
Thanks for your reply and want to know that I have visitied your indicated thread and happy for getting my result :)
You have done a good job for us :d
 
Blocking bots you don't like

6. Blocking bots you don't like

What is HTTP_USER_AGENT?
I could go on the whole day about it but the main idea is:
it's the identifier of the app/user/bot/service accessing your site.
And your website can recognise it. OK, that's the first part.

Now, we all know that there are some bots that do you more harm than good. Site rippers too, they eat your bandwith. And you want to block them.
The usual way to block bots is the robots.txt file. But some bots ignore it.
Well, let's just say we don't like those bots.

So, what do you do?
.htaccess file is your friend (again)!

Add this to your .htaccess file:
Code:
RewriteEngine On 
RewriteCond %{HTTP_USER_AGENT} ^BlackWidow [OR] 
RewriteCond %{HTTP_USER_AGENT} ^Bot\ mailto:craftbot@yahoo.com [OR] 
RewriteCond %{HTTP_USER_AGENT} ^ChinaClaw [OR] 
RewriteCond %{HTTP_USER_AGENT} ^Custo [OR] 
RewriteCond %{HTTP_USER_AGENT} ^DISCo [OR] 
RewriteCond %{HTTP_USER_AGENT} ^Download\ Demon [OR] 
RewriteCond %{HTTP_USER_AGENT} ^eCatch [OR] 
RewriteCond %{HTTP_USER_AGENT} ^EirGrabber [OR] 
RewriteCond %{HTTP_USER_AGENT} ^EmailSiphon [OR] 
RewriteCond %{HTTP_USER_AGENT} ^EmailWolf [OR] 
RewriteCond %{HTTP_USER_AGENT} ^Express\ WebPictures [OR] 
RewriteCond %{HTTP_USER_AGENT} ^ExtractorPro [OR] 
RewriteCond %{HTTP_USER_AGENT} ^EyeNetIE [OR] 
RewriteCond %{HTTP_USER_AGENT} ^FlashGet [OR] 
RewriteCond %{HTTP_USER_AGENT} ^GetRight [OR] 
RewriteCond %{HTTP_USER_AGENT} ^GetWeb! [OR] 
RewriteCond %{HTTP_USER_AGENT} ^Go!Zilla [OR] 
RewriteCond %{HTTP_USER_AGENT} ^Go-Ahead-Got-It [OR] 
RewriteCond %{HTTP_USER_AGENT} ^GrabNet [OR] 
RewriteCond %{HTTP_USER_AGENT} ^Grafula [OR] 
RewriteCond %{HTTP_USER_AGENT} ^HMView [OR] 
RewriteCond %{HTTP_USER_AGENT} HTTrack [NC,OR] 
RewriteCond %{HTTP_USER_AGENT} ^Image\ Stripper [OR] 
RewriteCond %{HTTP_USER_AGENT} ^Image\ Sucker [OR] 
RewriteCond %{HTTP_USER_AGENT} Indy\ Library [NC,OR] 
RewriteCond %{HTTP_USER_AGENT} ^InterGET [OR] 
RewriteCond %{HTTP_USER_AGENT} ^Internet\ Ninja [OR] 
RewriteCond %{HTTP_USER_AGENT} ^JetCar [OR] 
RewriteCond %{HTTP_USER_AGENT} ^JOC\ Web\ Spider [OR] 
RewriteCond %{HTTP_USER_AGENT} ^larbin [OR] 
RewriteCond %{HTTP_USER_AGENT} ^LeechFTP [OR] 
RewriteCond %{HTTP_USER_AGENT} ^Mass\ Downloader [OR] 
RewriteCond %{HTTP_USER_AGENT} ^MIDown\ tool [OR] 
RewriteCond %{HTTP_USER_AGENT} ^Mister\ PiX [OR] 
RewriteCond %{HTTP_USER_AGENT} ^Navroad [OR] 
RewriteCond %{HTTP_USER_AGENT} ^NearSite [OR] 
RewriteCond %{HTTP_USER_AGENT} ^NetAnts [OR] 
RewriteCond %{HTTP_USER_AGENT} ^NetSpider [OR] 
RewriteCond %{HTTP_USER_AGENT} ^Net\ Vampire [OR] 
RewriteCond %{HTTP_USER_AGENT} ^NetZIP [OR] 
RewriteCond %{HTTP_USER_AGENT} ^Octopus [OR] 
RewriteCond %{HTTP_USER_AGENT} ^Offline\ Explorer [OR] 
RewriteCond %{HTTP_USER_AGENT} ^Offline\ Navigator [OR] 
RewriteCond %{HTTP_USER_AGENT} ^PageGrabber [OR] 
RewriteCond %{HTTP_USER_AGENT} ^Papa\ Foto [OR] 
RewriteCond %{HTTP_USER_AGENT} ^pavuk [OR] 
RewriteCond %{HTTP_USER_AGENT} ^pcBrowser [OR] 
RewriteCond %{HTTP_USER_AGENT} ^RealDownload [OR] 
RewriteCond %{HTTP_USER_AGENT} ^ReGet [OR] 
RewriteCond %{HTTP_USER_AGENT} ^SiteSnagger [OR] 
RewriteCond %{HTTP_USER_AGENT} ^SmartDownload [OR] 
RewriteCond %{HTTP_USER_AGENT} ^SuperBot [OR] 
RewriteCond %{HTTP_USER_AGENT} ^SuperHTTP [OR] 
RewriteCond %{HTTP_USER_AGENT} ^Surfbot [OR] 
RewriteCond %{HTTP_USER_AGENT} ^tAkeOut [OR] 
RewriteCond %{HTTP_USER_AGENT} ^Teleport\ Pro [OR] 
RewriteCond %{HTTP_USER_AGENT} ^VoidEYE [OR] 
RewriteCond %{HTTP_USER_AGENT} ^Web\ Image\ Collector [OR] 
RewriteCond %{HTTP_USER_AGENT} ^Web\ Sucker [OR] 
RewriteCond %{HTTP_USER_AGENT} ^WebAuto [OR] 
RewriteCond %{HTTP_USER_AGENT} ^WebCopier [OR] 
RewriteCond %{HTTP_USER_AGENT} ^WebFetch [OR] 
RewriteCond %{HTTP_USER_AGENT} ^WebGo\ IS [OR] 
RewriteCond %{HTTP_USER_AGENT} ^WebLeacher [OR] 
RewriteCond %{HTTP_USER_AGENT} ^WebReaper [OR] 
RewriteCond %{HTTP_USER_AGENT} ^WebSauger [OR] 
RewriteCond %{HTTP_USER_AGENT} ^Website\ eXtractor [OR] 
RewriteCond %{HTTP_USER_AGENT} ^Website\ Quester [OR] 
RewriteCond %{HTTP_USER_AGENT} ^WebStripper [OR] 
RewriteCond %{HTTP_USER_AGENT} ^WebWhacker [OR] 
RewriteCond %{HTTP_USER_AGENT} ^WebZIP [OR] 
RewriteCond %{HTTP_USER_AGENT} ^Wget [OR] 
RewriteCond %{HTTP_USER_AGENT} ^Widow [OR] 
RewriteCond %{HTTP_USER_AGENT} ^WWWOFFLE [OR] 
RewriteCond %{HTTP_USER_AGENT} ^Xaldon\ WebSpider [OR] 
RewriteCond %{HTTP_USER_AGENT} ^Zeus 
RewriteRule ^.* - [F,L]
That's the list of some of the unwanted apps. You can expand it or arrange it the way you want.
If the user agent equals the text behind the ^ the site gets a 403 Forbidden error.
Neat, isn't it?
This can save you a lot of trouble!
 
Folder listing customization

7. Folder listing customization

Every now and then you end up with a folder on your site that doesn't have the index page. So if someone navigates to it they get the folder listing. All files included.
Most of the webservers prevent this from happening but some don't.
Here's what you do:
Add this line to your .htaccess file:
Code:
IndexIgnore *
IndexIgnore setting defines what files will not be listed. Since we have a * here, it's used as a wildcard so no files are listed.
But sometimes you want to hide only zip and rar files.
In that case you add this line to your .htaccess instead:
Code:
IndexIgnore *.zip *.rar
Now all the other files will be listed but zip and rar won't.

But what if you really want a folder to be listed. And your server settings don't allow this.
Well, you can always add this to your .htaccess:
Code:
Options +Indexes
This will allow the folder to be listed but watch out that you don't allow access to some private data this way.
If you've set up your file with the last option you can customize the folder listing even more.
Add two files in the listed folder. One is called README and the other one HEADER.
Now the content of HEADER will be displayed before the folder listing, and README after the folder listing.
So if you're giving download links for software in this manner (just an example) you can display additional information and tips to your users with these two files.
 
Add MIME types

8. Add MIME types

So you don't like clowns but you love mimes? Here's tutorial for you!
Just kidding. Wrong mimes.

Anyway MIME is short for *drumroll* Multipurpose Internet Mail Extensions.
So you're asking yourself what does mail have to do with htaccess?
The answer is nothing. Well not in this tutorial anyway.
But MIME isn't used for mail only. It plays an important role in HTTP. MIME is an Internet Standard which defines what applications open what file types when accessed from the net.

Some servers don't have the MIME settings configured for some file types. We'll pretend that our fictional server doesn't work well with Flash.
So instead of opening the swf file in your browser it offers you to download it.
What you need to do is add this to your .htaccess file:
Code:
AddType application/x-shockwave-flash swf
List of MIMEs can be found here:
IANA | MIME Media Types

But there's another common problem. Sometimes you really want a file to be downloaded instead of being opened in the browser.
How do you do that? Easy! Replace the upper code with:
Code:
AddType application/octet-stream swf
So when someone clicks on the link to the swf file he'll be offered to download it instead of the file playing in the browser.
Also, use these settings with care. You don't want to make your users opening JPEGs with their Calculator, OK?
 
Redirection

9. Redirection

Okay. We've come the one of the most common applications of .htaccess files and also one of the simplests.

No matter what is the reason for your redirect you always do it the same way. It could be that you moved some portions of your site to a different (sub)domain or you changed your directory structure. Whatever the reason is, instead of painfully changing dozens of links on your pages you can simply add the links to your .htaccess file and redirect them.

This is the code for redirection:
Code:
Redirect /folder/file.html http://www.site.com/newfolder/newfile.html
Use the relative path for the old files and full URL for new files and you're good to go.
You can also redirect whole folders in the same way:
Code:
Redirect /myoldfolder http://www.site.com/mynewfolder/

It will affect all the links on your site and it's the fastest way to change the links if you changed your folder structure.
Ready, steady, redirect!
 
Hotlinking prevention

10. Hotlinking prevention

When someone is trying to copy the design of your site, you don't need to go haywire. It just means that your site is good and he likes the design.
But when someone starts using your images, CSS files and even JavaScript than it's time to say: "Bye bye!"
News sites have similair problems. Users on forums and blogs tend to use the pictures from news sites when posting news on their sites. That's ok if they download them and reupload them on their server or a free image host. But if they just use the path to your site they are actually stealing your bandwith.

You can stop this easily with the help of .htaccess files.
Add this to your file:
Code:
RewriteEngine on
RewriteCond %{HTTP_REFERER} !^$
RewriteCond %{HTTP_REFERER} !^http://(www\.)?yoursite.com/.*$ [NC]
RewriteRule \.(gif|jpg|js|css)$ - [F]
This setting will prevent hotlinking of gif, jpg, js and css files. You can expand the list in the last line of the code as necessary.
Just don't forget to replace yoursite.com with your domain. :)

But there's a nice bonus to this too. You can make the hotlinkers look like idiots.
Use this in your .htaccess file:
Code:
RewriteEngine on
RewriteCond %{HTTP_REFERER} !^$
RewriteCond %{HTTP_REFERER} !^http://(www\.)?yoursite.com/.*$ [NC]
RewriteRule \.(gif|jpg)$ http://www.yoursite.com/stopstealing.gif [R,L]
Also create a default picture (in this example it's called stopstealing.gif) you can put anything you want in it - typically an angry message to the hotlinker, and upload the file to your server.
Now when someone links to your gif and jpg files, stopstealing.gif will be displayed instead.
Note that the path to your replacement picture is in the last line of the code so you can customize it any way you like.
And as usual you can expand the number of filetypes you want to block.
 
I never knew you can do so much with .httaccess files, well done Melky loooad of rep coming your way :)
 
Enable SSI

11. Enable SSI

SSI are Server Side Includes. Sounds complicated? Well it is.
And it's kinda dangerous. I'm not going to write much about SSI here. If you need it, you know what it is and how to use it.
If you've never heard of them you can skip this tutorial.

SSI is a way of serving dynamicall content in HTML files without using PHP or CGI. They're good for adding small pieces of information to your HTML. But if you want to generate the whole page dynamically I suggest that you turn to PHP, Perl or whatever your preference is.
Most servers have SSI disabled and there's a good reason for it. It's server intensive if not used properly.
Before enabeling this contact your web host first because there's a good chance it's not allowed to do so.

OK here's the code to enable SSI:
Code:
AddType text/html .shtml
AddHandler server-parsed .shtml
Options Indexes FollowSymLinks Includes
This will get the server to parse all .shtml files for SSI.
You can replace .shtml with anything you like but don't use .html cause it will get the server to parse all your HTML files even if they don't have SSI. You might overload the server that way.

In case you still want to use .html files with SSI there's another way.
Chmod (change permissions) all your html files with SSI (only with SSI) to have the +X flag (that marks them as executives).
Now add this to your .htaccess:
Code:
XBitHack on
Now Apache will search for SSI only in html pages that have the +X flag.

And if you all behave well maybe I'll write you a tutorial on SSI someday. :D
 
Changing default directory files

Changing default directory files

You've probably seen that I used the DirectoryIndex directive in this tutorial.
Well let me explain it in this final part of the htaccess tutorial.

This directive is rather simple and sometimes very useful. Let's say that you are tired of using index.htm (or .php or .cgi or .whatever) as the default page in every folder on your site.

Let's say you want to use coolmonkey.htm instead.

Just use the formentioned directive to change the default directory index page. Add this code to your .htaccess file:
Code:
DirectoryIndex coolmonkey.htm
Now when someone navigates to www.yoursite.com he'll be taken to www.yoursite.com/coolmonkey.htm instead of www.yoursite.com/index.htm
So you can make really creative sites now.
 
Ladies and gentleman,
this concludes my little htaccess tutorial.
If you've found it useful add some reputation and I'll be most obliged.
Cheers!

Special note: During the making of this tutorial no .htaccess files were harmed.
 
banners
Back