It appears you have not yet registered with our community. To register please click here...

HOME  |  FORUM  |  CONTACT US     

Old Skool Anthemz  

 
3 in 1 Search - Gemm, Netsounds & Musicstack
Search over 60 million vinyl/cd's for sale now! (more info)
 

Go Back   Old Skool Anthemz > Forum > General > Help

Register FAQ Members List Calendar Arcade Search Today's Posts Mark Forums Read





Reply
 
LinkBack Thread Tools Rate Thread Display Modes
  #1  
Old 29th August 2002, 09:34 AM
Board Addict
 

Join Date: Nov 2001
Location: Chav-ville Droylsden, innit!
Posts: 1,076
Send a message via ICQ to Red Mancunian Send a message via AIM to Red Mancunian
Question robots.txt?

Ok, I know what this is, but acan anyone please tell me what the robots.txt file should look like for my website?

I notice a lot of spiders looking for it, and I'm sure if I bothered to put one in then it would help with listing my site on the search engines.

When replying, please remember 2 things:

(1) I'm thick
(2) I'm thick

Thanks!
__________________
Signature temporarily removed.
Normal service will be resumed as soon as possible. Maybe.
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #2  
Old 30th August 2002, 04:32 AM
Contributing Member
 

Join Date: Oct 2001
Location: Near beaches, bars and rollercoasters!
Posts: 2,769
Interesting one, I'd not heard of the robots page before so I've doen a bit of digging - from the looks of things the robots.txt page is an option only available if you run your own server (not just have webspace) and want to EXCLUDE certain areas from search engine spiders.

Quote:
You'll also want to check your robots.txt file. This is a standard file for Web servers that sits at the root of your site and excludes unwelcome indexing robots or restricts them from crawling certain directories. If you run your own server, you control this file; otherwise, your host server administrator controls it. Make sure that this file exists and that it allows at least your search indexer robot to access your directories.
Therefore its not something you need to worry about. If you want to know more follow this link for a few pages with some relavant information:

http://search.hotwired.com/webmonkey...ery=robots.txt
__________________
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #3  
Old 30th August 2002, 10:50 AM
Board Addict
 

Join Date: Nov 2001
Location: Chav-ville Droylsden, innit!
Posts: 1,076
Send a message via ICQ to Red Mancunian Send a message via AIM to Red Mancunian
Talking

Sh*te. I think I'll take yer advice and just ignore the bugger.

Spiders? Web? Crawling?

Fekk me, is it a computer or a frikkin creepy-crawly house? :p
__________________
Signature temporarily removed.
Normal service will be resumed as soon as possible. Maybe.
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #4  
Old 30th August 2002, 07:08 PM
Board Addict
 

Join Date: Oct 2001
Location: Sheffield / Astley
Gender: Male
Posts: 3,343
Send a message via MSN to fugjostle
Search engines use a technique called spidering (web... get it). The program (or spider as its called) goes trawling through the web following links on all the web pages it finds and builds a database of all the web pages it finds (database?, well yep, thats what google is... and altavista.. and yahoo... and etc).

Some sites have pages that they don't want found in google or altavista. For example, development pages, private members pages, etc.

The robots.txt file is a simple text file that tells the spider not to check the pages your text file specifies. The various spiders always check for that file so that they don't include anything you don't want it to.

If your site is www.redmanc.com and you didn't want the pages within your private or tmp directory appearing in google then you could create a robots.txt file that would look like this:

User-agent: *
Disallow: /tmp/
Disallow: /private/

The web spider would check for www.redmanc.com/robots.txt and obey by ignoring www.redmanc.com/tmp and www.redmanc.com/private

The "user-agent" thing is quite handy as you can allow certain spiders and disallow others:

User-agent: WebCrawler
Disallow:

User-agent: *
Disallow: /

This would only allow the webcrawler spider to index your site and deny all others.

Have a look at http://www.robotstxt.org/wc/exclusion-admin.html for some more ideas of what you can do.

Hope this helps

---
Fug
__________________

Fug's pearl necklace of wisdom:
- "A cult is a religion with no political power"
- "Age is a high price to pay for maturity"
- "Always remember you're unique. Just like everyone else"
- "A gross ignoramus: 144 times worse than an ordinary ignoramus"
- "Depression is merely anger without enthusiasm"
- "All it takes to fly is to hurl yourself at the ground... and miss"
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #5  
Old 30th August 2002, 07:47 PM
Board Addict
 

Join Date: Nov 2001
Location: Chav-ville Droylsden, innit!
Posts: 1,076
Send a message via ICQ to Red Mancunian Send a message via AIM to Red Mancunian
Cheerz fug fella.

I should have just kept my Commodore 64, shouldn't I? j/k

That site's got loads of useful reading on it mate.
__________________
Signature temporarily removed.
Normal service will be resumed as soon as possible. Maybe.
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #6  
Old 30th August 2002, 07:54 PM
Board Addict
 

Join Date: Oct 2001
Location: Sheffield / Astley
Gender: Male
Posts: 3,343
Send a message via MSN to fugjostle
Good Luck mate

When you finish your site you should run it through the world wide web (w3) consortiums validater... it will check your html and tell you if its compliant or not. Compliance means that any web browser can view it.

http://validator.w3.org/

---
Fug
__________________

Fug's pearl necklace of wisdom:
- "A cult is a religion with no political power"
- "Age is a high price to pay for maturity"
- "Always remember you're unique. Just like everyone else"
- "A gross ignoramus: 144 times worse than an ordinary ignoramus"
- "Depression is merely anger without enthusiasm"
- "All it takes to fly is to hurl yourself at the ground... and miss"
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #7  
Old 31st August 2002, 09:28 AM
Board Addict
 

Join Date: Nov 2001
Location: Chav-ville Droylsden, innit!
Posts: 1,076
Send a message via ICQ to Red Mancunian Send a message via AIM to Red Mancunian
Nice one mate. I will.

Cheers
__________________
Signature temporarily removed.
Normal service will be resumed as soon as possible. Maybe.
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
Sponsored Links
Reply


Thread Tools
Display Modes Rate This Thread
Rate This Thread:

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On



All times are GMT +1. The time now is 10:03 AM.




Powered by vBulletin® Version 3.6.4
Copyright ©2000 - 2008, Jelsoft Enterprises Ltd.
Content Relevant URLs by vBSEO 3.1.0

- Dedicated to the memory of Anthony ROCK-XC -