Dwebsites html meta tag and robots.txt

Introducing a small, informal proposal for standards for Dwebsites.

Right after releasing Almonit search engine, literally five minutes later, a question arrived. It asked, “how to add my Dwebsite to the search engine?”.

Well, here’s how.

Add a Dwebsite to the Almonit search engine

Basically we crawl for Dwebsites constantly and find it on our own. The following two methods are to to speed up the process and to make sure we don’t skip your website.

It’s sufficient to use only one of those methods.

  1. Add a Dwebsite HTML metatag inside index.html
<meta name="Dwebsite" content="yourDwebsite.eth">  


where instead of yourDwebsite.eth write the address of your Dwebsite.

  1. Using robots.txt in the base folder of the Dwebsite. Just add the following line to the robots.txt file.
Dwebsite: yourDwebsite.eth


Simple, right?

FAQ

Here are some questions people may ask.

Q: What if I don’t put a meta tag or robots.txt in your Dwebsite?

Nothing dramatic happens. We continue as we do now, meaning we will probably index the Dwebsite, but not surely index it.

The meta tag or robots.txt are just methods to speed up the process and to ensure our crawler notices the Dwebsite.

Q: Will all Dwebsites using those methods be indexed?

No.

Why? Because at the moment there are about ten “hello world!” Dwebsites for any “proper” one. Almonit aims at giving users a glance of the value Dwebsites offers, so we index only the ones with real content.

We intend this state to be temporarily. Once there will be enough Dwebsites we aim to automate the process, and even decentralize it: let the community index itself!

Q: What if the HTML meta tag or robots.txt are not consistent with the blockchain records?

By this we mean that the HTML meta tag or robots.txt in an IPFS CID point to x.eth, but x.eth does not point back to that IPFS CID.

In this case we won’t index the Dwebsite. You must connect, on the blockchain, your .eth name to the IPFS CID if you want it to appear in Almonit.

Q: Can I mention several .eth names in your Dwebsite?

Yes! Just write a list of .eth names, separated by a comma. You can do that either in index.html:

<meta name="Dwebsite" content="yourDwebsite1.eth, yourDwebsite2.eth">


or in robots.txt:

Dwebsite: yourDwebsite1.eth, yourDwebsite2.eth


where instead of yourDwebsite1.eth, yourDwebsite2.eth write the addresses of your Dwebsite.

Q: Can I tell Almonit NOT to index my website?

You’re breaking our heart, really! But of course you can. Just tell our crawler not to index it putting the following line in robots.txt:

User-agent: Almonit
Disallow: /


You can also tell all crawlers not to index your website by using this robots.txt:

User-agent: *
Disallow: /


In other words, we are following the regular specifications of robots.txt.

Q: can I ask that ONLY Almonit will index my website?

Yes! Now you’re melting our heart ❤. You can do that with this robots.txt:

User-agent: Almonit
Disallow:

User-agent: *
Disallow: /


That’s a nice gesture for us!

Q: what if I want Almonit to index only parts of my Dwebsite?

No problem. Our crawler parses robots.txt as any other crawler would do.

So if you want, for example, Almonit not to index a folder called /noalmonit/, use the following robots.txt.

User-agent: Almonit
Disallow: /noalmonit/


Or if you want, for example, Almonit not to index a file called noalmonit.pdf, use the following robots.txt.

User-agent: Almonit
Disallow: /noalmonit.pdf


Find more info about robots.txt here.

What if I still have questions?

We love questions! Ask us in Twitter or write an email to contact@almonit.club .