While I don't need a robots.txt
file for this site, I did understand why others would want one for their site when I ran across this github request for help on the topic https://github.com/oleeskild/obsidian-digital-garden/issues/208
The user was requesting help in marking their digital garden as noindex
so search engines do not crawl it.
The way to do this is to have a /robots.txt
on the website with the following content:
User-agent: *
Disallow: /
Then make sure the file is accessible via http://yourdomain.com/robots.txt.
While most major search engines respect the robots.txt
directive, some minor or malicious bots may ignore it. Over time, if your site was previously indexed and you add the block later, it may take some time for search engines to remove the indexed pages. If you want to expedite the removal of specific pages or the entire site, you can use tools provided by search engines, like Google's Search Console.
How to do this for an Obsidian Digital garden?
The static site generator, eleventy.js, being used here isn't one I'm very familiar with but I did figure it out. I tried added a robots-test.txt
into my garden repo under src/site/robots-test.txt
to see if that caused it to show up under /robots-test.txt
on the Vercel hosted website - no luck.
Looking deeper I found this blog post on adding robots.txt to eleventy site.
What is also needed is something like below tell the static site generator, eleventy, to pass that file through.
// Put robots.txt in root
eleventyConfig.addPassthroughCopy({ 'src/site/robots.txt': '/robots.txt' });
So I went to around line 540 in my site's .eleventy.js
file, right before the userEleventySetup(eleventyConfig);
line and added the following above it:
// Put robots-text.txt in root
eleventyConfig.addPassthroughCopy({ 'src/site/robots-test.txt': '/robots-test.txt' });
That did the trick and my new robots-test.txt
showed up as expected.