Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Why single out Perplexity? Pretty much no crawler out there fetches robots.txt.

robots.txt is not a blocking mechanism; it's a hint to indicate which parts of a site might be of interest to indexing.

People started using robots.txt to lie and declare things like no part of their site is interesting, and so of course that gets ignored.



This is objectively wrong. Take it straight from the source: https://www.rfc-editor.org/rfc/rfc9309.html


That's not true, at all.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: