Bug 150

Summary: bugzilla robots.txt blocking web crawlers such as archive.org
Product: Libre-SOC Website Reporter: Jacob Lifshay <programmerjake>
Component: websiteAssignee: Luke Kenneth Casson Leighton <lkcl>
Status: CONFIRMED ---    
Severity: normal CC: libre-soc-bugs
Priority: ---    
Version: unspecified   
Hardware: All   
OS: All   
NLnet milestone: --- total budget (EUR) for completion of task and all subtasks: 0
budget (EUR) for this task, excluding subtasks' budget: 0 parent task for budget allocation:
child tasks for budget allocation: The table of payments (in EUR) for this task; TOML format:

Description Jacob Lifshay 2019-12-22 01:24:13 GMT
We should switch robots.txt to allow more things, a good template to use would be Mozilla's bugzilla robots.txt:
https://bugzilla.mozilla.org/robots.txt
Comment 1 Luke Kenneth Casson Leighton 2019-12-22 02:00:24 GMT
yep this apparently is quite common, web-crawling of bugzilla can be pretty heavy so mozilla set up a default that banned pretty much everything.  i'm not so bothered so have set it to "Allow /"
Comment 2 Luke Kenneth Casson Leighton 2019-12-22 02:01:42 GMT
(In reply to Jacob Lifshay from comment #0)
> a good template to use
> would be Mozilla's bugzilla robots.txt:
> https://bugzilla.mozilla.org/robots.txt

just copied it entirely, just... because :)
Comment 3 Jacob Lifshay 2019-12-22 02:05:52 GMT
(In reply to Luke Kenneth Casson Leighton from comment #2)
> (In reply to Jacob Lifshay from comment #0)
> > a good template to use
> > would be Mozilla's bugzilla robots.txt:
> > https://bugzilla.mozilla.org/robots.txt
> 
> just copied it entirely, just... because :)

Thanks, sounds good to me!

if you have more time, it would be nice to also fix #149