2

This question pertains to publishing the Ubuntu Serverguide on help.ubuntu.com. For the 20.04 LTS cycle, there will (O.K. might) be significant changes to the source code workflow for the Ubuntu Serverguide. Currently, translations are not being considered for this migration. Our best feedback has been that server admin types prefer English, even if it is not their first language. Before the decision becomes non reversible, we want to test it by doing the next point release of the 18.04 Ubuntu Serverguide in U.S. English only. Everything is ready, except for one issue:

We know for certain that many links, bookmarks, etc. exist with the language extension. Example:

https://help.ubuntu.com/lts/serverguide/networking.html.en-CA

And we want to have that scenario return this page instead:

https://help.ubuntu.com/lts/serverguide/networking.html

because the language specific versions will no longer exist, but returning a 404 Not Found error is undesirable.

The current version of an .htaccess file, with commented out previous attempts, is:

# unable to make below method work.
#RedirectMatch permanent ^(*\.html)\.*$ $1
#
# enable rewriting
RewriteEngine on
#RewriteRule ^(*\.html)\.*$ $1 [R=301, L]
#RewriteRule ^(*\.html)\.*$ $1
RewriteRule ^(*\.html)\.*$ $1 [PT]
#RewriteRule ^(*.html).*$ $1

Resulting in:

500 Internal Server Error

to the client, and this in the test server logs:

[Thu Jun 20 11:57:07.647838 2019] [core:alert] [pid 16079] [client 192.168.111.101:62992] /home/doug/public_html/linux/ubuntu-docs/help.ubuntu.com/dev/lts/serverguide/.htaccess: RewriteRule: cannot compile regular expression '^(*\\.html)\\.*$', referer: http://my-test-website/~doug/linux/ubuntu-docs/help.ubuntu.com/dev/index.html
[Thu Jun 20 14:19:27.360334 2019] [core:alert] [pid 16079] [client 192.168.111.101:63908] /home/doug/public_html/linux/ubuntu-docs/help.ubuntu.com/dev/lts/serverguide/.htaccess: RewriteRule: cannot compile regular expression '^(*\\.html)\\.*$', referer: http://my-test-website/~doug/linux/ubuntu-docs/help.ubuntu.com/dev/index.html

Notice that current attempts are with a wildcard for the language extension. If that is not possible, then the language list is:

ace ar ast be bg bn bs ca cs da de el en en_AU en_CA en_GB eo es et eu fa fi fr gl gu he hr hu id is it ja km ko ku lo lt lv mk ms nb nl oc pl ps pt_BR pt ro ru sk sl sq sr sv th tl tr ug uk ur vi zh_CN zh_TW

Can someone help with this?

Doug Smythies
  • 14,898
  • 5
  • 40
  • 57
  • A very well written question! Complete with rationale, examples, attempted solutions and error messages :) – vidarlo Jun 20 '19 at 22:02

1 Answers1

2
RewriteRule ^(.*.html)\..*$ $1    

This rewrites /foobar.html.anything into /foobar.html, and seems to do what you want.

If it is intended to be permanent, you should probably send a 301:

RewriteRule ^(.*.html)\..*$ $1 [R=301]
vidarlo
  • 21,954
  • 8
  • 58
  • 84
  • It certainly does. Thanks so much. Wish I came here many many hours ago. – Doug Smythies Jun 20 '19 at 21:55
  • yes, we actually want a 301 response. I took your answer, but put it into a ` RedirectMatch permanent` stanza instead, and it is working. I know for certain that `RedirectMatch permanent` stanzas work on the real server, because we use it elsewhere, but I do not know for certain about `RewriteRule`. – Doug Smythies Jun 20 '19 at 22:04
  • 1
    Correction: I'll use a 302 response, until the decision is finalized, then switch to a 301 response. – Doug Smythies Jun 20 '19 at 22:40
  • 1
    The single language [18.04 Serverguide](https://help.ubuntu.com/lts/serverguide/index.html) point release has now been published on [help.ubuntu.com](https://help.ubuntu.com/) with the .htaccess file based on your answer. – Doug Smythies Jun 21 '19 at 15:00
  • 1
    @DougSmythies Note that an actual .htaccess is [bad for the performance](https://haydenjames.io/disable-htaccess-apache-performance/), as Apache has to check for modifications and presence of the .htaccess on every call. Moving it into the server config will improve performance. – vidarlo Jun 21 '19 at 17:02
  • Yes, I know, and that is what I do on my own server. However, the actual publication of [help.ubuntu.com](https://help.ubuntu.com/) is done automatically once every 24 hours (around 6 A.M. in my time zone) from the [master launchpad code](https://code.launchpad.net/~ubuntu-core-doc/help.ubuntu.com/help.ubuntu.com). We, the [Ubuntu Doc Team committers](https://launchpad.net/~ubuntu-core-doc), are not worthy of messing with the server configs. Nor do we have access to the server logs, which would be useful. So we use .htaccess method. – Doug Smythies Jun 21 '19 at 18:36