tail -f equivalent for an URL

Question

I want to monitor the log file of my application which however doesn't work locally but on a SaaS platform and is exposed over HTTP and WebDAV. So, an equivalent of tail -f that works for URLs would do great job for me.

P.S. If you know of any other tools that can monitor remote files over HTTP, it may also be of help. Thanks

Plain text with specific format: [timestamp] Error_name ..... Which I then intent to filter through grep — munch, Dec 03 '12 at 12:54
You can use `wget -N http://somewhere/something`, that'll download file only if it's newer than one that you downloaded before or use `wget -O - http://somewhere/something` to redirect file to stdout. — week, Dec 03 '12 at 13:08

terdon · Accepted Answer · 2020-06-12T16:33:50.600

15

There may be a specific tool for this, but you can also do it using wget. Open a terminal and run this command:

while :; do 
    sleep 2
    wget -ca -O log.txt -o /dev/null http://yoursite.com/log
done

This will download the logfile every two seconds and save it into log.txt appending the output to what is already there (-c means continue downloading and -a means append the output to the file name given). The -o redirects error messages to /dev/null/.

So, now you have a local copy of log.txt and can run tail -f on it:

tail -f log.txt

edited Jun 12 '20 at 16:33

answered Dec 03 '12 at 13:22

terdon

52,568
14
124
170

I found out that I could use davfs2 to integrate with the webDAV interface and then use the file like a regular file. This is what I really expected. But your solution is more simple and actually works – munch Dec 03 '12 at 16:05
I found that everything is being saved in "log" file not "log.txt". In my case this works: wget -ca -O log.txt -o /dev/null http://yoursite.com/log – yatsek Feb 18 '14 at 10:49
@munch davfs2 doesn't work that well. In my case I found that `tail -f` doesn't update file changes unless there's some other process actively asking server for directory updates (a plain `ls` seems enough). Problem is `tail -f` relies on inotify, and inotify doesn't seem to work over davfs2. – jesjimher Jan 09 '18 at 11:04
@jesjimher `tail` does not depend on inotify. It simply reads the file, seeks back and reads again. If it doesn't work well with davfs, that will be down to how davfs itself works. Presumably, it only updates information when something is actively reading the directory and since `tail` keeps the file open, that doesn't trigger it. Or something along those lines. – terdon Jan 09 '18 at 12:59
As far as I understand tail's code, it's not a dependence, but it uses inotify if it's available, resorting to polling behaviour only if inotify is not available in the system. Since davfs can't know when a file has changed without doing an explicit request, no inotify event is generated until some other process requests a directory refresh. It would be nice if tail had some way to force polling, even if inotify is available, but I haven't found such parameter. – jesjimher Jan 10 '18 at 14:02
Thanks that's very helpful! you can shorten it into one line `while true; do wget -ca -o /dev/null -O output.txt "$URL"; sleep 2; done` – Khaled AbuShqear Jun 12 '20 at 16:32
1

@KhaledAbuShqear yes, the `\ ` are not needed here at all. In my defense, this was written quite a few years ago :) – terdon Jun 12 '20 at 16:33

score 5 · Answer 2 · edited Jan 05 '18 at 19:13

I answered the same question over here with a complete shell script that takes the URL as it's argument and tail -f's it. Here's a copy of that answer verbatim:

This will do it:

#!/bin/bash

file=$(mktemp)
trap 'rm $file' EXIT

(while true; do
    # shellcheck disable=SC2094
    curl --fail -r "$(stat -c %s "$file")"- "$1" >> "$file"
done) &
pid=$!
trap 'kill $pid; rm $file' EXIT

tail -f "$file"

It's not very friendly on teh web-server. You could replace the true with sleep 1 to be less resource intensive.

Like tail -f, you need to ^C when you are done watching the output, even when the output is done.

score 1 · Answer 3 · answered Oct 16 '16 at 15:30

curl with range option in combination with watch can be used to achieve this:

RANGES

HTTP 1.1 introduced byte-ranges. Using this, a client can request to get only one or more subparts of a specified document. Curl supports this with the -r flag.

watch -n <interval> 'curl -s -r -<bytes> <url>'

For example

watch -n 30 'curl -s -r -2000 http://yoursite.com/log'

This will retrieve the last 2000 bytes of the log every 30 seconds.

Note: for self signed https use --insecure curl option

tail -f equivalent for an URL

3 Answers3

Linked