22

Could some one explain what is happening behind the scenes in character escaping in Linux shell? I tried the following and googled a lot, without any success in understanding what (and how) is going on:

root@sv01:~# echo -e "\ Hello!"
\ Hello!
root@sv01:~# echo -e "\\ Hello!"
\ Hello!
root@sv01:~# echo -e "\\\ Hello!"
\ Hello!
root@sv01:~# echo -e "\\\\ Hello!"
\ Hello!
root@sv01:~# echo -e "\\\\\ Hello!"
\\ Hello!
root@sv01:~# echo -e "\\\\\\ Hello!"
\\ Hello!
root@sv01:~# echo -e "\\\\\\\ Hello!"
\\ Hello!
root@sv01:~# echo -e "\\\\\\\\ Hello!"
\\ Hello!
root@sv01:~# echo -e "\\\\\\\\\ Hello!"
\\\ Hello!
root@sv01:~# echo -e "\n Hello!"

 Hello!
root@sv01:~# echo -e "\\n Hello!"

 Hello!
root@sv01:~# echo -e "\\\n Hello!"
\n Hello!

I am totally lost there, so for example, why do three backslashes give only one back slash? I would expect: the first two will be escaped to one, the third one will find nothing to escape so it will remain a slash (line in the first experiment), but what is happening is that the third one is just disappears.
Why I am getting one backslash from four \\\\ Hello? I would expect each pair will give one back slash -> two backslashes.

And why I need three backslashes in the last case to get \n escaped? what is happening in background of escaping to get that? and how is it different from \\n case?

I appreciate any explanation of what is going on in the previous lines.

Mohammed Noureldin
  • 1,285
  • 2
  • 19
  • 29
  • 1
    `echo -e` behavior is not standards-defined anyhow -- see http://pubs.opengroup.org/onlinepubs/9699919799/utilities/echo.html. Output is *completely implementation-defined* if there's a literal backslash anywhere in the inputs, and the only allowed option is `-n` (meaning that a standards-compliant implementation will have `echo -e` print `-e` on its output). – Charles Duffy Sep 13 '17 at 14:41
  • ...even if you're 100% sure that your shell is bash, *even then* `echo -e` isn't safe: `echo` will behave in accordance with the standard if both `posix` and `xpg_echo` runtime options are enabled, or if compiled with equivalent build-time options. The safe practice is to use `printf` instead -- see the APPLICATION USAGE and RATIONALE sections of the above link describing how to make `printf` act as a replacement for `echo`. – Charles Duffy Sep 13 '17 at 14:43

1 Answers1

32

This is because bash and echo -e combined. From man 1 bash

A non-quoted backslash (\) is the escape character. It preserves the literal value of the next character that follows, with the exception of <newline>. […]

Enclosing characters in double quotes preserves the literal value of all characters within the quotes, with the exception of $, `, \, […] The backslash retains its special meaning only when followed by one of the following characters: $, `, ", \, or <newline>.

The point is: double quoted backslash is not always special.

There are various implementations of echo in general, it's a builtin in bash; the important thing here is this behavior:

If -e is in effect, the following sequences are recognized:
\\
backslash
[…]
\n
new line

Now we can decode:

  1. echo -e "\ Hello!" – nothing special to bash, nothing special to echo; \ stays.
  2. echo -e "\\ Hello!" – the first \ tells bash to treat the second \ literally; echo gets \ Hello! and acts as above.
  3. echo -e "\\\ Hello!" – the first \ tells bash to treat the second \ literally; echo gets \\ Hello! and (because of -e) it recognizes \\ as \.
  4. echo -e "\\\\ Hello!" – the first \ tells bash to treat the second \ literally; the third tells the same about the fourth; echo gets \\ Hello! and (because of -e) it recognizes \\ as \.
  5. echo -e "\\\\\ Hello!" – the first \ tells bash to treat the second \ literally; the third tells the same about the fourth; the last one is not special; echo gets \\\ Hello! and (because of -e) it recognizes the initial \\ as \, the last \ stays intact.

And so on. As you can see, up to four consecutive backslashes give one in result. That's why you need (at least) nine of them to get three. 9=4+4+1.

Now with \n:

  1. echo -e "\n Hello!" – there's nothing special to bash, echo gets the same string and (because of -e) it interprets \n as a newline.
  2. echo -e "\\n Hello!"bash interprets \\ as \; echo gets \n Hello! and the result is the same as above.
  3. echo -e "\\\n Hello!"bash interprets the initial \\ as \; echo gets \\n Hello! and (because of -e) it interprets \\ as a literal \ which needs to be printed.

The results would be different with ' instead of " (due to different bash behavior) or without -e (different echo behavior).

Kamil Maciorowski
  • 69,815
  • 22
  • 136
  • 202
  • 1
    Thank you very much! So there are two escaping steps, the step done by bash, and the second one by `-e` judges the text already judged by bash, is that right? if that is correct, that removes the confusion. Could you maybe mention how does `sed` behaves? does it has its own build in escape technique? so I mean does it behave like echo with `-e`? – Mohammed Noureldin Sep 13 '17 at 01:23
  • 2
    @MohammedNoureldin Yes, there are two steps. Your original question is fine as it is, let's not make it overcomplicated with `sed`. You may ask another question though (about `sed` only), just do your own research first. – Kamil Maciorowski Sep 13 '17 at 01:29
  • 3
    @MohammedNoureldin `sed` will follow the same basic principles: bash will interpret the command line according to its rules, parsing (and maybe removing) quotes and escapes. The result of that gets passed to `sed`, which interprets it according to *its* escaping rules. BTW, you can simplify this by using a single-quoted string in `bash`, since that doesn't have eny escape interpretation done (by bash) -- bash removes the single-quotes, and passes what's inside them directly to the command. – Gordon Davisson Sep 13 '17 at 03:15
  • I still have a small problem in `echo -e "\\\n Hello!"` case. Here bash will do escaping, and it will become `\\n`, then `-e` will escape the resulted two backslashes `\\n` and then they will become `\n`. now who interprets `\n` to make it new line? Normally in case of `echo -e "\n Hello!"`, bash does nothing, and `-e` interprets `\n` as new line. but in the first mentioned situation interpretation process was done before getting `\n` to get interpreted. Could you explain that please? – Mohammed Noureldin Sep 13 '17 at 09:50
  • @MohammedNoureldin "Who interprets `\n` to make it new line?" -- no program does. Your last example in the question shows clearly that in this case `\n` is printed, not interpreted. – Kamil Maciorowski Sep 13 '17 at 09:56
  • 1
    Ok my fault, I have been reading about that for 2 days with their nights, therefore apparently I started to mix the cases, I believe I got something else in `sed` which confused me last night (maybe I did something wrong yesterday), but now I tried again the same with `echo -e` and `sed`, and I got the same thing as excepted, thanks! – Mohammed Noureldin Sep 13 '17 at 10:10
  • May I ask you to take a look at this? this is what made the main confusion, now I understand every detail you wrote (I think), but I still cannot explain what is going on here: https://superuser.com/questions/1249497/sed-needs-n-to-append-new-line – Mohammed Noureldin Sep 13 '17 at 10:33
  • @GordonDavisson, that principle is actually pretty well applicable on `sed substitution` but not `appending` and I do not understand why. You may check that in the link above. – Mohammed Noureldin Sep 13 '17 at 11:41
  • (Aside: Please don't use backtick-style quoting to escape technical terms -- they're for *code*, not English prose that describes code). – Charles Duffy Sep 13 '17 at 14:44
  • @KamilMaciorowski, that was re: prior comment (code-formatting `sed substitution` and `appending` despite neither of those being code), not your answer. – Charles Duffy Sep 13 '17 at 15:59