3

I am using GNU SED for find and replace functionality on large files(upto 2GB).

Find and replace characters can contain any characters, hence I want find and replace parameters to be treated as plain text as it comes.

I do not want to treat either find or replace parameters as regex by sed command.

I have experimented a lot, but every time I am getting new combinations of regex which does not work for sed as plain text.

How can this be achieved?

Is there any formula to escape the special characters?

Note: I am using ~ operator as command seperator instead of /

Below is the example

sed -ne "s~^[-+]?[0-9]*\.?[0-9]+([eE][-+]?[0-9]+)?$~Replace" -ne "w output.txt" "input.txt"

Above command does not work, as it treats the find parameter as regex(as it is regex). Hence to find the text I have to escape some special characters in regex as below

sed -ne "s~\^\[-+\]?\[0-9\]\*\\.?\[0-9\]+(\[eE\]\[-+\]?\[0-9\]+)?\$~Replace" -ne "w output.txt" "input.txt"

In another example I have to modify .*$ to .\*\$ But in (.*$) I do not want to mofify input.

So is there any universal rule for escape sequence?

sagar
  • 139
  • 4
  • Could you be more specific? Sample input and expected output for example. – Thor Jan 11 '13 at 06:44
  • Use single-quotes instead of double-quotes, then the shell will leave those characters alone. – Thor Jan 11 '13 at 07:02
  • but it throws following error sed: -e expression #1, char 1: unknown command: `'' – sagar Jan 11 '13 at 07:09
  • You're missing a terminating `~`. Which version of sed is this? – Thor Jan 11 '13 at 07:41
  • sed -ne 's~a.d~sss~g' -ne 'w output.txt' 'input.txt' This is my command, which is giving the error. And sed version is => GNU sed version 4.2.1 – sagar Jan 11 '13 at 09:58
  • That command works here. – Thor Jan 11 '13 at 10:03
  • @Thor It's not working on my command prompt. Can sed verison be an issue? – sagar Jan 11 '13 at 10:14
  • I have the same version. Try re-typing the command. – Thor Jan 11 '13 at 11:47
  • sadly, it's not working:-( – sagar Jan 11 '13 at 12:19
  • I agree with @Thor on two points: (1) That `sed -ne 's~a.d~sss~g' -ne 'w output.txt' 'input.txt'` command looks perfectly valid, and it works on my system.  (2) You need to explain your problem better.  You say you want a “universal rule” right after saying that you want some characters to represent themselves literally while others implement their regex functionality. – Scott - Слава Україні Jan 11 '13 at 21:40
  • I just do not want regex behavior of sed command. I want all the arguments supplied to sed for find and replace, to be treated as plain text irrespective of anything – sagar Jan 14 '13 at 04:54
  • Hello guys, thanks for your replies! My problem got solved. I am running above sed version on windws 7. I have used below syntax of sed. sed -nre "s~a\.d~sss~g;w output.txt" "input.txt". Since I have used -r as the option, I am escaping all special characters which are being used in regular expression. – sagar Jan 14 '13 at 11:48

1 Answers1

1

Q: Is there any formula to escape the special characters?
Q: Is there any universal rule for escape sequence?

A: You can use the corresponding hex code for special characters, in cases where just typing /,.,*,?,$, etc. becomes annoying. For example:

sed -rn '/\x22/p' file

will print lines that contain double quotes, since \x22 represents ".

If you need to look up hex codes, you can conveniently save them all to a file with this command:

gawk 'BEGIN{for(i=0;i<255;i++){printf("%d\t%x\t%c\n", i,i,i)}}' null >chars.txt
simlev
  • 3,782
  • 3
  • 14
  • 33
Til
  • 221
  • 2
  • 11