4

I have the following tabular separated table:

NM_000057   0
NM_000059   0
NM_000060   0
NM_000061   0
NM_000062   0
NM_000063   0
NM_000063   0
NM_000063   3
NM_000063   2
NM_000063   0
NM_000063   0
NM_000063   0
NM_000064   0
NM_000065   0
NM_000066   0
NM_000067   0
NM_000068   0
NM_000069   0
NM_000070   0

I want to look for the first value, if there are more than one equal, I want to merge it and add the values from the second column. In the example:

NM_000057   0
NM_000059   0
NM_000060   0
NM_000061   0
NM_000062   0
**NM_000063 5**
NM_000064   0
NM_000065   0
NM_000066   0
NM_000067   0
NM_000068   0
NM_000069   0
NM_000070   0

Thank you!

αғsнιη
  • 35,092
  • 41
  • 129
  • 192
  • Just a question to make sure: Where should resulting line go ? I see your entries are numbered. You want to preserve the order of lines ? Or can the new line be added last to file ? – Sergiy Kolodyazhnyy Sep 26 '16 at 10:59
  • I would like to create a new file only with the unique entries and the sum of the values (which are in the second column) that are equally named. Preserving the order of lines would be nice too. – Joan Gibert Fernandez Sep 26 '16 at 14:36

2 Answers2

2

Use 'awk',

awk '{seen[$1]+=$2} END{for (x in seen) print x, seen[x]}' infile > outfile

In above awk command, main this 'seen[$1]+=$2' part do the job, the variable $1 as the key feild suming the value of second column when matched key seen.

And at the end, we are looping over seen array with x as variable and print the keys seen in first column then the sum result of each key by seen[x].

αғsнιη
  • 35,092
  • 41
  • 129
  • 192
1

Having recently discovered GNU Datamash, I'm going to throw in

datamash groupby 1 sum 2 < input

If your data is not already sorted you may need to add the -s option, and if it is separated by other whitespace (instead of tabs), add -W

steeldriver
  • 131,985
  • 21
  • 239
  • 326
  • Hi! Thank you but it seems that I do not have datamash in my MacOS. Is it exclusive for Linux? – Joan Gibert Fernandez Sep 26 '16 at 14:24
  • @JoanGibertFernandez a version of it should be available via `brew` I think - see http://brewformulas.org/Datamash - or you can build it from source – steeldriver Sep 26 '16 at 15:39
  • @JoanGibertFernandez ... although you really shouldn't be asking OSX questions on AskUbuntu, for exactly the reason you have just discovered - the answers given may not always apply to your OS – steeldriver Sep 26 '16 at 16:49