awk - comparing 2 columns of 2 files and print common lines

Question

I know there are my of the same questions already answered on this platform but I tried all the solutions for several hours and I cannot find my mistake. So I would appreciate any hint or help for what I am doing wrong.

Like here https://unix.stackexchange.com/questions/216511/comparing-the-first-column-of-two-files-and-printing-the-entire-row-of-the-secon and here how can i compare data in 2 files to identify common and unique data? I have two files of which I like to filter out the lines of file 2 that match column 1 in file 1. In my opinion, the proposed solution for the same questions should work but unfortunately they do not. My files are tab-separated.

file_1.txt

apple
great
see
tree

file_2.txt

apple    5.21      Noun
around   6.21      Adverb
great    2         Adjective
bee      1         Noun
see      7.43      Verb
tree     3         Noun

The output should look like:

apple    5.21      Noun
great    2         Adjective
see      7.43      Verb
tree     3         Noun

I tried comm -12,

awk (e.g awk 'NR==FNR{a[$1];next} ($1 in a)' file_1.txt file_2.txt > output.txt)

I know this might be a stupid question, I apologize in advance. However I do not seem to be able to figure it out.

`awk 'FNR==NR{a[$1];next}($1 in a){print}' file_1.txt file_2.txt > out.txt` doesn't work? — , Feb 04 '17 at 13:14
Strange, oddly works in the BSD awk. Typically it's the other way around, works in GNU not BSD. — , Feb 04 '17 at 13:23
Just checked on my system, running ubuntu 16.04, and the `awk 'FNR==NR{a[$1];next}($1 in a){print}' file_1.txt file_2.txt` seems to work fine (also works on osx). — Nick Sillito, Feb 04 '17 at 13:34
@Nick Sillito I tried again but it does not work for me for any reason, this is really strange. other awk-command just work normally. thank you for your help! — dani_anyman, Feb 04 '17 at 13:40
If you sort file_2.txt, you could use `join` i.e. `join file_1.txt <(sort file_2.txt)` — steeldriver, Feb 04 '17 at 13:57

score 3 · Accepted Answer · answered Feb 04 '17 at 13:21

One way to do it would be like this:

awk '   BEGIN { while ((getline <"file2.txt") > 0) {REC[$1]=$0}}
    {print REC[$1]}' <file1.txt

The getline at the start reads file2.txt and stores it in an array REC, indexed by the first record.

The "main" section of the code then reads the content of file1.txt, and simply uses the first word on the line to look up the appropriate line from file2.txt, now stored in REC.

Example output:

apple    5.21      Noun
great    2         Adjective
see      7.43      Verb
tree     3         Noun

that works, thank you so much! it is great and you saved me a lot of hours! thank you! — dani_anyman, Feb 04 '17 at 13:34

awk - comparing 2 columns of 2 files and print common lines

1 Answers1