Is there a way to compare N files at once, and only leave lines unique to each file?
Background
I have five files that I am trying to make unique relative to each other. In other words, I want to make it so that the lines of text in each file have no commonality with each other.
Attempted solution
So far, I have been able to run the grep -vf command comparing one file with the other 4 as so:
grep -vf file2.txt file1.txt
grep -vf file3.txt file1.txt
…
This makes it print out the lines in file1 that are not in file2, nor file3, etc.. However, this becomes cumbersome because I would need to do this for the superset of all files. In otherwords, to truly reduce each file to lines of text only in that file, I would have to do every combination of files into the grep -vf command. Given that this sounds cumbersome to me, I wanted to know…
Question
What is the command/series of commands in linux to find the lines of text in each file that is mutually exclusive to all the other files?
Answer
You could just do:
awk '!a[$0]++ { out=sprintf("%s.out", FILENAME); print > out}' file*
This will write the lines that are uniq in file
to file.out
. Each line will be written to the output file of the associated input file in which it first appears, and subsequent duplicates of that same line will be suppressed.