One Tip Per Day: Remove duplicate lines in a text file with uniq

Wednesday, April 18, 2012

Remove duplicate lines in a text file with uniq

After sorting a file you will often find that some duplicate data, or you may be given various lists that need de-duplicating. sort and uniq will quickly and easily remove duplicates, list only the duplicates or only the unique data:

sort myfile.txt | uniq

List only the unique lines: sort myfile.txt | uniq -u

List only the duplicate lines: sort myfile.txt | uniq -d

Get a count of the number of lines by adding the -c option.

sort myfile.txt | uniq -uc

sort myfile.txt | uniq -dc

Skip fields: uniq -f 3 mylogfile. this could be useful with log files to skip the time stamp data

Skip characters. uniq -s 30 myfile.txt. Skip the first 30 characters

Compare characters. uniq -w 30 myfile.txt. Compare the first 30 characters

One Tip Per Day

Pages

Wednesday, April 18, 2012

Remove duplicate lines in a text file with uniq

No comments:

Post a Comment