After sorting a file you will often find that some duplicate data, or you may be given various lists that need de-duplicating. sort and uniq will quickly and easily remove duplicates, list only the duplicates or only the unique data:
sort myfile.txt | uniq
List only the unique lines: sort myfile.txt | uniq -u
List only the duplicate lines: sort myfile.txt | uniq -d
Get a count of the number of lines by adding the -c option.
sort myfile.txt | uniq -uc
sort myfile.txt | uniq -dc
Skip fields: uniq -f 3 mylogfile. this could be useful with log files to skip the time stamp data
Skip characters. uniq -s 30 myfile.txt. Skip the first 30 characters
Compare characters. uniq -w 30 myfile.txt. Compare the first 30 characters
No comments:
Post a Comment