Grep All Email Addresses From File

Last updated on Dec 30, 2019 in Linux

To parse all email addresses from a text file can be done with grep tool in Linux.

grep File With Regexp

To list all the email addresses from our file index.html, we use grep and regexp match email addresses from the file:

$ grep -oE "\b[a-zA-Z0-9.-][email protected][a-zA-Z0-9.-]+\.[a-zA-Z0-9.-]+\b" index.html
  • -o tells grep to echo only matched part, not the whole line
  • E tells grep that our search term is in regexp format

The regexp part "\b[0-9]{1-3}\.[0-9]{1-3}\.[0-9]{1-3}\.[0-9]{1-3}\b" is not a perfect pattern to match all possible emails. In order to have pattern that matches 100% to all email combinations, you should explore the internet to find more sophisticated pattern for your needs. This simple pattern will do the work in most of the cases.

Remove Duplicate Email addresses

We may have an output with duplicate email addresses. Let’s dedup the output with uniq:

$ grep -oE "\b[a-zA-Z0-9.-][email protected][a-zA-Z0-9.-]+\.[a-zA-Z0-9.-]+\b" index.html | uniq

The uniq takes input lines and echoes only unique lines.