Use a sed script to print only valid email entries


3. saman&
6. saman@mail@com
7. saman.desilva@yahoo com

I want to print valid email addresses but am having trouble figuring this problem out. So far I have this script, but it doesn’t print the fully correct output. It still gives me an incorrect output.

sed -nr '/w+@w+.w+$/p' emaillist.txt

The output:


First of all, a regular expression that matches all valid email addresses is notoriously complex. I’m going to assume, given the test data, that you’re aiming for a much simpler concept of email address validity.

One issue with your regex is that you aren’t matching from the beginning of the line, which is signified with ^. This allows invalid emails like the one with an ampersand in the username to match because it just matches everything after the ampersand. So if we add the ^, we then get the following output:

$ sed -nr '/^w+@w+.w+$/p' emaillist.txt

Well that’s not right either, and now the problem is that w only represents any letter, number or underscore. Periods are the other “valid” non-alphanumeric character for usernames in your test data, so we also need to tweak your pattern to add that, and now we get the correct output:

$ sed -nr '/^(w|.)+@w+.w+$/p' emaillist.txt