I have a large number of email addresses to validate. Initially I parse them with a regexp to throw out the completely crazy ones. I’m left with the ones that look sensible but still might contain errors.
I want to find which addresses have valid domains, so given [email protected] I want to know if it’s even possible to send emails to abcxyz.com .
I want to test that to see if it corresponds to a valid A or MX record – is there an easy way to do it using only Python standard library? I’d rather not add an additional dependency to my project just to support this feature.
There is no DNS interface in the standard library so you will either have to roll your own or use a third party library.
This is not a fast-changing concept though, so the external libraries are stable and well tested.
The one I’ve used successful for the same task as your question is PyDNS.
A very rough sketch of my code is something like this:
import DNS, smtplib DNS.DiscoverNameServers() mx_hosts = DNS.mxlookup(hostname) # Just doing the mxlookup might be enough for you, # but do something like this to test for SMTP server for mx in mx_hosts: smtp = smtplib.SMTP() #.. if this doesn't raise an exception it is a valid MX host... try: smtp.connect(mx) except smtplib.SMTPConnectError: continue # try the next MX server in list
Another library that might be better/faster than PyDNS is dnsmodule although it looks like it hasn’t had any activity since 2002, compared to PyDNS last update in August 2008.
Edit: I would also like to point out that email addresses can’t be easily parsed with a regexp. You are better off using the parseaddr() function in the standard library email.utils module (see my answer to this question for example).
The easy way to do this NOT in the standard library is to use the validate_email package:
from validate_email import validate_email is_valid = validate_email('[email protected]', check_mx=True)
For faster results to process a large number of email addresses (e.g. list
emails, you could stash the domains and only do a check_mx if the domain isn’t there. Something like:
emails = ["[email protected]", "[email protected]_domain", "[email protected]", ...] verified_domains = set() for email in emails: domain = email.split("@")[-1] domain_verified = domain in verified_domains is_valid = validate_email(email, check_mx=not domain_verified) if is_valid: verified_domains.add(domain)
An easy and effective way is to use a python package named as validate_email.
This package provides both the facilities. Check this article which will help you to check if your email actually exists or not.