Getting TypeError: expected string or buffer in python -


i have simple code:

#usr/bin/python  bs4 import beautifulsoup import requests import tldextract  def scrap(url):     main_domain = tldextract.extract(url)     r = requests.get(url)     data = r.text     soup = beautifulsoup(data)     list = []     href in soup.find_all('a'):     link_domain = tldextract.extract(href.get('href'))     print link_domain     print 

getting error :

traceback (most recent call last): file "cloud.py", line 20, in <module> scrap("--- url here -- ") file "cloud.py", line 14, in scrap link_domain = tldextract.extract(href.get('href')) file "/usr/lib/python2.6/site-packages/tldextract/tldextract.py", line 196, in extract  return tld_extractor(url) file "/usr/lib/python2.6/site-packages/tldextract/tldextract.py", line 127, in __call__ netloc = scheme_re.sub("", url) \  typeerror: expected string or buffer 

how can fix it.

some of a tags not have href attribute, .get('href') returns none.

use:

link_domain = tldextract.extract(href.get('href', '')) 

to return empty string in case, or test attribute first:

href = href.get('href') if not href:     continue  link_domain = tldextract.extract(href) 

Comments