spell checking - Python SpellCheck str object has no attribute read -


trying spellchecker came across online work, no luck. appreciated. original code http://norvig.com/spell-correct.html

import re, collections, codecs  def words(text): return re.findall('[a-z]+', text.lower())   def train(features):     model = collections.defaultdict(lambda: 1)     f in features:         model[f] += 1     return model  file = codecs.open('c:\88888\88888\88888\88888\8888\a word.txt', encoding='utf-8', mode='r')  nwords = train(words(file.read()))  alphabet = 'abcdefghijklmnopqrstuvwxyz'  def edits1(word):     splits     = [(word[:i], word[i:]) in range(len(word) + 1)]     deletes    = [a + b[1:] a, b in splits if b]     transposes = [a + b[1] + b[0] + b[2:] a, b in splits if len(b)>1]     replaces   = [a + c + b[1:] a, b in splits c in alphabet if b]     inserts    = [a + c + b     a, b in splits c in alphabet]     return set(deletes + transposes + replaces + inserts)  def known_edits2(word):     return set(e2 e1 in edits1(word) e2 in edits1(e1) if e2 in nwords)  def known(words): return set(w w in words if w in nwords)  def correct(word):     candidates = known([word]) or known(edits1(word)) or known_edits2(word) or [word]     return max(candidates, key=nwords.get) 

error:

  file "c:\8888\8888\8888\8888\88888\spellcheck.py", line 11     file = codecs.open('c:\888\888\888\8888\88888\a word.txt', encoding='utf-8', mode='r')                       ^ syntaxerror: (unicode error) 'unicodeescape' codec can't decode bytes in position 2-3: truncated \uxxxxxxxx escape 

ok, let's try this... string value '\x' , try or try

string('\x.....') 

returns error right?

so if have string defined say

x = string('\y\o\u \c\a\n \n\e\v\e\r \c\h\a\n\g\e \t\h\i\s \i\n \p\y\t\h\o\n') 

than out of luck. bummer if user decides type '\' character of input.

to fix problem try using looping or recursive code like: how remove illegal characters path , filenames?


Comments