php - Strip only valid html -


i'm trying strip html tags piece of text. trouble whatever use - regex, strip_tags etc.. comes across same problem: strip text not html looks it.

some <foo@bar.com> content--> content <content looks -->  

is there way can around this?

a correct solution full-fledged html parser. see this legendary question full discussion.

a simple 80% solution known tags , strip them.

regexp('</?(a|b|blockquote|cite|dd|dl|dt|...|u)\b.*?>') 

the code more readable if use array of tags , build expressions loop through them. not handle comments nicely, if need more hack quality, don't hack approach. if need correctness, use actual html parser (e.g. domdocument in php).


Comments