DEVFYI - Developer Resource - FYI

Assuming $_ contains HTML, which of the following substitutions will remove all tags in it? 1.s/<.*>//g; 2.s/<.*?>//gs; 3.s/<\/?[A-Z]\w*(?:\s+[A-Z]\w*(?:\s*=\s*(?:(["']).*?\1|[\w-.]+))?)*\s*>//gsix;

Perl Questions and Answers


(Continued from previous question...)

Assuming $_ contains HTML, which of the following substitutions will remove all tags in it?
1.s/<.*>//g;
2.s/<.*?>//gs;
3.s/<\/?[A-Z]\w*(?:\s+[A-Z]\w*(?:\s*=\s*(?:(["']).*?\1|[\w-.]+))?)*\s*>//gsix;

You can't do that.
If it weren't for HTML comments, improperly formatted HTML, and tags with interesting data like < SCRIPT >, you could do this. Alas, you cannot. It takes a lot more smarts, and quite frankly, a real parser.

(Continued on next question...)

Other Interview Questions