This is great, thank you!
It has already been mentioned that self-closing tags are matching the next available </ when they don’t end with a trailing />. I will continue using this as it is very useful, but some improvements will be appreciated
The most common tags this applies to are: link, meta, img, input, br, hr.
Less common: frame, area, col, base, basefront, param.
Perhaps the best way to ensure accuracy is to parse the file for the full tag name instead of the first unclosed </ (obviously including exceptions for the above tags). This will also solve inadvertently matching those pesky overlapping tags like “”