Add the remaining lowercase acute accented vowel HTML entities to
parse_companies.pl. On unknown entity, print an error to STDERR so the
maintainer can more clearly understand the failure.
Several company identifier lines do not end in a </td> but rather <br/>
followed by newline followed by </td>. This dirty hack is more forgiving
of HTML weirdnesses in the SIGs company identifiers page.
This patch adds tools/parse_companies.pl, a twisted Perl script that
parses the SIG's HTML page in poor taste using regex. Improvements also
include support for non-ASCII entities such as é as well as full
unicode support for Chinese names.