Ticket #122 (closed defect: fixed)
UTF-8 support
| Reported by: | anonymous | Owned by: | mike |
|---|---|---|---|
| Priority: | normal | Milestone: | |
| Component: | Administration | Version: | |
| Severity: | normal | Keywords: | admin utf8 rss |
| Cc: | mike@… |
Description
I've realized that the UTF8 character set isn't really support in Plogger. I'm talking about 2.1 version, but seems that version 3 is affected too.
Although Plogger _install.php script creates the MySQL tables with the UTF-8 character set properly, the information is not really stored in UTF-8 as far as I know. I realized that because the RSS feed generated with special characters (ñ,á,é, and so on) didn't work properly, and get complaints about the use of invalid characters (ñ and so on). Looking in the database, I see that the information itself isn't been stored in really UTF-8.
In my Plogger version I fixed this problem doing the next things:
1) The Admin backend XHTML is not in UTF-8. For that I added the line
<meta http-equiv="Content-Type" content="txt/html; charset=utf-8" />
in the XHTML generated, and to make sure:
header("Content-Type: text/html; charset=utf-8");
With this we're sure the XHTML generated is in UTF-8, but we can't assure the information from the forms is been sent to the MySQL server in UTF-8. See step # 2
2) To make sure about the last point mentioned I added the next line:
$rs = run_query("SET NAMES 'utf8'");
(I'm not going to explain here this command, don't worry ;), but you can see more info here: http://www.herongyang.com/php/non_ascii_mysql_2.html)
So, Ok, we're storing properly the utf-8 information in the Database, but... why the hell my XML isn't valid yet? Let's look at step # 3
3) In the plog-rss.php I replaced the line:
$caption = htmlentities($row['caption']);
by the line:
$caption = xmlentities($row['caption']);
That's because the htmlentities() doesn't support UFT-8 (see http://www.php.net/htmlentities) The xmlentities() it's a solution proposed in the comments of the php documentation page of htmlentities(), and is defined as follows:
function xmlentities($string) {
return str_replace ( array ( '&', '"', "'", '<', '>' ), array ( '&' , '"', ''' , '<' , '>' ), $string );
}
Although it's not strictly needed, I also added to the header() call the charset:
header("Content-Type: application/xml; charset=utf-8");
And finish! real UTF-8 support in Plogger! Notice I didn't include filenames because looks like you've changed something in the version 3 in the Admin backend, so I prefer to ignore the filenames in the ticket. I'm sure you know where to fit each thing I commented ;)
Cheers,
Victor
