SVG Sanitization

A couple of days ago I was browsing through the WordPress core Trac looking for something to get involved in and stumbled upon the following issue: https://core.trac.wordpress.org/ticket/24251

The lack of ability to upload SVGs into WordPress has always been a slight annoyance to me so I decided to see what I could do about it. After reading through the thread, I realised that what PHP was really missing, was a decent SVG sanitizer.

I read around a few articles and read through the source at: https://github.com/cure53/DOMPurify to get some ideas of how it’s done by the professionals. I started off by mapping out what was needed:

  • An element whitelist
  • An attribute whitelist
  • A way to remove those not in the whitelist

The two whitelists I borrowed from DOMPurify, after all it’s built by people with more knowledge than me. I then put a basic PHP library together that allowed me to pass in a dirty SVG string and receive out an SVG string with all the non-whitelisted elements and attributes removed.

The strangest thing I came across whilst building this was the fact that DOMNodeList doesn’t work like an array. If you iterate from 0 to 10 and delete node 5, everything drops down 1 place, so when you move to node 6, that’s actually node 7, odd. This can be easily fixed of course by iterating backwards from 10 to 0.

Anyway, I have a basic POC up on Github (https://github.com/darylldoyle/svg-sanitizer) that I’m hopefully going to push a bit further in the next week or so. Lets hope all goes well!

 

WordPress InnoDB Issues

I’ve been working with a lot of WordPress plugins recently and have come across a bug that had me stumped for a while, therefore, I thought it’d be worth sharing it here.

After installing and activating a plugin that created new database tables, I realised that the tables had not been installed. I contacted the plugin support who suggested re-installing WordPress. Reluctantly I did this but it didn’t help at all, I therefore decided I’d dig into the issue and see if I could track it down.

A bit of work and a few var_dump()‘s later I found that there as an error with the database creation SQL (below)

CREATE TABLE IF NOT EXISTS wp_domain_mapping (
    id BIGINT NOT NULL AUTO_INCREMENT,
    blog_id BIGINT NOT NULL,
    domain VARCHAR(255) NOT NULL,
    active TINYINT DEFAULT 1,
    PRIMARY KEY (id),
    KEY blog_id (blog_id, domain, active)
) DEFAULT CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci

I ran the query manually and realised it was returning the error Specified key was too long; max key length is 1000 bytes. I’d never seen this error before, so I dug into it a little more, this page informed me that InnoDB has a maximum key size of 1000 bytes, unless the innodb_large_prefix flag is set in MySQL in which case the maximum size is 3072 bytes.

That said, I found the following document where I could check the storage requirements of each field type. From this, I can tell that the BIGINT field, blog_id is 8 bytes, the TINYINT field, active is 1 byte and as UTF8 needs 3 bytes per character, the domain field is 255 * 3 bytes or 765 bytes. This gives us a total key length of 774 bytes.

It was then that I realised that I wasn’t using standard UTF8 but utf8mb4, which WordPress has set as it’s default character set as of 4.2. utf8mb4 requires 4 bytes per character and therefore the domain field calculation was wrong and was actually 255 * 4 bytes or 1,020 bytes in length. This is now an issue as our key is bigger than the size allowed by InnoDB hence the issue creating the tables.

Now as far as I can see, there is only one real way of getting around this and it’s the method WordPress has chosen, reducing the field length from 255 chars to something less, 247 in this case.

Now I know about this it shouldn’t be much of an issue, but I wonder how many plugin developers aren’t going to realise this has changed and are going to run into this. I foresee it being a major pain for a few months.

Anyway, onwards and upwards now!

Cheers for now!