Introducing Dumbquotes

This is slightly re-inventing the wheel, but I have released a new package called Dumbquotes. The idea is to replace simple typographic techniques with their more correct forms. Such as replacing a ' with ‘ or ’. This also gave me the chance to try and write a package. So dealing with making sure it’s psr-0 compliant and has associated unit tests to run with phpunit.

The package will deal with apostrophes, quotes, dashes, and ellipses. There are certain issues. Ultimately this is designed to deal with plain text such as a markdown document. It does not work with HTML. Trying to parse HTML with regex will bring the return of Cthulu. However once you deal with HTML directly things get a little complicated.

Consider the following sentence that could appear in some HTML <p>Mary said \"How <em>did</em> she do that?\"</p>. We want to turn this into <p>Mary said “How <em>did</em> she do that?”</p>. This is complicated by the fact we can't just search for a string of text containing two double quotes like so, /\"(.*?)\"/. The sentence doesn't actually appear in the HTML DOM. We actually have three blocks of text

  • Mary said \"How
  • did
  • she do that?\"

To concatenate that into a single string, and then put the tags back in the right place seems a very difficult task. So I have decided to write the dumbquotes parser to be applied before the markdown transform is applied.

*[HTML]: HyperText Markup Language *[DOM]: Document Object Model