Autofilled PHP Forms
Pages: 1, 2, 3
Implementing fillInFormValues
fillInFormValues uses preg_replace_callback to find and replace the bits of HTML that need modifying. For example, here's the regular expression used to find <label> tags that might need class="error" added to them:
/<label([^>]*)>/i
Reading from left to right, the first / just starts the regular expression. <label matches exactly that. The gobbledygook in parentheses matches anything except a > character. The > after the parentheses matches that character, and the /i ends the regular expression--the i makes it case-insensitive so it finds <LABEL ...> or <label ...>.
This is the quick-and-dirty way to parse HTML. For example, it doesn't take into account tags that are surrounded by HTML comments, although in this case it's harmless if commented-out label tags change. It can also get confused if you pass it HTML4 code with bizarre attribute values like <input value="<label for='foo'*>"*>. Don't do that. If you need a real HTML parser, use XML_HTMLSax.
Now the code only needs a callback function that looks at the label tag and just returns it unchanged or returns it with class="error" inserted:
function fillInLabel($matches)
{
global $formErrors;
global $idToNameMap;
$tag = $matches[0];
$for = getAttributeVal($tag, "for");
if (empty($for) or !isset($idToNameMap[$for])) { return $tag; }
$name = $idToNameMap[$for];
if (array_key_exists($name, $formErrors)) {
return replaceAttributeVal($tag, 'class', 'error');
}
return $tag; // No error.
}
Callback functions passed to preg_replace_callback always have one argument--an array of stuff matched by the regular expression. $matches[0] is the entire match; $matches[1] is the stuff matched by the first set of parentheses; and so on. In this case, $matches[0] is the entire <label> tag. The code gets the label's for attribute and sees whether it corresponds to an entry in the $formErrors array; if so, it replaces the label's class attribute with class="error". If the label doesn't correspond to a form error, then the function returns the tag unchanged. I've used global variables to pass in the extra information that fillInLabel() needs.
The patterns and callbacks for <input>, <select>, <textarea>, and <ul class="error"> are similar, just more complicated. The <select> callback is the trickiest--it uses preg_replace_callback recursively to find and replace <option> tags between the <select> and </select> tags. The regular expression for <input> tags is hairy because values like this are legal HTML 4:
<input name="foo" value="hello <smile>">
The > character in the quoted value means that the "anything besides a >" ([^>]*) regular expression won't work to get the guts of the input tag. Note that < and > aren't legal XML/XHTML attribute values; they must be encoded as < and >. Life will be so much simpler when HTML 4 is obsolete.
The getAttributeVal and replaceAttributeVal functions use another powerful PHP regular expression function--preg_match_all. They use this regular expression to match attributes inside HTML tags:
/(\w+)((\s*=\s*".*?")|(\s*=\s*'.*?')|(\s*=\s*\w+)|())/s
It isn't quite as incomprehensible as it looks at first glance--for example, if passed a string like:
name="foo" value='123' style=purple checked
the first part of the regular expression (\w+) will match name, value, style, and checked--it matches one or more "word" characters. The rest of the regular expression matches one of the four ways you can specify attribute values in HTML. (\s*=\s*".*?") matches ="foo"; (\s*=\s*'.*?') matches ='123'; (\s*=\s*\w+) matches =purple; and () allows the checked attribute to match even though it's not followed by any value at all.
Because regular expressions are "greedy," matching as much as they can, preg_match_all with the above regular expression will do exactly what you want, returning four matches when passed name="foo" value='123' style=purple checked.
Here's the complete code for getAttributeVal:
/**
* Returns value of $attribute, given guts of an HTML tag.
* Returns false if attribute isn't set.
* Returns empty string for no-value attributes.
*
* @param string $tag Guts of HTML tag, with or without the <tag and >.
* @param string $attribute E.g. "name" or "value" or "width"
* @return string|false Returns value of attribute (or false)
*/
function getAttributeVal($tag, $attribute) {
$matches = array();
// This regular expression matches attribute="value" or
// attribute='value' or attribute=value or attribute
// It's also constructed so $matches[1][...] will be the
// attribute names, and $matches[2][...] will be the
// attribute values.
preg_match_all('/(\w+)((\s*=\s*".*?")|(\s*=\s*\'.*?\')|(\s*=\s*\w+)|())/s',
$tag, $matches, PREG_PATTERN_ORDER);
for ($i = 0; $i < count($matches[1]); $i++) {
if (strtolower($matches[1][$i]) == strtolower($attribute)) {
// Gotta trim off whitespace, = and any quotes:
$result = ltrim($matches[2][$i], " \n\r\t=");
if ($result[0] == '"') { $result = trim($result, '"'); }
else { $result = trim($result, "'"); }
return $result;
}
}
return false;
}
Passing PREG_PATTERN_ORDER to preg_match_all returns the attribute names in $matches[1][$i] and the attribute values in $matches[2][$i], which is exactly what you want. replaceAttributeVal's code is very similar, except it passes in PREG_OFFSET_CAPTURE (available in PHP 4.3.0 or later) to determine where in the string all the attributes are, and then uses substr_replace to either replace an existing value or add the value to the HTML tag.
