Latest Posts

Zend Framework JSON-RPC Client Released

Posted in PHP, Zend Framework by Karl on

I recently wrote a JSON-RPC client for Zend Framework.

The code is designed to mirror the API for Zend’s own XML-RPC client as much as possible, so if you already know how to use that then this should be really simple to just plug in and start using.

Currently HTTP is supported, though I may also add support for TCP in a future version.

You can grab the code from my GitHub.

Convert Numbers to Words with PHP

Posted in PHP by Karl on

I was recently asked the best way to convert a number to its textual form using PHP. For example, how to convert the number 123 to the string “one hundred and twenty-three”.

There is no built in PHP function to handle this, and the examples I found online tended to be too limited in some way or another, so I decided to write my own.

This function can accept numbers up to the system’s int size (2147483647 on 32-bit systems and 9223372036854775807 on 64-bit systems), both positive and negative, and can also handle real numbers (albeit with the inherent rounding inaccuracies involved in storing base 10 real numbers using base 2).

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
function convert_number_to_words($number) {
   
    $hyphen      = '-';
    $conjunction = ' and ';
    $separator   = ', ';
    $negative    = 'negative ';
    $decimal     = ' point ';
    $dictionary  = array(
        0                   => 'zero',
        1                   => 'one',
        2                   => 'two',
        3                   => 'three',
        4                   => 'four',
        5                   => 'five',
        6                   => 'six',
        7                   => 'seven',
        8                   => 'eight',
        9                   => 'nine',
        10                  => 'ten',
        11                  => 'eleven',
        12                  => 'twelve',
        13                  => 'thirteen',
        14                  => 'fourteen',
        15                  => 'fifteen',
        16                  => 'sixteen',
        17                  => 'seventeen',
        18                  => 'eighteen',
        19                  => 'nineteen',
        20                  => 'twenty',
        30                  => 'thirty',
        40                  => 'fourty',
        50                  => 'fifty',
        60                  => 'sixty',
        70                  => 'seventy',
        80                  => 'eighty',
        90                  => 'ninety',
        100                 => 'hundred',
        1000                => 'thousand',
        1000000             => 'million',
        1000000000          => 'billion',
        1000000000000       => 'trillion',
        1000000000000000    => 'quadrillion',
        1000000000000000000 => 'quintillion'
    );
   
    if (!is_numeric($number)) {
        return false;
    }
   
    if (($number >= 0 && (int) $number < 0) || (int) $number < 0 - PHP_INT_MAX) {
        // overflow
        trigger_error(
            'convert_number_to_words only accepts numbers between -' . PHP_INT_MAX . ' and ' . PHP_INT_MAX,
            E_USER_WARNING
        );
        return false;
    }

    if ($number < 0) {
        return $negative . convert_number_to_words(abs($number));
    }
   
    $string = $fraction = null;
   
    if (strpos($number, '.') !== false) {
        list($number, $fraction) = explode('.', $number);
    }
   
    switch (true) {
        case $number < 21:
            $string = $dictionary[$number];
            break;
        case $number < 100:
            $tens   = ((int) ($number / 10)) * 10;
            $units  = $number % 10;
            $string = $dictionary[$tens];
            if ($units) {
                $string .= $hyphen . $dictionary[$units];
            }
            break;
        case $number < 1000:
            $hundreds  = $number / 100;
            $remainder = $number % 100;
            $string = $dictionary[$hundreds] . ' ' . $dictionary[100];
            if ($remainder) {
                $string .= $conjunction . convert_number_to_words($remainder);
            }
            break;
        default:
            $baseUnit = pow(1000, floor(log($number, 1000)));
            $numBaseUnits = (int) ($number / $baseUnit);
            $remainder = $number % $baseUnit;
            $string = convert_number_to_words($numBaseUnits) . ' ' . $dictionary[$baseUnit];
            if ($remainder) {
                $string .= $remainder < 100 ? $conjunction : $separator;
                $string .= convert_number_to_words($remainder);
            }
            break;
    }
   
    if (null !== $fraction && is_numeric($fraction)) {
        $string .= $decimal;
        $words = array();
        foreach (str_split((string) $fraction) as $number) {
            $words[] = $dictionary[$number];
        }
        $string .= implode(' ', $words);
    }
   
    return $string;
}

Example usage:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
echo convert_number_to_words(123456789);
// one hundred and twenty-three million, four hundred and fifty-six thousand, seven hundred and eighty-nine

echo convert_number_to_words(123456789.123);
// one hundred and twenty-three million, four hundred and fifty-six thousand, seven hundred and eighty-nine point one two three

echo convert_number_to_words(-1922685.477);
// negative one million, nine hundred and twenty-two thousand, six hundred and eighty-five point four seven seven

// float rounding can be avoided by passing the number as a string
echo convert_number_to_words(123456789123.12345); // rounds the fractional part
// one hundred and twenty-three billion, four hundred and fifty-six million, seven hundred and eighty-nine thousand, one hundred and twenty-three point one two
echo convert_number_to_words('123456789123.12345'); // does not round
// one hundred and twenty-three billion, four hundred and fifty-six million, seven hundred and eighty-nine thousand, one hundred and twenty-three point one two three four five

XBMC PHP JSON-RPC Library Released: xbmc-php-rpc

Posted in PHP by Karl on

I’ve been writing playing around with XBMC quite a bit recently, and I decided to write a JSON-RPC library in PHP for interacting with an XBMC instance from another system.

XBMC supports JSON-RPC via HTTP and TCP, and in turn xbmc-php-rpc supports both mechanisms.

I uploaded the initial release today. You can get a copy via my GitHub.

Full documentation is forthcoming, but for now the README contains instructions for getting started.

CakePHP Valid XHTML/XML Behavior

Posted in CakePHP, PHP by Karl on

If you have ever written a CMS-type application where you accept input from users to be stored as valid XHTML, you will probably have come up against some problems!

Generally this task is accomplished by using a javascript WYSIWYG real-time editor on the client side in order to keep things simple for content editors, and the resulting markup is stored on the server. Often though, content editors tend to work in Microsoft Word and paste their content into the javascript editor. That’s where the fun begins! Windows uses its own character set (thanks Microsoft!) known as code page 1252 which, whilst being mostly compatible with the much more common latin-1 character set, is not something you generally want to use on the web – UTF-8 is a much more sensible way to go. If the content is to be stored in a database, you also need to ensure it matches the character set used by your table.

Aside from Microsoft-induced headaches, you often have little control over the markup itself. Even the best javascript editors don’t get everything 100% correct all the time, and as well as technological issues there is also potential for human error (inserting unencoded html entities for example).

All in all then, you can’t really trust the markup you receive to be valid UTF-8 encoded XHTML. I found myself in this position during the development of a CMS using CakePHP, so I decided to write a Model Behaviour which can be used to clean up strings of markup to ensure they are valid and properly encoded. It ensures that the content is free of code page 1252 characters by converting them to UTF-8, replaces any unencoded HTML entities with their properly encoded equivalents (e.g. & -> &amp;), fixes any invalid XHTML, and cleans and tidies the source code nicely.

The behaviour’s configuration is pretty simple. You just need to specify which fields should be automatically processed before they are saved:

1
2
3
4
5
public $actsAs = array(
    'ValidXhtml' => array(
        'fields' => array('content')
    )
);

You can optionally specify whether to tidy the markup for each field using the PHP Tidy extension (default is true, so you only need to specify this to disable tidy):

1
2
3
4
5
6
7
public $actsAs = array(
    'ValidXhtml' => array(
        'fields' => array(
            'content' => array('tidy' => false)
        )
    )
);

Obviously you need to have the tidy extension available to use that feature, but the Behaviour checks for the extension and will automatically configure itself accordingly, so there is no need to explicitly disable tidy if you don’t have it installed.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
<?php

class ValidXhtmlBehavior extends ModelBehavior {
   
    private $_defaults = array(
        'fields' => array()
    );
   
    private $_preProcessMap = array(
        // replace empty div tags with a div containing &nbsp;
        '/<div>\s*<\/div>/i' => '<div>&nbsp;</div>'
    );
   
    // Map of windows 1252 character points to utf-8 character points
    private $_cp1252Map = array(
        "\xc2\x80" => "\xe2\x82\xac", /* EURO SIGN */
        "\xc2\x82" => "\xe2\x80\x9a", /* SINGLE LOW-9 QUOTATION MARK */
        "\xc2\x83" => "\xc6\x92",     /* LATIN SMALL LETTER F WITH HOOK */
        "\xc2\x84" => "\xe2\x80\x9e", /* DOUBLE LOW-9 QUOTATION MARK */
        "\xc2\x85" => "\xe2\x80\xa6", /* HORIZONTAL ELLIPSIS */
        "\xc2\x86" => "\xe2\x80\xa0", /* DAGGER */
        "\xc2\x87" => "\xe2\x80\xa1", /* DOUBLE DAGGER */
        "\xc2\x88" => "\xcb\x86",     /* MODIFIER LETTER CIRCUMFLEX ACCENT */
        "\xc2\x89" => "\xe2\x80\xb0", /* PER MILLE SIGN */
        "\xc2\x8a" => "\xc5\xa0",     /* LATIN CAPITAL LETTER S WITH CARON */
        "\xc2\x8b" => "\xe2\x80\xb9", /* SINGLE LEFT-POINTING ANGLE QUOTATION */
        "\xc2\x8c" => "\xc5\x92",     /* LATIN CAPITAL LIGATURE OE */
        "\xc2\x8e" => "\xc5\xbd",     /* LATIN CAPITAL LETTER Z WITH CARON */
        "\xc2\x91" => "\xe2\x80\x98", /* LEFT SINGLE QUOTATION MARK */
        "\xc2\x92" => "\xe2\x80\x99", /* RIGHT SINGLE QUOTATION MARK */
        "\xc2\x93" => "\xe2\x80\x9c", /* LEFT DOUBLE QUOTATION MARK */
        "\xc2\x94" => "\xe2\x80\x9d", /* RIGHT DOUBLE QUOTATION MARK */
        "\xc2\x95" => "\xe2\x80\xa2", /* BULLET */
        "\xc2\x96" => "\xe2\x80\x93", /* EN DASH */
        "\xc2\x97" => "\xe2\x80\x94", /* EM DASH */
        "\xc2\x98" => "\xcb\x9c",     /* SMALL TILDE */
        "\xc2\x99" => "\xe2\x84\xa2", /* TRADE MARK SIGN */
        "\xc2\x9a" => "\xc5\xa1",     /* LATIN SMALL LETTER S WITH CARON */
        "\xc2\x9b" => "\xe2\x80\xba", /* SINGLE RIGHT-POINTING ANGLE QUOTATION*/
        "\xc2\x9c" => "\xc5\x93",     /* LATIN SMALL LIGATURE OE */
        "\xc2\x9e" => "\xc5\xbe",     /* LATIN SMALL LETTER Z WITH CARON */
        "\xc2\x9f" => "\xc5\xb8"      /* LATIN CAPITAL LETTER Y WITH DIAERESIS*/
    );
   
    // Map of utf-8 chracter points to special html entities
    private $_entMap = array(
        "\xe2\x80\x98" => '&lsquo;',
        "\xe2\x80\x99" => '&rsquo;',
        "\xe2\x80\x9c" => '&ldquo;',
        "\xe2\x80\x9d" => '&rdquo;',
        "\xe2\x82\xac" => '&euro;',
        "\xe2\x80\xa6" => '&hellip;'
    );
   
    /*
     For reference, these are other entity replacement codes which might be useful one day
    array(
        "\xe2\x80\x9a" => '&sbquo;',    // Single Low-9 Quotation Mark
        "\xe2\x82\xac" => '&euro;',     // Euro sign
        "\xc6\x92"     => '&fnof;',     // Latin Small Letter F With Hook
        "\xe2\x80\x9e" => '&bdquo;',    // Double Low-9 Quotation Mark
        "\xe2\x80\xa6" => '&hellip;',   // Horizontal Ellipsis
        "\xe2\x80\xa0" => '&dagger;',   // Dagger
        "\xe2\x80\xa1" => '&Dagger;',   // Double Dagger
        "\xcb\x86"     => '&circ;',     // Modifier Letter Circumflex Accent
        "\xe2\x80\xb0" => '&permil;',   // Per Mille Sign
        "\xc5\xa0"     => '&Scaron;',   // Latin Capital Letter S With Caron
        "\xe2\x80\xb9" => '&lsaquo;',   // Single Left-Pointing Angle Quotation Mark
        "\xc5\x92"     => '&OElig;',    // Latin Capital Ligature OE
        "\xe2\x80\x98" => '&lsquo;',    // Left Single Quotation Mark
        "\xe2\x80\x99" => '&rsquo;',    // Right Single Quotation Mark
        "\xe2\x80\x9c" => '&ldquo;',    // Left Double Quotation Mark
        "\xe2\x80\x9d" => '&rdquo;',    // Right Double Quotation Mark
        "\xe2\x80\xa2" => '&bull;',     // Bullet
        "\xe2\x80\x93" => '&ndash;',    // En Dash
        "\xe2\x80\x94" => '&mdash;',    // Em Dash
        "\xcb\x9c"     => '&tilde;',    // Small Tilde
        "\xe2\x84\xa2" => '&trade;',    // Trade Mark Sign
        "\xc5\xa1"     => '&scaron;',   // Latin Small Letter S With Caron
        "\xe2\x80\xba" => '&rsaquo;',   // Single Right-Pointing Angle Quotation Mark
        "\xc5\x93"     => '&oelig;',    // Latin Small Ligature OE
        "\xc5\xb8"     => '&Yuml;',     // Latin Capital Letter Y With Diaeresis
    );
    */

   
    public function setup($model, $config = array()) {
        $this->settings[$model->alias] = array_merge($this->_defaults, (array) $config);
    }
   
    public function beforeSave($model) {
        if (!empty($this->settings[$model->alias]['fields'])) {
            foreach ($this->settings[$model->alias]['fields'] as $key => $value) {
                if (is_array($value)) {
                    $options = $value;
                    $field = $key;
                } else {
                    $field = $value;
                }
                $options['tidy'] = isset($options['tidy']) ? $options['tidy'] : true;
                if (isset($model->data[$model->alias][$field])) {
                    $model->data[$model->alias][$field] =
                        $this->makeValid($model->data[$model->alias][$field], $options['tidy']);
                }
            }
        }
        return true;
    }
   
    public function makeValid($string, $tidy = true) {
       
        $string = trim($string);
       
        // apply the pre-process map
        $string = preg_replace(array_keys($this->_preProcessMap), $this->_preProcessMap, $string);
       
        // apply the windows > utf8 map
        $string = str_replace(array_keys($this->_cp1252Map), $this->_cp1252Map, $string);
       
        // get rid of any existing html entities to avoid double encoding
        $string = html_entity_decode($string, ENT_QUOTES, 'UTF-8');
       
        // break out any PHP sections since they should not be touched
        $parts = preg_split('/(<\?.+?\?>)/us', $string, -1, PREG_SPLIT_DELIM_CAPTURE);
       
        // replace &, ", ', < and > with their entities, but only where they are not
        // part of an html tag or a comment
        $string = '';
        foreach ($parts as $part) {
            if (false === mb_strpos(trim($part), '<?')) {
                $string .= preg_replace_callback(
                    '/(?<=\>)((?![<](\?|\/)*[a-z][^>]*[>])[^<])+/ius',
                    create_function(
                        '$matches',
                        'return htmlspecialchars($matches[0]);'
                    ),
                    $part
                );
            } else {
                $string .= $part;
            }
        }
       
        // apply the utf-8 > entities map
        $string = str_replace(array_keys($this->_entMap), $this->_entMap, $string);
       
        // trim whitespace from the end of each line and add a nice \n
        // tinymce in particular seems to have a bug where it will insert spaces
        // at the end of lines - this can cause problems with things like Revision
        // Behavior as the values of some fields will never be the same so a revision
        // is always saved even if the data itself has not changed.
        $parts = preg_split("/[\r\n]+/u", $string);
        foreach ($parts as &$part) {
            $part = rtrim($part);
        }
        $string = implode("\n", $parts);
       
        // tidy the output
        if ($tidy && extension_loaded('tidy')) {
            $tidy_config = array(
                'output-xhtml' => true,
                'show-body-only' => true,
                'indent' => true,
                'indent-spaces' => 4,
                'sort-attributes' => 'alpha',
                'wrap' => 80,
                'preserve-entities' => true,
                'join-styles' => false,
                'logical-emphasis' => true,
                'enclose-text' => true
            );
            $tidy = tidy_parse_string($string, $tidy_config, 'UTF8');
            $tidy->cleanRepair();
            $string = $tidy;
        }
       
        return $string;
       
    }
   
}

?>

CakePHP Traceable Model Behavior

Posted in CakePHP, PHP by Karl on

Sometimes it is useful to be able to associate an authenticated user with an action carried out on particular database table entry. For example, if you have a collection of recipes in a database which can be soft-deleted (i.e., marked as deleted but not actually removed from the database in order to allow for undo functionality), you might want to keep track of which user deleted a particular recipe. This is relatively easy to do in CakePHP; you can simply use association aliases to link a particular User to the deletion action by adding a corresponding foreign key to the Recipe table and defining the association accordingly:

1
2
3
4
5
6
public $belongsTo = array(
    'DeletedBy' => array(
        'className' => 'User',
        'foreignKey' => 'deleted_by'
    )
);

This is all well and good, but you have to manage the foreign keys for each of these associations somehow, which would typically involve either adding a hidden form field to the appropriate view containing the currently authenticated user’s id, or using the beforeSave model callback to add the user’s id into the data set. This can soon get cumbersome if you have many models which you need to trace in this way. To solve this issue I wrote a model behaviour which automates the process of associating the user performing an action with the model.

The behaviour works via a system of trigger and target fields. When the value of a trigger field is set, the currently authenticated user is automatically associated with the model using an alias which corresponds to the action being performed. If the value of a trigger field should be ‘unset’ (more on what constitutes unset in a minute), the association will be empty.

To clarify with an example, if you have a TINYINT(1) field named ‘deleted’ which indicates the soft-deletion state of a recipe (1 = deleted, 0 = not deleted), the behaviour will automatically provide an association with the user who performed the deletion, using an alias ‘DeletedBy’. The result of a find call to the Recipe model where the recipe’s deleted field = 1 would then be as follows:

1
2
3
4
5
6
7
8
9
10
11
Array (
    'Recipe' => Array (
        'id' => 1
        // etc
    ),
    'DeletedBy' => Array (
        'id' => 1,
        'username' => 'Karl'
        // etc
    )
)

For recipes whose deleted field = 0, the result would be:

1
2
3
4
5
6
7
Array (
    'Recipe' => Array (
        'id' => 1
        // etc
    ),
    'DeletedBy' => Array ()
)

Note that the association is still present, but it contains no data, because no user deleted the recipe.

When determining if the value of a trigger field such as ‘deleted’ is set or unset, the behaviour uses two mechanisms. The first is simply to test if the value is empty according to PHP’s empty() function – if it is, the value is considered unset, otherwise it is considered set. The second mechanism is to check the value against a list of values which would not be considered empty according to PHP’s empty() function, but which you want to treat as unset anyway. An example of this would be an empty datetime string (“0000-00-00 00:00:00″); if it represents the date a recipe was deleted for example, it should be considered unset. By default the behaviour is set up to treat empty datetime strings as unset, but other arbitrary values can be added to this.

When a trigger field is determined to be unset, the target field’s value is set to 0 by default. This can be configured on a per-model basis.

The behaviour can handle any number of associations simultaneously – you might want to keep track of who created or modified a particular row for example. There really is no limit, as long as you define a trigger field and a corresponding target field the behaviour will automatically handle the associations.

By default the behaviour will use the trigger fields ‘created’, ‘modified’, ‘deleted’, ‘hidden’ and ‘locked’, using the target fields ‘created_by’, ‘modified_by’, ‘deleted_by’, ‘hidden_by’ and ‘locked_by’ (though these defaults can be easily configured). Extra trigger and target fields can be set on a per-model basis by defining them when adding the behaviour to the model (note that you will also need to re-include any default fields which you may need):

1
2
3
4
5
6
7
8
public $actsAs = array(
    'Traceable' => array(
        'fields' => array(
            'created' => 'created_by',
            'modified' => 'modified_by',
            'published' => 'published_by'
        )
);

There are a few other settings which can be defined here too:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
public $actsAs = array(
    'Traceable' => array(
        // The name of the user model used for Authentication. Defaults to 'User'
        'user_model' => 'User',
        // An array of trigger => target fields to trace for this model.
        'fields' => array(
            'created' => 'created_by',
            'modified' => 'modified_by',
            'published' => 'published_by'
        ),
        // The value to which target fields will be set when the corresponding trigger field is unset.
        // Defaults to 0.
        'restored_value' => 0
);

Enough explanation, here is the code.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
<?php
/**
 *
 * Automagically adds the logged in user's id to the specified fields when certain
 * events occur within the model.
 *
 * NB: If this behavior is not working, be sure you are not using saveField to save the
 * trigger field. If you do, no other fields (including the target field) will be saved.
 * Instead use a standard model->save call. This is a cake issue so nothing can be
 * done without altering the core.
 *
 * This works on a system of trigger fields and target fields. When a 'trigger' field
 * is saved, and it updates a corresponding 'target' field. For example, if you have
 * a trigger field called 'deleted' with a corresponding target field of 'deleted_by',
 * when deleted is saved with an empty value (anything which causes empty() to return true)
 * then the deleted_by field is set to a specified value (by default, 0). When deleted
 * is saved with a non-empty value, deleted_by is updated with the id of the currently
 * Authed user.
 *
 * When using find queries on a model which implements this behaviour, a
 * user model will be generated for each target field and stored under a corresponding
 * name in the data array (assuming the recursive level permits of course).
 * For example, a User model for user identified in the created_by field will
 * be stored under the CreatedBy key in the returned data array:
 *
 * Array (
 *     'Webpages' => Array (
 *         'id' => 1
 *         // etc
 *     ),
 *     'CreatedBy' => Array (
 *         'id' => 1,
 *         'username' => 'Karl'
 *         // etc
 *     )
 * )
 *
 * In the event that no user model can be associated with the target field (for
 * example, of the target field contains 0 or NULL), the target field data will
 * be set to an empty array:
 *
 * Array (
 *     'Webpages' => Array (
 *         'id' => 1
 *         // etc
 *     ),
 *     'CreatedBy' => Array ()
 * )
 *
 * @author Karl Rixon <karl@karlrixon.co.uk>
 * @version 1.0
 *
 **/

class TraceableBehavior extends ModelBehavior {

    /**
     * @var mixed An array of default configuration options
     * @access private
     */

    private $_defaults = array(
        // The name of the user model.
        'user_model' => 'User',
        // An array of fields, with a trigger field as key and a target field as value.
        // When the trigger field is set to a non empty() value, the user's id is
        // inserted into the corresponding target field. When it is set to an empty()
        // value, the target field is reset to restored_value.
        'fields'  => array(
            'created' => 'created_by',
            'modified' => 'modified_by',
            'deleted' => 'deleted_by',
            'hidden' => 'hidden_by',
            'locked' => 'locked_by'
        ),
        // The value to which target fields will be set when toggled back.
        'restored_value' => 0,
        // If true, a user model representing each target field will be automatically bound
        // to the model which implements this behaviour (using a belongsTo association).
        'auto_bind' => true,
        'map' => array()
    );
   
    /**
     * @var int Stores the id of the currently Authed user.
     * @access private
     */

    private $_userId = 0;
   
    /**
     * @var mixed An array of values which should indicate that a field has been emptied. There
     * is no need to specify the values which evaluate true using PHP's empty() function;
     * this is intended for custom values which would not normally be considered empty().
     * @access private
     */

    private $_empty = array(
        '0000-00-00 00:00:00' // empty datetime
    );
   
    private $_noTrace = false;

    /**
     * Initialises the behaviour.
     *
     * @param Model $model A reference to the model object to which this behaviour is attached.
     * @param mixed $config An array of behaviour configuration options to be merged with the defaults.
     * @return void
     * @author Karl Rixon <karlrixon@gmail.com>
     * @version 1.0
     * @since 1.0
     */

    function setup($model, $config = array()) {
       
        if (!$model->useTable) {
            // Model is not tied to a database table.
            return;
        }
       
        $this->settings[$model->alias] = array_merge($this->_defaults, (array) $config);

        if (empty($this->_userId)) {
            $this->_userId = $this->_getAuthedUserId($model);
        }
        $this->settings[$model->alias]['map'] = $this->_buildFieldMap($model);
       
        if ($this->settings[$model->alias]['auto_bind']) {
            $this->_bindModels($model);
        }
       
    }
   
    /**
     * Called before model data is saved.
     *
     * This method is just used as a place to attach a call to the _trace method which
     * does all the hard work of managing the trigger/target fields.
     *
     * @param Model $model A reference to the model object to which this behaviour is attached.
     * @return boolean Always returns true to allow the save to continue as normal.
     * @author Karl Rixon <karlrixon@gmail.com>
     * @version 1.0
     * @since 1.0
     */

    function beforeSave($model) {
        if (!empty($this->settings[$model->alias]['map'])) {
            $this->_trace($model);
        }
        return true;
    }
   
    /**
     * Cleans up the found data, changing empty association models to empty arrays.
     *
     * Without this, any empty association model data will contain all of the keys
     * from the User model's table. It seems neater to return an empty array for any
     * items which do not have a matching user model.
     *
     * @param Model $model A reference to the model object to which this behaviour is attached.
     * @param mixed $results An array containing the data returned from the find.
     * @return void
     * @author Karl Rixon <karlrixon@gmail.com>
     * @version 1.0
     * @since 1.0
    */

    function afterFind($model, $results, $primary) {
       
        if (empty($this->settings[$model->alias]['map'])) {
            return $results;
        }
       
        if ($model->recursive == -1) {
            return $results;
        }
       
        if (!empty($results) && $primary) {
            foreach ($results as &$result) {
                if (!isset($result[$model->alias])) {
                    continue;
                }
                foreach ($this->settings[$model->alias]['map'] as $associationName) {
                    if (empty($result[$associationName]['id'])) {
                        $result[$associationName] = array();
                    }
                }
            }
        }
       
        return $results;
       
    }
   
    /**
     * Checks for any triggered fields, and sets the corresponding target field accordingly.
     *
     * If a triggered field is found in the data, and it's value is empty (anything
     * which evaluates to true using PHP's empty() function), its target field is
     * set to the restored_value. If it is not empty, its target field is set to the
     * id of the currently Authed user.
     *
     * @param Model $model A reference to the model object to which this behaviour is attached.
     * @return bool True if succesful, false if an error occured. Note that true will be
     *  returned even if no trigger fields were found. False will only be returned if
     *  an actual error occured - the lack of trigger fields is not considered an error.
     * @author Karl Rixon <karlrixon@gmail.com>
     * @version 1.0
     * @since 1.0
     */

    private function _trace(&$model) {
       
        if (!isset($this->settings[$model->alias]['fields'])) {
            return false;
        } elseif (!$this->_userId) {
            return false;
        }
       
        foreach ($this->settings[$model->alias]['fields'] as $trigger => $target) {
            if (isset($model->data[$model->alias][$trigger])) {
                if (empty($model->data[$model->alias][$trigger]) || in_array($model->data[$model->alias][$trigger], $this->_empty)) {
                    $model->data[$model->alias][$target] = $this->settings[$model->alias]['restored_value'];
                } else {
                    $model->data[$model->alias][$target] = $this->_userId;
                }
            }
        }
       
        return true;
       
    }

    /**
     * Gets the id of the currently Authed user.
     */

    private function _getAuthedUserId($model) {
        App::import('Component', 'Session');
        $session = new SessionComponent();
        return $session->read('Auth.' . $this->settings[$model->alias]['user_model'] . '.id');
    }

    /**
     * Builds an array with field names as keys, and camelized versions of
     * the field names as values.
     */

    private function _buildFieldMap($model) {
       
        // Get an array of all fields which exist in both the model's table, and in
        // the behaviour settings for this model.
        $fields = array_values(array_intersect(
            $this->settings[$model->alias]['fields'],
            array_keys($model->_schema)
        ));
       
        $map = array();
        foreach ($fields as $field) {
            $map[$field] = Inflector::camelize($field);
        }

        return $map;
   
    }

    /**
     * Binds any models which should be bound in order to represent the
     * User who is to be traced.
     */

    private function _bindModels($model) {
       
        if (empty($this->settings[$model->alias]['map'])) {
            return false;
        }
       
        $models = array();
        foreach ($this->settings[$model->alias]['map'] as $foreignKey => $associationName) {
            $models['belongsTo'][$associationName] = array(
                'className'  => $this->settings[$model->alias]['user_model'],
                'foreignKey' => $foreignKey
            );
        }
        if (sizeof($models) > 0) {
            $model->bindModel($models, false);
        }
    }

}

?>

PHP Sodoku Solver Class

Posted in PHP by Karl on

I wrote this small class to solve Sudoku puzzles. It also has a very basic sudoku puzzle generator method. The code uses a backtracking algorithm to solve the puzzles and should be fairly fast even with slower computers.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
class Sudoku {
   
    private $_matrix;
   
    public function __construct(array $matrix = null) {
        if (!isset($matrix)) {
            $this->_matrix = $this->_getEmptyMatrix();
        } else {
            $this->_matrix = $matrix;
        }
    }
   
    public function generate() {
        $this->_matrix = $this->_solve($this->_getEmptyMatrix());
        $cells = array_rand(range(0, 80), 30);
        $i = 0;
        foreach ($this->_matrix as &$row) {
            foreach ($row as &$cell) {
                if (!in_array($i++, $cells)) {
                    $cell = null;
                }
            }
        }
        return $this->_matrix;
    }
   
    public function solve() {
        $this->_matrix = $this->_solve($this->_matrix);
        return $this->_matrix;
    }
   
    public function getHtml() {
        echo '<table border="1">';
        for ($row = 0; $row < 9; $row++) {
            echo '<tr>';
            for ($column = 0; $column < 9; $column++) {
                echo '<td>' . $this->_matrix[$row][$column] . '</td>';
            }
            echo '</tr>';
        }
        echo '</table>';
    }
   
    private function _getEmptyMatrix() {
        return array_fill(0, 9, array_fill(0, 9, 0));
    }
   
    private function _solve($matrix) {
        while(true) {
            $options = array();
            foreach ($matrix as $rowIndex => $row) {
                foreach ($row as $columnIndex => $cell) {
                    if (!empty($cell)) {
                        continue;
                    }
                    $permissible = $this->_getPermissible($matrix, $rowIndex, $columnIndex);
                    if (count($permissible) == 0) {
                        return false;
                    }
                    $options[] = array(
                        'rowIndex' => $rowIndex,
                        'columnIndex' => $columnIndex,
                        'permissible' => $permissible
                    );
                }
            }
            if (count($options) == 0) {
                return $matrix;
            }
           
            usort($options, array($this, '_sortOptions'));
           
            if (count($options[0]['permissible']) == 1) {
                $matrix[$options[0]['rowIndex']][$options[0]['columnIndex']] = current($options[0]['permissible']);
                continue;
            }
           
            foreach ($options[0]['permissible'] as $value) {
                $tmp = $matrix;
                $tmp[$options[0]['rowIndex']][$options[0]['columnIndex']] = $value;
                if ($result = $this->_solve($tmp)) {
                    return $result;
                }
            }
           
            return false;
        }
    }
   
    private function _getPermissible($matrix, $rowIndex, $columnIndex) {
        $valid = range(1, 9);
        $invalid = $matrix[$rowIndex];
        for ($i = 0; $i < 9; $i++) {
            $invalid[] = $matrix[$i][$columnIndex];
        }
        $box_row = $rowIndex % 3 == 0 ? $rowIndex : $rowIndex - $rowIndex % 3;
        $box_col = $columnIndex % 3 == 0 ? $columnIndex : $columnIndex - $columnIndex % 3;
        $invalid = array_unique(array_merge(
            $invalid,
            array_slice($matrix[$box_row], $box_col, 3),
            array_slice($matrix[$box_row + 1], $box_col, 3),
            array_slice($matrix[$box_row + 2], $box_col, 3)
        ));
        $valid = array_diff($valid, $invalid);
        shuffle($valid);
        return $valid;
    }
   
    private function _sortOptions($a, $b) {
        $a = count($a['permissible']);
        $b = count($b['permissible']);
        if ($a == $b) {
            return 0;
        }
        return ($a < $b) ? -1 : 1;
    }
   
}

Usage is very simple, you can either pass to the constructor a two dimensional array representing the grid – 9 arrays of 9 elements – and then call the solve() method, or else do not pass anything in and call generate() to create a new puzzle. Either way the results can be displayed as a HTML table by calling the getHtml() method.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
$grid = array(
    array(0,0,0,0,0,0,2,0,3),
    array(8,0,7,0,0,0,0,6,0),
    array(0,0,2,6,5,0,0,0,8),
    array(0,3,0,0,0,0,0,0,0),
    array(7,5,0,2,0,0,1,0,0),
    array(0,0,1,0,3,0,5,0,0),
    array(4,0,0,5,0,0,8,7,0),
    array(6,0,0,0,4,2,0,0,0),
    array(0,9,5,0,6,0,0,2,0)
);
$s = new Sudoku($grid);
$s->solve();
echo $s->getHtml();

$s2 = new Sudoku();
$s2->generate();
echo $s2->getHtml();

PHP Regular Expression Fails Silently on Long Strings

Posted in PHP by Karl on

I had an odd bug today which took me a while to track down. I was using preg_replace_callback to match blocks of code in a string and hand them off to Geshi for syntax highlighting. However I found that some blocks which should match, were not being matched. I couldn’t find any explanation for it at all. I was using the following non-greedy-match-anything sub-pattern:

1
(.*?)

There really shouldn’t be any reason why that would fail to match. I gradually started removing bits of text from my string to try to find the cause, and suddenly after a few chunks were gone the pattern matched. I couldn’t see anything in what I had removed which could be causing an issue, so I assumed the string length itself was the issue, and this assumption proved correct.

As of PHP 5.2, a new ini setting was implemented called pcre.backtrack_limit. The documentation is very sparse for this setting, but it basically sets an upper limit on how much data the regular expression engine will trawl through to check dependant characters. This affects things like non-greedy patterns, and I assume lookahead and lookbehind assertions (though I have not tested this). The default value for this setting is a meagre 100000 bytes, or 97KB. Prior to 5.2, this setting did not exist and longer patterns would match without problem. The really annoying thing about all this is that the regex function will just fail silently, leaving you to start madly pulling your hair out while you try to see what could be preventing your pattern from matching. A notice or warning error would have saved me a couple of hours!

The pcre.backtrack_limit setting can be altered either in your php.ini, or at runtime. I set mine to 1MB and have not had any issues.

1
ini_set('pcre.backtrack_limit', '1048576');

Find the System Temp Directory with PHP

Posted in PHP by Karl on

Sometimes it is useful to be able to save a file to the system’s temporary directory in a PHP application. Depending on the platform and it’s configuration however, this directory can be in a variety of places. As of PHP 5.2.1, there is a native function sys_get_temp_dir which will do the job. For earlier versions though, the following function will try to determine the temp directory for you and return its path as a string. If you use it and upgrade later, the code will degrade gracefully and switch to the native function.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
if (!function_exists('sys_get_temp_dir')) {
    function sys_get_temp_dir() {
        // check environment variables.
        foreach (array('TMP', 'TEMP', 'TMPDIR') as $env_var) {
            if ($temp = getenv($env_var)) {
                return $temp;
            }
        }
        // test for a temp directory by having PHP create a temporary file.
        $temp = tempnam(__FILE__, '');
        if (file_exists($temp)) {
            unlink($temp);
            return dirname($temp);
        }
        // couldn't find a temp directory.
        return null;
    }
}

Tidy PHP: Fatal Error: Class ‘tidy’ Not Found

Posted in PHP by Karl on

I was playing around with Tidy on my development machine at work, however even simple examples copied directly from the PHP manual were giving me errors such as:

1
Fatal error: Class 'tidy' not found in /var/www/dev.test.domain.org/tidy.php on line 149

I checked my PHP version, which is 5.2.6, and made sure I had the php5-tidy Ubuntu package installed. All was well on those fronts, yet I still had the problem.

I thought maybe the extension wasn’t loading, but running the following confirmed that it was indeed loaded:

1
<?php echo extension_loaded('tidy') ? "LOADED" : "NOT LOADED" ?>

I tested a bit further using the procedural syntax. More oddness ensued. I used the following code:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
<?php ob_start() ?>

<html>
  <head>
   <title>test</title>
  </head>
  <body>
   <p>error<br>another ĨńtêrʼnåtȉΌnժlizǽtioǸ line</i>
  </body>
</html>

<?php

$buffer = ob_get_clean();

$tidy_config = array(
    'clean' => true,
    'output-xhtml' => true,
    'wrap' => 200,
);

$tidy = tidy_parse_string($buffer, $tidy_config, 'UTF8');
$tidy->cleanRepair();
echo $tidy;

?>

Which resulted in the following error:

1
Warning: tidy_parse_string() expects exactly 1 parameter, 3 given in /var/www/dev.test.domain.org/tidy.php on line 147

This made me suspicious. The fact that the tidy_parse_string function existed but did not expect the documented number of parameters made me suspect I had somehow got the wrong version installed. Sure enough, on further inspection I found I had at some point in the past installed the PECL version of tidy, which is 1.2. This was obviously taking precedence over the version installed via the php5-tidy package.

So the fix was pretty simple:

Remove the PECL version of Tidy:

1
# pecl uninstall tidy

For good measure I uninstalled the php5-tidy package too, but this is probably unnecessary:

1
# apt-get remove php5-tidy

Restart Apache (again probably not necessary at this point but I did it anyway):

1
# apache2ctl restart

Re-installed the php5-tidy package:

1
# apt-get install php5-tidy

Restart Apache:

1
# apache2ctl restart

After this everything worked as expected.

Replace HTML Special Characters With Entities – But Without Touching Tags

Posted in PHP by Karl on

I came a across a problem during the development of a CMS at work where I had to take a string of HTML source code and make sure all special html characters are replaced with their entities. For example, & (ampersand) should become &amp;.

PHP has a couple of useful functions for this sort of thing, namely htmlentities and htmlspecialchars. However running my string through either of these was no good to me because doing so would convert the characters used in the html tags too. For example, the following:

1
<p class="foo">This is a paragraph & that ampersand needs fixing</p>

Would become:

1
&lt;p class="foo"&gt;This is a paragraph &amp; that ampersand needs fixing&lt;/p&gt;

The ampersand is converted nicely, but now the HTML is useless. The first thought that struck me was to parse the string using php’s XML parser in order to get at the cdata directly, but of course that idea didn’t last long since the very characters I was trying to fix would have broken the parser.

In the end I settled on using a regular expression to match content in between tags, but leave the tags themselves alone. I also added some functionality to leave anything between tags along so I could pass though HTML with embedded PHP and not have it break.

Here is the function. It is coded to work with UTF-8, hence the multibyte functions and the /u modifier on the regex, but if you are working with a single byte character set you can just swap this out accordingly.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
<?php
function clean_entities($string) {
   
    $string = htmlspecialchars_decode($string);
   
    $parts = preg_split('/(<\?.*?\?>)/us', $string, -1, PREG_SPLIT_DELIM_CAPTURE);
   
    $string = '';
   
    foreach ($parts as $part) {
        if (false === mb_strpos(trim($part), '<?')) {
            $string .= preg_replace_callback(
                '/(?<=\>)((?![<](\?|\/)*[a-z][^>]*[>]).)+/ius',
                create_function(
                    '$matches',
                    'return htmlspecialchars($matches[0]);'
                ),
                $part
            );
        } else {
            $string .= $part;
        }
    }
   
    return $string;
   
}
?>

This results in nice valid entities, but the tags and any embedded php are left alone:

1
<p class="foo">This is a paragraph &amp; that ampersand <?php echo "has been" ?> fixed!</p>