Latest Posts

Zend Framework JSON-RPC Client Released

Posted in PHP, Zend Framework by Karl on

I recently wrote a JSON-RPC client for Zend Framework.

The code is designed to mirror the API for Zend’s own XML-RPC client as much as possible, so if you already know how to use that then this should be really simple to just plug in and start using.

Currently HTTP is supported, though I may also add support for TCP in a future version.

You can grab the code from my GitHub.

Convert Numbers to Words with PHP

Posted in PHP by Karl on

I was recently asked the best way to convert a number to its textual form using PHP. For example, how to convert the number 123 to the string “one hundred and twenty-three”.

There is no built in PHP function to handle this, and the examples I found online tended to be too limited in some way or another, so I decided to write my own.

This function can accept numbers up to the system’s int size (2147483647 on 32-bit systems and 9223372036854775807 on 64-bit systems), both positive and negative, and can also handle real numbers (albeit with the inherent rounding inaccuracies involved in storing base 10 real numbers using base 2).

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
function convert_number_to_words($number) {
   
    $hyphen      = '-';
    $conjunction = ' and ';
    $separator   = ', ';
    $negative    = 'negative ';
    $decimal     = ' point ';
    $dictionary  = array(
        0                   => 'zero',
        1                   => 'one',
        2                   => 'two',
        3                   => 'three',
        4                   => 'four',
        5                   => 'five',
        6                   => 'six',
        7                   => 'seven',
        8                   => 'eight',
        9                   => 'nine',
        10                  => 'ten',
        11                  => 'eleven',
        12                  => 'twelve',
        13                  => 'thirteen',
        14                  => 'fourteen',
        15                  => 'fifteen',
        16                  => 'sixteen',
        17                  => 'seventeen',
        18                  => 'eighteen',
        19                  => 'nineteen',
        20                  => 'twenty',
        30                  => 'thirty',
        40                  => 'fourty',
        50                  => 'fifty',
        60                  => 'sixty',
        70                  => 'seventy',
        80                  => 'eighty',
        90                  => 'ninety',
        100                 => 'hundred',
        1000                => 'thousand',
        1000000             => 'million',
        1000000000          => 'billion',
        1000000000000       => 'trillion',
        1000000000000000    => 'quadrillion',
        1000000000000000000 => 'quintillion'
    );
   
    if (!is_numeric($number)) {
        return false;
    }
   
    if (($number >= 0 && (int) $number < 0) || (int) $number < 0 - PHP_INT_MAX) {
        // overflow
        trigger_error(
            'convert_number_to_words only accepts numbers between -' . PHP_INT_MAX . ' and ' . PHP_INT_MAX,
            E_USER_WARNING
        );
        return false;
    }

    if ($number < 0) {
        return $negative . convert_number_to_words(abs($number));
    }
   
    $string = $fraction = null;
   
    if (strpos($number, '.') !== false) {
        list($number, $fraction) = explode('.', $number);
    }
   
    switch (true) {
        case $number < 21:
            $string = $dictionary[$number];
            break;
        case $number < 100:
            $tens   = ((int) ($number / 10)) * 10;
            $units  = $number % 10;
            $string = $dictionary[$tens];
            if ($units) {
                $string .= $hyphen . $dictionary[$units];
            }
            break;
        case $number < 1000:
            $hundreds  = $number / 100;
            $remainder = $number % 100;
            $string = $dictionary[$hundreds] . ' ' . $dictionary[100];
            if ($remainder) {
                $string .= $conjunction . convert_number_to_words($remainder);
            }
            break;
        default:
            $baseUnit = pow(1000, floor(log($number, 1000)));
            $numBaseUnits = (int) ($number / $baseUnit);
            $remainder = $number % $baseUnit;
            $string = convert_number_to_words($numBaseUnits) . ' ' . $dictionary[$baseUnit];
            if ($remainder) {
                $string .= $remainder < 100 ? $conjunction : $separator;
                $string .= convert_number_to_words($remainder);
            }
            break;
    }
   
    if (null !== $fraction && is_numeric($fraction)) {
        $string .= $decimal;
        $words = array();
        foreach (str_split((string) $fraction) as $number) {
            $words[] = $dictionary[$number];
        }
        $string .= implode(' ', $words);
    }
   
    return $string;
}

Example usage:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
echo convert_number_to_words(123456789);
// one hundred and twenty-three million, four hundred and fifty-six thousand, seven hundred and eighty-nine

echo convert_number_to_words(123456789.123);
// one hundred and twenty-three million, four hundred and fifty-six thousand, seven hundred and eighty-nine point one two three

echo convert_number_to_words(-1922685.477);
// negative one million, nine hundred and twenty-two thousand, six hundred and eighty-five point four seven seven

// float rounding can be avoided by passing the number as a string
echo convert_number_to_words(123456789123.12345); // rounds the fractional part
// one hundred and twenty-three billion, four hundred and fifty-six million, seven hundred and eighty-nine thousand, one hundred and twenty-three point one two
echo convert_number_to_words('123456789123.12345'); // does not round
// one hundred and twenty-three billion, four hundred and fifty-six million, seven hundred and eighty-nine thousand, one hundred and twenty-three point one two three four five

XBMC PHP JSON-RPC Library Released: xbmc-php-rpc

Posted in PHP by Karl on

I’ve been writing playing around with XBMC quite a bit recently, and I decided to write a JSON-RPC library in PHP for interacting with an XBMC instance from another system.

XBMC supports JSON-RPC via HTTP and TCP, and in turn xbmc-php-rpc supports both mechanisms.

I uploaded the initial release today. You can get a copy via my GitHub.

Full documentation is forthcoming, but for now the README contains instructions for getting started.

GTK Apache Log File Parser Written in Python

Posted in Python by Karl on

I needed to parse sets of large apache logfiles with the ability to filter results based on HTTP status code as well as URL regular expressions. Being primarily a PHP programmer my first instinct was to build this as a web based application, however I soon realised that PHP would just be far too slow when parsing large (multiple GB) files. Instead I decided to write the program as a traditional desktop application using python and GTK, utilising threads to keep it speedy and responsive. If you have any use for the program, you can download it here.

It should run on any OS assuming you have python and the GTK libraries installed. There’s no need to install, simply unzip and run parserGUI.py.

Screenshots:

CakePHP Valid XHTML/XML Behavior

Posted in CakePHP, PHP by Karl on

If you have ever written a CMS-type application where you accept input from users to be stored as valid XHTML, you will probably have come up against some problems!

Generally this task is accomplished by using a javascript WYSIWYG real-time editor on the client side in order to keep things simple for content editors, and the resulting markup is stored on the server. Often though, content editors tend to work in Microsoft Word and paste their content into the javascript editor. That’s where the fun begins! Windows uses its own character set (thanks Microsoft!) known as code page 1252 which, whilst being mostly compatible with the much more common latin-1 character set, is not something you generally want to use on the web – UTF-8 is a much more sensible way to go. If the content is to be stored in a database, you also need to ensure it matches the character set used by your table.

Aside from Microsoft-induced headaches, you often have little control over the markup itself. Even the best javascript editors don’t get everything 100% correct all the time, and as well as technological issues there is also potential for human error (inserting unencoded html entities for example).

All in all then, you can’t really trust the markup you receive to be valid UTF-8 encoded XHTML. I found myself in this position during the development of a CMS using CakePHP, so I decided to write a Model Behaviour which can be used to clean up strings of markup to ensure they are valid and properly encoded. It ensures that the content is free of code page 1252 characters by converting them to UTF-8, replaces any unencoded HTML entities with their properly encoded equivalents (e.g. & -> &amp;), fixes any invalid XHTML, and cleans and tidies the source code nicely.

The behaviour’s configuration is pretty simple. You just need to specify which fields should be automatically processed before they are saved:

1
2
3
4
5
public $actsAs = array(
    'ValidXhtml' => array(
        'fields' => array('content')
    )
);

You can optionally specify whether to tidy the markup for each field using the PHP Tidy extension (default is true, so you only need to specify this to disable tidy):

1
2
3
4
5
6
7
public $actsAs = array(
    'ValidXhtml' => array(
        'fields' => array(
            'content' => array('tidy' => false)
        )
    )
);

Obviously you need to have the tidy extension available to use that feature, but the Behaviour checks for the extension and will automatically configure itself accordingly, so there is no need to explicitly disable tidy if you don’t have it installed.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
<?php

class ValidXhtmlBehavior extends ModelBehavior {
   
    private $_defaults = array(
        'fields' => array()
    );
   
    private $_preProcessMap = array(
        // replace empty div tags with a div containing &nbsp;
        '/<div>\s*<\/div>/i' => '<div>&nbsp;</div>'
    );
   
    // Map of windows 1252 character points to utf-8 character points
    private $_cp1252Map = array(
        "\xc2\x80" => "\xe2\x82\xac", /* EURO SIGN */
        "\xc2\x82" => "\xe2\x80\x9a", /* SINGLE LOW-9 QUOTATION MARK */
        "\xc2\x83" => "\xc6\x92",     /* LATIN SMALL LETTER F WITH HOOK */
        "\xc2\x84" => "\xe2\x80\x9e", /* DOUBLE LOW-9 QUOTATION MARK */
        "\xc2\x85" => "\xe2\x80\xa6", /* HORIZONTAL ELLIPSIS */
        "\xc2\x86" => "\xe2\x80\xa0", /* DAGGER */
        "\xc2\x87" => "\xe2\x80\xa1", /* DOUBLE DAGGER */
        "\xc2\x88" => "\xcb\x86",     /* MODIFIER LETTER CIRCUMFLEX ACCENT */
        "\xc2\x89" => "\xe2\x80\xb0", /* PER MILLE SIGN */
        "\xc2\x8a" => "\xc5\xa0",     /* LATIN CAPITAL LETTER S WITH CARON */
        "\xc2\x8b" => "\xe2\x80\xb9", /* SINGLE LEFT-POINTING ANGLE QUOTATION */
        "\xc2\x8c" => "\xc5\x92",     /* LATIN CAPITAL LIGATURE OE */
        "\xc2\x8e" => "\xc5\xbd",     /* LATIN CAPITAL LETTER Z WITH CARON */
        "\xc2\x91" => "\xe2\x80\x98", /* LEFT SINGLE QUOTATION MARK */
        "\xc2\x92" => "\xe2\x80\x99", /* RIGHT SINGLE QUOTATION MARK */
        "\xc2\x93" => "\xe2\x80\x9c", /* LEFT DOUBLE QUOTATION MARK */
        "\xc2\x94" => "\xe2\x80\x9d", /* RIGHT DOUBLE QUOTATION MARK */
        "\xc2\x95" => "\xe2\x80\xa2", /* BULLET */
        "\xc2\x96" => "\xe2\x80\x93", /* EN DASH */
        "\xc2\x97" => "\xe2\x80\x94", /* EM DASH */
        "\xc2\x98" => "\xcb\x9c",     /* SMALL TILDE */
        "\xc2\x99" => "\xe2\x84\xa2", /* TRADE MARK SIGN */
        "\xc2\x9a" => "\xc5\xa1",     /* LATIN SMALL LETTER S WITH CARON */
        "\xc2\x9b" => "\xe2\x80\xba", /* SINGLE RIGHT-POINTING ANGLE QUOTATION*/
        "\xc2\x9c" => "\xc5\x93",     /* LATIN SMALL LIGATURE OE */
        "\xc2\x9e" => "\xc5\xbe",     /* LATIN SMALL LETTER Z WITH CARON */
        "\xc2\x9f" => "\xc5\xb8"      /* LATIN CAPITAL LETTER Y WITH DIAERESIS*/
    );
   
    // Map of utf-8 chracter points to special html entities
    private $_entMap = array(
        "\xe2\x80\x98" => '&lsquo;',
        "\xe2\x80\x99" => '&rsquo;',
        "\xe2\x80\x9c" => '&ldquo;',
        "\xe2\x80\x9d" => '&rdquo;',
        "\xe2\x82\xac" => '&euro;',
        "\xe2\x80\xa6" => '&hellip;'
    );
   
    /*
     For reference, these are other entity replacement codes which might be useful one day
    array(
        "\xe2\x80\x9a" => '&sbquo;',    // Single Low-9 Quotation Mark
        "\xe2\x82\xac" => '&euro;',     // Euro sign
        "\xc6\x92"     => '&fnof;',     // Latin Small Letter F With Hook
        "\xe2\x80\x9e" => '&bdquo;',    // Double Low-9 Quotation Mark
        "\xe2\x80\xa6" => '&hellip;',   // Horizontal Ellipsis
        "\xe2\x80\xa0" => '&dagger;',   // Dagger
        "\xe2\x80\xa1" => '&Dagger;',   // Double Dagger
        "\xcb\x86"     => '&circ;',     // Modifier Letter Circumflex Accent
        "\xe2\x80\xb0" => '&permil;',   // Per Mille Sign
        "\xc5\xa0"     => '&Scaron;',   // Latin Capital Letter S With Caron
        "\xe2\x80\xb9" => '&lsaquo;',   // Single Left-Pointing Angle Quotation Mark
        "\xc5\x92"     => '&OElig;',    // Latin Capital Ligature OE
        "\xe2\x80\x98" => '&lsquo;',    // Left Single Quotation Mark
        "\xe2\x80\x99" => '&rsquo;',    // Right Single Quotation Mark
        "\xe2\x80\x9c" => '&ldquo;',    // Left Double Quotation Mark
        "\xe2\x80\x9d" => '&rdquo;',    // Right Double Quotation Mark
        "\xe2\x80\xa2" => '&bull;',     // Bullet
        "\xe2\x80\x93" => '&ndash;',    // En Dash
        "\xe2\x80\x94" => '&mdash;',    // Em Dash
        "\xcb\x9c"     => '&tilde;',    // Small Tilde
        "\xe2\x84\xa2" => '&trade;',    // Trade Mark Sign
        "\xc5\xa1"     => '&scaron;',   // Latin Small Letter S With Caron
        "\xe2\x80\xba" => '&rsaquo;',   // Single Right-Pointing Angle Quotation Mark
        "\xc5\x93"     => '&oelig;',    // Latin Small Ligature OE
        "\xc5\xb8"     => '&Yuml;',     // Latin Capital Letter Y With Diaeresis
    );
    */

   
    public function setup($model, $config = array()) {
        $this->settings[$model->alias] = array_merge($this->_defaults, (array) $config);
    }
   
    public function beforeSave($model) {
        if (!empty($this->settings[$model->alias]['fields'])) {
            foreach ($this->settings[$model->alias]['fields'] as $key => $value) {
                if (is_array($value)) {
                    $options = $value;
                    $field = $key;
                } else {
                    $field = $value;
                }
                $options['tidy'] = isset($options['tidy']) ? $options['tidy'] : true;
                if (isset($model->data[$model->alias][$field])) {
                    $model->data[$model->alias][$field] =
                        $this->makeValid($model->data[$model->alias][$field], $options['tidy']);
                }
            }
        }
        return true;
    }
   
    public function makeValid($string, $tidy = true) {
       
        $string = trim($string);
       
        // apply the pre-process map
        $string = preg_replace(array_keys($this->_preProcessMap), $this->_preProcessMap, $string);
       
        // apply the windows > utf8 map
        $string = str_replace(array_keys($this->_cp1252Map), $this->_cp1252Map, $string);
       
        // get rid of any existing html entities to avoid double encoding
        $string = html_entity_decode($string, ENT_QUOTES, 'UTF-8');
       
        // break out any PHP sections since they should not be touched
        $parts = preg_split('/(<\?.+?\?>)/us', $string, -1, PREG_SPLIT_DELIM_CAPTURE);
       
        // replace &, ", ', < and > with their entities, but only where they are not
        // part of an html tag or a comment
        $string = '';
        foreach ($parts as $part) {
            if (false === mb_strpos(trim($part), '<?')) {
                $string .= preg_replace_callback(
                    '/(?<=\>)((?![<](\?|\/)*[a-z][^>]*[>])[^<])+/ius',
                    create_function(
                        '$matches',
                        'return htmlspecialchars($matches[0]);'
                    ),
                    $part
                );
            } else {
                $string .= $part;
            }
        }
       
        // apply the utf-8 > entities map
        $string = str_replace(array_keys($this->_entMap), $this->_entMap, $string);
       
        // trim whitespace from the end of each line and add a nice \n
        // tinymce in particular seems to have a bug where it will insert spaces
        // at the end of lines - this can cause problems with things like Revision
        // Behavior as the values of some fields will never be the same so a revision
        // is always saved even if the data itself has not changed.
        $parts = preg_split("/[\r\n]+/u", $string);
        foreach ($parts as &$part) {
            $part = rtrim($part);
        }
        $string = implode("\n", $parts);
       
        // tidy the output
        if ($tidy && extension_loaded('tidy')) {
            $tidy_config = array(
                'output-xhtml' => true,
                'show-body-only' => true,
                'indent' => true,
                'indent-spaces' => 4,
                'sort-attributes' => 'alpha',
                'wrap' => 80,
                'preserve-entities' => true,
                'join-styles' => false,
                'logical-emphasis' => true,
                'enclose-text' => true
            );
            $tidy = tidy_parse_string($string, $tidy_config, 'UTF8');
            $tidy->cleanRepair();
            $string = $tidy;
        }
       
        return $string;
       
    }
   
}

?>

CakePHP Traceable Model Behavior

Posted in CakePHP, PHP by Karl on

Sometimes it is useful to be able to associate an authenticated user with an action carried out on particular database table entry. For example, if you have a collection of recipes in a database which can be soft-deleted (i.e., marked as deleted but not actually removed from the database in order to allow for undo functionality), you might want to keep track of which user deleted a particular recipe. This is relatively easy to do in CakePHP; you can simply use association aliases to link a particular User to the deletion action by adding a corresponding foreign key to the Recipe table and defining the association accordingly:

1
2
3
4
5
6
public $belongsTo = array(
    'DeletedBy' => array(
        'className' => 'User',
        'foreignKey' => 'deleted_by'
    )
);

This is all well and good, but you have to manage the foreign keys for each of these associations somehow, which would typically involve either adding a hidden form field to the appropriate view containing the currently authenticated user’s id, or using the beforeSave model callback to add the user’s id into the data set. This can soon get cumbersome if you have many models which you need to trace in this way. To solve this issue I wrote a model behaviour which automates the process of associating the user performing an action with the model.

The behaviour works via a system of trigger and target fields. When the value of a trigger field is set, the currently authenticated user is automatically associated with the model using an alias which corresponds to the action being performed. If the value of a trigger field should be ‘unset’ (more on what constitutes unset in a minute), the association will be empty.

To clarify with an example, if you have a TINYINT(1) field named ‘deleted’ which indicates the soft-deletion state of a recipe (1 = deleted, 0 = not deleted), the behaviour will automatically provide an association with the user who performed the deletion, using an alias ‘DeletedBy’. The result of a find call to the Recipe model where the recipe’s deleted field = 1 would then be as follows:

1
2
3
4
5
6
7
8
9
10
11
Array (
    'Recipe' => Array (
        'id' => 1
        // etc
    ),
    'DeletedBy' => Array (
        'id' => 1,
        'username' => 'Karl'
        // etc
    )
)

For recipes whose deleted field = 0, the result would be:

1
2
3
4
5
6
7
Array (
    'Recipe' => Array (
        'id' => 1
        // etc
    ),
    'DeletedBy' => Array ()
)

Note that the association is still present, but it contains no data, because no user deleted the recipe.

When determining if the value of a trigger field such as ‘deleted’ is set or unset, the behaviour uses two mechanisms. The first is simply to test if the value is empty according to PHP’s empty() function – if it is, the value is considered unset, otherwise it is considered set. The second mechanism is to check the value against a list of values which would not be considered empty according to PHP’s empty() function, but which you want to treat as unset anyway. An example of this would be an empty datetime string (“0000-00-00 00:00:00″); if it represents the date a recipe was deleted for example, it should be considered unset. By default the behaviour is set up to treat empty datetime strings as unset, but other arbitrary values can be added to this.

When a trigger field is determined to be unset, the target field’s value is set to 0 by default. This can be configured on a per-model basis.

The behaviour can handle any number of associations simultaneously – you might want to keep track of who created or modified a particular row for example. There really is no limit, as long as you define a trigger field and a corresponding target field the behaviour will automatically handle the associations.

By default the behaviour will use the trigger fields ‘created’, ‘modified’, ‘deleted’, ‘hidden’ and ‘locked’, using the target fields ‘created_by’, ‘modified_by’, ‘deleted_by’, ‘hidden_by’ and ‘locked_by’ (though these defaults can be easily configured). Extra trigger and target fields can be set on a per-model basis by defining them when adding the behaviour to the model (note that you will also need to re-include any default fields which you may need):

1
2
3
4
5
6
7
8
public $actsAs = array(
    'Traceable' => array(
        'fields' => array(
            'created' => 'created_by',
            'modified' => 'modified_by',
            'published' => 'published_by'
        )
);

There are a few other settings which can be defined here too:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
public $actsAs = array(
    'Traceable' => array(
        // The name of the user model used for Authentication. Defaults to 'User'
        'user_model' => 'User',
        // An array of trigger => target fields to trace for this model.
        'fields' => array(
            'created' => 'created_by',
            'modified' => 'modified_by',
            'published' => 'published_by'
        ),
        // The value to which target fields will be set when the corresponding trigger field is unset.
        // Defaults to 0.
        'restored_value' => 0
);

Enough explanation, here is the code.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
<?php
/**
 *
 * Automagically adds the logged in user's id to the specified fields when certain
 * events occur within the model.
 *
 * NB: If this behavior is not working, be sure you are not using saveField to save the
 * trigger field. If you do, no other fields (including the target field) will be saved.
 * Instead use a standard model->save call. This is a cake issue so nothing can be
 * done without altering the core.
 *
 * This works on a system of trigger fields and target fields. When a 'trigger' field
 * is saved, and it updates a corresponding 'target' field. For example, if you have
 * a trigger field called 'deleted' with a corresponding target field of 'deleted_by',
 * when deleted is saved with an empty value (anything which causes empty() to return true)
 * then the deleted_by field is set to a specified value (by default, 0). When deleted
 * is saved with a non-empty value, deleted_by is updated with the id of the currently
 * Authed user.
 *
 * When using find queries on a model which implements this behaviour, a
 * user model will be generated for each target field and stored under a corresponding
 * name in the data array (assuming the recursive level permits of course).
 * For example, a User model for user identified in the created_by field will
 * be stored under the CreatedBy key in the returned data array:
 *
 * Array (
 *     'Webpages' => Array (
 *         'id' => 1
 *         // etc
 *     ),
 *     'CreatedBy' => Array (
 *         'id' => 1,
 *         'username' => 'Karl'
 *         // etc
 *     )
 * )
 *
 * In the event that no user model can be associated with the target field (for
 * example, of the target field contains 0 or NULL), the target field data will
 * be set to an empty array:
 *
 * Array (
 *     'Webpages' => Array (
 *         'id' => 1
 *         // etc
 *     ),
 *     'CreatedBy' => Array ()
 * )
 *
 * @author Karl Rixon <karl@karlrixon.co.uk>
 * @version 1.0
 *
 **/

class TraceableBehavior extends ModelBehavior {

    /**
     * @var mixed An array of default configuration options
     * @access private
     */

    private $_defaults = array(
        // The name of the user model.
        'user_model' => 'User',
        // An array of fields, with a trigger field as key and a target field as value.
        // When the trigger field is set to a non empty() value, the user's id is
        // inserted into the corresponding target field. When it is set to an empty()
        // value, the target field is reset to restored_value.
        'fields'  => array(
            'created' => 'created_by',
            'modified' => 'modified_by',
            'deleted' => 'deleted_by',
            'hidden' => 'hidden_by',
            'locked' => 'locked_by'
        ),
        // The value to which target fields will be set when toggled back.
        'restored_value' => 0,
        // If true, a user model representing each target field will be automatically bound
        // to the model which implements this behaviour (using a belongsTo association).
        'auto_bind' => true,
        'map' => array()
    );
   
    /**
     * @var int Stores the id of the currently Authed user.
     * @access private
     */

    private $_userId = 0;
   
    /**
     * @var mixed An array of values which should indicate that a field has been emptied. There
     * is no need to specify the values which evaluate true using PHP's empty() function;
     * this is intended for custom values which would not normally be considered empty().
     * @access private
     */

    private $_empty = array(
        '0000-00-00 00:00:00' // empty datetime
    );
   
    private $_noTrace = false;

    /**
     * Initialises the behaviour.
     *
     * @param Model $model A reference to the model object to which this behaviour is attached.
     * @param mixed $config An array of behaviour configuration options to be merged with the defaults.
     * @return void
     * @author Karl Rixon <karlrixon@gmail.com>
     * @version 1.0
     * @since 1.0
     */

    function setup($model, $config = array()) {
       
        if (!$model->useTable) {
            // Model is not tied to a database table.
            return;
        }
       
        $this->settings[$model->alias] = array_merge($this->_defaults, (array) $config);

        if (empty($this->_userId)) {
            $this->_userId = $this->_getAuthedUserId($model);
        }
        $this->settings[$model->alias]['map'] = $this->_buildFieldMap($model);
       
        if ($this->settings[$model->alias]['auto_bind']) {
            $this->_bindModels($model);
        }
       
    }
   
    /**
     * Called before model data is saved.
     *
     * This method is just used as a place to attach a call to the _trace method which
     * does all the hard work of managing the trigger/target fields.
     *
     * @param Model $model A reference to the model object to which this behaviour is attached.
     * @return boolean Always returns true to allow the save to continue as normal.
     * @author Karl Rixon <karlrixon@gmail.com>
     * @version 1.0
     * @since 1.0
     */

    function beforeSave($model) {
        if (!empty($this->settings[$model->alias]['map'])) {
            $this->_trace($model);
        }
        return true;
    }
   
    /**
     * Cleans up the found data, changing empty association models to empty arrays.
     *
     * Without this, any empty association model data will contain all of the keys
     * from the User model's table. It seems neater to return an empty array for any
     * items which do not have a matching user model.
     *
     * @param Model $model A reference to the model object to which this behaviour is attached.
     * @param mixed $results An array containing the data returned from the find.
     * @return void
     * @author Karl Rixon <karlrixon@gmail.com>
     * @version 1.0
     * @since 1.0
    */

    function afterFind($model, $results, $primary) {
       
        if (empty($this->settings[$model->alias]['map'])) {
            return $results;
        }
       
        if ($model->recursive == -1) {
            return $results;
        }
       
        if (!empty($results) && $primary) {
            foreach ($results as &$result) {
                if (!isset($result[$model->alias])) {
                    continue;
                }
                foreach ($this->settings[$model->alias]['map'] as $associationName) {
                    if (empty($result[$associationName]['id'])) {
                        $result[$associationName] = array();
                    }
                }
            }
        }
       
        return $results;
       
    }
   
    /**
     * Checks for any triggered fields, and sets the corresponding target field accordingly.
     *
     * If a triggered field is found in the data, and it's value is empty (anything
     * which evaluates to true using PHP's empty() function), its target field is
     * set to the restored_value. If it is not empty, its target field is set to the
     * id of the currently Authed user.
     *
     * @param Model $model A reference to the model object to which this behaviour is attached.
     * @return bool True if succesful, false if an error occured. Note that true will be
     *  returned even if no trigger fields were found. False will only be returned if
     *  an actual error occured - the lack of trigger fields is not considered an error.
     * @author Karl Rixon <karlrixon@gmail.com>
     * @version 1.0
     * @since 1.0
     */

    private function _trace(&$model) {
       
        if (!isset($this->settings[$model->alias]['fields'])) {
            return false;
        } elseif (!$this->_userId) {
            return false;
        }
       
        foreach ($this->settings[$model->alias]['fields'] as $trigger => $target) {
            if (isset($model->data[$model->alias][$trigger])) {
                if (empty($model->data[$model->alias][$trigger]) || in_array($model->data[$model->alias][$trigger], $this->_empty)) {
                    $model->data[$model->alias][$target] = $this->settings[$model->alias]['restored_value'];
                } else {
                    $model->data[$model->alias][$target] = $this->_userId;
                }
            }
        }
       
        return true;
       
    }

    /**
     * Gets the id of the currently Authed user.
     */

    private function _getAuthedUserId($model) {
        App::import('Component', 'Session');
        $session = new SessionComponent();
        return $session->read('Auth.' . $this->settings[$model->alias]['user_model'] . '.id');
    }

    /**
     * Builds an array with field names as keys, and camelized versions of
     * the field names as values.
     */

    private function _buildFieldMap($model) {
       
        // Get an array of all fields which exist in both the model's table, and in
        // the behaviour settings for this model.
        $fields = array_values(array_intersect(
            $this->settings[$model->alias]['fields'],
            array_keys($model->_schema)
        ));
       
        $map = array();
        foreach ($fields as $field) {
            $map[$field] = Inflector::camelize($field);
        }

        return $map;
   
    }

    /**
     * Binds any models which should be bound in order to represent the
     * User who is to be traced.
     */

    private function _bindModels($model) {
       
        if (empty($this->settings[$model->alias]['map'])) {
            return false;
        }
       
        $models = array();
        foreach ($this->settings[$model->alias]['map'] as $foreignKey => $associationName) {
            $models['belongsTo'][$associationName] = array(
                'className'  => $this->settings[$model->alias]['user_model'],
                'foreignKey' => $foreignKey
            );
        }
        if (sizeof($models) > 0) {
            $model->bindModel($models, false);
        }
    }

}

?>

How To Fix Analog Error: “Warning F: Failed to open logfile”

Posted in Linux by Karl on

We use analog at work to perform analysis of our Apache server logs. I was setting up a new server recently and after installing and configuring analog I found I was getting an error every time I tried to run analog against our logfiles. The error was as follows:

1
Warning F: Failed to open logfile /var/log/httpd/access_log: ignoring it

If you are having a similar problem, hopefully this post will help you solve it.

There are a couple of basic causes of this error. The first and most obvious is that analog genuinely can’t open the logfile(s) in question because it either doesn’t exist at the location specified, or else it exists but cannot be opened by analog due to insufficient permissions. The second is that the logfile in question is too large to be processed (exactly what constitutes “too large” varies, but it typically means greater than either 2GiB or 4GiB on a 32bit system).

File doesn’t exist, or has insufficient permissions

This situation is easy to test for. First ensure you are logged in as the user who runs analog when it gives the error (if you run it manually you are fine, but if you run it via another user’s crontab for example you need to log in as that user to test). Next enter the following command (assuming analog is failing to open /var/log/httpd/access_log):

1
head /var/log/httpd/access_log

If you get the first few lines of your logfile printed to the terminal, then you don’t have a permissions error. Skip to the next section.

If you get an error indicating that the file could not be found, then you need to adjust the LOGFILE line(s) in your analog.cfg (typically located in /etc/analog.cfg) to point to the real location of your logfile(s).

If you get an error indicating that you have insufficient permissions, you will either need to run analog as a user who does have the correct permissions, or else alter the permissions of the logfile so that you can execute the command successfully. Remember that although the file itself may have adequate permissions, the entire path to the file also needs to have adequate permissions. On my CentOS system for example, the logfile at /var/log/httpd/access_log is readable by everyone, but the /var/log/httpd directory is only readable by root, which means that running analog on this logfile as any user other than root would fail.

If you need to alter permissions, it is only necessary to make the file readable, so the following is sufficient:

1
sudo chmod a+r /var/log/httpd/access_log

Again be sure to repeat this for any parent directories of your logfile which are unreadable.

Log files are too large

This is the issue I had, and it caused me half an hour or so of frustration trying to find the problem! Luckily it is simple enough to fix. The cause of this issue is that the logfiles are not rotated frequently enough and grow too large. In my case, httpd logs were set to rotate at the default 1 week. This is fine for most sites, but it happens that the site I am working on is fairly busy, and a week’s worth of access logs amount to almost 5GiB. To adjust the log rotation settings, I use logrotate, which is configured via files placed in /etc/logrotate.d on my system. The file which needs editing in this case is /etc/logrotate.d/httpd. I won’t go into the details of logrotate settings here, but I will give an example of my configuration. I set it to rotate logs daily, and keep 28 days worth of historical logs:

1
2
3
4
5
6
7
8
9
10
11
cat /etc/logrotate.d/httpd
/var/log/httpd/*log {
    missingok
    notifempty
    sharedscripts
    postrotate
    /sbin/service httpd reload > /dev/null 2>/dev/null || true
    endscript
    daily
    rotate 28
}

All future logfiles should now be well under the OS file size limit. However if you have existing logfiles which you would like analog to be able to process but are too big, you can split them down into smaller files. To do this, use wc to find the number of lines in your file (I’m using a file named access_log.1 in this example):

1
wc -l access_log.1

That should give you something like:

1
16542827 access_log.1

Next you need to decide into how many chunks you would like to split the file. My file is almost 5GiB, so I’m going to split it into 5 chunks. Take the number of lines returned by wc and divide it by the number of chunks (it doesn’t have to be exact). This gives me roughly 3300000 lines per file, so that’s the figure I’ll use for splitting:

1
split -l 3300000 access_log.1

This may take a while for large files, but once it’s done you should now find that you have some new files along with your old access logs:

1
2
3
4
5
6
7
8
9
10
11
12
ls -l
total 13411008
-rw-r--r-- 1 root root   16345797 Sep 16 11:29 access_log
-rw-r--r-- 1 root root 4582662485 Sep 16 11:05 access_log.1
-rw-r--r-- 1 root root 4488483066 Sep 12 04:11 access_log.2
-rw-r--r-- 1 root root      49594 Sep 16 11:28 error_log
-rw-r--r-- 1 root root  918092183 Sep 16 11:27 xaa
-rw-r--r-- 1 root root  911760823 Sep 16 11:27 xab
-rw-r--r-- 1 root root  912206727 Sep 16 11:28 xac
-rw-r--r-- 1 root root  912078937 Sep 16 11:28 xad
-rw-r--r-- 1 root root  917153546 Sep 16 11:29 xae
-rw-r--r-- 1 root root   11370269 Sep 16 11:29 xaf

As you can see I actually ended up with 6 new files because I rounded the number of lines at which to split to 3300000. The last new file contains the leftovers. At this point you can archive your original large logfile and rename your new files to match your log naming scheme:

1
2
3
4
5
6
7
mv access_log.1 old.access_log.1
mv xaa access_log.1
mv xab access_log.2
mv xac access_log.3
mv xad access_log.4
mv xae access_log.5
mv xaf access_log.6

Repeat the above for any large logfiles and you should be able to run analog against the new, smaller files with no problems.

Virtualbox Bridged Adapter With Ubuntu/Debian Host (AKA Guest OS On Same LAN As Host)

Posted in Linux by Karl on

By default, Virtualbox uses NAT to provide network support for guest operating systems. This is fine if you just want to have basic network functionality such as internet access, but it means that the guest will be on its own network. Often having the guest machine on the same LAN as the host is desirable (if you want to connect to the guest via SSH for example).

This is actually very simple to set up. Just open a terminal on the host and install the necessary packages:

1
sudo apt-get install bridge-utils uml-utilities

Next open Virtualbox, select your guest and click Settings. Change ‘Attached to’ from NAT to Bridged Adapter, and click OK.

If you are using DHCP, that’s it. If you are using a static IP, you will need to configure the guest OS accordingly so that it can join your LAN.

Start rTorrent Automatically at Boot on Debian/Ubuntu

Posted in Linux by Karl on

I tend to use ruTorrent as a front-end for rTorrent and so I like to have rTorrent start silently and automatically on boot. Most of the guides on this subject seem to suggest using an init script placed in /etc/init.d, but this is more complicated than it needs to be in my opinion. You can use start-stop-daemon instead which is much simpler and yields the same result. Here’s how.

First install dtach:

1
sudo apt-get install dtach

Then edit your /etc/rc.local file:

1
sudo nano /etc/rc.local

By default this file looks something like this:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
#!/bin/sh -e
#
# rc.local
#
# This script is executed at the end of each multiuser runlevel.
# Make sure that the script will "exit 0" on success or any other
# value on error.
#
# In order to enable or disable this script just change the execution
# bits.
#
# By default this script does nothing.

exit 0

Add the following before exit 0, making sure to replace karl with the user as which you would like to run rTorrent, and /usr/local/bin/rtorrent with the location of rTorrent on your system (/usr/bin/rtorrent if you installed from the repo, /usr/local/bin/rtorrent if you compiled rTorrent from source with the default install path):

1
start-stop-daemon --start --chuid karl --name rtorrent --exec /usr/bin/dtach -- -n /tmp/rtorrent.dtach /usr/local/bin/rtorrent

Your /etc/rc.local should now look like this:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
#!/bin/sh -e
#
# rc.local
#
# This script is executed at the end of each multiuser runlevel.
# Make sure that the script will "exit 0" on success or any other
# value on error.
#
# In order to enable or disable this script just change the execution
# bits.
#
# By default this script does nothing.

start-stop-daemon --start --chuid karl --name rtorrent --exec /usr/bin/dtach -- -n /tmp/rtorrent.dtach /usr/local/bin/rtorrent

exit 0

Press ctrl+o to save the file and ctrl+x to quit nano.

That’s it! To test you can either reboot, or else manually run the rc.local script:

1
sudo /etc/rc.local

If you don’t use an rTorrent front end, or want to view rTorrent in a terminal for whatever reason, you can reattach it to your terminal with the following command:

1
dtach -a /tmp/rtorrent.dtach

You can either press ctrl+z or simply close the terminal to quit dtach (rTorrent will keep running in the background).