One of the many things I love about Koha is the easy access I have to statistics. I’m a do-it-yourselfer, so I seldom use Koha’s built-in reports. Instead I go directly to the source: Koha’s statistics table.
Field | Type | Null | Key | Default | Extra |
---|---|---|---|---|---|
datetime | datetime | YES | MUL | NULL | |
branch | varchar(10) | YES | NULL | ||
proccode | varchar(4) | YES | NULL | ||
value | double(16,4) | YES | NULL | ||
type | varchar(16) | YES | NULL | ||
other | mediumtext | YES | NULL | ||
usercode | varchar(10) | YES | NULL | ||
itemnumber | int(11) | YES | NULL | ||
itemtype | varchar(10) | YES | NULL | ||
borrowernumber | int(11) | YES | NULL | ||
associatedborrower | int(11) | YES | NULL |
The statistics table is long-term memory storage for circulation transactions: checkouts, check-ins, renewals, and payments. The kind of reports we depend on most are circulation reports, so I usually focus on checkouts and renewals.
Just a few of these fields are relevant to most of my circulation reports:
- datetime stores the date and time of the transaction.
- branch stores the three-letter code for the branch where the transaction took place.
- type records whether the statistic is a checkout (“issue”), check-in (“return”), renewal (“renew”), payment (“payment”), or write-off (“writeoff”).
- itemnumber, which records the id of the item in Koha’s catalog.
Using these pieces we can put together a query which counts circulations per branch for a given month:
[sourcecode language=”SQL”]
SELECT branch,count(*) FROM statistics WHERE year(datetime) = 2009 AND month(datetime) = 12 AND (type = ‘issue’ OR type = ‘renew’) GROUP BY branch ORDER BY branch;
[/sourcecode]
branch | count(*) |
---|---|
ALB | 2762 |
APL | 21475 |
COV | 1746 |
CPL | 616 |
GPL | 2725 |
NPL | 5475 |
PPL | 5391 |
7 rows in set (37.16 sec)
One important thing to note about the results of that query: It took over thirty-seven seconds to execute. That’s ages in MySQL terms, and a cause for caution. We’ve been using Koha since 2003, so our statistics table is huge: almost nine million rows. For that reason I don’t run my statistics queries directly on our production Koha server. I back up the statistics table to a separate database where it won’t interfere with the performance of our Koha operations.
Getting fancy
Querying MySQL directly can get you many of the numbers you want, but if you want to aggregate data and manipulate it in fun ways you have to add a scripting layer to the process. For Koha that’s Perl, for me it’s PHP. The end result that I want to share is a report of circulations per month for all branches. This report shows not just the raw numbers from Koha but also formats them as a graph using Google’s Chart API.
This was just a quick look at the kind of work I’ve been doing recently. There are a lot of pieces of the puzzle that I’ve glossed over. I hope at the very least it’s an interesting glimpse of what is possible when you have easy access to your data and the tools to manipulate it.