Load testing (with loader.io): Find the bottleneck?


I tested a small c5 page with the free servicehttp://loader.io which simulates 250 concurrent users. Unfortunately the page (and server) can only handle 100 users if I test the c5 page. If I test a simple .html file the server can handle it. (See screenshots)

Max users:   250
Duration:   60 seconds
Avg response time:   137 ms
Timeout errors:   345
Network errors:   219
Errors (400/500):   0 / 0
Avg error rate:   3.37%
Max users:   250
Duration:   60 seconds
Avg response time:   2385 ms
Timeout errors:   140
Network errors:   958
Errors (400/500):   0 / 0

All Cache settings are turned on, Statistics deactivated. We use APC as opcode cache (I can post the configuration if needed).
We're hosting on a hosteurope root-server (package M). I can submit server performance details if needed, but it is not some low-cost shared virtual server (but also not the best-performance-server because plesk is installend on it).

My question: Has anybody experiences finding the bottleneck?

Htop tells me (see screenshot) that apache causes most of the cpu and load average goes above 20.0 when loader.io hits our server.

I also thought about using XDebug and webgrind (https://code.google.com/p/webgrind/wiki/Installation) to profile it, but I don't think that this will really solve my problem.

Any advice for next steps?

Thanks in advance!

5 Attachments

View Replies:
jshannon replied on at Permalink Reply

I don't think those numbers are too out of line for an application like C5 (vs a straight HTML page).

You're using the full-page caching in 5.6.2? My understanding is that that's basically a straight HTML page.

I think the way to go is xdebug. If you're familiar and comfortable with it, why not? It'll answer your problem pretty quickly. Well, I take that back. It'll tell you what areas are taking a lot of cycles, but not necessary that, e.g., paging is your problem. But it's still a good place to start.

I've been playing with c5 profiling recently. Unfortunately, PHP doesn't seem to have any variables that give you info about bottlenecks. But I have started to create a c5Profiler which will give you a sense of where it's spending the most time (including DB calls). If you're comfortable with xdebug, I don't think it's got many additional benefits, though.
programmieraffe replied on at Permalink Reply 3 Attachments
Hey jshannon,

thanks for your reply! I used full page cache of 5.6.2, but I used "if blocks on particular page allow it". Testing again it is a much better error rate (still problems above 100 concurrent users, but not a total breakdown). I attached some screenshots as well.

Full Page Caching: On - If blocks on particular page allows it
Block Cache: On
Overrides Cache: On 
Expire: Every 6 hours (default setting).
Max users: 200 (0-200 Cycling)
Duration: 60 seconds 
Avg response time: 3207 ms 
Success responses: 503
Timeout errors: 139
Network errors: 630 
Errors (400/500): 0 / 0 
Avg error rate: 60.46%
Full Page Caching: On - In all cases
Block Cache: On

Further investigation showed that apache throws tons of the following error:
[apc-warning] Unable to allocate memory for pool. in /var/www/vhosts/wissenschaft.magdeburg.de/httpdocs/concrete/dispatcher.php on line 108.

I attached the apc stats as screenshot.


Anyone knows what to do about it?

My main problem is that the page will have a page list search module which will be used almost in all visits on peak day. So I think full page caching alone will not save me here. I thought about caching the mysql results withhttp://www.concrete5.org/documentation/developers/system/caching/...

Regarding XDebug: Unfortunately i'm not familiar with it, so I don't know if I really should spend the time to learn it by now. Is your c5Profiler ready to use? Sounds very interesting!

jshannon replied on at Permalink Reply
Yeah. I found the pagelist block was a HUGE problem on a previous site. I created a conctact directory based on page types, with the overview page showing every single subpage (with javascript for sorting). It worked fine in testing, but once the client was up to 150 contacts, he was seeing 10+ second load times. But that block, more than most, should be easily solved with some intellgient caching.

Anyways, yes, the profiler is half-ready. The code that generates the statistics seems so be working, but there's no front-end to speak of. It dumps everything into a SQL table, so youv'e got to be comfortable with that, or I can assist if you send me the table contents. (In theory, this will be a useful tool, but until it proves itself in practice, I'm not going to bother with the fancy front-end.)

If you give me your email (via pm), I'll send it back to you.

Responsive replied on at Permalink Reply
Hi programmieraffe

Are you still using this hosting for concrete5 , I was interested as having problems with current host.