Cache'ing & concrete5 - my benchmark tests

Permalink 4 users found helpful
I run the concrete5 cache on almost all of our websites...

I've set up another server today in the same datacentre purely to benchmark our production machine (we have been trying to work out how much traffic we can comfortably handle).

Test:
Run the AB process (with one concurrent connection at a time) and calculate how many times we can request a particular file page in a 60 second window.
Evaluate server load / and real world browsing impact

Results:

Static HTML File
>42,176 downloads in 60 seconds
>Latency 99% requests served within 2ms
>Server load impact - little
>Real world impact - no noticeable impact

Basic concrete5 site with cache enabled
>300 downloads in 60 seconds
>Latency 50% requests served within 180ms, 99% within 494ms
>Server load impact - double static test-but not huge
>Real world impact - no noticeable impact

Wordpress Site with Supercache Enabled
>14,224 downloads in 60 seconds
>Latency 99% requests served within 22ms
>Server load impact - little
>Real world impact - no noticeable impact

I've run these tests several times and each is getting fairly consistent results.

I'm amazed at how poor concrete5 performs, even with cache on.

Is there a way we can build something like the 'supercache' plugin into concrete5 to generate static files so we can handle a lot of traffic?

I tested on concrete5.3.3.1 - I'm aware that 5.4 uses APC - will this make much of a difference?

Thoughts?

myFullFlavour
View Replies: View Best Answer
andrew replied on at Permalink Reply
andrew
5.4 uses Zend Cache, which can have different layers, including the basic file cache (which is what we use), APC, memcached, and others. I just benchmarked 5.3.3.1 and 5.4.0 on our servers (the /about page in a standard install) and 5.3.3.1 served up 477 requests, and 5.4 served up 512 (compared with 300 in your tests.)

Obviously, the flexibility that concrete5 provides comes at a performance price, especially given its modular nature. It's no secret that even opening a database connection creates some semi-serious overhead, especially when comparing against static sites. I'm assuming supercache operates without a database (running a very lightweight PHP front-end and delegating requests to a full-HTML cached copy somewhere.) We would like to release something similar to this at some point in the future, and I believe Remo may be working on it already for inclusion in our add-ons area.
myFullFlavour replied on at Permalink Reply
myFullFlavour
Help us all if a concrete5 website ever gets dugg...

I think a cache'ing solution like supercache is the answer - be keen to see what Frz says about this also?

I know a few of us developers were rather concerned when I released those figures yesterday.
Tony replied on at Permalink Reply
Tony
from tests that i've done before with other frameworks, 5-10 requests per second isn't bad for CMS systems and other dynamic sites. the database calls tend to be the constraint. concrete's running at the lower end of that range (the trade off for offering more functionality), but it's not really excessive either. yeah, concrete5 probably should get some kind of static caching option for sites that are getting pounded. the downside would be you couldn't display custom info for a specific user (login status, shopping cart contents, etc). being able to turn on static caching on a page by page basis would be cool though.
frz replied on at Permalink Reply
frz
Here's the deal..

5.4 is a lot faster in many ways, both in editing and in front end. Lots of little things make that happen, so does Zend Cache & APC when they are /properly configured/. They also can slow things down and seem to be annoying to maintain.

What I would like for the digg moment is actually rather simple and I'd like it for other reasons as well.

I'd like any existing page in a concrete5 site to have a option to be "static published" to a directory somehow. Perhaps this is a page level command from the properties drop down? I tend to think you'd only use it across the top few levels of any site, even the biggest of ones, but I'd like to be able to say "right now I want static HTML and some image links that will work of THIS page and I want it OVER THERE.

Then I want dispatcher to be light weight enough that it can pass through this static page with the absolute minimum of database calls. Ideally, none. I recognize that's not particularly likely and I don't want to start recreating the role of a database with some static file somewhere, but essentially letting apache do what it does for sitename.com/bigPromo with concrete5 trying to get out of the way as quickly as possible.

If that dynamic page that c5 has rendered as static content has interactive blocks on it, so be it. If there's a guestbook or survey or something, that block is essentially going to be slightly out of date as people see the static page. If they choose to interact with the guestbook, they'll be redirected over to the interactive version of the page with traditional concrete5 stuff going on.

In my experience, this would solve 99% of the problems that you have with most really big sites that get hammered. Its the top few pages that take most of the hits by far, and ideally these "pageletts" might even be encapsulated enough that you could spit them out to a mirror network like Akamai ..

Now I get there's a LOT of holes in what I described, but fundamentally I think that's the right approach to take, because continually worrying about which one of these caching engines to use or compiling javascript to not include white space or something is like cleaning your car in the hopes it will turn into a pickup truck.

Perhaps some big corporate entity will come along and help finance these improvements.

Out of curiosity, what does wordpress without supercache enabled look like on your tests?
mose replied on at Permalink Reply
mose
This sounds intriguing. I like your idea, Franz, of making the highest-traffic pages static-ish. It would only be a few pages, such as the home page or a page being slashdotted. The page could be marked cachable in the page properties.

One way this could be done, leveraging what is already in place, would be to write the rendered view to cache. When the request for a page arrives, the page is checked to see if it is cacheable. This information itself would be cached with the page information already being stored. If the page is cacheable, check the cache to see if the view is there. If it is, return the view and you are done. If the view is not in cache, things proceed normally, and the rendered view is written to cache at the end of the process so that it is available on the next request.

If there is dynamic information on the page, a view could have a low cache timeout of 15 minutes or even 5 minutes so that dynamic information is refreshed regularly, but at least the entire page won't be regenerated on every access.

An alternative for dynamic information would be to include javascript in the page that would refresh the dynamic information after the page is loaded in the browser. That requires preparation, though.

If APC is being used, then all of the PHP will already be in cache. Since only a few high-traffic pages are being cached, it won't require much additional memory. Very few database calls would need to be made. This should provide a reasonable speed increase.

If this idea is generalized, then anything that has a view could be cached. You might not cache a page, then, but if there is an expensive block on a page, it could be cached. That would allow other dynamic information on the page to still be generated.

It might be efficient to include all of the javascript and css in the cached view, especially in a slashdot situation where a single page is being targeted. The browser would open one connection to get the web page, and everything would be shipped out in one bundle rather than having the browser open multiple connections for individual files.

Of course, things could be even faster by coming up with a way to generate a pure, static page and bypassing PHP altogether, but the approach above makes use of features already in Concrete5. Implementation would be transparent (i.e., no special configuration changes).
Remo replied on at Permalink Reply
Remo
have you seen this posthttp://www.concrete5.org/developers/beta/beta_bitching/cache-librar...

I didn't put too much effort into it but it should give you some basic information about apc in c5.
Remo replied on at Permalink Reply
Remo
Andrew's right too, I'm working on a different cache approach which caches a whole page (if possible). This has a huge impact on pages where you only use autonav, content and other static blocks.

I've posted a version of my addon herehttp://www.concrete5.org/community/forums/customizing_c5/reconsider...

It's not finished and I probably won't be working on it in the near future. Other stuff to do at the moment..

But feel free to use it if it works for you, but keep in mind that it isn't finished! I'd appreciate any feedback though.
mose replied on at Permalink Reply
mose
I took a look at your code. Interesting idea with how you hook into things. I updated your code for 5.4.0. The package is attached to this message.

I used the built-in caching mechanism in Concrete5. If you are using APC, for example, then page content will be cached in memory. If you (or any other brave person) can do some tests to see how much of an improvement this is over no caching, that would be great to know.

This is a good proof of concept, but several things need to be addressed to make it a general solution. It would be best if this is a part of C5, or at least, there should be an API in C5 that would allow pluggable caching. While this solution works, it's not quite a perfect fit.

At the moment, pages are cached if all of the blocks on them are cacheable. If a site has thousands of public web pages and a robot comes along, it will load the entire site into cache, which is probably not what someone wants. There should be a mechanism to select a few specific pages to be cached. These would be the highest-traffic pages.

The default home page of a new installation won't be cached, because it has the YouTube block on it. That can probably be cacheable, because that information won't change very often. In any case, there should be an option to make a page cacheable, no matter what blocks are on it, if a small delay in updating dynamic information can be tolerated.

Currently, a web page is cached forever. If the page is edited and approved, the page is removed from cache, and the next request for that page will cache it, again. There should be an option to specify how long a page can be cached. That way the home page can be cached for 5 minutes, for example. That would help survive an onslaught of visitors but still display relatively current information, if it is dynamically updated.

Code needs to be added to determine if a POST is being made. In that case, the cache should be bypassed so that the information can be processed by the controller.

I'm interested in any other thoughts. Thanks for breaking the ground, Remo, with this code.

Edit: Attachment removed. See below.
mose replied on at Permalink Reply
mose
I can't get to the beta page. A login page is displayed, and it won't accept my username/password. I requested access to the beta group several weeks ago, but I haven't heard anything.
frz replied on at Permalink Reply
frz
Are you sure? i see you have the beta badge so you're in the beta group.
mose replied on at Permalink Reply
mose
I just checked, and I can get to the beta group, now.
Quaro replied on at Permalink Reply
>I'd like any existing page in a concrete5 site to have a option to be "static published" to a directory somehow. Perhaps this is a page level command from the properties drop down? I tend to think you'd only use it across the top few levels of any site, even the biggest of ones, but I'd like to be able to say "right now I want static HTML and some image links that will work of THIS page and I want it OVER THERE.

I know there's been a lot of work done on caching in the time since this 20120 post but this sounds great. Did anything ever come of this idea?

The difference between a static html page and anything dynamic is still so huge. And for most of my sites static content wouldn't noticeably even change the user experience.

Edit: And now I see you guys are looking towards Varnish as a solution, sounds cool!
frz replied on at Permalink Reply
frz
there's some more thinking in the works in this direction.. yeah.

best wishes

Franz Maruna
CEO - concrete5.org
http://about.me/frz
adamjohnson replied on at Permalink Reply
adamjohnson
It seems that Concrete5 could really use an add on similar to WP's Supercache. It would really round out the package.

Case in point: I showed a friend a site that I had converted to Concrete5 and the first thing he said was "It seems slower now." It appears people are noticing the differences in speed.
mose replied on at Permalink Reply
mose
Attached is version 1.2 of the package. I discovered that the login page became cached, and then I couldn't login, heh.

Code has been added to bypass cache if a POST is made. The search, youtube, survey and form blocks have been added to the cacheable list. All of these blocks should be static until the next time they are edited. After a page is edited, it will be re-cached. Very few blocks, then, aren't cachable.

Edit: Attachment removed.
mose replied on at Permalink Reply
mose
Can't seem to attach a new file to a previous post.

Made additional change to not cache https information. Added flash as a cacheable block.
Remo replied on at Permalink Reply
Remo
hey sorry man, it seems like I didn't upload the latest version.. I fixed the "POST-Problem" a long time ago.

If you're interesting in improving this, should we create a public repository? sourceforge, google code?

I also replaced some code to use the cache helper instead of writing files... Can't access the code right now, vpn isn't working at the moment ):
mose replied on at Permalink Reply
mose
Before we go too much further, I would like to see if it would make sense to put some hooks into dispatcher.php or View so that a caching add-on could be called as part of the regular process. Sending the content to the output and calling exit() seems a bit abrupt. It would be nice to capture the rendered output and store it in cache instead of opening a TCP connection to get the content of the page.
Remo replied on at Permalink Reply
Remo
Yes, I tried to capture the output using output buffer but it didn't work, probably because c5 uses output buffering too..

That's not the biggest problem imho. It's certainly not perfect..

Putting some stuff into dispatcher would definitely improve a lot but at some point we'd have to integrate it into the core. If the dispatcher has to check all addons for a specific hook it would cause some overhead too.
Remo replied on at Permalink Reply
Remo
I'm not able to extract the zip file you've uploaded. There's only one subdirectory but no files in it?
mose replied on at Permalink Reply
mose
Yes, you're right. This attachment should have everything.

Edit: See below.
Remo replied on at Permalink Reply
Remo
cool, thanks!

I've seen that you've moved the cache creation method into getPageCache. I don't think it really matters, but when you delete the cache, why are you still using on_page_version_approve, isn't on_page_delete more appropriate?
mose replied on at Permalink Reply
mose
I'd have to check when each of the events is fired. The code you wrote used on_page_approve, and I just kept the event. When a page is approved, the current copy in cache is no longer valid, and it should be removed. on_page_delete should also be handled to remove the page from cache. If page approval is followed by a page delete and a page add, then on_page_add should be used to invalidate the cache information instead of on_page_approve.
Remo replied on at Permalink Reply
Remo
I think the "approve-event" is fine for this..

My approach was a bit different, I wanted to update the cache as soon as a new page (version) has been approved. By doing this, no end user would have to wait until the cache has been updated for the first time.....

But it doesn't really change a lot..
mose replied on at Permalink Reply
mose
Extending events is already a part of c5. My general thought was that an add-on could extend a page caching event. I'd have to look at the code more closely to see if something like that makes sense.
Remo replied on at Permalink Reply
Remo
the problem is that the dispatcher needs to include quite a few things before it knows about any package or event. You already have quite a lot of overhead before an event can get fired...

We'd have to create a differnt kind of event to handle this earlier in the dispatcher.

That's certainly possible but we'd have to update a few lines in the core. At this point we'd have to contact Andrew I guess. However, the addon is going to improve a lot, even with this overhead in the dispatcher..
arcanepain replied on at Permalink Reply
arcanepain
Hey guys,

Not going to pretend I understand more than about 20% of what you're saying and the code you're bouncing back and forth, but like the route you're going with this, and I really look forward to the add-on when it's ready. From the less technically capable, thanks for all the work you're putting into this!
Remo replied on at Permalink Reply
Remo
@arcanepain, Thanks, I'm sure you'll be able to use it pretty soon. But keep in mind that it isn't able to cache all pages. Only those with "static blocks". By "static block" I mean a block that doesn't change its output unless you go to the edit mode and change something. The guestbook for example can be changed, even if you don't have edit access. Therefore, this addon won't improve pages with guestbooks..

@mose, I don't really see why you'd want to specify a page as "cacheable", nor why you'd want to add a "lifetime". The cache gets updated as soon as the page changes if there are only static blocks.. If you'd want to cache all pages it would be possible to cache a page for x minutes. But keep in mind that you cannot cache a page with a form block, not even for 10 seconds. There's a unique id in it.. Makes it pretty much uncachable..
mose replied on at Permalink Reply
mose
I'm thinking about those extremely high-traffic situations for pages that are dynamically updated. This is often referred to as the slashdot effect. Someone posts a link and then thousands or tens of thousands of people descend on that poor web page. The website dies a quick death.

If the web page could be cached for 5 minutes at a time, it might be able to keep up with a large number of accesses. It's likely that any dynamic information being out of date by 5 minutes will have a minimal impact.

If you aren't concerned about that kind of situation happening, then there is no need to mark a normally uncacheable page as cacheable or to specify a lifetime. As you said, the cache is updated whenever the page is changed.

However, it would still be helpful for large sites (i.e., thousands of pages) to identify the pages that can be cached. That way an entire website won't be loaded into memory if a robot walks the website.

Edit: I didn't realize that a form has a unique ID. I have removed it from the cachable list in my copy of the code.
mose replied on at Permalink Reply
mose
I just did some ab testing with very unexpected results. c5 is faster without page caching installed. That seems bizarre.

I did the test after viewing the page so that everything should have been loaded into cache. I'll have to see if I can track down what is happening.
Remo replied on at Permalink Reply
Remo
I never tested it with APC but with simple file caches my sites were quite a big faster!

About robots, if you don't want to load a whole site into the memory you could probably check for the query string. If search robots dig into the deep web, the usually use complex query strings to get access to more data. This should work for a few more sites. Concrete5.org might be an exception since we really have lots of pages / content, no matter if there's a query string or not..

Might be a bit difficult to create an addon which works in all situations.. hmm.
mose replied on at Permalink Best Answer Reply 1 Attachment
mose
OK, I tracked down the slowness problem. The world is right, again.

I ran some ab benchmarks with a single connection on a new, standard installation. Page caching with APC is roughly 2.5 times faster than no page caching when retrieving the default home page. Average time per request dropped from 63ms to 22ms. That equates to over 2,500 requests a minute (4-core 2GHz Xeon, 45% CPU used during the test).

Actual total page load time, of course, will be greater and will depend on the complexity of the web page, the number of additional files that need to be loaded (JavaScript, CSS, images, Flash), if you are running APC to cache PHP code, if your web server has been tuned and if your website is running in a shared environment.
arcanepain replied on at Permalink Reply
arcanepain
Forgive me if i'm totally misunderstanding the function of this but, worst case scenario, would this add-on be any use if my sites were running on a shared server without APC installed/enabled? Would the add-on even function without APC?
Remo replied on at Permalink Reply
Remo
sure, the cache library in concrete5 can use different cache backends.. APC, memcache, files, sqlite etc.

No problem there!
arcanepain replied on at Permalink Reply
arcanepain
Awesome...does it select one automatically or do you have to declare one in the add-on? Actually, this just extends the Concrete cache, right? I guess it handles the plugging in to one of these.
arcanepain replied on at Permalink Reply
arcanepain
Hi Mose / Remo. Perhaps against best judgement, i've actually got your Caching add-on running on one of my production sites right now (don't worry...is thoroughly backed up!). I successfully upgraded it to 5.4 the other day and so I thought I take a chance on it as it is an almost entirely static site and i've always found it annoyingly slow. I've got a few custom blocks on there but following the formula of the custom blocks of yours in there Remo, I added mine (they're all static so shouldn't cause a problem) to the array so there should be no problems there. Site still runs fine after installation and not spotted any problems BUT nor have I spotted much of a speed increase.

I'm not asking for anything unrealistic, but would love to think it's running a little faster. Stupid question, but how can I actually tell if any of the pages are being correctly cached by your add-on? I'm trying to sift through all the incomprehensibly named files in my cache (using the basic file method...not APC or anything) but I don't really know what i'm looking for. Is it a file with all the meta/version/edits data + a few regular HTML sections which represent the blocks on the page -- there seem to be lots of these -- or should it look more like an actual HTML page (ie. with <html><head>.scripts, stylesheets.</head><body>.layers,content,images.</body></html> etc....)? You talk about capturing the whole page above, so i'm thinking it should look more like a normal static webpage...not a mis-mash of Concrete page/block data and the odd bit of HTML.

I know this isn't finished nor properly 'released' or anything, but if you could give me any hints that would be great.

Cheers
Remo replied on at Permalink Reply
Remo
You're right about everything. Whole page, there should be a speed increase and there should be stuff in the cache directory.

Are you sure you've added this line to config/site.php

<?php
define('ENABLE_APPLICATION_EVENTS', true);
?>
mose replied on at Permalink Reply
mose
I haven't actually looked at the cached information, but it should be the headers and all of the HTML of the web page. It might be easier to find the document by clearing cache, visiting a web page and then looking at cache to see what files have been created. That way you don't have to look through too many files. When you visit the next page, note the exact time, and then look for cache files that were created at that moment.

At this time, there is no easy way to list the pages that are in cache. Since you seem somewhat comfortable working with the code, you could put a logging command into the code to record the cID of a page when it is cached.

Find the following line in models/remocache.php

if ($writeCache) {

and add this statement on the next line.

Log::addEntry("cached page " . $cID);

Clear the cache to start fresh. Logout, access a page, login (or do this with two browsers, one logged in and the other one not), go to the dashboard, click Reports and then click the tab for Logs. You should see that the page was cached and its number.

Note that pages are only cached when they are accessed by a guest (i.e., someone who is not logged in). Pages for a registered user are not cached. As a result, this will not speed up things like the dashboard. As you also discovered, a page will not be cached if one or more of its blocks are not on the approved list.

Many things affect the speed of a website. APC is an in-memory cache and will be faster than a file-based cache. Since the Linux operating system itself does a good job of caching recent files in memory, a file-based cache on Linux should be an improvement over no caching, given sufficient memory.

This particular implementation of the page cache only caches the HTML. If CSS, JavaScript and images must be retrieved, that still must be done separately. If an image-heavy website is slow, it will still be slow after caching, because the amount of HTML will be tiny in comparison to a number of sizable images.

You could consider turning on caching in the web server itself for certain things like CSS, JavaScript and images. You have to be careful, because you don't want to cache things that only a registered user should see or that are dynamically updated. If the web server does caching, then when a request arrives for that information, the web server can respond without involving Concrete5 at all (again, assuming that the JavaScript or CSS is not dynamically generated).

Configure the web server to compress HTML, JavaScript and CSS. While it takes time to to do the compression, sending less information across the network can make the whole process faster.

All of the things that make an ordinary web page fast apply here, as well. For example, if JavaScript is referenced at the top of the web page, that will cause other operations to stall while the JavaScript is being downloaded. If the JavaScript is placed at the bottom of the web page, the browser will open multiple connections to download images and other information in parallel. That makes it appear that things are faster, because items show up on the web page sooner, even though the total time is probably about the same either way.

Beyond those factors, more resources can help speed delivery. That includes a faster CPU, more memory, faster disk drives and a faster network. The next step after that is a content delivery network (CDN) to spread accesses among multiple servers in different locations on the Internet (e.g., referring to jQuery stored at Google).

In summary, page caching is only a small part of speeding content delivery. You need to analyze why a website is slow, and then make appropriate changes to resolve that specific issue. If you tweak something that isn't very slow to begin with, you won't notice much change.
mose replied on at Permalink Reply
mose
As you may have noticed in this thread, people are using "ab" to automate testing the speed of page accesses. ab comes with apache. You might want to try it, as well, to get more quantitative results.

Retrieving HTML without page caching here takes about 63ms, and with page caching, it takes about 25ms. No human can tell the difference between those two times just by looking at a web browser.

If you have something like Firefox + Firebug + Yslow, you can see the total time it takes for the web page and everything on it to load.

Good call, Remo, on defining events.
Remo replied on at Permalink Reply
Remo
It seems like performance is getting even more importanthttp://searchengineland.com/google-now-counts-site-speed-as-ranking...
frz replied on at Permalink Reply
frz
A question of refresh time came up a bit ago..

In my mind, ideally you'd have:
o.. Page specific attribute or something to turn "static caching" on for the page or not.

o.. A page specific "refresh" attribute - with settings like:
On Edit
Every Night
Every 12 hours
Every 1 Hour
Every 10 Mins
Every 5 Mins

o.. Some site wide dashboard controls for the moment you get "slashdotted". These should actually be in a config file somewhere so you can easily edit them from the OS even if the sites UI & DB is down. They should let you :
Put the site in maintenance mode (exits already)
Put the site in maintenance mode, except for cached pages which render as expected.
Ignore all cache refresh settings.
Flush cache and force recreate (maybe, this is more of a command than a setting)

o.. yes, ideally it would create results that were sanely formatted and encapsulated enough that you could pass it on to a mirror sever somewhere else. (think Akamai)

o.. this doesn't probably help you guys, but it should be a hook deep in core so it can bypass as much un-needed junk as possible. Before DB queries all together if at all possible. If we can get a solution working well here, I think it shouldn't be an add-on but should be deeply integrated.

I actually think Andy has prototyped some stuff along these lines, inspired by your direction here. Its reasonable to expect this list of desires above grow and something compelling end up in the next big version of concrete5, with your help.
Remo replied on at Permalink Reply
Remo
1. static caching attribute. In 95% of all cases there's no need for that, I know if a page can be cached, no need to ask the user for this information. It might actually cause a few problems. If a user enabled static caching on a page with a form block it might cause some annoying support requests...

2. Refresh attribute. With the events we have, we usually know if something has been changed. Again, in 90% of all cases this isn't necessary. However, for a page like concrete5.org you could use such an attribuge in about 90% of all cases because of the discussion package. Your event concept is a lot more efficient than any refresh functionality. I'd rather add a new event "page_content_updated" which all addons must call if they update page content printed by view.php.

3. Mirror server. This already works, zend_cache allows you to use memcache which is just perfect for this. My testsite already runs on two servers!

4. Database less. Yes, we can't do everything within an addon but the current code already improves it a bit... Andy actually asked me if this could be an addon but I agree, if you want to get the best result it shouldn't be an addon..
arcanepain replied on at Permalink Reply
arcanepain
Ok...thanks so much for the help and the clarification guys, but still no joy here. Pretty weird really...I added the log event following the permission/ability to cache in the add-on (as per Mose's post). This works fine and as expected...pages with valid blocks are cropping up in the log when I load the page in a separate browser. All well and good (although when clicking round even without edits i'm showing a one or two duplicated caching events...not worrying about that for now though). Problem is, having meticulously gone through every cache file created at exactly the log time (both tiny files and the bigger cache files, AND cache files a couple of seconds before and afterwards just to be sure) I can't find anything that looks like a proper cached page (ie. basically an HTML document as i'd expect the page to look). Strangely, what I AM finding is a duplicated version of my CSS/stylesheet, which I didn't even think was a valid candidate for caching as per Mose's post above. This CSS is just added in to the page with an @import, so nothing too unusual going on there.

Any idea what might be happening there? Is the CSS supposed to get cached? I activated page events as per Remo's suggestion too. Needless to say, i've tried multiple cache clears and even tried manually deleting all files in cache before flicking through the site. Incidentally, should these files be going into the regular 'cache' folder in 'files' or to the 'cache_objects' folder that seems to be in there? I cleared my 'cache_objects' folder as part of one of my manual cache-clears, and nothing new seems to have made it back in there....it's just empty!
mose replied on at Permalink Reply
mose
5.4.0 caches more information than before, including some CSS. So, yes, you should be seeing CSS in the cache. Not all CSS is cached. Only the CSS that can be customized.

I use APC here for the cache, and I do not have a cache_objects directory on my server. Perhaps it is used by the Zend file cache, but I don't know.

If you are using the default Concrete5 cache, it is the Zend file cache. You should find the html files somewhere in the files/cache directory. It is a mystery why they are not apparent.

Note, again, that the file will begin with the headers normally sent by the web sever and followed by the <html></html> document. The file will not start with <html>.
arcanepain replied on at Permalink Reply
arcanepain
Interesting. Think i've found out what's happening, although I couldn't hope to explain why! After exhaustively going through cached files over and over and still not finding ANYTHING that looks like a complete page despite the cache log entries, I started wondering why I was seeing the same page multiple times coming up in the cache log. This appears to be the problem...as I keep returning to the same page, it keeps triggering the cache (ie. your add-on is not finding the existing page in cache), despite no changes/approvals having been done. Makes me think that although we're getting successfully to the cache stage, the file isn't actually being written to cache...despite successfully reaching cache stage each time it never makes it in there. Explains why I can never see it, and why sometimes i'm not finding ANY files that have been written to the cache at the same time as the log indicates one has.

Any idea why this might happen? Pretty standard C5 install (was a fresh install at 5.3.3.1 and was upgraded to 5.4), except for the Pretty URL fix to work with my shared hosting server (no core change any more, just added: define('SERVER_PATH_VARIABLE', 'REDIRECT_URL'); to the site.php.

Cache must be doing its job apart from this, as lots of files making their way in there - Concrete metadatas, content block content, etc... Just no actual pages! Is there something wrong with the $ca => set(xxx, xxx, xxx); method maybe? It works fine on a default C5 install i've got running on my desktop WAMP server and obviously fine for you guys, so i'm guessing it's fine, but i'm sort of stabbing in the dark here...
mose replied on at Permalink Reply
mose
If it is working fine on a default install, there must be something different about the production installation. If the cache module can't write the page to cache, that would explain why it doesn't find the page, again, and tries to write it the next time the page is accessed.

Is the files directory and all of the files and directories below it read/write by the web server (e.g., chmod 755 for directories and chmod 644 for files, all owned by the web server user and group)?
mose replied on at Permalink Reply
mose
@Remo: I mostly agree with your thoughts about static caching and refreshing. I don't know what the Zend file cache does, but APC has a fixed amount of memory with a default timeout out of 2 hours. If the cache fills up before then, I think it uses LRU replacement.

If a robot came along and indexed a website, every page would eventually get loaded into cache. If the cache would fill, old pages would get tossed out. The most-visited pages would stay in cache (or make it back in pretty quickly). Over the next two hours after the robot leaves, the cache would normalize, again. That situation takes care of itself. You just need to decide how much memory to use for cache.

If we use events to tell when a page has changed, then the page never needs to expire from cache. (Although, APC has its own default timeout of 2 hours, as I mentioned.) This handles the general case nicely.

Now for the exception. Some pages have dynamically-generated content, but it would be nice to cache that information for a short time, maybe 5 minutes or even 1 minute, on a high-traffic site so that you aren't regenerating (nearly) the same dynamic content a hundred times a minute. That seems like a waste.

Take the forum summary page on this site, for example. I don't know how often it is visited, but it is the main page into the forums. It would be pretty cool if that page just splashed on the screen. That could happen if the content were cached. However, it needs to be cached by force, due to its dynamic block, and it should only be cached for a short time, because people are adding posts.

Maybe we need a combination of the Remo and frz ideas. Cache all pages by default, if they are cacheable, and let the cache sort things out in terms timeout and replacement policy. That is what it is already designed to do.

If a page isn't cacheable, define a page attribute that forces it to be cached. If the page attribute name is something like cacheTimeout and it's value is greater than zero, the page is always cached for that number of seconds, even if dynamic blocks are on the page.

@frz: I was looking at the dispatcher earlier today, and I mapped out a simple idea to retrieve page cache right after the database and cache were loaded. That would require a minimum number of database accesses and bypass most things that are loaded in the dispatcher.

If the database could be bypassed altogether, that would be fantastic. It seems like it could be a tricky problem, but perhaps there is a simple solution. I'll have to think about it.
Remo replied on at Permalink Reply
Remo
I see, didn't know about the apc memory limit but it makes sense of course..

In my case it looks a bit different. I use files & memcache with quite a lot of storage. Having about 10'000 in the cache isn't a bit big deal.

If I'd use apc I'd simply increase the memory
apc.shm_size="128" and I could cache all pages. But you're right, if a search bot generates some non-human like page hits we have a problem

If something like this makes it into the core: One important point - I tried to fetch the output using the already used output buffer. Didn't seem to work without rewriting some parts of the core. This addon uses an additional http call which is of course silly! But I couldn't find a way to put this into an addon without that..

If we create an attribute for this - how would we make sure that non-cachable pages aren't cached? Blocks with unique output can never be cached. So far it's only the form block. Please note, it's not about dynamically generated output, it's about unique content on each call which gets validated by the controller. Each client / session must have its own ID, you cannot even cache it for 1 seconds!
arcanepain replied on at Permalink Reply
arcanepain
Thanks Mose, but i'd already checked permissions...all fine there. I've done some more testing (trail & error is my speciality!) with the help of the log, and the problem seems to be with the getContents() function...it's returning null! I've checked the getCollectionURL() bit, and that's grabbing the URL fine...so it must be failing in the getContents bit. I've had a look at the file helper and there's actually a credit to you there Remo, so i'm guessing we've you to thank for it...file_get_contents on its own returns nothing too so I don't think the problem there...I think the problem must be how Concrete is handling the URL when PHP goes to grab the page. Pretty URLs problem? I'm guessing so... I've tried manually entering the page path to one with index.php in there and one without. Makes no difference. Definitely works though, because I shovedhttp://www.google.com in there, and it grabbed that fine and started serving it on my site! Haha...glad we don't get much traffic on the weekend as that would definitely have confused come people!

I know this is knocking it back to my own personal install of C5 & server, but here are my .htaccess settings i'm using to make Pretty URLS work on my shared host (I found them after much trial & error and digging around these forums in the past).

<IfModule mod_rewrite.c>
RewriteEngine on
RewriteBase /
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule .* index.php [L]
</IfModule>

Only other change is the in site.php.:
define('SERVER_PATH_VARIABLE', 'REDIRECT_URL');

any thoughts, as usual, most appreciated! :-)
Remo replied on at Permalink Reply
Remo
not sure why there's a credit but it might be because of a fallback solution in getContents. I think the method should be pretty solid as it uses curl and file_get_contents..

Can't imagine why it doesn't work.

Did you try to create a simpler script like echo $fh->getContents('c5-page');

?
arcanepain replied on at Permalink Reply
arcanepain
Yep...I did. Tried all sorts of things. No variant of the URL (even one with the page id appended as a cID query string off index.php) worked...none returned any data.

I think I may have traced the problem to cURL...i've tried disabling the timeout, but that hasn't helped. I've tried a bit of hunting on Google and i've discovered null return strings are not uncommon on shared hosts using cURL. Hmmm...what are my alternatives? I've determined that it definitely is using cURL (ie. is passing the 'function_exists' test) on my host, but is there another way to grab this? If I try and enable fopen for example?
arcanepain replied on at Permalink Reply
arcanepain
Actually...looking at my phpinfo.php, fopen IS allowed already. Hmm...why might ini_get('allow_url_fopen') not be working then?
mose replied on at Permalink Reply
mose
fopen is not enough. The URL wrappers for fopen must be enabled to allow fopen to retrieve a web page just as if it were reading a file. The php.ini file should have lines like the following.

;;;;;;;;;;;;;;;;;;
; Fopen wrappers ;
;;;;;;;;;;;;;;;;;;

; Whether to allow the treatment of URLs (like http:// or ftp://) as files.
allow_url_fopen = On
mose replied on at Permalink Reply
mose
Check what URL is being returned. Just before the getContents() line, add the following logging statement.

Log::addEntry("URL is " . $nh->getCollectionURL($c));
mose replied on at Permalink Reply
mose
The rewriting rules in Sitewide Settings in the dashboard when pretty URLs are turned on (in 5.4.0, anyway). Here they are.

<IfModule mod_rewrite.c>
RewriteEngine On
RewriteBase /
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule ^(.*)$ index.php/$1 [L]
</IfModule>

Your rewriting rules are missing some important pieces, but maybe the forum has messed that up. I will see in a moment when I post this. If your rules don't look like the rules above, you should update your rules.

I don't think SERVER_PATH_VARIABLE normally needs to be set, unless it is related to your shared hosting or something.
arcanepain replied on at Permalink Reply
arcanepain
Thanks for helping me out with this Mose. Ok...not too sure about the fopen wrappers bits (certainly nothing that looks like that in my php.ini!) but allow_url_fopen is already on when I look at my phpinfo.php.

Yes...I tried logging that already. Returns the URL perfectly -- the pretty url version of the address and everything.

In terms of the mod rewrite, the suggested code doesn't seem to work with my shared hosting unfortunately. I'm with Heart Internet in the UK...never got to the bottom of why it didn't work, but I know i'm not alone with the problem and, as I said, it's the same Mod Rewrite code that is posted in a couple of topics around this forum. Can't remember where I picked up the SERVER_PATH_VARIABLE bit, but it doesn't work without it...think I picked it up from a post of Frz's. The site just redirects every pages request to the homepage if I turn it off.

Damn...think we've hit a bit of a brick wall here, haven't we. Out of interest, has this cache addon worked for you with pretty URLS turned on?
mose replied on at Permalink Reply
mose
Yes, caching works with pretty URLs. Pretty URLs was one of the first things I turned on after installing Concrete5.
mose replied on at Permalink Reply
mose
If your host is redirecting requests, maybe there is something that doesn't work when the request comes from the same host. That is, it can't or doesn't redirect local requests. That would explain why no page is being returned.
arcanepain replied on at Permalink Reply
arcanepain
Sounds plausible to me, and i'm really running out of ideas here. I've even had a play around with the infuriating Pretty URLs and not helped at all. I'm guessing you're on the monday about the server not accepting requests from itself as I KNOW caching works if I plug in an external site manually as the page to cache, and no matter how specifically and directly I reference my Concrete pages, still returns null.

Any idea how I would go about chasing my hosting provider up on this? A specific setting or variable or PHP config item they might have off that needs to be on? Haha...I just don't know how to describe/term it to have a go at getting them to fix it!
mose replied on at Permalink Reply
mose
Not a clue. They are already doing something funny by redirecting to your directory. You might just explain to them that your code is trying to retrieve a web page from your site so that it can store the page in cache, but no web page is returned. Point out that the request is coming from the web server itself, and ask them if the request is being redirected correctly like it is for requests coming from the Internet. They should be able to look in the web server log to see how the request is coming in and then figure out where it should go.
arcanepain replied on at Permalink Reply
arcanepain
This is promising! They actually got back to me!

"Our server firewalls intentionally block loopback connections to the same webserver (e.g. using cURL), so using this method is not actually possible. However, any information which can be requested via an external request (cURL), can also be requested via its absolute path. So rather than making a request forhttp://www.mysite.co.uk/somefile... you should instead use /home/sites/mysite.co.uk/public_html/somefile."

Hmm...I sort of follow them, but this poses two questions. 1, can the remo_cache.php be modified to grab this path and 2, would the absolute path even work with the concrete index.php or dispatcher.php or whatever handles the page requests?
arcanepain replied on at Permalink Reply
arcanepain
Hi guys,

Sorry...PROMISE i'll stop pestering you on this and put it to rest v. soon, but does my above post about the absolute server path spark any ideas? Can you think of any way to capture the page this way?

I've tried file_get_content('absolute server path'), fiddled with fopen a bit (don't really understand that though)...er, and that's about it! Sort of extent of my knowledge on this, and repeated searches on google haven't fielded any suggestions at this point that work. Any thoughts?
mose replied on at Permalink Reply
mose
While their suggestion to you makes sense and would work under ordinary conditions, it does not work in this case. The absolute path could be used with a static page stored on the web server.

However, Concrete5 is a CMS. In order to get the web page out of it, a request must come through the web server and be passed to C5 for processing. At this point, it looks like there isn't anything more that you can do to make page caching work with your website the way the module is currently written.

All is not lost, though. When page caching is incorporated into C5, C5 will generate the web page internally and cache it. That should work just fine with your web host.
myFullFlavour replied on at Permalink Reply
myFullFlavour
I kind of got lost with all this discussion... I have another cache'ing idea being looked at by another PHP developer... If that eventuates into anything I'll let you know.

But I did just spot this thread:
http://www.concrete5.org/community/forums/chat/wow-5-4-apc-fast/...

So am I correct in assuming

- If APC is installed on the server
- php.ini is adjusted to the following:
<?php  apc.shm_size = 64 ?>

- & on each 5.4 install this line is added
<?php  define('CACHE_LIBRARY', 'apc'); ?>
to the concrete5 config file

We are in business with fast loading websites?

Thoughts?
mose replied on at Permalink Reply
mose
You are pretty much correct in your assumption. Those are the right variables to define. A site should perform quite a bit faster with APC than without. APC is used to cache the opcode version of the PHP scripts so that they don't need to be loaded into memory and compiled each time a web page is accessed. That saves a lot of time. Concrete5 5.4.0 itself now caches more information than before.

While APC can result in significantly faster operation, it won't solve every problem. If you have a chunky web page with lots of images, Flash, JavaScript and CSS, the web page may still load slowly. Everything you would do to speed the loading of a normal web page still needs to be done here.
andrew replied on at Permalink Reply
andrew
That should definitely result in some performance improvements.

One thing to note, however: your site ought to be on a dedicated box or vps instance to use the code above. Multiple sites on the same server should not all use the APC caching layer on concrete5. APC itself can be installed fine (and will still result in a speedup for all sites on the box) but if you're actually defining the concrete5 CACHE_LIBRARY make sure there's only one site doing so on the box.
mose replied on at Permalink Reply
mose
The issue of multiple Concrete5 installs could be resolved with something like defining a site ID in site.php and then have cache incorporate the site ID into the key. All sites could use APC without a collision.
myFullFlavour replied on at Permalink Reply
myFullFlavour
OK Mose, what code do I need to put into my config file to achieve what you just mentioned?

Andrew, is this do-able / the right way forward?
myFullFlavour replied on at Permalink Reply
myFullFlavour
(note I'm hosting about 100 odd sites on one dedicated box)
mose replied on at Permalink Reply
mose
If SITE is defined, which it should always be, then that could be used as a unique identifier without requiring anything else to be added in site.php. This would require a change to the core code, though, to incorporate the identifier into the cache. A quick scan of the code looks like a small change would solve this problem.

Edit ../concrete/libraries/cache.php and change these lines at the top

public function key($type, $id) {
                return md5($type . $id);
        }


to include SITE.

public function key($type, $id) {
                return md5(SITE . $type . $id);
        }


I have not tested this change. Use at your own risk, etc., etc. This should allow each site to store its own, unique entries in the same cache. If you use APC and give it sufficient memory, every site should experience a speed increase. Report back on the results if you try this.
andrew replied on at Permalink Reply
andrew
I'm a little concerned about this approach, only because SITE doesn't get defined at the very beginning of the process. It will be there eventually, but there are config values and things loaded from the cache before SITE, so early config values might be shared/cached across all your sites, which would lead to some really weird debugging issues.

I believe you could use BASE_URL . DIR_REL to accomplish this, however, and it might be more reliable as its unlikely to duplicated by any sites across sites. This could lead to some long source keys (before they get run through md5() ) but that probably doesn't matter.

One last thing: There are a few times when APC flushes the cache. This will flush the cache for the entire server, NOT just the site that runs the flush command. This probably won't matter, but I thought I'd mention it. Certain other caching layers like Xcache's user cache make flush() something you can only run when you're logged in to the Xcache administration console, so it might be wise to test out clearing the sitewide cache from the dashboard after you put this change in place.
mose replied on at Permalink Reply
mose
Good call on the availability of SITE. I was trying to think of something short, but as you say, it probably doesn't matter. BASE_URL . DIR_REL should be guaranteed to be unique, and it will always be available.

If speed or length was a concern, you could create a constant, such as CACHE_ID and set it to BASE_URL . DIR_REL in site.php during installation. Someone who wanted to tweak things could manually change it to something shorter.

Having a unique value early in the process becomes even more important if caching is embedded in c5. Avoiding the database means that the unique value must be defined in site.php.
myFullFlavour replied on at Permalink Reply
myFullFlavour
So how do we achieve this then?
What code needs to be (replaced/added/removed) in the site config php file?
mose replied on at Permalink Reply
mose
As far as I know, no one has tested this, yet. You would add this line to site.php.

define('CACHE_ID', BASE_URL . DIR_REL);

Then, edit <root>/concrete/libraries/cache.php and change these lines near the top

public function key($type, $id) {
                return md5($type . $id);
        }


to include the CACHE_ID.

public function key($type, $id) {
                return md5(CACHE_ID . $type . $id);
        }


CACHE_ID could actually be defined as anything, but its value must be unique for each Concrete5 site running on the same web server. Using BASE_URL and DIR_REL is an easy way to make the configuration the same among all of the hosts and also ensure that the value is unique.
myFullFlavour replied on at Permalink Reply
myFullFlavour
This could be a winner. However, where do we mentioned APC - your code mentions APC nil times - so it isn't being called?
mose replied on at Permalink Reply
mose
You always want to use APC, if you have that option, even if c5 can't take advantage of it. APC will be used by the web server to cache compiled PHP scripts, which will make things faster.

To tell c5 to use APC for its caching rather than use the default file cache, add the following line to site.php.

define('CACHE_LIBRARY', 'apc');
hursey013 replied on at Permalink Reply
hursey013
Has anyone tested this approach yet? I will try as soon as I get a second, just want to make sure I'm not duplicating efforts.
hursey013 replied on at Permalink Reply
hursey013
And just so I understand correctly, by giving each site a unique identifier, it will be caching all of the php files of EACH c5 instance? It would seem that if my one site is barely fitting within the 64mb of shared memory, doubling the number of pages APC by adding another instance will require more shared memory? Is that correct?
synlag replied on at Permalink Reply
synlag
How does this work with a shared core install?
Is there a whitepaper about that for hosting and service partners?
mose replied on at Permalink Reply
mose
Just a note to say that I installed a second, separate test site with C5 on the same web server running APC. Caching for both websites works correctly with the CACHE_ID modification.
cssninja replied on at Permalink Reply
cssninja
Unfortunately I do not have the APC on the server. But the solution that gave mose is very good. My website works much faster. The customer is also satisfied.
Thanks for your help!
PerryGovier replied on at Permalink Reply
PerryGovier
Mose: I should have read this through a bit more before posting a new thread in beta. I believe this solves my issue with multiple sites using APC.

C5 Crew: any plans of making this default. Maybe having the unique ID being the DB name?
katz515 replied on at Permalink Reply
katz515
Mose,

I just wanted to express my thank you.

Just uninstalled eAccelerator, and installed APC on my mediatemple (dv) 3.5 server.



My website

http://yokosonews.com/

is bullet fast now!



If you want to install APC on Media Temple (dv) 3.5 server, here are a couple instructions

***** You need to log-in as (mt) customer to view the post

https://forums.mediatemple.net/viewtopic.php?id=4212...
arcanepain replied on at Permalink Reply
arcanepain
Great that Mose's (and Remo's) solution (with or without APC) has been so helpful in speeding up our C5 sites, but has anyone else come across the same problem as me on shared hosting -- my hosting provider doesn't allow 'loopback' requests. ie. when the page cache addon decided a page can be cached and it requests the page back through cURL, it is blocked and, subsequently, returns nothing.

Anyone got any ideas? It's just that i'm wavering on the cusp of being able to justify a dedicated server, but the nightmare of moving my sites over and having to face the possibility of managing my own server is not appealing. Haha...anything for a simple life! Most of my sites are simple and fairly static, and I think the page cache add-on alone would be more than sufficient in keeping things chugging along nicely for the time being.
myFullFlavour replied on at Permalink Reply
myFullFlavour
Sending you a message..might have a solution for you..
mose replied on at Permalink Reply
mose
@hursev013: Several things about your understanding are not correct. APC will only cache an individual php file one time. In a shared-core installation, there is only instance of the php file on the disk. As a result, that one php file will only be cached once, even though multiple websites are using the same file. That's a win for everyone.

If you have a multi-core installation, then there will be multiple copies of the same php file on disk, and APC will cache each copy when it is accessed. Theoretically, you could have 10 copies of index.php in cache, if there were 10 individual Concrete5 websites on the server.

Caching of php files has nothing to do with a unique cache identifier in Concrete5. Concrete5 has nothing to do with caching php files. The php module in the web server (or fastcgi) uses APC to cache php scripts. If you install APC, php scripts will be cached, even if you don't tell Concrete5 to use APC.

Concrete5 does not cache web pages (at this time). Concrete5 only caches a small amount of information about pages, and this information is typically dwarfed by the size of the cached php scripts.

If you barely have enough memory, now, APC is probably not a solution for you.

@synlag: The steps I described above are for a shared-core or multi-core installation on the same web server. There is no advantage to these changes for a single-core installation.

There is no whitepaper. The information I describe above is the sum total of what you need to know. This is an experimental procedure that requires the modification of a core file (or you could copy cache.php to a local directory). It is believed to be correct, but it has not yet been tested, as far as we know.
hursey013 replied on at Permalink Reply
hursey013
mose - thanks for the info. Shortly after I posted my question I realized it didn't make sense, I forgot that APC was caching files even if "define('CACHE_LIBRARY', 'apc');" wasn't included in the site.php. So when I was looking at the graphs in apc.php it was already representative of every php file on that server - I was thinking incorrectly that it was only using files for the c5 site that I enable APC with. Thanks for clearing that up.
bebop1065 replied on at Permalink Reply
This is a great addition.

My site loads almost instantaneously!

I'm really happy with this.

Thanks to the smart people that made this happen.
focus43 replied on at Permalink Reply
focus43
You guys who wrote this absolutely rock. I feel like this was the last big hurdle that may have been making people hesitant to launch enterprise-level, heavy duty sites in production with C5 - which, with this plugin, now should no longer be an issue.
focus43 replied on at Permalink Reply
focus43
Can we get a benchmark on number of pages served in 60 seconds now with this plugin installed?
arcanepain replied on at Permalink Reply
arcanepain
Hi all,

Been a while since I posted on this one (and the whole PageCache add-on as it exists here might be redundant in the forthcoming version of Concrete) but Jesse over at FullFlavour has hooked me up with a specially configured VPS (fully recommended! PM him if you're in the market!) so, shared hosting behind me, i'm back in business to test this!

Installed the PageCache on a couple of sites last night, which I had already APC-enabled with the switch in site.php. Pages took an age to load and by using the log, I could see that multiple attempts were being made each page load to submit to cache. The cache submissions looked well formed, but nothing was actually being successfully submitted, hence it would try over and over again with the same page, and generate a dozen or entries in the log (some blank, some complete).

Removing the APC switch in site.php instantly fixed this, and caching when ahead for the valid pages as intended. So, does this mean that APC and the page cache add-on can't exist side by side? Even without the switch in site.php APC is helping anyway, right? Maybe in that way you enjoy the benefits of both. If it makes any difference, I made the cache.php modification in the core as described above and as is clarified in FullFlavour's post on the topic elsewhere in the forums.

On the page cache add-on though, did run into one problem - a cache entry was being created for for the 'Download File' page, which Concrete did NOT seem to like. download_file/view/### stopped working entirely, and completely tripped up the banner rotator I had going on my pages, embedded via Elyon's Flash Embed block. Easy enough fix - grabbed the collectionName() along with the $cID in the remocache.php, and told the add-on NOT to cache if it came across 'Download File' as a valid page for cache entry. Maybe not the best way of doing it, but all working fine now. Might be worth bearing in mind if people are having any funny problems with it. APC or not, sites are now working ALOT faster with this!
codecloud replied on at Permalink Reply
codecloud
I'm brand new to concrete5, but am fairly experienced with high traffic websites.

I think the best way to achieve high throughput for anonymous users (that is, those that haven't logged in) is to perform write-through caching at the webserver level.

Basically, the web server holds a cache of HTML files - either in memory or on the file system. When a request is made to the website, the web server checks to see if there is a valid cache entry for that URL. If there is, then it serves straight from the cache and doesn't ever push the request through to PHP. Obviously the performance benefits you get from caching at this level are huge.

I've used varnishhttp://www.varnish-cache.org/ before on other applications and have seen the incredible performance that it offers. Serving web pages from memory - so that you don't even have to wait for the disk reads on the webserver - is the way forward!

I'm planning to create a site in concrete5 soon and will be creating a plugin that will create a bridge between the CMS and varnish. When it's done I'll be happy to share the code with those who are interested.
andrew replied on at Permalink Reply
andrew
Thanks codecloud. We'd be very interested in that code. I agree about Varnish – we want to make sure concrete5 makes every effort to be usable in such a configuration, including the ability to flush varnish cache from concrete5, control Varnish caching from within concrete5 cache settings, etc... Any thoughts on that? Obviously since it's a server that sits in front of the web server you can't rely on the middleware layer even necessarily being accessed in a given request.
codecloud replied on at Permalink Reply
codecloud
Sure, varnish provides a CLI through which you can invalidate cache entries, flush the cache etc.

It's pretty straightforward to use. You just need to define a naming convention for the cache entries that the CMS also uses, then it's trivial to invalidate groups of pages, individual pages or the entire varnish cache.

As a sidenote, it's worth pointing out that this will only really work for anonymous traffic. If you've got a site that has lots of members that are signed in, all with personalised content, then it's unlikely you'll benefit from using varnish.
andrew replied on at Permalink Reply
andrew
Yeah – we're pretty aware of that. This is more for high-traffic, largely-static content sites, rather than a site like ours, for example.
frz replied on at Permalink Reply
frz
Hi,

We're actually working on some Varnish hooks now. PM me if you'd like
to collaborate.

best wishes

Franz Maruna
CEO - concrete5.org
http://about.me/frz
sulate replied on at Permalink Reply
sulate
I think the first, relatively simple(?) and still a major step here would be just to utilize the full page cache functionality by setting appropriate Cache-Control and Expires headers for pages that can be cached externally.