It All Started with Files
In concrete5 5.2.0, we introduced a completely new file manager. Files were now full-fledged objects with extensible metadata, permissions and sets. Retrieving information about a particular file object was much more expensive (in terms of database queries) than it used to be.
All of a sudden, certain add-ons like galleries which gathered large amounts of files at once were quite slow. They had to run hundreds of queries just to render a page.
Caching to the Rescue
In web applications, caching typically means retrieving an object from a faster place be it RAM, or a serialized object in a hard disk than reconstituting it entirely from the database. This lessens the load on the database and can dramatically improve performance. We added our own custom caching layer, which stored objects in the file system, and this problem went away. We liked this solution so much that we added more objects to this cache.
Enter Zend Cache
When we started working on localizing our add-ons (so they could be included in translation) we realized that it would make a lot of sense to use Zend Translate, which is a library contained with theZend Framework. Zend Translate makes extensive use of Zend Cache. We took a look at Zend Cache, and realized that without a lot of work, we could remove our custom caching layer, replace it with Zend Cache, and automatically get access to different cache backends, like APC, Xcache, memcached and more. It was a no-brainer.
Not entirely. Zend Cache hasalways had its share of quirks. The biggest one? The default cache, which stores objects in the filesystem, gets slower and slower the more objects you place in it, due to some garbage collection routines that Zend Cache uses. If you disable garbage collection, the cache is fast, but it fills up with millions of files. This was exacerbated by the fact that, as we added features like extensible attributes, layouts and full page caching, we found ourselves storing more and more objects in that cache.
And the worst part? While we were trying to lessen database access, we just shuttled the problem over to the file system, which now had to deal with thousands of small files on every request. Yes, this problem goes away if you configure an alternate cache backend, but most concrete5 sites don't do that. On web hosts with slow disk I/O this was a killer: hitting a site with no cache entries or out of date cache entries could take a really long time. Oftentimes the next access would be faster, but not always; the filesystem could be a significant bottleneck on shared hosts, and it was clear that caching was hurting as often as it was helping.
Cache Bigger Objects: Enter 5.6
We recognized this, so we tried to cache larger and larger objects in fewer and fewer cache entries. The idea was that if we could still use the cache but access it less often we'd get the benefits of the cache without as much of the lookup slowness.
This reached a head in concrete5 5.6.0. Our completely rebuilt permissions system was designed from scratch to be extensible, powerful and able to handle anything. Unfortunately, it requires quite a bit more database access to work with. Rather than cache each permission lookup, we ran all permission lookups every time a page was requested, and cached that in the page object. This worked but performance was significantly degraded for sites that had caching off. And when I say significantly degraded, I meansignificantly degraded. I think at one point I benchmarked around 700-800 queries running on a simple page when caching was disabled.
This was where we were in the Fall of 2012: concrete5 was more extensible than ever, but also slower than ever. Sites would fail due to slow disk I/O and the response was either "disable your cache" or "enable your cache." Disabling the cache would bloat the database queries by the hundreds, many of them duplicates. It was clear that caching was causing more problems than it was solving.
At this point, we had several layers of performance mechanisms:
- The in-memory cache: powered by a simple array, the in-memory cache (aka "CacheLocal", if you're curious) is a way of storing a requested object in memory, so that it wasn't retrieved from the file system if it had already been retrieved once in a given request.
- The basic cache: powered by Zend Cache, this was the object-based caching, usually performed by the file system, that I've described above.
- Block caching: introduced in 5.4.0, block caching is a powerful way of preserving all that is great about blocks while allowing developers to mark ways that their blocks can be cached, in order to preserve their performance on render.
- Full Page caching: also introduced in 5.4.0, full page caching is a way to store the entire contents of a page and render it ideally without accessing the database.
- Overrides Caching: introduced in 5.6.0, overrides caching was our way of saving the state of your overridden file system to a Config variable, so that we don't have to do as many file_exists() checks throughout our code.
While these were all noble attempts at solving the performance problems of a complex and flexible application, they werealmost all failing:
#2 failed because too many items were being placed in the cache, the file system is slow on many web hosts, the item still has to be requested from the database and then written to the file system, and it just wasn't much faster to request from the file system than it was to get out of the database. Furthermore, by forcing more data to be written into the cache in order to create fewer entries, we actually did many more database queries to make #2 viable than we normally did. This killed performance especially when cache was disabled.
#3 was great in theory, but it used #2 as its backend, which meant all the problems that plagued #2 filtered down to #3. This was the same problem with #4 but in addition #4 suffered from firing far too late in the loading process. Practically, #4 never ran without still connecting to the database, which rendered its benefits minimal. Finally, #5 was a good idea in theory, but still required a database connection, and was confusing, because it couldn't be cleared by deleting files/cache/ directory in the filesystem its data was stored in a different way.
The Present: 5.6.1
It's no secret:we have great plans for concrete5 5.7.0. But these plans are somewhat contingent on concrete5 running everywhere, and running everywherewell. We need to work well on good servers, and work acceptably on even the slowest ones. We simply cannot afford and will not accept performance continuing to be such a problem.
So I set out to fix it. To do this, I decided to look at what worked, and throw out what didn't. The override cache worked and was a good idea, but storing its information in the database to get around the slowness of the cache was stupid. Full page caching was a great idea, and the options that triggered it were sensible. But relying on Zend Cache and firing late in the loading process? This was a bad idea. We had tried to make it so that controller actions and certain interactive items could still be used with full page caching. While this was neat in theory, it made full page caching much less effective, and most importantly just plain slower.
Block caching is a great idea. It occurred to me if we had had block output caching first, when the gallery problems had cropped up with the file block, we may never have implemented anything else.
What didn't work? The one-size-fits-all cache, using Zend Cache as a backend. At best, it still added uncertainty to concrete5 ("Something not working right? Clear the cache, I guess."). At worst, it degraded performance on first run and, for many sites, in perpetuity.
Let's be clear: Zend Cache is a fine piece of software. We will still continue to support its usage in our API and actively use it when interactive with Zend Framework components that use it (like Zend Translate). But shoving every object in there because we make too many database queries? That's like applying a bandaid to a cut with a pair of scissors.
You can currentlydownload a beta version of concrete5 5.6.1to see these changes in action. I would encourage you to do so, and to take it for a test drive on any web host. The cheaper the host the better.
What's been done? A lot.
We don't use the Cache anymore. It's still there and used for Zend Framework libraries, but rarely accessed anymore. I think the information about the dashboard picture of the day might use it but that's it.
We do use the in-memory cache. A lot. This is CacheLocal. The first time in a given request that a page, file, permissions record or other objects are requested, they are added to the local cache, and then subsequent requests for these items are retrieved from an array. This dramatically reduces database queries without incurring any disk I/O problems. Yes, you do occasionally run into odd cache-related bugs, and it could cause some problems for hosts with very restrictive memory settings, but these should both be rare, since CacheLocal doesn't persist across subsequent requests.
We do use block-level caching, for records and output, but we store this information in the database. We're already requesting data from these database tables, so adding another column to the mix and displaying its contents is trivial (and much faster than looking in the file system for Zend Cache entries.) All the same block caching settings that were valid pre-5.6.1 are still honored. The best part about block-level caching: we save block output and information when a block is saved, which means the data is already there for subsequent requests, leading to much less slowness on initial request for a given page.
We do use the Environment cache, but it's stored as a serialized object in files/cache/environment.cache. Clearing the cache will remove this file, as well as just deleting files/cache/ yourself.
We have updated our CSS cache to store files in files/cache/css/. This is used whenever a theme uses customizable stylesheets. Files are generated and referenced directly without going through concrete5.
We have completely rewritten the full page caching library. Full Page caching is completely modular and extensible. It fires very early in the request. If a site enables full page caching, and the pages are in the cache, a site can completely lose its database connection and it will still render as though nothing is wrong.
We have optimized and removed unnecessary queries. So many queries were structured in certain ways to appease the mighty Gods of cached objects, that when this was no longer a concern they can be rewritten. Page and File custom attributes are an example of this: we used to retrieve all attributes about a particular file or page when that object was retrieved, in order to cache as much in the object as possible. Now we only retrieve this information on demand.
The results speak for themselves. concrete5 5.6.1 is much more responsive on many web hosts. Yes, it still makes robust use of a database (non-logged-in users will see around 50-70 database queries rendering a standard starting point page that's not cached in the full page cache) but this is a massive decrease from 188.8.131.52. And its all being done without the cache adding extra disk I/O.
We are still actively exploring how we might speed up some core block types like Auto-Nav and the PageList class. This may require some creative, not-necessarily-object-oriented problem solving, but we're up to the challenge, armed with some of the lessons we've learned.
One Size Doesn't Fit All
The features we'd added for the sake of performance weren't all hindrances but their impact was lessened because we tried to funnel them all through the same approach (a file-based cache.) Now, although we have some slightly more idiosyncratic approaches to solving these problems, the results are much, much better.
Keep the Good Ditch the Bad
We've never been afraid to rebuild something in a better way.
Remember the Problem You're Trying to Solve
We should have realized we had a problem the moment we started trying to fix the cache. The cache isn't a feature of the software: the cache is means to an end (faster software!) Focusing on actually reducing database usage throughout the application in sensible ways rather than adding them and shuttling them into a cache once they've run is going to serve us far better.
Let us know what you think.
(Reposted from Andy's Blog)
It's no secret: concrete5 is a beefy application. We pack in a lot of dynamic features, have a very extendable code base, include large libraries from the Zend Framework and offer powerful permissions and design flexibility. In turn, sometimes site performance suffers, especially on cheaper web hosts.
Unfortunately, somewhere along the line, performance problems started to become more than just a nuisance and threatened to become ameme. We had to do something about them.
Its more important what you say no to in life than what you say yes to. When we went open source, we stared at google analytics every day and rejoiced when we passed 100 visits a day. Now we average 10,000 visits in a day and were the fastest growing open source CMS out there. In the early days we were just flattered that anyone cared, so wed bend over backwards to provide any service we could to anyone who asked. As time went on, we tried out different ideas and dumped a few that turned out to be mistakes or just past their prime (pro accounts, support incidents to name just two). Were banking on 5.7 being a pretty big deal, so to make sure were positioned correctly to support and promote it, were focusing on a few things:
- No more budget hosting. To stay the size I want we have to choose where to put our hosting expertise, and were focusing on the big boys. Theres plenty of budget hosts willing to do a great job helping people get started with concrete5. Were going to have the free hosting with ads at concrete5.com, and those ads are going to suggest you get a paid hosting account to get rid of the ads. I dont expect to make much of anything on the ads (who knows) but I expect to make a lot on the hosting signup commissions.This would be awesome for us as it would help me track ROI on marketing spends.
- Enterprise Support SLAs. Since launching the enterprise site weve already sold a few of these guys with great success.
- Training. This is the #1 priority for growing the project. Weve got to offer more training in more ways and get certification working. Were seeing huge system integrators looking for ways to train 1000 developers on concrete5, and what weve got today is only now starting to meet that need.
- Services. Were still helping a handful of clients with concrete5. We like having our feet in the water so to speak, as we believe it gives us insight into ways the tools need to improve. Theres an old sales adage of you can have it fast, cheap, and well done - now pick 1 (or 2). It strikes me that we should be expensive and perfectly done. The more we move in that direction, the more room there is for the community to move up and be reasonably priced, fast, well done.
- The marketplace. For 2-3 years I talked about very little beyond the marketplace when people asked us how we were going to make money. I always got inquisitive looks when I was speaking to well informed folks, but I was passionate about it nevertheless. I believe weve made a more powerful solution to the module/add-on problem than all of our competition has today. I believe the PRB and community curated aspect of our marketplace is a key component to our projects success, and will always remain so. So many of us have been burned by Drupal modules that break other modules, or Wordpress add-ons that just dont work. Bringing the barrier to entry up a bit and using Apples App store as a guide was a good idea and I have no regrets about it.
That being said, there are some real issues with it being the primary or only revenue model:
- Practically speaking, it has been flat for us for some time. Its just very dependable income each month but it never explodes and it doesnt seem to be that directly connected to our site traffic or demo signups. Practically the marketplace is not our primary revenue model today. Marketplace net profits cover of our monthly operating burn rate.
- We cant sell enterprise stuff there. People are outraged when we price an internationalization suite at $1,750 in the marketplace, but when I tell people through enterprise.concrete5.com that our enterprise suite costs $250k for an unlimited license no one blinks an eye. Moreover any enterprise that might want to grab a simple add-on out of there instantly has support issues. A large organization needs one point of contact for support issues and is willing to pay for that. Since we really cant promise failsafe support on add-ons we didnt create, even with the PRB, its often difficult for the big boys to see the marketplace as anything more than a nice prototyping perk.
- Its a turnoff for the moral high ground open source crowd. The fact that were MIT licensed instead of GPL is already a hard sell for some open source advocates, and then they look at our marketplace and exclaim that all the good stuff costs money! I can (and do) argue that actually about half the stuff in there is free and you can buy stuff for Wordpress/Drupal/Joomla too. We just took the effort to make a curated destination with support and refund policies, but it still rubs folks the wrong way at a glance - and that is what it is. If someone thinks you smell, it doesnt matter if you know you dont from their position, you smell. The idea of making something and selling it in the marketplace tends to appeal to the 1-2 man entrepreneurial cowboy webshop looking for different ways to augment their services revenue with something more dependable. Speaking as someone in that camp, I think thats great. Speaking as the leader of a open source project with the goal of being a ubiquitous building material for the web, I wonder if we may be letting ourselves get unfairly compared to more truly open source offerings out there by always saying the marketplace is our primary revenue concern.
- Our own add-ons are priced in a way to try to cover the costs of developing the core. What ends up happening is our add-ons arent as good as they could be, because revenue that should be going back into improving them gets funneled into time on the core. We have too many small things in there which require too much support expertise. It also makes tracking ROI on any marketing spend difficult. The process of discovering concrete5 to choosing to buy an add-on is too squishy and hard to track for me to justify spending dollars marketing concrete5, so today I dont. Id love to change that.
To address some of these issues, we want to change the way the marketplace works:
- 5% gross of any sale should go to a non-profit the customer picks from a list on checkout. Ive wanted to do this from day one but foolishly never did. Theres a clothing store here in Portland called Buffalo Exchange that gives you a wooden coin to put in one of three boxes if you dont need a bag. The boxes are each for a different charity that they cycle out monthly or so. They keep them very safe (no politics, religion, etc - stuff like shoes for kids or saving dogs). I believe that by doing this we will address two branding issues: First I believe peoples expectations about support from our add-on developers will be more reasonable. Second I believe the I cant believe this stuff costs MONEY crowd will have the sense to keep their mouths shut, or risk looking very cheap.
- 2% gross of your purchases above $1,000 should go back to customers as rewards credits. Use them for more marketplace purchases or just cash it out and buy a beer. This gives me something to say to the shops looking for a reseller/affiliate/commission type deal.
- To make those happen, wed ask marketplace developers to give up a 30% cut instead of the 25% we ask today. ;) Seriously though, this is one of those places where making it up on volume actually works. These changes and the ecosystems changes will make me feel comfortable spending money marketing concrete5 (which doesn't happen today.)
- Were not going to call it a Marketplace any more. It's simply the community, as we now will have concrete5.com the free hosting site as well.
- Our Add-ons - were going to start selling fewer things in the marketplace. Some of our stuff should just be free at this point (superfish??). Some of our stuff will be rendered obsolete with 5.7 (discussions and calendar). With eCommerce we plan on having a basic free version, a $95 version that is a bit simpler to use than what we have today but is somewhat comparable, and a $295 version that includes complex product configuration and other goodies. Were going to start selling some themes as well.
We are speaking to many disparate audiences with one site at concrete5.org. Adding the enterprise.concrete5.com site has been very helpful, and was a no brainer we should have done a long time ago. Enterprise CMS solutions are licensed in the low-mid 6 figure range, so it just doesnt make sense to put any of the tools were working on for large clients into our marketplace at concrete5.org. Same goes with the support SLAs that are important.
A similar problem exists for consumers vs designers & developers at concrete5.org. If Im a small business comparing concrete5 to the DIY web builder market it just doesnt make as much sense at concrete5.org as it could. Sure theres an instant setup link buried at the top of the page, but that is just a demo that we cant easily upsell into a hosting account.
To clear it all up:
- Redesign concrete5.org to feel more like a developer/designer community (bit cleaner design with a tad less whimsy). Improve the features around finding a community member and posting a job, but keep all that free. Focus more on documentation, including general cleanup and improved API documentation.
- Launch concrete5.com to be one click install of concrete5 for free at Yoursite.concrete5.com. Much like wordpress, offer free hosting on a cloud server with ads for ever. Want to lose the ads? Click a built in promo to one of our hosting customers and have your site packed up and moved automatically (see business changes).
- Make concrete5.org, .com, and the app itself work with Facebook and Twitter authentication. Also make 1-click installs at concrete5.com automatically connected to concrete5.org project pages without you having to do anything. Getting a site on concrete5.com would automatically make or use a user account on concrete5.org. Theres just far too many logins required for people to understand today. Under the hood we kind of need it to work that way, but theres no reason we cant take some of the pain out of setting things up for the majority of new folks.
- Were debating just making concrete5.org a new site. With the new Conversations model weve got some real issues with our forums here (which are a one off). Were also starting to think that some of the legacy content in this site is going to be irrelevant with 5.7. Cleaning house is always appealing, and building a new site we can test in a sane way sounds nice too. Theres certainly no perfect answer here, but were entertaining any option at this point.
My Account & Social Networking
With the introduction of the Grid, all default concrete5 installs basically have a social feed built right in. To pull that off right were going to bundle discussions and events right into concrete5 as well. Right now we sell add-ons for both, but these are just features we believe the majority of sites will benefit from and Im willing to trade the revenue for growth. Discussions will get re-architected into Conversations which will have their own flat (read: performance optimized) structure in the core.
Just like Files, Users and Pages, conversations will have their own permissions and be easily centrally managed from the dashboard. Conversations will be organized with Topics - which amount to tags with hierarchy. Topics can have permissions applied. The result is you will be able to make topic relationships like this:
A conversation could be posted to Vegans, Fruit, Apples, etc. A person could be subscribed to any or all of those topics, or might not have permissions to see Meat at all. Conversations will be joined to pages through topics and a page ID reference.
Members will also have Relationships. These are typed connections between people. We will have a few out of the box and as a developer you can make more. Think Twitter follows vs Facebook confirmed friends, etc. Community points will be built in. Member profiles and member search will be more flushed out and easy to use.
We started talking about this as The zine - a magazine made just for you but everyone pronounced it wine. I think the nod towards xeroxed zines from the 80s is appropriate. Take a windows 8 or flipboard style summary view, but give people enough control to curate their content. This should be a tool people want to tinker with.
We see using the grid layout for any number of UI challenges, so basically think of it as an alternate view layer for an improved page list block. We also want to make it so you can include 3rd party feeds in your grid. So you might include your company's facebook, twitter and youtube feeds in a community page, along with press releases youre posting to the site. Interacting with any of the tiles in the grid makes a page for it on your site so we can store whatever data is required. You should be able to post back to your grid, and have it auto-post to the same services it ingests content from. Add-ons could/should create tile layouts so they can be aggregated in this view. Grids can have automatic sorting, filtering and sizing logic - combined with curated tiles. Tiles can have conversations and events attached to them as free tools in the core.
We will use grids for the main landing page on the demo site, and for your My Account area as much as possible.
All of these are very loose creative direction:
More in the 5.7 plans...
Editing & Page Creation
This new text editor is cross browser compatible, makes clean HTML, is bootstrappy in its styling, and generally is totally awesome. Weve already integrated it with concrete5 so successfully that we can do text editing within the page instead of an overlay to our satisfaction.
This also works easily with the page in edit mode (skipping the click to put a specific block in edit mode), effectively giving us a middle edit state were going to use for other things later.
The only challenge were looking at here is what to do with legacy issues and TinyMCE. My own sense on this one is hey, you upgrade you lose old bad stuff and get sexy new stuff so if theres a strong argument for why concrete5 needs to support both editors, Id love to hear it.
As part of integrating redactor, weve been able to include a new snippits feature that amounts to Mail Merge in MSWord. You can pull in the current date, user name, page name, etc. Were architecting this in a flexible way so a developer might integrate with other data sources in handy ways.
Check out the video from a while ago...
Demoing editing a page on concrete5 always feels great, prospects just get it and quickly become clients. Demoing page creation is less awesome. Explaining to someone that your whole site is a tree and you have to goto the parent of where you want this page to be first is where they start to come back to earth and realize they need to pay attention. Composer starts to solve this problem a bit, but it also solves a completely different problem of making data entry screens for CRUD type problems. Adding more options rarely makes something less overwhelming. Were going to solve this problem by breaking it along more traditional lines:
1) What is composer today will mature into something that creates the scaffolding a developer needs to create a custom entry/edit UI in the dashboard. Use this for creating interfaces for inputting strongly typed content.
2) What is Add a Sub Page today will mature into something more geared for drafting as you go. You can add content from anywhere, you start writing (with auto-save) before you have to decide what it is youre making or where youre going to want to publish it. There will be mobile and browser toolbar versions of this Create tool.
New Add Block Interface
Adding a block will be as simple as dragging its icon from the Add Block toolbar into an area at the exact point you wish it to display.
The fact that sometimes page types exist for layout and sometimes as an object model is not great. Someone should be able to add a calendar event or a blog post, and also switch its form factor. Were going to introduce a new concept called Feature that we can use to quickly determine if a page contains certain types of data for programmatic purposes. This should let us organically push page types towards the layout solution and de-couple functionality from requiring specific page types more. Blocks and add-ons could mark pages as implementing these Features, meaning that youll never have to choose whether a page is a Calendar Event or a Product Detail page type.
Were currently rewriting Layouts completely. The next version of Layouts will honor permissions and permissions inheritance correctly, be better integrated nto the concrete5 editing system, work correctly with area-specific developer methods, integrate better with themes, and be a bit more aware of modern style guidelines. Yes you can lock layouts and reuse them today, but it feels like a last minute feature. We want to make it easier for you to create snap to points so a site owners layout falls on common alignment lines.
We currently have three image editing solutions in concrete5. The avatar picker, the image sizer from composer, and the edit image from file manager. Some of them are Flash (ick). All of them suck. Were rebuilding all of these to be served by one attractive well thought out extendable image editor that the community can add filters/plugins to.
A sneak peak into the goals and vision for the next major release of concrete5.....
We're back! We talk news, karma, review PRB submissions and and answer your questions about our next big release, concrete5.7.
The Core Team is taking some time off for the holidays, so there will be no new Totally Random episodes until the new year. Check back in January to find out when the next live episode will air.