My team has been getting more and more into CMSes over the past couple years. We have also been getting more and more into continuous integration. Reconciling the two has proven to be difficult. To make it even worse, we do LAMP and .NET sites so our scripts ideally work for both.
We have four environments for each site local, integration, staging and production. Content entry and file upload regularly occur on the production site. Development obviously starts locally and works its way up.
What are some of the methods or techniques that I can implement on my build server to automatically push data and schema updates from development environments into production without overwriting user-generated content? Conversely, how (and when) can I automatically draw user-generated data down into the development environments?
You will have 3 sorts of things in your database you need to worry about.
1) The schema, which can be defined in DDL.
2) Static or lookup data, which can be defined in DML.
3) Dynamic (or user) data which also can be defined in DML.
Normally changes to (1) and (2) should just go up to production with the code they are mutually reliant on. (3) should never go up, but can be copied to development environments if they are in sync with production environments at that time.
Of course it is much more complicated than that. To get (1) up may require converting exsiting schema / DDL into specific alter statements, and may also require data manipulation if data types or locations are changing. To get (2) up requires synchronisation with the code build which can become hard in complex environments.
There are many tools out there and if you require automation then you probably need advice from someone familiar with them.
I use a very simple scheme where all schema changes are reflected in a SQL build script, and are also added to a changes SQL script which includes all SQL required to perform any required transformations. This works well for me but my scenario is very simple (1 person, 1 server) and is therefore atypical.
However the key to any success is defining the work required for the changes at the time the change is made. The naive method of just changing the development DB and then performing remediation at deployment time is a disaster.
For A) the best thing is to use a priority system. Where your content is “default” and user content “overrides” it. And this content is in two different places. So, you update your stuff with no chance of having to deal with any kind of merging process. Your data can be in a different directory, or flagged differently in the DB, whatever. The downside of this is that you can’t just use basic out of the box server functionality to server the data up, unless you do the actual mapping when the links are generated.
i.e. you can either intercept /site/images/icon.png and serve up either /site/images/default/icon.png or /site/images/client/icon.png or when you write the pages out you can do the check there and send out the proper URLs so the server can service them directly.
For the second, as to when you can use the clients data, that’s more of a legal IP issue that you’d need to talk to the client about whether they’re a) willing and b) even capable of sharing their content with you. After that, the tech is straightforward.