Tuesday, March 27, 2012

DR Suggestions Required

Hi All,

Need some suggestions for senior management for DR Purposes:

Background:

WSS/MOSS2007 is being used as a Document Management solution.

17 Servers geographically dispersed around the UK. Each server runs WSS 3, SQL Server 2005 and IIS. Each server is linked into a PiP cloud via 2MB MPLS.

At each location; We are looking at 20 core databases; each pre-sized to 10GB. If I take one site as an example, the previous nights backup totalled 135GB.

The company has taken a centralised view on backup's, so SQL Server Data and Log files are replicated using Double-Take to a central location where by the files are taken onto tape daily (Full backup of all files).

As a precaution, I take a Full SQL Server backup daily and also Tran Logs every 4 hours locally and keep it there for 2 days; however if the site goes boom I loose those, so for this purpose; please forget they exist.

As I expect; when I restore the mdf and ldf files from tape, I will get errors when I attach those files into SQL Server for transactional inconsistencies which I'm well aware of.

Other options I've considered are:

1) DB Mirroring. Not a bad option, but still have to get the DB to the Mirror Server in the first place. Also DB Mirroring is not recommended for more than 10 mirrored databases.

2) Log Shipping. Same issue as above; Have to get the data here in the first place. Then once Log Shipping is setup; if I have a failure; I need to start the whole lot off again.

3) Transactional Replication. Issue is with the initial replication getting the data from A to B, then if I need to use it in a DR situation; I will get issues saying this table is being used for replication. This can be worked around, but it's a not a quick process...

4) 3rd Party Backup Compression. E.G. Litespeed; Redgate SQL Backup, etc. Good; Tests have shown a 42% compression for us, however if I refer to the earlier example of 135GB, this compresses to 81GB. Throw in the theoretical max for a 2MB link of 19GB / 24 Hours, this would take 4 Days to copy.

Other thoughts I've come up with are:

A) Split the tables into different file groups; not sure how easy this would be as the DB's and Tables already exist.

B) Full/Diff/Tran. Still have the issue of scheduling the full backup over the weekend and taking 4 days to get here.

C) Local Tape Backups. Issue is relying on someone to change the tape on a daily basis. It's not centrally managed and how do we restore in a DR situation ?

Could someone give me some pointers please?

Thanks

Steve

SQL DBA.

I assume that the entire backup of all the dbs in one location is 135 GB........You can try log shipping where you can specify the option "generate a full backup of primary db and restore it in secondary" which will automatically create the db in secondary so that we need not copy the file manually..........out of those 20 dbs if you feel anything is extremely mission critical (very specific) you can configure mirroring for those dbs alone with automatic failover in high availability mode............in log shipping failover is manual........other dbs take a direct tape backup.......

You need to sit and split these 20 dbs as which one is very very critical, critical and also consider the size of those dbs.......you need to be sure if it grows enormously you can adopt a different strategy as log shipping or mirroring else tape backup/full backup/differential backup........

Let us get few more inputs from our friends in this community Smile

Rgds

Deepak

|||

So, my first thought is that DoubleTake isn't magic - they have to also get the data from site A to site B as well. How are they doing it?

If you're going to have a true DR site - one from which you could recover if your primary site became a smoking hole in the ground - you have no option but to somehow get the data replicated in a remote site. If you enable some sort of ongoing replication (mirroring, log shipping, transactional replication), then this is a one-time hit, and not an ongoing burden.

Sometimes if the links are not as fast as you'd like, the most efficient way to transfer data is "FFTP" - the Ford File Transfer Protocol. (put the tapes in the trunk of your Ford and drive them over...)

Certainly, compressing the backups will help with this, as the volume of data is reduced.

By the way, SQL Server 2008 will include backup compression as part of the native product.

|||

Thanks for your thoughts guys.

We are using Double-Take at the moment to get the data and log files here. We do have a one-time hit when we kick the replication set off for the first time on a new site.

We are using DT as our DR/User Recovery solution too. We have configured the Recycle Bin's in WSS/Moss, so we can cover that angle, so it's primary reason for replication is DR.

Bear in mind; each new site consists of 20 x 10GB Data (Pre Sized) and 20 x 5GB Log (Pre Sized) giving a total of 250Gb which needs to replicate. There is only 40mb or so on each db at this point, but Double-Take cant see this and so replicates the white space. If we were to do a SQL Backup; this would be around 400MB Per site which is not yet fully used.

We anticipate substantial volume growth in the next 12 months. We project 3.6TB for all sites once they are migrated onto the document management system, then anything upto 10TB+ spread over 17 servers. Can you see a DR Nightmate coming ?

We can and have tried shrinking the DB's to 5GB a piece, but then we get into disk fragmentation issues later on which is the reason behing the presizing .

I've thought about Database Mirroring from the off; we have 20 x 40MB Files, copy them over and then start mirroring; trouble is, all sites now have content. Also; I was under the impression that SQL Server could not support more than 10 mirroring sessions from any one source server?

The value of the data is mission critical. Essentially they are documents, tenders, spreadsheets, drawings, emails, etc. It's a document management system in essence with a custom UI.

One question I've thought of is this:

If we are using Log Shipping. Say we have a network blip and we loose the site for a couple of hours. Will those transactions queue and apply once the link is back or will the process fall over?

My experience of Log Shipping in SQL2K is that it would fall over and require a complete resync. Not a small task when you have 135Gb Content...

Thanks,

Steve

|||

I haven't seen that happen and we've lost connectivity to DR sites before. If you have network problems and loose connectivity for a couple of hours, all that should do is cause the transaction logs not to be copied over to the DR site. Once it's back, you can make sure all the logs get copied over from where the copies stoped, restore all the logs and "catch up" with where you were suppose to be. It shouldn't affect the database in the DR site as it would still be in a restoring state and able to apply more log backups.

-Sue

|||

Would it be possible to consolidate into fewer larger databases?

i.e. 20 10GB databases -> 10 20GB databases, etc.

This might make mirroring more feasible.

|||

Consolidating the DB's down is certainly an option; I had previously discounted this because I wanted to have a higher number of smaller DB's than a lower number of higher capacity DB's; This was purely for recovery purposes.

In your suggestion we would have 10 content databases, but would also need to mirror the SharePoint config database as this contains what's in what db table.

If I were to mirror 11 DB's; I dont know what effect it would have. Although the box's can run x64 code; that's a management nightmare in that each server woudl have to be rebuilt. The inconvienance it would cause the business is something which they are most likely to try and avoid.

No comments:

Post a Comment