The case for not Delphix
Delphix is positioned as a database virtualization product. Databases are virtual already, that is, they aren’t physical things you can touch, they are logical things of which you can have many on a single platform (the virtualization idea of “ex uno plures”, that is, out of one, many). Curt Monash likes Delphix’s idea of database virtualization, but I still don’t.
ScaleDB also use the term, and defend it by saying “Database virtualization means different things to different people; from simply running the database executable in a virtual machine, or using virtualized storage, to a fully virtualized elastic database cluster composed of modular compute and storage components that are assembled on the fly to accommodate your database needs”. Their implementation seems more like clustering or federation to me.
Virtualization is surely abstracting a physical concept into a logical idea which can then exist in software form. Any database is then a virtualization of a filing cabinet or a bunch of handwritten records in papyrus format.
Then there’s the implementation. The basic idea is to take a backup of your favourite database, recover it onto a Delphix appliance and connect to the original source to receive a stream of logical logs. The first recovered copy is called a dsource, and this is the thing you can make virtually any number of “virtual” copies of. The space saving is in the deduplication and compression. A time slider allows you to go to any point in time (as much as the logs cover) in a flash. The copies are presented back to your favourite database server via iSCSI and a special connector.
The first problem would be the performance. In the absence of 10 GbE, you’ve just attached a database via 1 Gbps ethernet & TCP/IP, that you’d otherwise access via, say, 2 x 8 Gbps FC. If this is still ok, consider that the point of Delphix is that you can have several, even a great many “virtual” copies of your database, making performance more of a problem with each new copy. Other than that, by definition, I/O to and from a deduplicated, compressed data source must be slower than normal. Deduplication (inline!) is a difficult thing to do at speed and at best some custom hardware appliances get it right, though for exclusively sequential access patterns only.
A second problem is that additional copies of any database add additional administrative overhead for your DBA teams. While storage consumption may scale at deduplicated and compressed rates, administrative effort scales linearly – two databases have twice the administrative effort of one.
A third problem is cost. Lets assume some random numbers and say a single appliance would cost 800k USD (not a complete thumbsuck but please speak to your local vendor) and that you can buy storage at the inflated rate of 5k USD per TB (speak to your other vendor). Let us now plot database size against number of copies and cost for these figures using this formula:
Cost = Acq + (x + factor.x.y).perTB
- “Acq” is cost to acquire (800k USD)
- “x” is the total capacity in TB
- “factor” is the compression/dedup factor you can get, lets say 0.1
- “y” is the number of copies of the database, and
- “perTB” is what you pay, in USD, for storage capacity
We get this:
Clearly, at the 10 copies of 10 TB databases mark it becomes cost effective, a point where performance really might become an issue. This is if you considered storage cost only. So clearly it’s not a storage cost saving play. However, consider thin provisioned copies at the storage controller layer (and no, you don’t always pay for it), the other, non-cost, benefits have to weigh really heavy. Add the cost of decent virtual infrastructure to host your virtual Delphix appliances, 10 GbE, storage and the fact that only Oracle and SQLServer are supported and that you’ll need an appliance for every version (one for SQL2008, one for SQL2008 R2, …), the costs stack up even more.
Development is probably the target market, but they often don’t care about the cost and effort of making multiple copies of databases available (it’s just an incident or a call to the help desk, yes : – ?).
I’m not convinced yet.
[Corrections and comments most welcome!]