There are times when managing ECMSquared installs where a customer will want to migrate to a new piece of hardware.
One the biggest chores is migrating all of the stored mailbox data. When we migrate a customer from EIMS (Eudora Internet Mail Server) or another product to ECM, the migration must go through a IMAP to IMAP process using imapsync. With a hardware only change, there is no format change, so it’s just a matter of copying data. Well, how do you efficiently copy tens of gigabytes of mail data with tens of thousands of folders and hundreds of thousands of files?
rsync
but not just any old rsync. Here is the command I ended up using after culling through a bunch of web sites looking for the most efficient way of making the connection:
rsync -av -e “ssh -c blowfish” –delete root@newserverIP:/var/mail/data/1* /var/mail/data/ &
rsync -av -e “ssh -c blowfish” –delete root@newserverIP:/var/mail/data/2* /var/mail/data/ &
rsync -av -e “ssh -c blowfish” –delete root@newserverIP:/var/mail/data/3* /var/mail/data/ &
Several items of note here:
- the -a switch is an “archive” option meaning that rsync will maintain all file permissions and dates as best as possbile.
- the -e switch and -c blowfish tell rsync that when using ssh to communicate with the other system to have ssh use the blowfish cipher. Blowfish is a very fast block level cipher that is well suited to bulk data transfers.
- the –delete switch tells rsync to delete anything in the target that is no longer in the source. This will be become important on the second pass (see below)
- The multiple forked rsync calls can all run simultaneously as each set of directories picked up by numerical wildcards (ECM Maildir data is stored primarily by site id) will have some different folder contents/data sets to work on. The system will settle down and balance itself out between network and disk activity amongst the various processes. This makes sure all parts of the system are utilized to their fullest.
- We make these calls based on having a passwordless ssh pre-setup between the servers.
- We make two passes with this set of commands. Once to pre-load the data to the new server, and then once again at the cutover point. The second time there should be much less data to actually move over the network.