CloudEndure Failback Procedure
How to Failback to On-Premise Servers from AWS
Business continuity and IT resilience are a click away with CloudEndure® Disaster Recovery technology. CloudEndure’s one-touch disaster recovery solution utilizes block-level continuous replication, application stack orchestration, and automated machine conversion to ensure near-zero RPO and RTO for all applications, while reducing traditional disaster recovery expenses by over 85%.
The purpose of this document is to define and clarify all the procedural and technical aspects for failing back into your on-premise environment after recovering from a disaster using CloudEndure.
Failing over all Machines
- Click on each one of the machines in the CloudEndure console and confirm that the Blueprint is configured correctly based on your needs (instance types, subnets, security groups, etc....). Once you confirmed that, you are ready for a failover.
- To launch your DR site, please select all machines and click on Test. The process will create target machines for your source machines in your target DR region, based on the Blueprint you defined for each.
Please note that Test and Recovery are very similar. The difference is how the machine is marked in the CloudEndure Console. The Recovery action should be invoked in case of an actual disaster and Test is for drills.
- Confirm that all Target machines were created properly based on your requirements. Make adjustment in the Blueprint as necessary and re-Test the machines. Note that changes to the Blueprint do not impact machines that are currently running, only machines launched after the changes were saved.
- Confirm that the recovered applications work as expected. Your DR site is now ready.
- Click ‘Prepare For Failback’ under ‘PROJECT ACTIONS’ and choose to use existing machines. This step will allow your account to reverse replication direction in preparation for a fail back.
6. Modify the DNS to route end-users to the DR site.
Failing over a subset of your Machines
To fail back specific servers, another project needs to be created and configured identically to your original project (Same target Cloud/Region, project type, credentials and Replication Settings).
Once you have the Failback project ready, the failback process for a server is:
- Select the machine(s), click on Machine Actions and then Move Machines to Another Project. Select the other project to move the machine to.
2. Once all the machines you moved to the other project are there, switch projects and follow the ‘Failing over all Machines’ 1-6 steps.
How to Fail back?
Using CloudEndure Bootable Replication Server Image
- Download the CloudEndure Bootable Replication Server Image, using the link from the Replication Settings section in the CloudEndure Console under Setup & Info.
- To initiate replication of a source server, the target server needs to be booted into the CloudEndure Bootable Replication Server Image downloaded in step 1.
- If the networking settings cannot be fetched using DHCP, the software will request manual networking configuration needed for connectivity.
- The CloudEndure software will then connect to cloudendure.com on port 443 (TCP) and authenticate using the CloudEndure account credentials that you enter.
- Follow the instructions and provide all the necessary details, such as: the project name, machine ID of the source server, the disk mapping between source and target and the IP address of the target machine to which the source machine will send the replicated data.
- The on-premise server will connect to the Target server on port 1500 (TCP) to start replication.
- When initial replication completes, the console will indicate that replication is in Continuous Data Protection mode, at which point you can initiate a test of the migration or an actual cutover by using the corresponding buttons in the CloudEndure Console.
- Please note that machines must be in Continuous Data Protection mode before failback, otherwise, there is no consistent state and CloudEndure will not allow the recovered on-premise machine to boot in an inconsistent state.
- When the test or the cutover are finished, the target machine will eject the bootable media and will reboot the Target machine into its new Operating System.
- Once all machines are recovered, select Project Actions and click Return to Normal Operation. At the end of the operation, the Failback project is configured to replicate into AWS again. The machines will start replicating into AWS.
- If you failed over a subset of your machines, and used a separate project, select all the machines from that project, click on Machine Actions and then Move Machines to Different Project. Select your original project to fully resume normal operation.
- By default, when using the same CloudEndure account for failing over/back, the ROOT disk will not be replicated back to prevent any changes to interfere with the boot process.
To force the failback of the root disk, please use the following instructions:
- When booting the source machine from the CloudEndure Bootable Replication Server Image, once the software prompts entering the CloudEndure username, please press CTRL+C and then run the command: ./start.sh --device-mapping="/dev/sda1:/dev/sda" (In general, it is <AWS device path:Local device name>)
- You can add more disks as follows: ./start.sh --device-mapping="/dev/sda1:/dev/sda,/dev/sdb:/dev/sdb"
- In both cases, the argument is a mapping between the device on the EC2 instance (that can be seen in the AWS console in the details pane of the instance) to the local device of the on-premise machine.
- In can you’d need to force those settings to all your servers, we can edit the start.sh script on the vanilla image to include those arguments.
- When your target machine is Openstack-based, please create a standard disk from the Bootable Recovery Media ISO, and boot from that standard disk, instead of attaching the ISO directly.
- Currently, the replication direction is a project level configuration. Which means, when you click the ‘FAILOVER’ button, the entire project and all machine within the project will be set up to replicate the other way (from target -> source). Please note that this step will stop replication of all your machines, including those that may have not been tested.
Fail back Network Architecture