Portworx disaster recovery between on-prem and AWS EKS

5 November 2021 d-nix Comments 0 Comment

Introduction

As a modern company you have started building cloud native applications and you’ve transformed to a DevOps way of working. But now you’ve been tasked with enabling disaster recovery for your Kubernetes environment. This procedure describes the steps required to setup disaster recovery between and on-premises environment and the public cloud (AWS EKS), however the technology describe can easily also be used for on-prem to on-prem or one cloud to another. Also while we are describing DR here, an almost similar approach can be used to migrate your apps between different environments and clouds.

More of a visual style learning, feel free to check out the recording I did on the installation below.

Prerequisites

For this procedure I am assuming that you already have a local Kubernetes cluster and an AWS EKS cluster configured and also have Portworx installed on both. If not, check out my blog on Install Portworx on Kubernetes running on VMware vSphere andInstall Portworx on Kubernetes running on EKS by AWS

In addition to Portworx being install, PX-DR requires that you have a license enabled on the cluster that includes the Portworx Disaster Recovery (PX-DR) feature.

Installing storkctl

The first step in the installation is to install storkctl, a command-line tool for interacting with Portworx to configure amongst other things PX-DR.

STORK_POD=$(kubectl get pods -n kube-system -l name=stork -o jsonpath='{.items[0].metadata.name}') &&
kubectl cp -n kube-system $STORK_POD:/storkctl/linux/storkctl ./storkctl
sudo mv storkctl /usr/local/bin &&
sudo chmod +x /usr/local/bin/storkctl

Add AWS EKS credentials to Stork (source cluster)

Since we’re building DR to AWS EKS, the source cluster requires AWS credentials to be able to make some changes to the EKS environment. For this we will first create a secret containing the AWS credentials and then pass that secret into the stork pods.

Make sure you are logged in to AWS using the awscli utility. This will create a local file that stores your credentials here: $HOME/.aws/credentials

We can then use that file to create the secret in Kubernetes using the following command:

kubectl create secret generic --from-file=$HOME/.aws/credentials -n kube-system aws-creds

Next we are going to edit our StorageCluster CRD object on our source cluster using the following command:

kubectl edit StorageCluster -n kube-system

Add add the lines below in bold to the spec.stork section of theStorageCluster:

apiVersion: core.libopenstorage.org/v1alpha1
kind: StorageCluster
...
spec:
  ...
  stork:
    ...
    volumes:
    - mountPath: /root/.aws/
      name: aws-creds
      secret:
        secretName: aws-creds

Now save theStorageCluster object and the Portworx Operator will automatically reconfigure the stork pods.

Enable load balancing on EKS (destination cluster)

For the source cluster to be able to connect to the Portworx services in the destination cluster, we need to expose the Portworx services using a Elastic Load Balancer (elb).

For this we will edit the StorageCluster CRD object on our EKS cluster, our destination cluster using the following command:

kubectl edit StorageCluster -n kube-system

Add add the lines below in bold to the metadata.annotations section of theStorageCluster:

apiVersion: core.libopenstorage.org/v1alpha1
kind: StorageCluster
metadata:
  annotations:
    portworx.io/service-type: "LoadBalancer"

Now save the StorageCluster object and the Portworx Operator will automatically reconfigure the services as LoadBalancer services.

Create objectstore credentials

For the two cluster to exchange data for the replication a objectstore is used. This allows the async replication to take place over high latency connection. To specify the objectstore to use, we have to add the credentials to the objectstore to both our clusters.

EKS cluster (destination)

We will start on our destination cluster, where we will use pxctl status to identify the Cluster UUID of the target cluster.

PX_POD=$(kubectl get pods -l name=portworx -n kube-system -o jsonpath='{.items[0].metadata.name}')
kubectl exec $PX_POD -n kube-system -- /opt/pwx/bin/pxctl status

Note down the value shown under Cluster UUID, as we will need it for the next command. Enter the correct information about the objectstore that you want to use. You can either use an existing one, by specifying the --bucket parameter, of the can allow Portworx to create one using the following:

PX_POD=$(kubectl get pods -l name=portworx -n kube-system -o jsonpath='{.items[0].metadata.name}')
kubectl exec $PX_POD -n kube-system -- /opt/pwx/bin/pxctl credentials create \
  --provider s3 \
  --s3-access-key <YOUR-SECRET-ACCESS-KEY>
  --s3-secret-key <YOUR-ACCESS-KEY-ID> \
  --s3-region us-east-1 \
  --s3-endpoint s3.amazonaws.com \
  --s3-storage-class STANDARD \
  <NAME>

For the <NAME> we will want to use clusterPair_<Cluster UUID>, where <Cluster UUID> needs to be replaced by the UUID recorded in the previous step.

On-prem cluster (source)

Now use exactly the same command as you used on the EKS cluster to create the credentials on the on-prem/source cluster, to point this cluster to exactly the same objectstore that you used for the EKS cluster.

PX_POD=$(kubectl get pods -l name=portworx -n kube-system -o jsonpath='{.items[0].metadata.name}')
kubectl exec $PX_POD -n kube-system -- /opt/pwx/bin/pxctl pxctl credentials create \
  --provider s3 \
  --s3-access-key <YOUR-SECRET-ACCESS-KEY>
  --s3-secret-key <YOUR-ACCESS-KEY-ID> \
  --s3-region us-east-1 \
  --s3-endpoint s3.amazonaws.com \
  --s3-storage-class STANDARD \
  <NAME>

For the <NAME> we will want to use clusterPair_<Cluster UUID>, where <Cluster UUID> needs to be replaced by the UUID noted in from the EKS cluster recorded in the previous step.

Generate a ClusterPair spec on the destination cluster

Now use storkctl the generate a ClusterPair object on the destination cluster. For this procedure, I’m specifying that I want to create it for the petclinic namespace, to replicate that from the source cluster to the EKS cluster. The petclinic namespace does not have to exist on the EKS cluster (for more detailed steps, check the official docs).

storkctl generate clusterpair -n petclinic remotecluster > clusterpair.yaml

Now edit the file clusterpair.yaml, where you replace the part:

apiVersion: stork.libopenstorage.org/v1alpha1
kind: ClusterPair
...
spec:
  ...
    options:
       <insert_storage_options_here>: ""

With the following:

apiVersion: stork.libopenstorage.org/v1alpha1
kind: ClusterPair
...
spec:
  ...
  options:
    ip:     <ip_of_remote_px_node>
    port:   <port_of_remote_px_node_default_9001>
    token:  <token>
    mode:   DisasterRecovery

Replace <ip_of_remote_px_node> with the DNS name of the load balancer for the portworx-service.

Replace <port_of_remote_px_node_default_9001> with "9001".

Replace <token> with a token generated on the EKS cluster using this procedure.

Create the ClusterPair on the source cluster

Copy the clusterpair.yaml to the source cluster and apply it as follows:

kubectl apply -f clusterpair.yaml

Verifying the Pair status

Once you apply the above spec on the source cluster, you should be able to check the status of the pairing:

storkctl get clusterpair -n petclinic

Which should show the following output:

NAME               STORAGE-STATUS   SCHEDULER-STATUS   CREATED
remotecluster      Ready            Ready              5 Nov 21 03:11 UTC

Schedule Policy

Now we will create a SchedulePolicy that we will later use in our MigrationSchedule to determine how often the replication should run. The example below shows a configuration that replicates very minute, save this as testpolicy.yaml.

apiVersion: stork.libopenstorage.org/v1alpha1
kind: SchedulePolicy
metadata:
  name: testpolicy
policy:
  interval:
    intervalMinutes: 1

And then apply it to the cluster:

kubectl apply -f testpolicy.yaml

Migration Schedule

Next we can create the actual migration schedule, as shown below,save this as migrationschedule.yaml.

apiVersion: stork.libopenstorage.org/v1alpha1
kind: MigrationSchedule
metadata:
  name: petclinicmigrationschedule
  namespace: petclinic
spec:
  template:
    spec:
      clusterPair: remotecluster
      includeResources: true
      startApplications: false
      namespaces:
      - petclinic
  schedulePolicyName: testpolicy

And then apply it to the cluster:

kubectl apply -f migrationschedule.yaml

Check migration status

This will initiate the migrations. The status of schedule can be checked using the following command:

kubectl describe migrationschedules.stork.libopenstorage.org -n petclinic

To check the actual migration tasks, use the following:

kubectl get migration -n mysql

And finally to get more information on the actual migration, use the following:

kubectl descibe migration -n mysql

Execute failover

To execute failover, first make sure that the replication is stopped (eg. source is down or the MigrationSchedule is stopped. Then from the destination cluster run the following command:

storkctl activate migrations -n petclinic

Conclusion

This completes the configuration of PX-DR between on-prem and AWS EKS. You can now start explorer the other options for migrations, including replicating multiple namespaces! I hope you have found this article useful and would love to hear your feedback below.

D-nix.nl

Technology blog