Skip links

HCX Error – Adding Mobility Agent Host failed |SSL Exception

Working with an enterprise customer recently I ran into an issue when deploying an HCX Service Mesh with Cross Cloud vMotion enabled.

At the end of the service mesh deployment there was an error message displayed:

ApplianceLifecycyle job failed | intercocnnectConfigMA failed |error Adding Mobility Agent Host failed | SSL Exception.

As part of the IX Appliance deployment using Cross Cloud vMotion it registers itself with vCenter as a vMotion ‘Target’ (the mobility Agent). However as the IX by default uses self signed certificates this particular vCenter rejects this registration. The investigation revealed that the likely cause was that the customer had enabled ‘Custom’ mode on their vCenter for their ESXi hosts in the destination environment, meaning that they only trusted their own CA issued certificates. Checking the destination vCenter confirmed this:

Destination vCenter Advanced Setting



The destination environment was a brand new VMware Cloud Foundation (VCF) cloud that was setup slightly differently from their existing legacy environments where they used the more typical ‘VMCA’ mode. Speaking with the customer and checking the source environment also confirmed this:

Source vCenter Advanced Setting

From the VMware KB for this issue it says that the procedure replaces both the IX appliance certificate and key and if you have more than one IX it will have to be performed for each Interconnect appliance that is deployed in a vCenter with ‘custom’ certificate management.

First things first we need to pay attention to the error message which tells us which IP is having trouble. In this case I know it was my destination environment.

Resolution

  1. SSH to the HCX Manager for the destination and login as ‘admin’.
  2. Enable ‘CCLI’ mode from the prompt.
  3. Within ‘CCLI’ mode type ‘list ‘to show the deployed devices.

In the below screenshot you can see that in this case the HCX Manager has deployed an IX and NE device. The IP address for the IX is listed in the Address Column. This is the IP we will use in the certificate request.

I create my CSR and private KEY in the usual way using Ubuntu. Below is the config file I used. Note that the CN and alt_names match the IP taken from the list above:

[req]
distinguished_name = req_distinguished_name
req_extensions = v3_req
prompt = no

[req_distinguished_name]
C = SE
ST = Stockholm Lan
L = Stockholm
O = Terataki
OU = Terataki Cloud
CN = 192.168.93.121

[v3_req]
keyUsage = keyEncipherment, dataEncipherment
extendedKeyUsage = serverAuth
subjectAltName = @alt_names

[alt_names]
IP.1 = 192.168.93.121

I submit the CSR to the Certificate Authority and get back a certificate which I can now use to update the IX.

Back on the CCLI

4. Type Go 0 (or Go X wherever you IX is)
5. Change directory to /etc/vmware/ssl

In that directory you will see the current self-signed certificate ‘rui.crt’ and its private key ‘rui.key’


6. Backup the existing certificate files:
mv rui.crt rui.crt.bak
mv rui.key rui.key.bak

7.​​Replace the files with the custom CA cert and key

In order to replace the certificate and key I used VI as it enables me to cut and paste from my desktop

8. Reboot the IX appliance or restart the MA and authentication services
stc restart mobilityagent
stc restart authdlauncher

Now, I can carry out a re-sync to the Service Mesh to trigger the MA deployment. In my case it said there no configuration changes found.

Clicking Resync again causes the restart of the registration of the MA agent.


At this point there should be no error and the Service Mesh Modified message should appear in Green along with Green status on the HCX Services.

With the above complete I was then able to carry out vMotion based migrations as normal.

Caveats

The KB points these out but it says that ‘this workaround will not be persistent if the Service Mesh is re-sync’ed or after service updates. The same procedure will have to be performed to re-deploy the MA again.‘ My own testing showed that a simple re-sync did not require the certificate to be replaced but your mileage may vary.


Also you must ‘Ensure the ‘CN’ and ‘SAN’ fields of the certificate contain the IP address that is intended to be used for the management IP of the IX appliance.‘ As you can see in the above CSR request this is what I did.

Gotchas

During the deploy of the customer mesh with the CA certificates it turned out they used different Intermediate CAs to sign the IX certificates. It is important to ensure that the vCenter running the custom mode has the relevant certificate chains loaded in the Trusted Root Certificate store.

Documentation

The official VMware KB 79003 explains the process.

Leave a Comment

  1. Great. I resolved this issue using this.
    Thanks