-- PDL (Permanent Device Loss) Esxi considers device loss permanent. It can be caused by making a LUN inaccessible to a host. This state of PDL is inferred from SCSI sense codes returned for a LUN by an array. ESXi interprets certain codes as a permanent failure.

--APD (All Paths down) Esxi considers connectivity Loss as transient. This can occur if a host can't access the storage array via network. Till 5.x there was no support for handling APD.

 

-- PDL support was introduced in 5.1, it used to Rely on vSCSI layer to kill a VM upon IO failures during PDL. There were complex setup requirements required earlier till 5.x like (a) advanced option to enable vSCSI Kill (per VM, per Host). (b) Guest needs to issue IO to PDL'ed datastore. (c) advanced option to enable HA failover (enabled by default). (d) VM placement used to use staled data of accessibility.

-- To handle APD and PDL in 6.0, VMware has introduced vSphere VMCP (VM component protection)

-- Now if an APD or a PDL condition occurs and let's say a VM is running out of a host which has got connectivity issue with a datastore, HA will kick in and restart that VM to a different host which doesn't have connectivity issue with the same storage.

-- NFS doesn't have a PDL as its not a block level storage from where the LUN access can be removed. VMCP for APD works totally fine with NFS

-- VMCP protects VMs against storage connectivity failures and misconfigurations. It covers all datastores used by a VM.

-- The recovery Workflow is as follows: In this wait for APD declaration Host defined is 140Sec and User defined by default is 3 minutes. Restart guest is required because if storage comes back within the timeout period, the VM might still be in a Zombie state.

 

-- VMCP specifically has an option for APD which is "aggressive" and "conservative" failover. The difference between the two is in Conservative failover mode, HA would first look out for a place where to restart the VM and then terminates it, whereas in aggressive failover mode, HA would terminate the VM first and then would look out where to place the VM.