This Blog is to share our knowledge and expertise on Linux System Administration and VMware Administration

Showing posts with label VMware. Show all posts
Showing posts with label VMware. Show all posts

Wednesday, February 17, 2016

An ESXi 5.x host running on HP server fails with a purple diagnostic screen and the error: hpsa_update_scsi_devices or detect_controller_lockup_thread

Wednesday, February 17, 2016 0
 Whenever you find below Symptoms

    Cannot run the host on Hewlett Packard (HP) hardware
    Running the host on HP hardware fails with a purple diagnostic screen
    You see the error:

    hpsa_update_scsi_devices@<None>#<None>+0x39c
    hpsa_scan_start@<None>#<None>+0x187
    hpsa_kickoff_rescan@<None>#<None>+0x20f
    kthread@com.vmware.driverAPI#9.2+0x185
    LinuxStartFunc@com.vmware.driverAPI#9.2+0x97
    vmkWorldFunc@vmkernel#nover+0x83
    CpuSched_StartWorld@vmkernel#nover+0xfa
    Your host fails with a purple diagnostic screen and you see the error:

    Panic: 892: Saved backtrace: pcpu X TLB NMI
    _raw_spin_failed@com.vmware.driverAPI#9.2+0x5
    detect_controller_lockup_thread@#+0x3a9
     kthread@com.vmware.driverAPI#9.2+0x185
     LinuxStartFunc@com.vmware.driverAPI#9.2+0x97
     vmkWorldFunc@vmkernel#nover+0x83                
     CpuSched_StartWorld@vmkernel#nover+0xfa
     PCPU X locked up. Failed to ack TLB invalidate (total of 1 locked up, PCPU9s): X)
    Before host becomes unresponsive, in the /var/log/vmkernel.log file, you see entries similar to:

    WARNING: LinDMA: Linux_DMACheckConstraints:149: Cannot map machine address = 0xfffffffffff, length = 49160 for device 0000:03:00.0; reason = buffer straddles device dma boundary (0xffffffff)WARNING: Heap: 4089: Heap_Align(vmklnx_hpsa, 32768/32768 bytes, 8 align) failed.  caller: 0x41802dcb1f91cpu4:1696102)<4>hpsa 0000:09:00.0: out of memory in adjust_hpsa_scsi_table
    Before you see a purple diagnostic screen, in the /var/log/vmkernel.log file, you see entries similar to:

    Note: These are multiple memory error messages from the hpsa driver.

    out of memory at vmkdrivers/src_9/drivers/hpsa/hpsa.c:3562
    out of memory at vmkdrivers/src_9/drivers/hpsa/hpsa.c:3562
    out of memory at vmkdrivers/src_9/drivers/hpsa/hpsa.c:3562
    out of memory at vmkdrivers/src_9/drivers/hpsa/hpsa.c:3562
    WARNING: Heap: 3622: Heap vmklnx_hpsa (39113576/39121768): Maximum allowed growth (8192) too small for size (20480)
    cpu7:1727675)<4>hpsa 0000:06:00.0: out of memory at vmkdrivers/src_9/drivers/hpsa/hpsa.c:3562
    cpu2:1727677)<4>hpsa 0000:0c:00.0: out of memory at vmkdrivers/src_9/drivers/hpsa/hpsa.c:3562
    cpu4:1727676)<4>hpsa 0000:09:00.0: out of memory at vmkdrivers/src_9/drivers/hpsa/hpsa.c:3562
    cpu3:1727738)WARNING: LinDMA: dma_alloc_coherent:726: Out of memory
    cpu3:1727738)<3>hpsa 0000:06:00.0: cmd_special_alloc returned NULL!

Resolution should be

This is a known issue affecting VMware ESXi 5.x.

To resolve this issue, apply the updated driver supplied by HP. Always check the HCL to determine the latest available driver update.

Note: For all BL685c G7 blades and DL360p Gen8 servers, HP recommends to update to ESXi 5.5 update1 to the June 2014 version. For more information, see

The reasons for the recommendation are:

    Fix for smx-provider memory leak issue is resolved.
    Several issues for the hpsa driver are resolved in the .60 version found in new June 2014 version of ESXi 5.5 update1. The previous version of the hpsa driver was .50 and was problematic.

For the DL360p Gen8 servers, the iLO firmware need to be checked. If the iLO Firmware is not at 1.51, it is recommended to update the Firmware on all servers to 1.51. This is a critical update to avoid NMI events which would cause PSOD in your environment.

It is also recommended to check the DL360p Gen8 servers to make sure that they are at least at Feb 2014 system ROM. This is to correct a possible IPMI issue.

If this issue persists after the driver upgrade:

    Open a HP Support Request, reference HP case 4648045806.
    If this issue persists, open a support request with VMware Support.
    Provide VMware support your HP case number.

ESXi 5.0 host experiences a purple diagnostic screen with the errors "Failed to ack TLB invalidate" or "no heartbeat" on HP servers with PCC support

Wednesday, February 17, 2016 0
Whevever - ESXi 5.0 host fails with a purple diagnostic screen

The purple diagnostic screen or core dump contains messages similar to:

PCPU 39 locked up. Failed to ack TLB invalidate (total of 1 locked up, PCPU(s): 39).
0x41228efc7b88:[0x41800646cd62]Panic@vmkernel#nover+0xa9 stack: 0x41228efe5000
0x41228efc7cb8:[0x4180064989af]TLBDoInvalidate@vmkernel#nover+0x45a stack: 0x41228efc7ce8

@BlueScreen: PCPU 0: no heartbeat, IPIs received (0/1)....

0x4122c27c7a68:[0x41800966cd62]Panic@vmkernel#nover+0xa9 stack: 0x4122c27c7a98
0x4122c27c7ad8:[0x4180098d80ec]Heartbeat_DetectCPULockups@vmkernel#nover+0x2d3 stack: 0x0

NMI: 1943: NMI IPI received. Was eip(base):ebp:cs [0x7eb2e(0x418009600000):0x4122c2307688:0x4010](Src 0x1, CPU140)

Heartbeat: 618: PCPU 140 didn't have a heartbeat for 8 seconds. *may* be locked up

Cause might be some HP servers experience a situation where the PCC (Processor Clocking Control or Collaborative Power Control) communication between the VMware ESXi kernel (VMkernel) and the server BIOS does not function correctly.
As a result, one or more PCPUs may remain in SMM (System Management Mode) for many seconds. When the VMkernel notices a PCPU is not available for an extended period of time, a purple diagnostic screen occurs.

The solution should be

This issue has been resolved as of ESXi 5.0 Update 2 as PCC is disabled by default.
To work around this issue in versions prior to ESXi 5.0 U2, disable PCC manually.
To disable PCC:

Connect to the ESXi host using the vSphere Client.

    Click the Configuration tab.
    In the Software menu, click Advanced Settings.
    Select vmkernel.
    Deselect the vmkernel.boot.usePCC option.
    Restart the host for the change to take effect.

Tuesday, December 29, 2015

Differences between upgraded and newly created VMFS-5 datastores:

Tuesday, December 29, 2015 0
Differences between upgraded and newly created VMFS-5 datastores:
  • VMFS-5 upgraded from VMFS-3 continues to use the previous file block size which may be larger than the unified 1MB file block size. Copy operations between datastores with different block sizes won’t be able to leverage VAAI.  This is the primary reason I would recommend the creation of new VMFS-5 datastores and migrating virtual machines to new VMFS-5 datastores rather than performing in place upgrades of VMFS-3 datastores.
  • VMFS-5 upgraded from VMFS-3 continues to use 64KB sub-blocks and not new 8K sub-blocks.
  • VMFS-5 upgraded from VMFS-3 continues to have a file limit of 30,720 rather than the new file limit of > 100,000 for newly created VMFS-5.
  • VMFS-5 upgraded from VMFS-3 continues to use MBR (Master Boot Record) partition type; when the VMFS-5 volume is grown above 2TB, it automatically switches from MBR to GPT (GUID Partition Table) without impact to the running VMs.
  • VMFS-5 upgraded from VMFS-3 will continue to have a partition starting on sector 128; newly created VMFS-5 partitions start at sector 2,048.

Based on the information above, the best approach to migrate to VMFS-5 is to create net new VMFS-5 datastores if you have the extra storage space, can afford the number of Storage vMotions required, and have a VAAI capable storage array holding existing datastores with 2, 4, or 8MB block sizes.

Difference between VMFS 3 and VMFS 5 -- Part1

Tuesday, December 29, 2015 0
  • Explains you the major difference between VMFS 3 and VMFS 5. VM FS 5 is available as part of vSphere 5. VMFS 5 is introduced with lot of performance enhancements. 
  • Newly installed ESXi 5 will be formatted with VMFS 5 version but if you have upgraded the ESX 4 or ESX 4.1 to ESXi 5, then datastore version will be VMFS 3 only. 
  • You will able to upgrade the VMFS 3 to VMFS 5 via vSphere client once ESXi upgrade is Complete. This posts tells you some major differences between    VMFS 3 and VMFS 5




How to Identify the virtual machines with Raw Device Mappings (RDMs) using PowerCLI

Tuesday, December 29, 2015 0
Open the vSphere PowerCLI command-line.
Run the command:

Get-VM | Get-HardDisk -DiskType "RawPhysical","RawVirtual" | Select Parent,Name,DiskType,ScsiCanonicalName,DeviceName | fl

This command produces a list of virtual machines with RDMs, along with the backing SCSI device for the RDMs.

An output looks similar to:

Parent                      Virtual Machine Display Name
Name                       Hard Disk n
DiskType                  RawVirtual
ScsiCanonicalName naa.646892957789abcdef0892957789abcde
DeviceName            vml.020000000060912873645abcdef0123456789abcde9128736450ab

If you need to save the output to a file the command can be modified:

Get-VM | Get-HardDisk -DiskType "RawPhysical","RawVirtual" | Select Parent,Name,DiskType,ScsiCanonicalName,DeviceName | fl | Out-File –FilePath RDM-list.txt

Identify the backing SCSI device from either the ScsiCanonicalName or DeviceName identifiers.

Snapshot consolidation "error: maximum consolidate retries was exceeded for scsix:x"

Tuesday, December 29, 2015 0
Whenever you cannot perform snapshot consolidation in VMware ESXi 5.5 and ESXi 6.0.x.Performing a snapshot consolidation in ESXi 5.5 fails.

or

When attempting to consolidate snapshots using the vSphere Client, you see the error:

maximum consolidate retries was exceeded for scsix:x

Consolidate Disks message: The virtual machine has exceeded the maximum downtime of 12 seconds for disk consolidation.

 This issue occurs because ESXi 5.5 introduced a different behavior to prevent the virtual machine from being stunned for an extended period of time.

This message is reported if the virtual machine is powered on and the asynchronous consolidation fails after 10 iterations. An additional iteration is performed if the estimated stun time is over 12 seconds.This occurs when the virtual machine generates data faster than the consolidated rate.

To resolve this issue, turn off the snapshots consolidation enhancement in ESXi 5.5 and ESXi 6.0.x, so that it works like earlier versions of ESX/ESXi. This can be done by setting the snapshot.asyncConsolidate.forceSync to TRUE.

  Note: If the parameter is set to true, the virtual machine is stunned for long time to perform the snapshot consolidation, and it may not respond to ping during the consolidation.

To set the parameter snapshot.asyncConsolidate.forceSync to TRUE using the vSphere client:

Shut down the virtual machine.

Right-click the virtual machine and click Edit settings.

Click the Options tab.

Under Advanced, right-click General

Click Configuration Parameters, then click Add Row.

In the left pane, add this parameter:

snapshot.asyncConsolidate.forceSync

In the right pane, add this value:


TRUE

Click OK to save your change, and power on the virtual machine.

To set the parameter snapshot.asyncConsolidate.forceSync to TRUE without shutting down the virtual machine, run this Powercli command:

get-vm virtual_machine_name | New-AdvancedSetting -Name snapshot.asyncConsolidate.forceSync -Value TRUE -Confirm:$False

How to resolve : Cannot take a quiesced snapshot of Windows 2008 R2 virtual machine

Tuesday, December 29, 2015 0
When creating a snapshot on a Windows 2008 R2 virtual machine on ESXi/ESX 4.1 and later versions, you may experience these symptoms:
  • The snapshot operation fails to complete.
  • Unable to create a quiesced snapshot of the virtual machine.
  • Unable to back up the virtual machine.
  • Cloning a Windows 2008 R2 virtual machine fails.
  • In the Application section of the Event Viewer in virtual machine, Windows guest operating system reports an VSS error similar to:
           Volume Shadow Copy Service error: Unexpected error calling routine IOCTL_DISK_SET_SNAPSHOT_INFO(\\.\PHYSICALDRIVE1) fails with winerror 1168. hr = 0x80070490, Element not found.
  •  Any process that creates a quiesced snapshot fails.
  •  You see the error:
    Can not create a quiesced snapshot because the create snapshot operation exceeded the time limit for holding off I/O in the frozen virtual machine.

Backup applications, such as VMware Data Recovery, fails.You see the error:

  • Failed to create snapshot for vmname, error -3960 (cannot quiesce virtual machine)
  • This is a known issue with VSS application snapshots which is not caused by VMware software. It affects ESXi/ESX 4.1 and later versions.
  • Currently, there is no resolution.
  • To work around this issue, disable VSS quiesced application-based snapshots and revert to file system quiesced snapshots. You can disable VSS applications quiescing with either the VMware vSphere Client or with VMware Tools. Use one of these procedures:
 Disable VSS application quiescing using the vSphere Client:
  •  Power off the virtual machine.
  •  Log in to the vCenter Server or the ESXi/ESX host through the vSphere Client.
  •  Right-click the virtual machine and click Edit settings.
  •  Click the Options tab.
  •  Navigate to Advanced > General > Configuration Parameters.
  •  Add or modify the row disk.EnableUUID with the value FALSE.
  •  Click OK to save.
  •  Click OK to exit.
  •  Reboot the virtual machine for changes to take in effect.
Note: If this change is done through the command line using a text editor, the vim-cmd command to reload the vmx is enough to see the changes. For more information
Alternatively, un-register the virtual machine from the vCenter Server inventory. To un-register, right-click the virtual machine and click Remove from Inventory.
        Re-register the virtual machine back to the inventory.

Disable VSS application quiescing using VMware Tools:

  • Open the C:\ProgramData\VMware\VMware Tools\Tools.conf file in a text editor, such as Notepad. If the file does not exist, create it.
  • Add these lines to the file
            [vmbackup]
            vss.disableAppQuiescing = true
  • Save and close the file.
  • Restart the VMware Tools Service for the changes to take effect.
  • Click Start > Run, type services.msc, and click OK.
  • Right-click the VMware Tools Service and click Restart.

Taking a snapshot fails with the Error "Failed to take a memory snapshot, since the virtual machine is configured with independent disks"

Tuesday, December 29, 2015 0
When attempting to take a snapshot of a powered on virtual machine, you experience these symptoms:
You cannot take a snapshot with the Snapshot the virtual machine's memory option selected.

You see this error:

Failed to take a memory snapshot, since the virtual machine is configured with independent disks.

Resolution


This is an expected behavior, virtual machines with Independent disks cannot use memory or quiesced snapshots.

To resolve this issue, use one of these options:
When taking a snapshot of a virtual machine, deselect the Snapshot the virtual machine's memory and Quiesce Snapshot options.
Deselect the independent option in the virtual disk options.

To change the options for the virtual disk(s):

  • Open the vSphere Client.
  • Right-click the virtual machine and click Edit Settings.
  • Find the affected virtual disk(s) and deselect the Independent option.
  • Click OK to apply and save the changes to the virtual machine configuration.
 Note: This change requires the virtual machine to be powered off. If not, the option is grayed out.

How to Troubleshoot the NTP issue on ESX and ESXi 4.x / 5.x

Tuesday, December 29, 2015 0
Validate network connectivity between the ESXi/ESX host and the NTP server using the ping command.

Query ntpd service using ntpq

Use the NTP Query utility program ntpq to remotely query the ESXi/ESX host's ntpd service.
The ntpq utility is commonly installed on Linux clients and is also available in the ESX service console and the vSphere Management Assistant. For more information on the installation and use of the ntpq utility program on a given Linux distribution, see your Linux distribution's documentation.

For an ESXi 5.x host, the ntpq utility is included by default and does not need to be installed. It can be run locally from the ESXi 5.x host.


The ntpq utility is not available on ESXi 3.x/4.x. To query an ESXi host's NTP service ntpd, install ntpq on a remote Linux client and query the ESXi host's ntpd service from the Linux client.

To use the NTP Query utility ntpq to remotely query the ESX host's NTP service (ntpd) and determine whether it is successfully synchronizing with the upstream NTP server:

When using a Linux client, open a console session on the client where ntpq is installed.
Run this command:


When using an SSH shell or local console session on ESXi 5.5 and 5.1:
# "watch ntpq -p localhost_or_127.0.0.1"

When using a Linux client for ESXi/ESX 4.x:
# watch "ntpq -p ESX_host_IP_or_domain_name"

Monitor the output for 30 seconds and press Ctrl+C on your keyboard to stop the watch command.


Note: In ESXi 5.5 and 5.1, the output you see either localhost or loopback (127.0.0.1).

remote              refid    st  t  when poll reach delay  offset  jitter
======================================================
*10.11.12.130  1.0.0.0  1  u   46   64   377   43.76  5.58   40000


How to resolve : vMotion fails with network errors

Tuesday, December 29, 2015 0
Network misconfiguration can cause random vMotion failure. Retrying the vMotion operation may be successful, then to isolate and correct the problem please check as suggested by VMware.

To resolve this issue:
  • Check for IP address conflicts on the vMotion network. Each host in the cluster should have a vMotion vmknic, assigned a unique IP address.
  • Check for packet loss over the vMotion network. Try having the source host ping (vmkping) the destination host's vMotion vmknic IP address for the duration of the vMotion.
  • Check for connectivity between the two hosts (use the same ping test as above).
  • Check for potential interaction with firewall hardware or software that prevents connectivity between the source and the destination TCP port 8000.
For the Connection refused error, after confirming a lack of IP address conflicts, check to see that the vmotionServer process is running. If it is running, it exists as a kernel process visible in the output of the ps or esxtop command.

Remediating an ESXi 5.x and 6.0 host with Update Manager fails with the error: There was an error checking file system on altbootbank

Tuesday, December 29, 2015 0
Whenever you find below Symptoms

You cannot remediate an ESXi 5.x and 6.0 host.
Remediation of an ESXi 5.x and 6.0 host using vCenter Update Manager fails.
You see the error:


The host returns esxupdate error code:15. The package manager transaction is not successful. Check the Update Manager log files and esxupdate log files for more details

In the /var/log/esxupdate.log file, you see entries similar to:

esxupdate: esxupdate: ERROR: InstallationError: ('', 'There was an error checking file system on altbootbank, please see log for detail.')

Then Resolution would be as follows

To resolve the issue, repair the altbootbank partition.

To repair the altbootbank partition:

    Run this command to determine the device for /altbootbank:

    vmkfstools -P /altbootbank

    You see output similar to:

mpx.vmhba32:C0:T0:L0:5


Run this command to repair the altbootbank filesystem:

dosfsck -a -w /dev/disks/device_name
   
#dosfsck -a -w /dev/disks/mpx.vmhba32:C0:T0:L0:5 


 If remediation fails at this stage, reboot the host.

Esxupdate error code:15. The package manager transaction is not successful error While Remediating an ESXi 5.x or 6.0 host

Tuesday, December 29, 2015 2
Whenever - You cannot remediate an ESXi 5.x or 6.0 host using vCenter Update Manager.

 Remediating ESXi 5.x or 6.0 hosts fails.


 A package is to be updated on the host, particularly when VMware_locker_tools-light* is corrupt.  


error code:15. The package manager transaction is not successful. Check the Update Manager log files and esxupdate log files for more details.

To resolve this issue

 Recreate the/locker/packages/version/ folder, where version is:
        ESXi 5.0 – /locker/packages/5.0.0/
        ESXi 5.1 – /locker/packages/5.1.0/
        ESXi 5.5 – /locker/packages/5.5.0/
        ESXi 6.0 – /locker/packages/6.0.0/

To verify the store folders contents and symbolic link:

 Connect to the ESXi host using an SSH session.
 Check for information in the /store folder by running this command:
        ls /store

This folder must contain packages and var folder.
Run this command to verify that the symbolic link is valid:
        ls -l /

The /store folder should be linked to /locker and appear as:
        locker  -> /store

If that link is not displayed, run this command to add the symbolic link:
        ln -s /store /locker

To recreate the/locker/packages/version/ folder:
 Put the host in the maintenance mode.
 Navigate to the /locker/packages/version/ folder on the host.
 Rename /locker/packages/version/ folder to /locker/packages/version.old.
 Remediate the host using Update Manager.

The /locker/packages/version/ folder is recreated and the remediation should now be successful.
 Note: Verify if you can change to the other folders in /locker/packages/version/. If not, rename all the three folders including floppies.

An alternative resolution for ESXi:
Put the host in the maintenance mode.
Navigate to the /locker/packages/version/ folder on the host.
Rename the folder to:
       /locker/packages/ version.old

Run this command as the root user to recreate the folder:
       mkdir / locker/packages/ version/

For ex:

In ESXi 5.0:
        mkdir / locker/packages/5.0.0/

In ESXi 5.1:
        mkdir / locker/packages/5.1.0/

 In ESXi 5.5:
        mkdir / locker/packages/5.5.0/

In ESXi 6.0:
        mkdir / locker/packages/6.0.0/


Use WinSCP to copy the folders and files from the / locker/packages/ version/ directory on a working host to the affected host.


If the preceding methods do not resolve the issue:
Verify and ensure that there is sufficient free space on root folder by running this command:
        vdf -h

Check the locker location by running this command:
        ls -ltr /

If the locker is not pointing to a datastore:
Rename the old locker file by running this command:
        mv /locker /locker.old

Recreate the symbolic link by running this command:
        ln -s /store /locker

Monday, December 21, 2015

Troubleshooting Syslog Collector in VMWare vsphere

Monday, December 21, 2015 0
Whenever syslog files aren’t updating  in the repository from he vSphere Syslog Collector server.












Here are some basic steps that can be used to troubleshoot this problem.
VMware ESXi hosts

On the VMware ESXi hosts check the following settings:
– Syslog destination. Open the vSphere Client. On the ESXi server, open the configuration tab and select advanced Settings. Check the Syslog.global.logHost value. The format is: protocol://FQDN:port . For example udp://syslog.beerens.local:514




















– Is the ESXi firewall port open for syslog traffic. Open the vSphere Client, on the ESXi server, open the Configuration tab, select Security Profile, Firewall and select Properties. Check if the syslog service is enabled.



















 vSphere Syslog Collector
On the vSphere Syslog Collector server check the following settings:
– Is the syslog port 514 (default) listening:






-  Reload and update the syslog configuration.  On the ESXi host use the following command:
esxcli system syslog reload
– Is the Syslog Collector service started. Restart the Syslog Collector service if needed 















After the reloading the syslog settings and restarting the Syslog Collector service the files begun to update again in the repository.

How to Enable Execute Disable/No Execute CPU feature at ESXI

Monday, December 21, 2015 0

ESXi requires the Execute Disable/No Execute CPU feature to be enabled

Restart the host, enter in(press F9) to boot in BIOS mode.

Advanced Options --> Processor Options --> No-Execute Memory Protection, then configure: Enabled



image

 

Hope it helps.

VMware: How to rollback ESXi 5.1 to 5.0

Monday, December 21, 2015 0
Whenever you find issues after upgrading  to esxi 5.1 from 5.0 , rollback is as simple as below.

Reboot the host and press R to start the Recovery Mode..
Installed hypervisors:
HYPERVISOR1: 5.0.0-623860
HYPERVISOR2: 5.1.0-799733 (Default)
CURRENT DEFAULT HYPERVISOR WILL BE REPLACED PERMANENTLY
DO YOU REALLY WANT TO ROLL BACK?

Press Y to start the roll back

image

Result:
image
The host is downgraded and back online again with VMware vSphere ESXi 5.0.0

How to Disable the interrupt remapping on ESXi

Monday, December 21, 2015 0
 ESXi/ESX 4.1

To disable interrupt remapping on ESXi/ESX 4.1, perform one of these options:

    Run this command from a console or SSH session to disable interrupt mapping:

    # esxcfg-advcfg -k TRUE iovDisableIR

    To back up the current configuration, run this command twice:

    # auto-backup.sh

    Note: It must be run twice to save the change.

    Reboot the ESXi/ESX host:

    # reboot

    To check if interrupt mapping is set after the reboot, run the command:

    # esxcfg-advcfg -j iovDisableIR

    iovDisableIR=TRUE
    In the vSphere Client:
        Click Configuration > (Software) Advanced Settings > VMkernel.
        Click VMkernel.Boot.iovDisableIR, then click OK.
        Reboot the ESXi/ESX host.

ESXi 5.x and ESXi 6.0.x

ESXi 5.x and ESXi 6.0.x does not provide this parameter as a GUI client configurable option. It can only be changed using the esxcli command or via the PowerCLI.


    To set the interrupt mapping using the esxcli command:

    List the current setting by running the command:

    # esxcli system settings kernel list -o iovDisableIR

    The output is similar to:

    Name          Type  Description                              Configured  Runtime  Default
    ------------  ----  ---------------------------------------  ----------  -------  -------
    iovDisableIR  Bool  Disable Interrupt Routing in the IOMMU   FALSE        FALSE    FALSE

    Disable interrupt mapping on the host using this command:

    # esxcli system settings kernel set --setting=iovDisableIR -v TRUE

    Reboot the host after running the command.

    Note: If the hostd service fails or is not running, the esxcli command does not work. In such cases, you may have to use the localcli instead. However, the changes made using localcli do not persist across reboots. Therefore, ensure that you repeat the configuration changes using the esxcli command after the host reboots and the hostd service starts responding. This ensures that the configuration changes persist across reboots.
    To set the interrupt mapping through PowerCLI:

    Note: The PowerCLI commands do not work with ESXi 5.1. You must use the esxcli commands as detailed above.

    PowerCLI> Connect-VIServer -Server 10.21.69.233 -User Administrator -Password passwd
    PowerCLI> $myesxcli = Get-EsxCli -VMHost 10.21.69.111
    PowerCLI> $myesxcli.system.settings.kernel.list("iovDisableIR")

    Configured  : FALSEDefault     : FALSE
    Description : Disable Interrrupt Routing in the IOMMU
    Name        : iovDisableIR
    Runtime     : FALSE
    Type        : Bool

    PowerCLI> $myesxcli.system.settings.kernel.set("iovDisableIR","TRUE")
    true

    PowerCLI> $myesxcli.system.settings.kernel.list("iovDisableIR")

    Configured  : TRUEDefault     : FALSE
    Description : Disable Interrrupt Routing in the IOMMU
    Name        : iovDisableIR
    Runtime     : FALSE
    Type        : Bool
    After the host has finished booting, you see this entry in the /var/log/boot.gz log file confirming that interrupt mapping has been disabled:

    TSC: 543432 cpu0:0)BootConfig: 419: iovDisableIR = TRUE

How to resolve - vCenter Server task migration fails with the error: Failed to create journal file provider, Failed to open for write

Monday, December 21, 2015 0
For vCenter Server

The journal files for vCenter Server on Windows are located at:

    Windows 2003 and earlier – %ALLUSERSPROFILE%\Application Data\VMware\VMware VirtualCenter\journal\
    Windows 2008 and later – %PROGRAMDATA%\VMware\VMware VirtualCenter\journal\

Whenever you..

    Cannot perform provisioning operations, such as vMotion, Clone, storage DRS
    Cannot create new virtual machines.
    Cannot add RDM disk to a virtual machine.
    Provisioning operations (such as vMotion, Clone, or Migrate) fail.
    vCenter Server task migration fails. You see the error:

    A general system error occurred: Failed to create journal file providerFailed to open "<filename>" for write   

Cause

This issue occurs if there is not enough free disk space to store the journal information. The management components in vCenter Server and ESX/ESXi record transactions to journal files when tracking long-running operations. The path and filename cited in the error message indicate the layer that failed to create a journal file.

The Resolution should be ..

Delete or archive the unnecessary files on this filesystem to free up disk space. Depending on your vCenter Server implementation, it is recommended to have a minimum of 40GB of disk space free on the system.



Saturday, November 21, 2015

Brief about ESXTOP - Batch Mode

Saturday, November 21, 2015 0
Batch mode – Statistics can be collected  and output can be saved in a file (csv) and  also  it can be viewed & analyzed using windows perfmon & other tools in later time.

To run esxtop in batch mode and save the output file for feature analysis use the command as in in below syntax

esxtop -b -d 10 -n 5 >/home/nagu/esxtstats.csv

–d Switch is used for the number of seconds between refreshes
–n switch is the number of iterations to run the esxtop


In our above example, esxtop command will run for about 50 seconds. 10 seconds dealy* 5 iterations. redirecting the output of above esxtop stats into csv file to store in the location  

/home/nagu/esxstats.csv




Once the command completed, Browse towards the location /home/nagu to see the esxtop output file “esxstats.csv”. Transfer the csv file using winscp to your windows desktop and analyze using windows perfmon or esxplot.

VMWare Interview Questions and answers on vMotion

Saturday, November 21, 2015 0
1.What is vMotion?

      Live migration of a virtual machine from one ESX server to another with Zero downtime.

2. What are the use cases of vMotion ?
  • Balance the load on ESX servers (DRS
  • Save power by shutting down ESX using DPM
  • Perform patching and maintenance on ESX server (Update Manager or HW maintenance
3.  What are Pre-requisites for the vMotion to Work?
  • ESX host must be licensed for VMotion
  • ESX  servers must be configured with vMotion Enabled VMkernel Ports.   
  • ESX servers must have compatible CPU’s for the vMotion to work
  • ESX servers should have Shared storage (FB, iSCSI or NFS) and VM’s should be stored on that    storage.
  • ESX servers should have exact similar network & network names
4. What are the Limitations of vMotion?
  • Virtual machines configured with the Raw Device Mapping(RDM) for clustering features using vMotion
  • VM cannot be connected to a CD-ROM or floppy drive that is using an ISO or floppy image stored on a drive that is local to the host server. The device should be disconnected before initiating the vMotion.
  • Virtual Machine cannot be migrated with VMotion unless the destination swapfile location is the same as the source swapfile location. As a best practice, Place the virtual machine swap files with the virtual  machine configuration file.
  • Virtual Machine affinity must not be set (aka, bound to physical CPUs).
5. Steps involved in VMWare vMotion ?
  • A request has been made that VM-1 should be migrated (or “VMotioned”) from ESX A to ESX B.
  • VM-1’s memory is pre-copied from ESX A to ESX B while ongoing changes are written to a memory bitmap on ESX A.
  • VM-1 is quiesced on ESX A and VM-1’s memory bitmap is copied to ESX B.
  • VM-1 is started on ESX B and all access to VM-1 is now directed to the copy running on ESX B.
  • The rest of VM-1’s memory is copied from ESX A all the while memory is being read and written from VM-1 on ESX A when applications attempt to access that memory on VM-1 on ESX B.
  • If the migration is successful, VM-1 is unregistered on ESX A. 

PowerShell Script to List all VM’s with a connected CD-ROM/floppy device

Saturday, November 21, 2015 0
This script will report all VMs with a connected CD-ROM/floppy device. It will give you information about the device status – e.g. connected, connect at power on, client device

Replace vCenter Server with your vCenter Server name in the first line:

Connect-VIServer vCenter_name

$vms = Get-VM
write “VMs with a connected CD-ROM:”
foreach ($vm in $vms | where { $_ | Get-CDDrive | where { $_.ConnectionState.Connected -eq “true”}}) {
write $vm.name
}
write “VMs with CD-ROM connected at power on:”
foreach ($vm in $vms | where { $_ | Get-CDDrive | where { $_.ConnectionState.StartConnected -eq “true”}}) {
write $vm.name
}
write “VMs with CD-ROM connected as ‘Client Device’:”
foreach ($vm in $vms | where { $_ | Get-CDDrive | where { $_.RemoteDevice.Length -ge 0}}) {
write $vm.name
}
write “VMs with CD-ROM connected to ‘Datastore ISO file’:”
foreach ($vm in $vms | where { $_ | Get-CDDrive | where { $_.ISOPath -like “*.ISO*”}}) {
write $vm.name
}
write “VMs with connected Floppy:”
foreach ($vm in $vms | where { $_ | Get-FloppyDrive | where { $_.ConnectionState.Connected -eq “true”}}) {
write $vm.name
}
write “VMs with floppy connected at power on:”
foreach ($vm in $vms | where { $_ | Get-FloppyDrive | where { $_.ConnectionState.StartConnected -eq “true”}}) {
write $vm.name
}
write “VMs with floppy connected as ‘Client Device’:”
foreach ($vm in $vms | where { $_ | Get-FloppyDrive | where { $_.RemoteDevice.Length -ge 0}}) {
write $vm.name
}

Note: Copy this code in a notepad and save the file as .ps1