Friday, February 26, 2016
Wednesday, February 24, 2016
Repeat a Linux Command Every X Seconds Forever
NAGARAJU AVALA
Wednesday, February 24, 2016
0
Run Linux Command Every Second
In this Article, you will learn a simple scripting techniques to monitor or keep a eye on a particular command in continuously running state for every 3 seconds by default.1. Use watch Command
Watch is a Linux command that allows you to execute a command or program periodically and also shows you output on the screen. This means that you will be able to see the program output in time. By default watch re-runs the command/program every 2 seconds. The interval can be easily changed to meet your requirements.
Monitor Memory Usage
“Watch” is extremely easy to use, to test it, you can fire up a Linux terminal right away and type the following command:
# watch free -m
The above command will check your system free memory and update the results of the free command every two seconds.
Monitor Memory Usage in Linux
As seen per the above output, you have a header, displaying information about (from left to right) update interval, command that is being executed and current time. If you wish to hide this header, you can use the -t option.
The next logical question is – how to change the execution interval. For that purpose, you can use the -n option, that specifies the interval with which the command will be executed. This interval is specified in seconds. So let’s say you want to run your script.sh file every 10 seconds, you can do it like this:
# watch -n 10 script.sh
Monitor Logged-In Users, Uptime and Load Average Let’s say you want to monitor logged-in users, server uptime and load average output in continuously phase every few seconds, then use following command as shown:
# watch uptime
Watch Linux Load Average
To exit the command, press CTRL+C.
Here, the 'uptime' command will run and display the updated results every 2 seconds by default.
Monitor Progress of Copy Command
In Linux, while copying files from one location to other using cp command, the progress of data is not shown, to see the progress of data being copied, you can use the watch command along with du -s command to check disk usage in real time.
2. Use sleep Command
Sleep is often used to debug shell scripts, but it has many other useful purposes as well. For example, when combined with for or while loops, you can get pretty awesome results.
In case this is the first time you hear about the "sleep" command, it is used to delay something for a specified amount of time. In scripts, you can use it to tell your script to run command 1, wait for 10 seconds and then run command 2.
With the above loops, you can tell bash to run a command, sleep for N amount of seconds and then run the command again.
Below you can see examples of both loops:
for loop Example
# for i in {1..10}; do echo -n "This is a test in loop $i "; date ; sleep 5; done
The above one liner, will run the echo command and display the current date, total of 5 times, with 5 seconds sleep between executions. Here is a sample output:
This is a test in loop 1 Wed Feb 17 20:49:47 EET 2015
This is a test in loop 2 Wed Feb 17 20:49:52 EET 2015
This is a test in loop 3 Wed Feb 17 20:49:57 EET 2015
This is a test in loop 4 Wed Feb 17 20:50:02 EET 2015
You can change the echo and date commands with your own commands or script and change the sleep interval per your needs.
while loop Example
# while true; do echo -n "This is a test of while loop";date ; sleep 5; done
Here is sample output:
This is a test of while loopWed Feb 17 20:52:32 EET 2015
This is a test of while loopWed Feb 17 20:52:37 EET 2015
This is a test of while loopWed Feb 17 20:52:42 EET 2015
This is a test of while loopWed Feb 17 20:52:47 EET 2015
This is a test of while loopWed Feb 17 20:52:52 EET 2015
This is a test of while loopWed Feb 17 20:52:57 EET 2015
The above command will run until it is either killed or interrupted by user It can come in handy if you need to run a command running in the background and you don’t want to count on cron.
How to Install Skype 4.3 on Arch Linux
NAGARAJU AVALA
Wednesday, February 24, 2016
0
Skype is a most popular VoIP – Voice over IP for Linux.
Install Skype in Arch Linux
What brings new in this version of Skype for Linux:
1. Before installing Skype 4.3 on Arch Linux, assure that you have PulseAudio and all the required libraries installed on your system using the following commands.
On 32-bit Arch Linux
$ sudo pacman -S pulseaudio pulseaudio-alsa pavucontrol
On 64-bit Arch Linux
$ sudo pacman -S pulseaudio pulseaudio-alsa pavucontrol lib32-libpulse
2. Then stop and start PulseAudio server with the following commands.
$ pulseaudio -k
$ pulseaudio --start
3. Now install old Skype package from official Arch repository in order to pull out all the dependencies required to run smooth.
$ sudo pacman -S skype
Install Old Skype in Arch
4. Now to upgrade your software to the last version, go to official Skype web page using followiing link and download Dynamic package and extract it.
http://www.skype.com/en/download-skype/skype-for-linux/
$ cd Downloads
$ tar xjv skype-4.3.0.37.tar.bz2
$ cd skype-4.3.0.37/
Download and Install Latest Skype
5. Don’t leave the folder and use the following commands to upgrade Skype to latest version 4.3 from sources.
$ sudo cp -r avatars/* /usr/share/skype/
$ sudo cp -r lang/* /usr/share/skype/
$ sudo cp -r sounds/* /usr/share/skype/
$ sudo cp skype /usr/bin/
$ sudo chmod +x /usr/bin/skype
6. Now reboot your system and open Skype and you should see the last version running on your Arch Linux.
Skype 4.3 Login Screen
About Skype 4.3
7. To revert changes to official Arch repository Skype run the following commands.
$ sudo pacman -R skype
$ sudo pacman -S skype
If you want to install Skype 4.3 on other Linux distributions like Ubuntu, Debian, Fedora and OpenSuse visit Skype official page and grab the binary especially build and packaged for those distributions by Skype developers.
Install Skype in Arch Linux
What brings new in this version of Skype for Linux:
- An enhanced User Interface.
- A New cloud-based Group Chat exposure.
- Improved support for file transfer using on multiple device the same time.
- Support for PulseAudio 3.0 and 4.0.
- ALSA sound system is no more supported without PulseAudio.
- Many bug fixes.
1. Before installing Skype 4.3 on Arch Linux, assure that you have PulseAudio and all the required libraries installed on your system using the following commands.
On 32-bit Arch Linux
$ sudo pacman -S pulseaudio pulseaudio-alsa pavucontrol
On 64-bit Arch Linux
$ sudo pacman -S pulseaudio pulseaudio-alsa pavucontrol lib32-libpulse
2. Then stop and start PulseAudio server with the following commands.
$ pulseaudio -k
$ pulseaudio --start
3. Now install old Skype package from official Arch repository in order to pull out all the dependencies required to run smooth.
$ sudo pacman -S skype
Install Old Skype in Arch
4. Now to upgrade your software to the last version, go to official Skype web page using followiing link and download Dynamic package and extract it.
http://www.skype.com/en/download-skype/skype-for-linux/
$ cd Downloads
$ tar xjv skype-4.3.0.37.tar.bz2
$ cd skype-4.3.0.37/
Download and Install Latest Skype
5. Don’t leave the folder and use the following commands to upgrade Skype to latest version 4.3 from sources.
$ sudo cp -r avatars/* /usr/share/skype/
$ sudo cp -r lang/* /usr/share/skype/
$ sudo cp -r sounds/* /usr/share/skype/
$ sudo cp skype /usr/bin/
$ sudo chmod +x /usr/bin/skype
6. Now reboot your system and open Skype and you should see the last version running on your Arch Linux.
Skype 4.3 Login Screen
About Skype 4.3
7. To revert changes to official Arch repository Skype run the following commands.
$ sudo pacman -R skype
$ sudo pacman -S skype
If you want to install Skype 4.3 on other Linux distributions like Ubuntu, Debian, Fedora and OpenSuse visit Skype official page and grab the binary especially build and packaged for those distributions by Skype developers.
Remediating an ESXi 5.x or 6.0 host fails with the error: The host returns esxupdate error code:15. The package manager transaction is not successful
NAGARAJU AVALA
Wednesday, February 24, 2016
0
Symptoms
You cannot remediate an ESXi 5.x or 6.0 host using vCenter Update Manager.
Remediating ESXi 5.x or 6.0 hosts fails.
A package is to be updated on the host, particularly when VMware_locker_tools-light* is corrupt.
You see the error:
error code:15. The package manager transaction is not successful. Check the Update Manager log files and esxupdate log files for more details.
Cause
This issue occurs if the package files for floppies in the /locker/packages/Version/ folder is corrupt or full.
For example:
In ESXi 5.0 systems – /locker/packages/5.0.0/
In ESXi 5.1 systems – /locker/packages/5.1.0/
In ESXi 5.5 systems – /locker/packages/5.5.0/
In ESXi 6.0 systems – /locker/packages/6.0.0/
Resolution
To resolve this issue, recreate the/locker/packages/version/ folder, where version is:
ESXi 5.0 – /locker/packages/5.0.0/
ESXi 5.1 – /locker/packages/5.1.0/
ESXi 5.5 – /locker/packages/5.5.0/
ESXi 6.0 – /locker/packages/6.0.0/
To verify the store folders contents and symbolic link:
Connect to the ESXi host using an SSH session.
Check for information in the /store folder by running this command:
ls /store
This folder must contain packages and var folder.
Run this command to verify that the symbolic link is valid:
ls -l /
The /store folder should be linked to /locker and appear as:
locker -> /store
If that link is not displayed, run this command to add the symbolic link:
ln -s /store /locker
To recreate the/locker/packages/version/ folder:
Put the host in the maintenance mode.
Navigate to the /locker/packages/version/ folder on the host.
Rename /locker/packages/version/ folder to /locker/packages/version.old.
Remediate the host using Update Manager.
The /locker/packages/version/ folder is recreated and the remediation should now be successful.
Note: Verify if you can change to the other folders in /locker/packages/version/. If not, rename all the three folders including floppies.
An alternative resolution for ESXi:
Put the host in the maintenance mode.
Navigate to the /locker/packages/version/ folder on the host.
Rename the folder to:
/locker/packages/ version.old
Run this command as the root user to recreate the folder:
mkdir / locker/packages/ version/
For example:
In ESXi 5.0:
mkdir / locker/packages/5.0.0/
In ESXi 5.1:
mkdir / locker/packages/5.1.0/
In ESXi 5.5:
mkdir / locker/packages/5.5.0/
In ESXi 6.0:
mkdir / locker/packages/6.0.0/
Use WinSCP to copy the folders and files from the / locker/packages/ version/ directory on a working host to the affected host.
If the preceding methods do not resolve the issue:
Verify and ensure that there is sufficient free space on root folder by running this command
vdf -h
Check the locker location by running this command:
ls -ltr /
If the locker is not pointing to a datastore:
Rename the old locker file by running this command:
mv /locker /locker.old
Recreate the symbolic link by running this command:
ln -s /store /locker
You cannot remediate an ESXi 5.x or 6.0 host using vCenter Update Manager.
Remediating ESXi 5.x or 6.0 hosts fails.
A package is to be updated on the host, particularly when VMware_locker_tools-light* is corrupt.
You see the error:
error code:15. The package manager transaction is not successful. Check the Update Manager log files and esxupdate log files for more details.
Cause
This issue occurs if the package files for floppies in the /locker/packages/Version/ folder is corrupt or full.
For example:
In ESXi 5.0 systems – /locker/packages/5.0.0/
In ESXi 5.1 systems – /locker/packages/5.1.0/
In ESXi 5.5 systems – /locker/packages/5.5.0/
In ESXi 6.0 systems – /locker/packages/6.0.0/
Resolution
To resolve this issue, recreate the/locker/packages/version/ folder, where version is:
ESXi 5.0 – /locker/packages/5.0.0/
ESXi 5.1 – /locker/packages/5.1.0/
ESXi 5.5 – /locker/packages/5.5.0/
ESXi 6.0 – /locker/packages/6.0.0/
To verify the store folders contents and symbolic link:
Connect to the ESXi host using an SSH session.
Check for information in the /store folder by running this command:
ls /store
This folder must contain packages and var folder.
Run this command to verify that the symbolic link is valid:
ls -l /
The /store folder should be linked to /locker and appear as:
locker -> /store
If that link is not displayed, run this command to add the symbolic link:
ln -s /store /locker
To recreate the/locker/packages/version/ folder:
Put the host in the maintenance mode.
Navigate to the /locker/packages/version/ folder on the host.
Rename /locker/packages/version/ folder to /locker/packages/version.old.
Remediate the host using Update Manager.
The /locker/packages/version/ folder is recreated and the remediation should now be successful.
Note: Verify if you can change to the other folders in /locker/packages/version/. If not, rename all the three folders including floppies.
An alternative resolution for ESXi:
Put the host in the maintenance mode.
Navigate to the /locker/packages/version/ folder on the host.
Rename the folder to:
/locker/packages/ version.old
Run this command as the root user to recreate the folder:
mkdir / locker/packages/ version/
For example:
In ESXi 5.0:
mkdir / locker/packages/5.0.0/
In ESXi 5.1:
mkdir / locker/packages/5.1.0/
In ESXi 5.5:
mkdir / locker/packages/5.5.0/
In ESXi 6.0:
mkdir / locker/packages/6.0.0/
Use WinSCP to copy the folders and files from the / locker/packages/ version/ directory on a working host to the affected host.
If the preceding methods do not resolve the issue:
Verify and ensure that there is sufficient free space on root folder by running this command
vdf -h
Check the locker location by running this command:
ls -ltr /
If the locker is not pointing to a datastore:
Rename the old locker file by running this command:
mv /locker /locker.old
Recreate the symbolic link by running this command:
ln -s /store /locker
Tags
# VMware
Continue Reading
Why VM creation in KVM is failing with error: libvirtError: Unable to read from monitor: Connection reset by peer on Red Hat Enterprise Liunx 6.5?
NAGARAJU AVALA
Wednesday, February 24, 2016
0
Issue
While creating/ starting VM, getting below error:
Unable to complete install: 'Unable to read from monitor: Connection reset by peer'
Traceback (most recent call last):
File "/usr/share/virt-manager/virtManager/asyncjob.py", line 44, in cb_wrapper
callback(asyncjob, *args, **kwargs)
File "/usr/share/virt-manager/virtManager/create.py", line 1928, in do_install
guest.start_install(False, meter=meter)
File "/usr/lib/python2.6/site-packages/virtinst/Guest.py", line 1229, in start_install
noboot)
File "/usr/lib/python2.6/site-packages/virtinst/Guest.py", line 1297, in _create_guest
dom = self.conn.createLinux(start_xml or final_xml, 0)
File "/usr/lib64/python2.6/site-packages/libvirt.py", line 2686, in createLinux
if ret is None:raise libvirtError('virDomainCreateLinux() failed', conn=self)
libvirtError: Unable to read from monitor: Connection reset by peer
Resolution
Add Display/ video drivers spice, virtio, qxl in the VM configuration.
Set loopback address eth-lo to up.
Root Cause
Display hardware virtio, spice, qxl were not enabled or added in VM configuration.
Loopback interface eth-lo was down to make localhost connection.
While creating/ starting VM, getting below error:
Unable to complete install: 'Unable to read from monitor: Connection reset by peer'
Traceback (most recent call last):
File "/usr/share/virt-manager/virtManager/asyncjob.py", line 44, in cb_wrapper
callback(asyncjob, *args, **kwargs)
File "/usr/share/virt-manager/virtManager/create.py", line 1928, in do_install
guest.start_install(False, meter=meter)
File "/usr/lib/python2.6/site-packages/virtinst/Guest.py", line 1229, in start_install
noboot)
File "/usr/lib/python2.6/site-packages/virtinst/Guest.py", line 1297, in _create_guest
dom = self.conn.createLinux(start_xml or final_xml, 0)
File "/usr/lib64/python2.6/site-packages/libvirt.py", line 2686, in createLinux
if ret is None:raise libvirtError('virDomainCreateLinux() failed', conn=self)
libvirtError: Unable to read from monitor: Connection reset by peer
Resolution
Add Display/ video drivers spice, virtio, qxl in the VM configuration.
Set loopback address eth-lo to up.
Root Cause
Display hardware virtio, spice, qxl were not enabled or added in VM configuration.
Loopback interface eth-lo was down to make localhost connection.
User execution of su fails with error "bash: /bin/su: Permission denied"
NAGARAJU AVALA
Wednesday, February 24, 2016
Issue
After updating the system, no one (including root) can use su without getting bash: /bin/su: Permission denied errors, but ssh and terminal logins work as normal. SELinux is disabled and no new log entries are generated in /var/log/secure or /var/log/messages when root tries to run /bin/su, which has appropriate permissions, file size, and md5sum. The /etc/nsswitch.conf, /etc/pam.d/system-auth, and /etc/pam.d/su files have all been replaced with default versions and still the problem remains.
Raw
[root@localhost ~]# ls -l /bin/su
-rwsr-xr-x 1 root root 28336 May 11 2011 /bin/su
[root@localhost ~]# /bin/su -
bash: /bin/su: Permission denied
[root@localhost ~]# strace -tvfs 2048 -o su_strace_root.log su -lc exit
strace: exec: Permission denied
<truncacted strace output:>
28530 15:17:38 execve("/bin/su", ["su", "-lc", "exit"], ... "_=/usr/bin/strace"]) = -1 EACCES (Permission denied)
Resolution
The one customer that reported this issue eventually resolved it by realizing that the system had a 3rd-party LDAP application installed that was no longer being used but which hadn't been permanently disabled and was therefore initialized after the reboot. Customer quote:
Issue was basically with an LDAP client called TAMOS running on the system was causing this authorization issues.
I hit that because after reboot all the disabled TAMOS processes started and not allowing us to authorize to su -.
Now TAMOS is been uninstalled from the system and we are good now.
If seeing a similar problem, this particular cause could be confirmed or ruled out by checking if the kail kernel module is loaded or if there are TAMOS log entries on the system.
Raw
$ lsmod | grep kail
kail 124328 1 kaznmod,[permanent]
$ egrep 'TAMOS|kail' /var/log/dmesg
kail: no version for "struct_module" found: kernel tainted.
kail: no version magic, tainting kernel.
TAMOS: INFO kail_init_module kernel module initializing
TAMOS INFO: kail_kernel.c: init_module OK: Perm2rw
TAMOS: INFO kail_kernel kailPerm2rw
kail_kernel change_perm loop: 0
kail_kernel change_perm loop: 1
kail_kernel change_perm loop: 0
kail_kernel change_perm loop: 1
TAMOS INFO: nct_async.c TIMEDWAIT_THREAD NOT enabled
TAMOS INFO: nct_async.c TRACE_AREXIT enabled
TAMOS INFO: nct_asyncInit pre: FFFFFFFF88512A50:FFFFFFFF88512A68:FFFFFFFF88512A60
TAMOS INFO nct_init.. scanning procs
TAMOS INFO: nct_init LSM enabled
TAMOS INFO: ignoring NR call index: 1000 at kail index: 3
kail_kernel kailPerm2ro
kail_kernel change_perm loop: 0
kail_kernel change_perm loop: 1
TAMOS INFO:kail_kernel perm2ro done: 0
kail_kernel change_perm loop: 0
kail_kernel change_perm loop: 1
TAMOS INFO:kail_kernel perm2ro done: 0
TAMOS INFO: setting up as a security module
TAMOS INFO: nct_arThread entering base counter local: 0: FFFF810234A75DEC global: 0: FFFFFFFF88512A48
TAMOS INFO: nct_arThread entering base counter local: 0: FFFF810234A73DEC global: 1000: FFFFFFFF88512A48
TAMOS INFO: nct_arThread entering base counter local: 0: FFFF810234A71DEC global: 2000: FFFFFFFF88512A48
TAMOS INFO: nct_arThread entering base counter local: 0: FFFF810234A6FDEC global: 3000: FFFFFFFF88512A48
TAMOS INFO: kaznmod successfully inserted (30:(10:0) of 6) as a security framework and overlaying.
TAMOS INFO: Initialization complete
After updating the system, no one (including root) can use su without getting bash: /bin/su: Permission denied errors, but ssh and terminal logins work as normal. SELinux is disabled and no new log entries are generated in /var/log/secure or /var/log/messages when root tries to run /bin/su, which has appropriate permissions, file size, and md5sum. The /etc/nsswitch.conf, /etc/pam.d/system-auth, and /etc/pam.d/su files have all been replaced with default versions and still the problem remains.
Raw
[root@localhost ~]# ls -l /bin/su
-rwsr-xr-x 1 root root 28336 May 11 2011 /bin/su
[root@localhost ~]# /bin/su -
bash: /bin/su: Permission denied
[root@localhost ~]# strace -tvfs 2048 -o su_strace_root.log su -lc exit
strace: exec: Permission denied
<truncacted strace output:>
28530 15:17:38 execve("/bin/su", ["su", "-lc", "exit"], ... "_=/usr/bin/strace"]) = -1 EACCES (Permission denied)
Resolution
The one customer that reported this issue eventually resolved it by realizing that the system had a 3rd-party LDAP application installed that was no longer being used but which hadn't been permanently disabled and was therefore initialized after the reboot. Customer quote:
Issue was basically with an LDAP client called TAMOS running on the system was causing this authorization issues.
I hit that because after reboot all the disabled TAMOS processes started and not allowing us to authorize to su -.
Now TAMOS is been uninstalled from the system and we are good now.
If seeing a similar problem, this particular cause could be confirmed or ruled out by checking if the kail kernel module is loaded or if there are TAMOS log entries on the system.
Raw
$ lsmod | grep kail
kail 124328 1 kaznmod,[permanent]
$ egrep 'TAMOS|kail' /var/log/dmesg
kail: no version for "struct_module" found: kernel tainted.
kail: no version magic, tainting kernel.
TAMOS: INFO kail_init_module kernel module initializing
TAMOS INFO: kail_kernel.c: init_module OK: Perm2rw
TAMOS: INFO kail_kernel kailPerm2rw
kail_kernel change_perm loop: 0
kail_kernel change_perm loop: 1
kail_kernel change_perm loop: 0
kail_kernel change_perm loop: 1
TAMOS INFO: nct_async.c TIMEDWAIT_THREAD NOT enabled
TAMOS INFO: nct_async.c TRACE_AREXIT enabled
TAMOS INFO: nct_asyncInit pre: FFFFFFFF88512A50:FFFFFFFF88512A68:FFFFFFFF88512A60
TAMOS INFO nct_init.. scanning procs
TAMOS INFO: nct_init LSM enabled
TAMOS INFO: ignoring NR call index: 1000 at kail index: 3
kail_kernel kailPerm2ro
kail_kernel change_perm loop: 0
kail_kernel change_perm loop: 1
TAMOS INFO:kail_kernel perm2ro done: 0
kail_kernel change_perm loop: 0
kail_kernel change_perm loop: 1
TAMOS INFO:kail_kernel perm2ro done: 0
TAMOS INFO: setting up as a security module
TAMOS INFO: nct_arThread entering base counter local: 0: FFFF810234A75DEC global: 0: FFFFFFFF88512A48
TAMOS INFO: nct_arThread entering base counter local: 0: FFFF810234A73DEC global: 1000: FFFFFFFF88512A48
TAMOS INFO: nct_arThread entering base counter local: 0: FFFF810234A71DEC global: 2000: FFFFFFFF88512A48
TAMOS INFO: nct_arThread entering base counter local: 0: FFFF810234A6FDEC global: 3000: FFFFFFFF88512A48
TAMOS INFO: kaznmod successfully inserted (30:(10:0) of 6) as a security framework and overlaying.
TAMOS INFO: Initialization complete
Wednesday, February 17, 2016
Collecting diagnostic information using the vm-support command in VMware ESX/ESXi
NAGARAJU AVALA
Wednesday, February 17, 2016
0
Purpose
VMware Technical Support routinely requests the diagnostic information from you when a support request is addressed. This diagnostic information contains product specific logs and configuration files from the host on which the product is running. This information is gathered using a specific script or tool within the product.
This article provides procedures for obtaining diagnostic information for a VMware ESXi/ESX host using the vm-support command line utility.
The diagnostic information obtained by using this article is uploaded to VMware Technical Support. To uniquely identify your information, use the Support Request (SR) number you receive when you create the new SR.
Resolution
The command-line vm-support utility is present on all versions of VMware ESXi/ESX, though some of the options available with the utility differ among versions.
Running vm-support in a console session on ESXi/ESX hosts
The traditional way of using the vm-support command-line utility produces a gzipped tarball (.tgz file) locally on the host. The resulting file can be copied off the host using FTP, SCP, or another method.
Open a console to the ESX or ESXi host.
Run the command:
vm-support
Note: Additional options can be specified to customize the log bundle collection. Use the vm-support -h command for a list of options available on a given version of ESXi/ESX.
A compressed bundle of logs is produced and stored in a file with a .tgz extension in one of these locations:
/var/tmp/
/var/log/
The current working directory
To export the log bundle to a shared vmfs datastore, use this command:
vm-support -f -w /vmfs/volumes/DATASTORE_NAME
Note: The -f option is not available in ESXi 5.x, ESXi/ESX 4.1 Update 3, and later.
The log bundle is collected and downloaded to a client, upload the logs to the SFTP/FTP site.
Streaming vm-support output from an ESXi 5.x and 6.0 host
Starting with ESXi 5.0, the vm-support command-line utility supports streaming content to the standard output. This allows to send the content over an SSH connection without saving anything locally on the ESXi host.
Enable SSH access to the ESXi shell.
Using a Linux or Posix client, such as the vSphere Management Assistant appliance, log in to the ESXi host and run the vm-support command with the streaming option enabled, specifying a new local file. A compressed bundle of logs is produced on the client at the specified location. For example:
ssh root@ESXHostnameOrIPAddress vm-support -s > vm-support-Hostname.tgz
Note: This requires you to enter a password for the root account, and cannot be used with lockdown mode.
You can also direct the support log bundle to a desired datastore location using the same command (mentioning the destination path). For example:
ssh root@ESXHostnameOrIPAddress 'vm-support -s > /vmfs/volumes/datastorexxx/vm-support-Hostname.tgz'
After the log bundle has been collected and downloaded to a client, upload the logs to the SFTP/FTP site.
HTTP-based download of vm-support output from an ESXi 5.x and 6.0 host
Starting with ESXi 5.0, the vm-support command-line utility can be invoked via HTTP. This allows you to download content using a web browser or a command line tool like wget or curl.
Using any HTTP client, download the resource from:
https://ESXHostnameOrIPAddress/cgi-bin/vm-support.cgi
For example, download the resource using the wget utility on a Linux or other Posix client, such as the vSphere Management Assistant appliance. A compressed bundle of logs is produced on the client at the specified location:
wget https://10.11.12.13/cgi-bin/vm-support.cgi
After the log bundle is collected and downloaded to a client, upload the logs to the SFTP/FTP site.
VMware Technical Support routinely requests the diagnostic information from you when a support request is addressed. This diagnostic information contains product specific logs and configuration files from the host on which the product is running. This information is gathered using a specific script or tool within the product.
This article provides procedures for obtaining diagnostic information for a VMware ESXi/ESX host using the vm-support command line utility.
The diagnostic information obtained by using this article is uploaded to VMware Technical Support. To uniquely identify your information, use the Support Request (SR) number you receive when you create the new SR.
Resolution
The command-line vm-support utility is present on all versions of VMware ESXi/ESX, though some of the options available with the utility differ among versions.
Running vm-support in a console session on ESXi/ESX hosts
The traditional way of using the vm-support command-line utility produces a gzipped tarball (.tgz file) locally on the host. The resulting file can be copied off the host using FTP, SCP, or another method.
Open a console to the ESX or ESXi host.
Run the command:
vm-support
Note: Additional options can be specified to customize the log bundle collection. Use the vm-support -h command for a list of options available on a given version of ESXi/ESX.
A compressed bundle of logs is produced and stored in a file with a .tgz extension in one of these locations:
/var/tmp/
/var/log/
The current working directory
To export the log bundle to a shared vmfs datastore, use this command:
vm-support -f -w /vmfs/volumes/DATASTORE_NAME
Note: The -f option is not available in ESXi 5.x, ESXi/ESX 4.1 Update 3, and later.
The log bundle is collected and downloaded to a client, upload the logs to the SFTP/FTP site.
Streaming vm-support output from an ESXi 5.x and 6.0 host
Starting with ESXi 5.0, the vm-support command-line utility supports streaming content to the standard output. This allows to send the content over an SSH connection without saving anything locally on the ESXi host.
Enable SSH access to the ESXi shell.
Using a Linux or Posix client, such as the vSphere Management Assistant appliance, log in to the ESXi host and run the vm-support command with the streaming option enabled, specifying a new local file. A compressed bundle of logs is produced on the client at the specified location. For example:
ssh root@ESXHostnameOrIPAddress vm-support -s > vm-support-Hostname.tgz
Note: This requires you to enter a password for the root account, and cannot be used with lockdown mode.
You can also direct the support log bundle to a desired datastore location using the same command (mentioning the destination path). For example:
ssh root@ESXHostnameOrIPAddress 'vm-support -s > /vmfs/volumes/datastorexxx/vm-support-Hostname.tgz'
After the log bundle has been collected and downloaded to a client, upload the logs to the SFTP/FTP site.
HTTP-based download of vm-support output from an ESXi 5.x and 6.0 host
Starting with ESXi 5.0, the vm-support command-line utility can be invoked via HTTP. This allows you to download content using a web browser or a command line tool like wget or curl.
Using any HTTP client, download the resource from:
https://ESXHostnameOrIPAddress/cgi-bin/vm-support.cgi
For example, download the resource using the wget utility on a Linux or other Posix client, such as the vSphere Management Assistant appliance. A compressed bundle of logs is produced on the client at the specified location:
wget https://10.11.12.13/cgi-bin/vm-support.cgi
After the log bundle is collected and downloaded to a client, upload the logs to the SFTP/FTP site.
Tags
# VMware
Continue Reading
An ESXi 5.x host running on HP server fails with a purple diagnostic screen and the error: hpsa_update_scsi_devices or detect_controller_lockup_thread
NAGARAJU AVALA
Wednesday, February 17, 2016
0
Whenever you find below Symptoms
Cannot run the host on Hewlett Packard (HP) hardware
Running the host on HP hardware fails with a purple diagnostic screen
You see the error:
hpsa_update_scsi_devices@<None>#<None>+0x39c
hpsa_scan_start@<None>#<None>+0x187
hpsa_kickoff_rescan@<None>#<None>+0x20f
kthread@com.vmware.driverAPI#9.2+0x185
LinuxStartFunc@com.vmware.driverAPI#9.2+0x97
vmkWorldFunc@vmkernel#nover+0x83
CpuSched_StartWorld@vmkernel#nover+0xfa
Your host fails with a purple diagnostic screen and you see the error:
Panic: 892: Saved backtrace: pcpu X TLB NMI
_raw_spin_failed@com.vmware.driverAPI#9.2+0x5
detect_controller_lockup_thread@#+0x3a9
kthread@com.vmware.driverAPI#9.2+0x185
LinuxStartFunc@com.vmware.driverAPI#9.2+0x97
vmkWorldFunc@vmkernel#nover+0x83
CpuSched_StartWorld@vmkernel#nover+0xfa
PCPU X locked up. Failed to ack TLB invalidate (total of 1 locked up, PCPU9s): X)
Before host becomes unresponsive, in the /var/log/vmkernel.log file, you see entries similar to:
WARNING: LinDMA: Linux_DMACheckConstraints:149: Cannot map machine address = 0xfffffffffff, length = 49160 for device 0000:03:00.0; reason = buffer straddles device dma boundary (0xffffffff)WARNING: Heap: 4089: Heap_Align(vmklnx_hpsa, 32768/32768 bytes, 8 align) failed. caller: 0x41802dcb1f91cpu4:1696102)<4>hpsa 0000:09:00.0: out of memory in adjust_hpsa_scsi_table
Before you see a purple diagnostic screen, in the /var/log/vmkernel.log file, you see entries similar to:
Note: These are multiple memory error messages from the hpsa driver.
out of memory at vmkdrivers/src_9/drivers/hpsa/hpsa.c:3562
out of memory at vmkdrivers/src_9/drivers/hpsa/hpsa.c:3562
out of memory at vmkdrivers/src_9/drivers/hpsa/hpsa.c:3562
out of memory at vmkdrivers/src_9/drivers/hpsa/hpsa.c:3562
WARNING: Heap: 3622: Heap vmklnx_hpsa (39113576/39121768): Maximum allowed growth (8192) too small for size (20480)
cpu7:1727675)<4>hpsa 0000:06:00.0: out of memory at vmkdrivers/src_9/drivers/hpsa/hpsa.c:3562
cpu2:1727677)<4>hpsa 0000:0c:00.0: out of memory at vmkdrivers/src_9/drivers/hpsa/hpsa.c:3562
cpu4:1727676)<4>hpsa 0000:09:00.0: out of memory at vmkdrivers/src_9/drivers/hpsa/hpsa.c:3562
cpu3:1727738)WARNING: LinDMA: dma_alloc_coherent:726: Out of memory
cpu3:1727738)<3>hpsa 0000:06:00.0: cmd_special_alloc returned NULL!
Resolution should be
This is a known issue affecting VMware ESXi 5.x.
To resolve this issue, apply the updated driver supplied by HP. Always check the HCL to determine the latest available driver update.
Note: For all BL685c G7 blades and DL360p Gen8 servers, HP recommends to update to ESXi 5.5 update1 to the June 2014 version. For more information, see
The reasons for the recommendation are:
Fix for smx-provider memory leak issue is resolved.
Several issues for the hpsa driver are resolved in the .60 version found in new June 2014 version of ESXi 5.5 update1. The previous version of the hpsa driver was .50 and was problematic.
For the DL360p Gen8 servers, the iLO firmware need to be checked. If the iLO Firmware is not at 1.51, it is recommended to update the Firmware on all servers to 1.51. This is a critical update to avoid NMI events which would cause PSOD in your environment.
It is also recommended to check the DL360p Gen8 servers to make sure that they are at least at Feb 2014 system ROM. This is to correct a possible IPMI issue.
If this issue persists after the driver upgrade:
Open a HP Support Request, reference HP case 4648045806.
If this issue persists, open a support request with VMware Support.
Provide VMware support your HP case number.
Cannot run the host on Hewlett Packard (HP) hardware
Running the host on HP hardware fails with a purple diagnostic screen
You see the error:
hpsa_update_scsi_devices@<None>#<None>+0x39c
hpsa_scan_start@<None>#<None>+0x187
hpsa_kickoff_rescan@<None>#<None>+0x20f
kthread@com.vmware.driverAPI#9.2+0x185
LinuxStartFunc@com.vmware.driverAPI#9.2+0x97
vmkWorldFunc@vmkernel#nover+0x83
CpuSched_StartWorld@vmkernel#nover+0xfa
Your host fails with a purple diagnostic screen and you see the error:
Panic: 892: Saved backtrace: pcpu X TLB NMI
_raw_spin_failed@com.vmware.driverAPI#9.2+0x5
detect_controller_lockup_thread@#+0x3a9
kthread@com.vmware.driverAPI#9.2+0x185
LinuxStartFunc@com.vmware.driverAPI#9.2+0x97
vmkWorldFunc@vmkernel#nover+0x83
CpuSched_StartWorld@vmkernel#nover+0xfa
PCPU X locked up. Failed to ack TLB invalidate (total of 1 locked up, PCPU9s): X)
Before host becomes unresponsive, in the /var/log/vmkernel.log file, you see entries similar to:
WARNING: LinDMA: Linux_DMACheckConstraints:149: Cannot map machine address = 0xfffffffffff, length = 49160 for device 0000:03:00.0; reason = buffer straddles device dma boundary (0xffffffff)WARNING: Heap: 4089: Heap_Align(vmklnx_hpsa, 32768/32768 bytes, 8 align) failed. caller: 0x41802dcb1f91cpu4:1696102)<4>hpsa 0000:09:00.0: out of memory in adjust_hpsa_scsi_table
Before you see a purple diagnostic screen, in the /var/log/vmkernel.log file, you see entries similar to:
Note: These are multiple memory error messages from the hpsa driver.
out of memory at vmkdrivers/src_9/drivers/hpsa/hpsa.c:3562
out of memory at vmkdrivers/src_9/drivers/hpsa/hpsa.c:3562
out of memory at vmkdrivers/src_9/drivers/hpsa/hpsa.c:3562
out of memory at vmkdrivers/src_9/drivers/hpsa/hpsa.c:3562
WARNING: Heap: 3622: Heap vmklnx_hpsa (39113576/39121768): Maximum allowed growth (8192) too small for size (20480)
cpu7:1727675)<4>hpsa 0000:06:00.0: out of memory at vmkdrivers/src_9/drivers/hpsa/hpsa.c:3562
cpu2:1727677)<4>hpsa 0000:0c:00.0: out of memory at vmkdrivers/src_9/drivers/hpsa/hpsa.c:3562
cpu4:1727676)<4>hpsa 0000:09:00.0: out of memory at vmkdrivers/src_9/drivers/hpsa/hpsa.c:3562
cpu3:1727738)WARNING: LinDMA: dma_alloc_coherent:726: Out of memory
cpu3:1727738)<3>hpsa 0000:06:00.0: cmd_special_alloc returned NULL!
Resolution should be
This is a known issue affecting VMware ESXi 5.x.
To resolve this issue, apply the updated driver supplied by HP. Always check the HCL to determine the latest available driver update.
Note: For all BL685c G7 blades and DL360p Gen8 servers, HP recommends to update to ESXi 5.5 update1 to the June 2014 version. For more information, see
The reasons for the recommendation are:
Fix for smx-provider memory leak issue is resolved.
Several issues for the hpsa driver are resolved in the .60 version found in new June 2014 version of ESXi 5.5 update1. The previous version of the hpsa driver was .50 and was problematic.
For the DL360p Gen8 servers, the iLO firmware need to be checked. If the iLO Firmware is not at 1.51, it is recommended to update the Firmware on all servers to 1.51. This is a critical update to avoid NMI events which would cause PSOD in your environment.
It is also recommended to check the DL360p Gen8 servers to make sure that they are at least at Feb 2014 system ROM. This is to correct a possible IPMI issue.
If this issue persists after the driver upgrade:
Open a HP Support Request, reference HP case 4648045806.
If this issue persists, open a support request with VMware Support.
Provide VMware support your HP case number.
Tags
# VMware
Continue Reading
ESXi 5.0 host experiences a purple diagnostic screen with the errors "Failed to ack TLB invalidate" or "no heartbeat" on HP servers with PCC support
NAGARAJU AVALA
Wednesday, February 17, 2016
0
Whevever - ESXi 5.0 host fails with a purple diagnostic screen
The purple diagnostic screen or core dump contains messages similar to:
PCPU 39 locked up. Failed to ack TLB invalidate (total of 1 locked up, PCPU(s): 39).
0x41228efc7b88:[0x41800646cd62]Panic@vmkernel#nover+0xa9 stack: 0x41228efe5000
0x41228efc7cb8:[0x4180064989af]TLBDoInvalidate@vmkernel#nover+0x45a stack: 0x41228efc7ce8
@BlueScreen: PCPU 0: no heartbeat, IPIs received (0/1)....
0x4122c27c7a68:[0x41800966cd62]Panic@vmkernel#nover+0xa9 stack: 0x4122c27c7a98
0x4122c27c7ad8:[0x4180098d80ec]Heartbeat_DetectCPULockups@vmkernel#nover+0x2d3 stack: 0x0
NMI: 1943: NMI IPI received. Was eip(base):ebp:cs [0x7eb2e(0x418009600000):0x4122c2307688:0x4010](Src 0x1, CPU140)
Heartbeat: 618: PCPU 140 didn't have a heartbeat for 8 seconds. *may* be locked up
Cause might be some HP servers experience a situation where the PCC (Processor Clocking Control or Collaborative Power Control) communication between the VMware ESXi kernel (VMkernel) and the server BIOS does not function correctly.
As a result, one or more PCPUs may remain in SMM (System Management Mode) for many seconds. When the VMkernel notices a PCPU is not available for an extended period of time, a purple diagnostic screen occurs.
The solution should be
This issue has been resolved as of ESXi 5.0 Update 2 as PCC is disabled by default.
To work around this issue in versions prior to ESXi 5.0 U2, disable PCC manually.
To disable PCC:
Connect to the ESXi host using the vSphere Client.
Click the Configuration tab.
In the Software menu, click Advanced Settings.
Select vmkernel.
Deselect the vmkernel.boot.usePCC option.
Restart the host for the change to take effect.
The purple diagnostic screen or core dump contains messages similar to:
PCPU 39 locked up. Failed to ack TLB invalidate (total of 1 locked up, PCPU(s): 39).
0x41228efc7b88:[0x41800646cd62]Panic@vmkernel#nover+0xa9 stack: 0x41228efe5000
0x41228efc7cb8:[0x4180064989af]TLBDoInvalidate@vmkernel#nover+0x45a stack: 0x41228efc7ce8
@BlueScreen: PCPU 0: no heartbeat, IPIs received (0/1)....
0x4122c27c7a68:[0x41800966cd62]Panic@vmkernel#nover+0xa9 stack: 0x4122c27c7a98
0x4122c27c7ad8:[0x4180098d80ec]Heartbeat_DetectCPULockups@vmkernel#nover+0x2d3 stack: 0x0
NMI: 1943: NMI IPI received. Was eip(base):ebp:cs [0x7eb2e(0x418009600000):0x4122c2307688:0x4010](Src 0x1, CPU140)
Heartbeat: 618: PCPU 140 didn't have a heartbeat for 8 seconds. *may* be locked up
Cause might be some HP servers experience a situation where the PCC (Processor Clocking Control or Collaborative Power Control) communication between the VMware ESXi kernel (VMkernel) and the server BIOS does not function correctly.
As a result, one or more PCPUs may remain in SMM (System Management Mode) for many seconds. When the VMkernel notices a PCPU is not available for an extended period of time, a purple diagnostic screen occurs.
The solution should be
This issue has been resolved as of ESXi 5.0 Update 2 as PCC is disabled by default.
To work around this issue in versions prior to ESXi 5.0 U2, disable PCC manually.
To disable PCC:
Connect to the ESXi host using the vSphere Client.
Click the Configuration tab.
In the Software menu, click Advanced Settings.
Select vmkernel.
Deselect the vmkernel.boot.usePCC option.
Restart the host for the change to take effect.
Tags
# VMware
Continue Reading