This Blog is to share our knowledge and expertise on Linux System Administration and VMware Administration

Wednesday, June 15, 2016

How to enable the Name Service cache Daemon (NSCD)

Wednesday, June 15, 2016 0
By enabling the Name Service cache Daemon (NSCD) of the operating system, a significant performance improvement can be achieved when using naming services like DNS, NIS, NIS+, LDAP.

Benefit of name service cache daemon (NSCD) for ClearCase
Example:

WithoutNSCD:

[user@host]$ time cleartool co -nc "/var/tmp/file"
Checked out "/var/tmp/file" from version "/main/10".
real    0m3.355s
user    0m0.020s
sys     0m0.018s

With NSCD

[user@host]$ time cleartool co -nc "/var/tmp/file"
Checked out "/var/tmp/file" from version "/main/11".
real    0m0.556s
user    0m0.021s
sys     0m0.016s
Enabling NSCD

Solaris:

/etc/init.d/nscd start

Linux

service nscd start

AIX:

startsrc -s netcd

Note: In addition to having nscd started it is mandatory to be sure this service will be started after a reboot. For instance on Red Hat and SuSE you can run:

chkconfig nscd  on

For more details on how to configure and or enable NSCD refer to your respective operating system vendor's manpage.

Useful TSM client commands for UNIX Admins

Wednesday, June 15, 2016 0
TSM (Tivoli Storage Manager)  is  a centralized, policy-based, enterprise class, data backup and recovery package from IBM Corporation.The software enables the user to insert objects not only via backup, but also through space management and archive tools. It also allows retrieval of the same data via similar restore, recall, and retrieve methods.

As  Unix Admins  we used to get lot of requests from the application teams for tsm backup restores.I would like to discuss about the the best 14 best use-full TSM client commands.

Lets discuss category wise  "Query,Backup & Restore".

Generally we use dsmc/dsm  for the  TSM client commands.

In this article we are going to discuss about the following contents with practice examples.

  1) Querying the server

    A. Querying your scheduled backup slot

    B. Querying what files are included / excluded for backup

    C.Querying what partitions have been backed up

    D.Querying what files have been backed up

 2) Backing Up data

    A. Backing your local filesystems

    B. Backing up selected files
 
 3) Restore Data

    A. Restore a file to its original directory

    B. Restore the most recent backup version of a file

    C. Display a list of active and inactive backup versions of files from which you can select versions to restore

    D. Restore with a directory including subdirectories

    E. Restore the  file under a new name and directory

    F. Restore all files in a directory  as of their current state

    G. Restore all files from a  directory that end with .xyz to the another directory

    H. Restore files specified in the text file to a different location

1) Querying the server
A. Querying your scheduled backup slot

To query your scheduled backup slot enter dsmc q sched (which is short for query schedule). The output should look similar to that below:

#dsmc q sched

    Schedule Name: WEEKLY_UM
    Description: UM weekly incremental backup
   Schedule Style: Classic

         Action: Incremental
        Options:
        Objects:
         Priority: 5

   Next Execution: 135 Hours and 25 Minutes

         Duration: 20 Minutes
          Period: 1 Week

      Day of Week: Thursday
           Expire: Never

B. Querying what files are included / excluded for backup

"q inclexcl" to list output similar to the following:

#dsmc q inclexcl

*** FILE INCLUDE/EXCLUDE ***

Mode Function  Pattern (match from top down)  Source File
---- --------- ------------------------------ -----------------
Excl Filespace /var/run                       /opt/tivoli/tsm/client/ba/bin/incl.excl
Excl Filespace /tmp                           /opt/tivoli/tsm/client/ba/bin/incl.excl
Excl Directory /.../.opera/.../cache4         /opt/tivoli/tsm/client/ba/bin/incl.excl
Excl Directory /.../.mozilla/.../Cache        /opt/tivoli/tsm/client/ba/bin/incl.excl
Excl Directory /.../.netscape/.../cache       /opt/tivoli/tsm/client/ba/bin/incl.excl
Excl Directory /var/tmp                       /opt/tivoli/tsm/client/ba/bin/incl.excl
Excl All       /.../dsmsched.log              /opt/tivoli/tsm/client/ba/bin/incl.excl
Excl All       /.../core                      /opt/tivoli/tsm/client/ba/bin/incl.excl
Excl All       /.../a.out                     /opt/tivoli/tsm/client/ba/bin/incl.excl

C.Querying what partitions have been backed up
"q fi" to list which partitions have been backed up:

** Unix/Linux  **

#dsmc q fi
  #     Last Incr Date      Type    File Space Name
---     --------------      ----    ---------------
  1   02-05-2013 02:13:13   UFS     /       
  2   25-07-2012 12:26:09   UFS     /export/home
  3   02-05-2013 02:13:26   UFS     /home   
  4   16-01-2013 11:26:37   UFS     /scratch 
  5   02-05-2013 02:13:54   UFS     /usr/local
  6   12-02-2013 02:52:41   UFS     /var   

** Netware **
  #     Last Incr Date      Type       File Space Name
---     --------------      ----       ---------------
  1   02-05-2013 00:23:46   NTW:LONG   Oracle_data\usr:
  2   02-07-2013 00:22:42   NDS        Oracle_data\bin:
  3   02-07-2013 00:25:33   NTW:LONG   Oracle_data\apps:
  4   02-07-2013 00:25:11   NTW:LONG   Oracle_data\usr:
D.Querying what files have been backed up

In order to query the files or directories that are backed-up earlier you can use "q ba".
The below example gives you only the directory information.
#dsmc q ba /home/oraadmin
   Size      Backup Date                Mgmt Class           A/I File
   ----      -----------                ----------           --- ----
   1024  B  15-10-2013 02:52:09          STANDARD             A  /home/oraadmin

If you just add a trailing * (star) as a wildcard in the above query, TSM will only return those files and directories backed up immediately below the directory path given in the query
#dsmc q ba /home/oraadm/*
   Size      Backup Date        Mgmt Class A/I File
   ----      -----------        ---------- --- ----
    512  12-09-2012 19:57:09    STANDARD    A  /home/oraadm/data1.dtf
  1,024  08-12-2012 02:46:53    STANDARD    A  /home/oraadm/data2.dtf
    512  12-09-2012 19:57:09    STANDARD    A  /home/oraadm/data3.dtf
    512  24-04-2002 00:22:56    STANDARD    A  /home/oraadm/data4.dtf

If you want to query all the current files and directories backed up under a directory and all its sub-directories you need to add the -subdir=yes option as below:
#dsmc q ba /home/oraadm/* -subdir=yes
   Size      Backup Date        Mgmt Class A/I File
   ----      -----------        ---------- --- ----
    512  12-09-2012 19:57:09    STANDARD    A  /home/oraadm/data1.dtf
  1,024  08-12-2012 02:46:53    STANDARD    A  /home/oraadm/data2.dtf
    512  12-09-2012 19:57:09    STANDARD    A  /home/oraadm/data3.dtf
    512  24-04-2002 00:22:56    STANDARD    A  /home/oraadm/data4.dtf
  1,024  12-09-2012 19:57:09    STANDARD    A  /home/oraadm/datasmart1/test
  1,024  12-09-2012 19:57:09    STANDARD    A  /home/oraadm/datasmart1/test/test2
 12,048  04-12-2012 02:01:29    STANDARD    A  /home/oraadm/datasmart2/tables
 50,326  30-04-2013 01:35:26    STANDARD    A  /home/oraadm/datasmart3/data_file1
 50,326  27-04-2013 00:28:15    STANDARD    A  /home/oraadm/datasmart3/data_file2
 11,013  24-04-2013 00:22:56    STANDARD    A  /home/oraadm/datasmart3/data_file3

2. Backing Up data
A. Backing your local filesystems
The syntax for this is "dsmc backup-type filesystem" , where backup-type is one of incremental or selective.

Incremental Backup : It is one that backs up only the data that changed since the last backup — be it a full or incremental backup

Selective Backup : A type of backup where only the user specified files and directories are backed up. A selective backup is commonly used for backing up files which change frequently or in situations where the space available to store backups is limited. Also called a partial backup.

I would always suggest you always go with incremental. The command is "dsmc incremental"  or "dsmc incr" Where "incr" is an abbreviation for incremental.

Perform an incremental backup of your client server.

#dsmc incr

Make this will omit the filesystems which were mention in the exclude file.
To incrementally back up specific file-systems enter:

#dsmc incr /  /usr  /usr/local  /home

To back up entire filesystem irrespective of whether files have changed since the last backup, use the selective command with a wild-card and -subdir=yes as below:

#dsmc sel /*  /usr/*   /home/*  -su=yes
B. Backing up selected files

For backing up selected files is similar to that for backing up filesystems. Be aware, however, that you cannot use wildcards in directory / folder names:

#dsmc incr /home/oradm/data*/* -su=yes
ANS1071E Invalid domain name entered: '/home/oradm/data*/*'

#dsmc sel /home/oradm/data*/* -su=yes

Selective Backup function invoked.
ANS1081E Invalid search file specification '/home/oradm/data*/*' entered

You can, however, enter several file specifications on the command line, as below:

#dsmc incr /home/surya/*  /usr/bin/* -su=yes
3) Restore Data

We use the "restore" command to restore  files
 A. Restore a file to its original directory

 Restore the /home/oraadm/data.txt  file to its original directory.

 #dsmc restore /home/oraadm/data.txt

 If you do not specify a destination, the files are restored to their original location.
B. Restore the most recent backup version of a file

Here is an example to restore  /home/oraadm/data.txt file, even if the backup is inactive.

#dsmc restore /home/oraadm/data.txt -latest

If the file you are restoring no longer resides on your client machine, and you have run an incremental backup since deleting the file, there is no active backup of the file on the server. In this case, use the latest option to restore the most recent backup version. Tivoli Storage Manager restores the latest backup version, whether it is active or inactive.

C. Display a list of active and inactive backup versions of files from which you can select versions to restore

#dsmc restore "/home/oraadmin/*"-pick -inactive
D. Restore with a directory including subdirectories

Restore the files in the /oradata1 directory and all of its sub-directories (-sub=yes)
#dsmc restore /oradata1/ -subdir=yes

When restoring a specific path and file, Tivoli Storage Manager recursively restores all sub-directories under that path, and any instances of the specified file that exist under any of those sub-directories.

E. Restore the  file under a new name and directory
In-order to restore the  /home/oraadm/data.txt   file under a new name and directory.

#dsmc restore /home/oraadm/data.txt /tmp/data-renamed.txt
F. Restore all files in a directory  as of their current state

Restore all files in the /usr/oradata/docs directory to their state as of 5:00 PM on October 16, 2013.

#dsmc restore -pitd=10/16/2013 -pitt=17:00:00 /usr/oradata/docs/

Use the pitdate option with the pittime option to establish a point in time for which you want to display or restore the latest version of your backups. Files that were backed up on or before the date and time you specified, and which were not deleted before the date and time you specified, are processed. Backup versions that you create after this date and time are ignored.

G. Restore all files from a  directory that end with .xyz to the another directory
Restore all files from the /usr/oradata/docs/ directory that end with .bak to the /usr/oradata/projects/ directory.

# dsmc restore "/usr/oradata/docs/*.bak" /usr/oradata/projects/

If the destination is a directory, specify the delimiter (/) as the last character of the destination. If you omit the delimiter and your specified source is a directory or a file spec with a wildcard, you will receive an error. If the projects directory does not exist, it is created.
H. Restore files specified in the text file to a different location

Restore files specified in the restorelist.txt file to a different location.

# dsmc restore -filelist=/tmp/restorelist.txt /usr/ora_backup/

The files (entries) listed in the filelist must adhere to the following rules:

    Each entry must be a fully or partially qualified path to a file or directory or a relative path.
    Each entry must be on a new line.
    Do not use wildcard characters.
    Each entry results in the processing of only one object (file or directory).
    If the file name contains any spaces, enclose the file name with quotes.
    The filelist can be an MBCS file or a Unicode file with all Unicode entries.
    Tivoli Storage Manager ignores any entry that is not valid.

LVM export and import: How to move a VG to another Machine or Group

Wednesday, June 15, 2016 0
 It is quite easy to move a whole volume group to another system if, for example, a user department acquires a new server. To do this we use the vgexport and vgimport commands.

    Unmount the file system
    Mark the volume group inactive
    Export the volume group
    Import the volume group
    Activate the volume group
    Mount the file system
Exporting Volume Group

1. unmount the file system

First make sure no users are accessing files on active volume, then unmount it

# df -h

Filesystem                                               Size  Used Avail Use% Mounted on

/dev/sda1                                                 25G  4.9G   19G  21% /
tmpfs                                                        593M     0  593M   0% /dev/shm
/dev/mapper/vg--nagavg-lvm--naga        664M  542M   90M  86% /lvm-naga

# umount /lvm-naga

2. Mark the Volume Group inactive

Marks the volume group inactive removes it from the kernal and prevents any further activity on it.

# vgchange -an vg-nagavg(VG name)

 logical volume(s) in volume group "vg-naga" now active

3. Export the VG

It is now necessor to export the Volume Group, this prevents it from being accessed on the "old"host system and prepares it to be removed.

# vgexport vg-nagavg(vg name)

  Volume group "vg-nagavg" successfully exported

when the machine is next shut down, the disk can be unplgged and then connected to its new machine.
Import the Volume Group(VG)

When plugged into new system it becomes /dev/sdb or what ever depends so an initial pvscan shows

1. # pvscan

PV /dev/sda3 is in exported VG vg-nagavg[580.00MB/0 free]
PV /dev/sda4 is in exported VG vg-nagavg[484.00MB/312.00MB free]
PV /dev/sda5 is in exported VG vg-nagavg[288.00MB/288.00MB free]
 Total: 3 [1.32 GB] / in use: 3 [1.32 GB] / in no VG: 0[0]

2. We can now import the Volume Group (which also activates it) and mount the fle system.

If you are importing on an LVM2 system run,

# vgimport vg-nagavg

Volume group "vg-nagavg" successfully imported
If you are importing on an LVM1 system, add the pvs that needed to import

# vgimport vg-nagavg/dev/sda3 /dev/sda4 /dev/sda5

3. Activate the Volume Group

You must activate the volume group before you can access it

# vgchange -ay vg-nagavg

1 logical volume(s) in volume group "vg-ctechz" now active

Now mount the file system

# mount /dev/vg-ctechz/lvm-ctechz /LVM-import/

# mount

/dev/sda1 on / type ext3 (rw)
proc on /proc type proc (rw)
sysfs on /sys type sysfs (rw)
devpts on /dev/pts type devpts (rw,gid=5,mode=620)
tmpfs on /dev/shm type tmpfs (rw)
none on /proc/sys/fs/binfmt_misc type binfmt_misc (rw)
sunrpc on /var/lib/nfs/rpc_pipefs type rpc_pipefs (rw)
/dev/mapper/vg--nagavg-lvm--naga on /LVM-import type ext3 (rw)

[root@localhost ~]# df -h

Filesystem                                             Size  Used Avail Use% Mounted on

/dev/sda1                                               25G  4.9G   19G  21% /
tmpfs                                                      593M     0  593M   0% /dev/shm
/dev/mapper/vg--nagavg-lvm--naga       664M  542M   90M  86% /LVM-import

 Using Vgscan

# pvs

  PV                 VG        Fmt  Attr   PSize   PFree
  /dev/sda3  vg-nagavg lvm2 ax-  580.00M 0
  /dev/sda4  vg-nagavg lvm2 ax-  484.00M 312.00M
  /dev/sda5  vg-nagavg lvm2 ax-  288.00M 288.00M

# pvs shows in which all disk attached to vg

# vgscan

Reading all physical volumes.  This may take a while...
Found exported volume group "vg-nagavg" using metadata type lvm2

# vgimport vg-naagavg
Volume group "vg-nagavg" successfully imported

# vgchange -ay vg-nagavg

1 logical volume(s) in volume group "vg-nagavg" now active

# mkdir /LVM-vgscan
# mount /dev/vg-ctechz/lvm-ctechz /LVM-vgscan

# df -h

Filesystem                                                  Size  Used Avail Use% Mounted on
/dev/sda1                                                    25G  4.9G   19G  21% /
tmpfs                                                           593M     0  593M   0% /dev/shm
/dev/mapper/vg--naga-lvm--nagavg            664M  542M   90M  86% /LVM-vgscan


# mount

/dev/sda1 on / type ext3 (rw)

proc on /proc type proc (rw)
sysfs on /sys type sysfs (rw)
devpts on /dev/pts type devpts (rw,gid=5,mode=620)
tmpfs on /dev/shm type tmpfs (rw)
none on /proc/sys/fs/binfmt_misc type binfmt_misc (rw)
sunrpc on /var/lib/nfs/rpc_pipefs type rpc_pipefs (rw)
/dev/mapper/vg--nagavg-lvm--naga on /LVM-vgscan type ext3 (rw)


VG Scan is using when we are not exporting the vg. ie, first umount the Logical Volume and take the disk and attach it to some other disk, and then do the # vgscan

it will detect the volume group from the disk and mount it in the new system.

Tuesday, June 14, 2016

NIC bonding in Vmware ESXi and ESX

Tuesday, June 14, 2016 0

To utilize NIC teaming, two or more network adapters must be uplinked to a virtual switch. The main advantages of NIC teaming are:
  • Increased network capacity for the virtual switch hosting the team.
  • Passive failover in the event one of the adapters in the team goes down.
Observe these guidelines to choose the correct NIC Teaming policy:
  • Route based on the originating port ID: Choose an uplink based on the virtual port where the traffic entered the virtual switch.
  • Route based on an IP hash: Choose an uplink based on a hash of the source and destination IP addresses of each packet. For non-IP packets, whatever is at those offsets is used to compute the hash.
  • Route based on a source MAC hash: Choose an uplink based on a hash of the source Ethernet.
  • Use explicit failover order: Always use the highest order uplink from the list of Active adapters which passes failover detection criteria.
  • Route based on physical NIC load (Only available on Distributed Switch): Choose an uplink based on the current loads of physical NICs.
Before you begin :
  • The default load balancing policy is Route based on the originating virtual port ID. If the physical switch is using link aggregation, Route based on IP hash load balancing must be used.
  • LACP support has been introduced in vSphere 5.1 on distributed vSwitches and requires additional configuration. 
  •  
  • Ensure VLAN and link aggregation protocol (if any) are configured correctly on the physical switch ports.

To configure NIC teaming for standard vSwitch using the vSphere / VMware Infrastructure Client:
  1. Highlight the host and click the Configuration tab.
  2. Click the Networking link.
  3. Click Properties.
  4. Under the Network Adapters tab, click Add.
  5. Select the appropriate (unclaimed) network adapter(s) and click Next.
  6. Ensure that the selected adapter(s) are under Active Adapters.
  7. Click Next > Finish.
  8. Under the Ports tab,highlight the name of the port group and click Edit.
  9. Click the NIC Teaming tab.
  10. Select the correct Teaming policy under the Load Balancing field.
  11. Click OK.
To configure NIC teaming for standard vSwitch using the vSphere Web Client:
  1. Under vCenter Home, click Hosts and Clusters.
  2. Click on the host.
  3. Click Manage > Networking > Virtual Switches.
  4. Click on the vSwitch.
  5. Click Manage the physical network adapters.
  6. Select the appropriate (unclaimed) network adapter(s) and use the arrow to move the adapter(s) to Active Adapters.
  7. Click Edit settings.
  8. Select the correct Teaming policy under the Load Balancing field.
  9. Click OK.
To configure NIC teaming for Distributed portgroup for VMware vSphere Distributed Switch (VDS) using the vSphere/VMware Infrastructure Client:
  1. From Inventory, go to Networking.
  2. Click on the Distributed switch.
  3. Click the Configuration tab.
  4. Click Manage Hosts. A window pops up.
  5. Click the host.
  6. From the Select Physical Adapters option, select the correct vmnics.
  7. Click Next for the rest of the options.
  8. Click Finish.
  9. Expand the Distributed switch.
  10. Right-click the Distributed Port Group.
  11. Click Edit Settings.
  12. Click Teaming and Failover.
  13. Select the correct Teaming policy under the Load Balancing field.
  14. Click OK.
To configure NIC teaming for Distributed portgroup for VDS using the vSphere Web Client:
  1. Under vCenter Home, click Networking.
  2. Click on the Distributed switch.
  3. Click the Getting Started tab.
  4. Under Basic tasks, click Add and manage hosts. A window pops up.
  5. Click Manage host networking.
  6. Click Next > Attached hosts.
  7. Select the host(s).
  8. Click Next.
  9. Select Manage physical adapters and deselect the rest.
  10. Click Next
  11. Select the correct vmnics.
  12. Click Assign Uplink > Next.
  13. Click Next for the rest of the options.
  14. Click Finish.
  15. Expand the Distributed Switch.
  16. Click Distributed Port Group. > Manage > Settings.
  17. Under Properties, click Edit.
  18. Click Teaming and failover.
  19. Select the correct Teaming policy underthe Load Balancing field.
  20. Click OK.

Explain about NIC Teaming in vmware

Tuesday, June 14, 2016 0

NIC Teaming

Let’s take a well-deserved break from networking math for a moment and shift into the fun world of NIC teaming. The concept of teaming goes by many different names: bonding, grouping, and trunking to name a few. Really, it just means that we’re taking multiple physical NICs on a given ESXi host and combining them into a single logical link that provides bandwidth aggregation and redundancy to a vSwitch. You might think that this sounds a little bit like port channels from earlier in the book. And you’re partially right—the goal is very similar, but the methods are vastly different.



  Let’s go over all of the configuration options for NIC teaming within a vSwitch. These options are a bit more relevant when your vSwitch is using multiple uplinks but are still valid configuration points no matter the quantity of uplinks.

Load Balancing

The first point of interest is the load-balancing policy. This is basically how we tell the vSwitch to handle outbound traffic, and there are four choices on a standard vSwitch:
  1. Route based on the originating virtual port
  2. Route based on IP hash
  3. Route based on source MAC hash
  4. Use explicit failover order
Keep in mind that we’re not concerned with the inbound traffic because that’s not within our control. Traffic arrives on whatever uplink the upstream switch decided to put it on, and the vSwitch is only responsible for making sure it reaches its destination.

The first option, route based on the originating virtual port, is the default selection for a new vSwitch. Every VM and VMkernel port on a vSwitch is connected to a virtual port. When the vSwitch receives traffic from either of these objects, it assigns the virtual port an uplink and uses it for traffic. 

The chosen uplink will typically not change unless there is an uplink failure, the VM changes power state, or the VM is migrated around via vMotion.

The second option, route based on IP hash, is used in conjunction with a link aggregation group (LAG), also called an EtherChannel or port channel. When traffic enters the vSwitch, the load-balancing policy will create a hash value of the source and destination IP addresses in the packet. The resulting hash value dictates which uplink will be used.

The third option, route based on source MAC hash, is similar to the IP hash idea, except the policy examines only the source MAC address in the Ethernet frame. To be honest, we have rarely seen this policy used in a production environment, but it can be handy for a nested hypervisor VM to help balance its nested VM traffic over multiple uplinks.


The fourth and final option, use explicit failover order, really doesn’t do any sort of load balancing. Instead, the first Active NIC on the list is used. If that one fails, the next Active NIC on the list is used, and so on, until you reach the Standby NICs. Keep in mind that if you select the Explicit Failover option and you have a vSwitch with many uplinks, only one of them will be actively used at any given time. 

Use this policy only in circumstances where using only one link rather than load balancing over all links is desired or required.

Network Failure Detection

When a network link fails (and they definitely do), the vSwitch is aware of the failure because the link status reports the link as being down. This can usually be verified by seeing if anyone tripped over the cable or mistakenly unplugged the wrong one. 

In most cases, this is good enough to satisfy your needs and the default configuration of “link status only” for the network failure detection is good enough.
But what if you want to determine a failure further up the network, such as a failure beyond your upstream connected switch? This is where beacon probing might be able to help you out. Beacon probing is actually a great term because it does roughly what it sounds like it should do. 

A beacon is regularly sent out from the vSwitch through its uplinks to see if the other uplinks can “hear” it.


Below image shows an example of a vSwitch with three uplinks. When Uplink1 sends out a beacon that Uplink2 receives but Uplink3 does not, this is because the upstream aggregation switch 2 is down, and therefore, the traffic is unable to reach Uplink3.
 
An example where beacon probing finds upstream switch failures
Are you curious why we use an example with three uplinks? Imagine you only had two uplinks and sent out a beacon that the other uplink did not hear. Does the sending uplink have a failure, or does the receiving uplink have a failure? It’s impossible to know who is at fault. Therefore, you need at least three uplinks in order for beacon probing to work.

 NOTE

Notify Switches

The Notify Switches configuration is a bit mystifying at first. Notify the switches about what, exactly? By default, it’s set to “Yes,” and as we cover here, that’s almost always a good thing.

Remember that all of your upstream physical switches have a MAC address table that they use to map ports to MAC addresses. This avoids the need to flood their ports—which means sending frames to all ports except the port they arrived on (which is the required action when a frame’s destination MAC address doesn’t appear in the switch’s MAC address table).

But what happens when one of your uplinks in a vSwitch fails and all of the VMs begin using a new uplink? The upstream physical switch would have no idea which port the VM is now using and would have to resort to flooding the ports or wait for the VM to send some traffic so it can re-learn the new port. 

Instead, the Notify Switches option speeds things along by sending Reverse Address Resolution Protocol (RARP) frames to the upstream physical switch on behalf of the VM or VMs so that upstream switch updates its MAC address table. This is all done before frames start arriving from the newly vMotioned VM, the newly powered-on VM, or from the VMs that are behind the uplink port that failed and was replaced.

These RARP announcements are just a fancy way of saying that the ESXi host will send out a special update letting the upstream physical switch know that the MAC address is now on a new uplink so that the switch will update its MAC address table before actually needing to send frames to that MAC address. It’s sort of like ESXi is shouting to the upstream physical switch and saying, “Hey! This VM is over here now!”

Failback

Since we’re already on the topic of an uplink failure, let’s talk about Failback. If you have a Standby NIC in your NIC Team, it will become Active if there are no more Active NICs in the team. Basically, it will provide some hardware redundancy while you go figure out what went wrong with the failed NIC.

 When you fix the problem with the failed Active NIC, the Failback setting determines if the previously failed Active NIC should now be returned to Active duty.

If you set this value to Yes, the now-operational NIC will immediately go back to being Active again, and the Standby NIC returns to being Standby. Things are returned back to the way they were before the failure.

If you choose the No value, the replaced NIC will simply remain inactive until either another NIC fails or you return it to Active status.

Failover Order

The final section in a NIC team configuration is the failover order. It consists of three different adapter states:
  • Active adapters: Adapters that are Actively used to pass along traffic.
  • Standby adapters: These adapters will only become Active if the defined Active adapters have failed.
  • Unused adapters: Adapters that will never be used by the vSwitch, even if all the Active and Standby adapters have failed.
While the Standby and Unused statuses do have value for some specific configurations, such as with balancing vMotion and management traffic on a specific pair of uplinks, it’s common to just set all the adapters to Active and let the load-balancing policy do the rest.

VMFS volume on the VMware ESX/ESXi host is locked due to an I/O error.

Tuesday, June 14, 2016 0
If naa.60060160b3c018009bd1e02f725fdd11:1 represents one of the partitions used in a VMFS volume, you see this message when the VMFS volume is inaccessible:
volume on device naa.60060160b3c018009bd1e02f725fdd11:1 locked, possibly because remote host 10.17.211.73 encountered an error during a volume operation and couldn’t recover.

If this issue occurs, the VMFS volume (and the virtual machines residing on the affected volume) are unavailable to the ESX/ESXi host.
In the /var/log/vmkernel.log file, you may see similar message indicating the same issue:
WARNING: LVM: 13127: The volume on the device naa.6000eb3b3638efa50000000000000258:1 locked, possibly because some remote host encountered an error during a volume operation and could not recover.
LVM: 11786: Failed to open device naa.6000eb3b3638efa50000000000000258:1 : Lock was not free

To resolve this issue, remove the lock on the indicated volume.
  1. Log in to the ESX/ESXi console.
    • For information on how to log in to ESXi 4.1 and 5.x hosts
    • For information on how to log in to ESXi 4.0, see 
  2. Log in to the terminal of the VMware ESX or ESXi host and run these commands:

    To break the lock:
    1. Break the existing LVM lock on the datastore by running this command:
      # vmkfstools –B vmfs deviceNote: You can also use the parameter --breaklock instead of -B with the vmkfstools command.

      From the preceding error message, this command is used:

      # vmkfstools -B /vmfs/devices/disks/naa.60060160b3c018009bd1e02f725fdd11:1You see output similar to:

      VMware ESX Question:

      LVM lock on device /vmfs/devices/disks/naa.60060160b3c018009bd1e02f725fdd11:1 will be forcibly broken. Please consult vmkfstools or ESX documentation to understand the consequences of this.

      Please ensure that multiple servers aren't accessing this device.

      Continue to break lock?
      0) Yes
      1) No

      Please choose a number [0-1]:
    2. Enter 0 to break the lock.
    3. Re-read and reload VMFS datastore metadata to memory by running this command:

      # vmkfstools –V
    4. From the vSphere UI, refresh the Storage Datastores View under Configuration tab.
Note: This issue can also be resolved by restarting all the hosts in the cluster.

VSphere 6.0 important log files and its locations

Tuesday, June 14, 2016 0
VSphere 6.0 has made some significant changes to the logging locations for its contained vCenter and PSC services. Everything has been condensed into a common area of the directory structure and labeled with the service name. In short, it makes a LOT more sense now than it did before.

This is an overview of what the structure looks like now and where to find what you need.

Windows Log Locations

C:\ProgramData\VMware\vCenterServer\logs


vCenter Appliance Log Locations

/var/log/vmware


vCenter Service

vmware-vpx\vpxd.log
Use this to troubleshoot issues with issues relating directly operation of the vCenter. Everything from DB connectivity problems to vCenter crashes are in here. This log will have a LOT of information in it and is a good place to start on many issues.
 

Inventory Service

invsvc\inv-svc.log
Formally the ds.log in 5.x. The format and location has changed.
invsvc\wrapper.log
Used to troubleshoot why the inventory service will not start.


Single Sign on

sso\vmware-sts-idmd.log
This is a good log to use as a “one-stop-shop” for SSO authentication issues. Authentication requests/failures as well as problems with an identity source will post here. 

vmafd\vdcpromo.log
Contains installation errors during configuration of vmdir. Especially useful for errors when adding another PSC to the same SSO domain.

vmdird\vmdird-syslog.log
Has information concerning the SSO LDAP instance named vmdir. Problems with ldap operations and replication within SSO can be found here.


vPostgres Service

vpostgres\postgresql-##.log
 
Operational information about the local vPostgres instance. This is just a renamed version on pg_log in normal Postgresql.


vSphere Web Client

vsphere-client\logs\vsphere_client_virgo.log
 
An excellent source of information when troubleshooting errors within the Web Client. If you receive errors from simply clicking on objects, you begin chasing them down here!

vsphere-client\wrapper.log
 
Entries in here can help determine why your vSphere Web Client service won’t start, or if it suddenly crashes. This log will not have as much on issues received while inside the Web Client
 

VMware System and Hardware Health Manager

vws\wrapper.log
This service is used to poll ESXi hosts for IPMI information for the Hardware Status tab. Entries in here can determine why the service won’t start, is malfunctioning, or if it suddenly crashes.


Performance Charts

perfcharts\stats.log
 
Has information on the Performance Charts section of the vCenter. If the charts fail to load, look here first.