Kmaiti

  • Subscribe to our RSS feed.
  • Twitter
  • StumbleUpon
  • Reddit
  • Facebook
  • Digg

Sunday, 12 May 2013

NDMP communication failure error

Posted on 10:15 by Unknown
Guys,

Issue :

Netbackup server sends alert NDMP communication failure once everyday. But there is no issue to run scheduled backup jobs.

Environment : Netbackup server is RHEL 6.2 which also works as Media server, Netbackup 7.1 is running on it. NDMP host is  Netapp filer 2240. Backup is being taken on HP MSL 4049 tape.

Troubleshooting steps followed :

Everyday I receive a NDMP communication error alert messages once. I don't know why. I tried to troubleshoot but no luck. Even Symantec doesn't have any answer apart from what they have in their KB. They said, pw authentication might have wrong but for our setup it is correct. This is already verified multiple times. Enabled all logging even. Network communication is also fine. Following error I could see in bptm log file :

avrd[15164]: ndmp_public_session_create() failed: Unexpected NDMP message - not a connected message
tldd[24890]: TLD(0) ndmp_public_session_create_wCred failed with error code -1005


I don't see any suspicious messages on filer.

Solution : I don't still find out. Please share if you anyone knows.

FYI: I observed that tape drives periodically go off line. Hence, I had to up them. I have deployed a work around script which all check if drive is down or not. It'll up if it is down.

Check on netbackup server : tpconfig -l    or vmoprcmd -d
up it :  vmoprcmd    


Read More
Posted in | No comments

Sunday, 28 April 2013

CISCO UCS makes easy to manage IT setup and provides optimized usage of resources

Posted on 12:47 by Unknown
CISCO has launched UCS (unified computing system) few years back and this makes really easy to manage IT environment. CISCO UCS 5k chassis (containing 8 blades), fabric interconnect(6k) and fabric extender (2k) ie FEX make UCS environment where FCOE protocol is used. In a tipical setup,  two FEX will be connected to one chassis. Each FEX has 8 Ethernet  and 2 FC ports. Two downlink connection will be for single blade, each from FEX. You don't need to manually do cabling. You need to insert FEX module. That's it. This is also called IO module.

Now, there will have one FCOE uplink to Fabric(A and B, for redundancy) to each CISCO fabric interconnect switch. Both A and B Fab will be connected in order to maintain failover. This is UCS setup. Number cable dramatically reduced. Hence, power consumption too.

In order to send out traffic to uplink, Nexus 5k switch can be used and this supports FCoE and other  features too. In DC, generally we prefer Nexus 5k. All zonning, vlan and trunking are done in this switch. There is also redundancy here ie NX A and NX B. Your storage systems will be connected to this NexUS switch. Even backup will be connected to it. In order to sendout traffic from NEXUS,  there are other lot of switches. In our environment, we use cat 6K. You can use NexUS 7k too. I heard somewhere MDS switch is also used. This is basically core switch which interfaces to ISP routers.

Now, in order to manage UCS, there is UCS manager which is accessible through GUI. Even you can manage everything through backend. You need to login into primay fabric interconnect switch and execute commands. From server installation to patching, upgrading(FW and software) can be done. UCSM has nice features.

If vmware ESX is setup on blades and if they form ESX cluster with FT enabled, then your IT environment is 100% up. I'll explain and provide more unusual details from time to time.

So, stay tune....take care :)
Read More
Posted in | No comments

Wednesday, 10 April 2013

multipath details on RHEL 6

Posted on 12:58 by Unknown
Guys,

I'll mention one default configuration file(/etc/multipath.conf)

Environment : RHEL 6 :
Default : /etc/multipath.conf [comment should be removed]

-------------
#multipath.conf
#NetApp recommended settings


defaults
{
        user_friendly_names yes
        max_fds max
        queue_without_daemon no
        bindings_file "/var/lib/multipath/bindings"
        uid=500
        gid=500
}
blacklist
{
        wwid DevId
        devnode "^(ram|raw|loop|fd|md|dm-|sr|scd|st)[0-9]*"
        devnode "^hd[a-z]"
        devnode "^cciss!c[0-9]d[0-9]*[p[0-9]*]"
}
devices
{
        device
        {
                vendor "NETAPP"
                product "LUN"
                getuid_callout "/sbin/scsi_id -g -u -s /block/%n"
                prio_callout "/sbin/mpath_prio_ontap /dev/%n"
                features "1 queue_if_no_path"
                hardware_handler "0"
                path_grouping_policy group_by_prio
                failback immediate
                rr_weight uniform
                rr_min_io 128
                path_checker directio
                flush_on_last_del yes
        }
}

--------------


Example of paths :
$multipath -ll
mini_p (360a98000572d45394b34715579354446) dm-23 NETAPP,LUN
[size=1.0T][features=1 queue_if_no_path][hwhandler=0][rw]
\_ round-robin 0 [prio=8][active]
 \_ 1:0:0:0  sda        8:0     [active][ready]
 \_ 2:0:1:0  sdca       68:224  [active][ready]
\_ round-robin 0 [prio=2][enabled]
 \_ 2:0:2:0  sdct       70:16   [active][ready]
 \_ 1:0:1:0  sdq        65:0    [active][ready]


    Explanations:


mini_p (360a98000572d45394b34715579354446) dm-23 NETAPP,LUN
------  ---------------------------------  ---- --- ---------------
   |               |                         |    |          |-------> Product
   |               |                         |    |------------------> Vendor
   |               |                         |-----------------------> sysfs name
   |               |-------------------------------------------------> WWID of the device
   |-----------------------------------------------------------------> User defined Alias

[size=1.0T][features=1 queue_if_no_path][hwhandler=0][rw]
 ---------  ---------------------------  ----------------
     |                 |                        |--------------------> Hardware Handler
     |                 |---------------------------------------------> Features supported
     |---------------------------------------------------------------> Size of the DM


Path Group 1:
\_ round-robin 0 [prio=8][active]
-- -------------  ------  ------
 |    |              |      |----------------------------------------> Path group state
 |    |              |-----------------------------------------------> Path group priority
 |    |--------------------------------------------------------------> Path selector
 |-------------------------------------------------------------------> Path group level

First path on Path Group 1:
  \_ 1:0:0:0  sda        8:0     [active][ready]
    -------- --- ----   ------  -----
      |      |     |        |      |---------------------------------> Physical Path state
      |      |     |        |----------------------------------------> DM Path state
      |      |     |-------------------------------------------------> Major, minor numbers
      |      |-------------------------------------------------------> Linux device name
      |--------------------------------------------------------------> host,chan,scsiid,lun

Second path on Path Group 1:
  \_ 2:0:1:0  sdca       68:224  [active][ready]

Path Group 2:


 \_ 2:0:2:0  sdct       70:16   [active][ready]
 \_ 1:0:1:0  sdq        65:0    [active][ready]


-----------------

polling_interval :  Specifies the interval between two path checks in seconds.
udev_dir     : The directory where udev device nodes are created. The default value is /dev.
multipath_dir : /var/lib/multipath/bindings, The directory where the dynamic shared objects are stored.

path_selector     : Specifies the default algorithm to use in determining what path to use for the next I/O operation.

Possible values include:

    round-robin 0: Loop through every path in the path group, sending the same amount of I/O to each.
    queue-length 0: Send the next bunch of I/O down the path with the least number of outstanding I/O requests.
    service-time 0: Send the next bunch of I/O down the path with the shortest estimated service time, which is determined

by dividing the total size of the outstanding I/O to each path by its relative throughput.
The default value is round-robin 0.

path_grouping_policy     : Specifies the default path grouping policy to apply to unspecified multipaths. Possible values include:

    failover: 1 path per priority group.
    multibus: all valid paths in 1 priority group.
    group_by_serial: 1 priority group per detected serial number.
    group_by_prio: 1 priority group per path priority value. Priorities are determined by callout programs specified as global, per-controller, or per-multipath options.
  group_by_node_name: 1 priority group per target node name. Target node names are fetched in/sys/class/fc_transport/target*/node_name.

The default value is failover. 

getuid_callout     :

Specifies the default program and arguments to call out to obtain a unique path identifier. An absolute path is required.
The default value is /lib/udev/scsi_id --whitelisted --device=/dev/%n.

prio     : Specifies the default function to call to obtain a path priority value. For example, the ALUA bits in SPC-3

provide an exploitable prio value. Possible values include:
    const: Set a priority of 1 to all paths.
    emc: Generate the path priority for EMC arrays.
    alua: Generate the path priority based on the SCSI-3 ALUA settings.
    tpg_pref: Generate the path priority based on the SCSI-3 ALUA settings, using the preferred port bit.
    ontap: Generate the path priority for NetApp arrays.
    rdac: Generate the path priority for LSI/Engenio RDAC controller.
    hp_sw: Generate the path priority for Compaq/HP controller in active/standby mode.
    hds: Generate the path priority for Hitachi HDS Modular storage arrays.
        The default value is const. 


path_checker     :

Specifies the default method used to determine the state of the paths. Possible values include:
    readsector0: Read the first sector of the device.
    tur: Issue a TEST UNIT READY to the device.
    emc_clariion: Query the EMC Clariion specific EVPD page 0xC0 to determine the path.
    hp_sw: Check the path state for HP storage arrays with Active/Standby firmware.
    rdac: Check the path stat for LSI/Engenio RDAC storage controller.
    directio: Read the first sector with direct I/O.
The default value is directio.

failback     :
Manages path group failback.
    immediate :  A value of immediate specifies immediate failback to the highest priority path group that contains

active paths.
    manual : A value of manual specifies that there should not be immediate failback but that failback can happen only

with operator intervention.
    followover : A value of followover specifies that automatic failback should be performed when the first path of a

path group becomes active. This keeps a node from automatically failing back when another node requested the failover.
A numeric value greater than zero specifies deferred failback, expressed in seconds.
The default value is manual.


 rr_min_io :    Specifies the number of I/O requests to route to a path before switching to the next path in the current

path group. This setting is only for systems running kernels older than 2.6.31. Newer systems should use rr_min_io_rq. The

default value is 1000.

rr_min_io_rq :    Specifies the number of I/O requests to route to a path before switching to the next path in the current

path group, using request-based device-mapper-multipath. This setting should be used on systems running current kernels. On

systems running kernels older than 2.6.31, use rr_min_io. The default value is 1.


rr_weight  :    If set to priorities, then instead of sending rr_min_io requests to a path before calling path_selector to

choose the next path, the number of requests to send is determined by rr_min_io times the path's priority, as determined by

the prio function. If set to uniform, all path weights are equal. The default value is uniform.

no_path_retry     : A numeric value for this attribute specifies the number of times the system should attempt to use a

failed path before disabling queueing.

    fail : A value of fail indicates immediate failure, without queueing.
    queue: A value of queue indicates that queueing should not stop until the path is fixed.
The default value is 0.

user_friendly_names     :If set to yes, specifies that the system should use the /etc/multipath/bindings file to assign a

persistent and unique alias to the multipath, in the form of mpathn. If set to no, specifies that the system should use the

WWID as the alias for the multipath. In either case, what is specified here will be overridden by any device-specific

aliases you specify in the multipaths section of the configuration file. The default value is no.

queue_without_daemon :    If set to no, the multipathd daemon will disable queueing for all devices when it is shut down. The

default value is no.

flush_on_last_del     : If set to yes, the multipathd daemon will disable queueing when the last path to a device has

been deleted. The default value is no.

max_fds  :    Sets the maximum number of open file descriptors that can be opened by multipath and the multipathd daemon.

This is equivalent to the ulimit -n command. As of the Red Hat Enterprise Linux 6.3 release, the default value is max,

which sets this to the system limit from /proc/sys/fs/nr_open. For earlier releases, if this is not set the maximum number

of open file descriptors is taken from the calling process; it is usually 1024. To be safe, this should be set to the

maximum number of paths plus 32, if that number is greater than 1024.

checker_timeout     : The timeout to use for path checkers that issue SCSI commands with an explicit timeout, in

seconds. The default value is taken from sys/block/sdx/device/timeout.

fast_io_fail_tmo  :    The number of seconds the SCSI layer will wait after a problem has been detected on an FC remote

port before failing I/O to devices on that remote port. This value should be smaller than the value of dev_loss_tmo.

Setting this to off will disable the timeout. The default value is determined by the OS.

dev_loss_tmo :     The number of seconds the SCSI layer will wait after a problem has been detected on an FC remote port

before removing it from the system. Setting this to infinity will set this to 2147483647 seconds, or 68 years. The default

value is determined by the OS.
Read More
Posted in | No comments

How to verify UDP packet communication between two linux system?

Posted on 12:13 by Unknown
Guys,

Today, I had to check UDP packet communication between linux and a windows system. Main purpose of the windows system was to capturing or receiving syslog data from various linux system on arcsight. Hence, remote log forwarding was enabled on client. Following steps I followed :

Sending UDP packets from client :

$nc  -uv   IP_of_system_where_UDP_is_sending   port_number_to_which_UDP_is_receiving
Hello
This is test UDP packet
Are you capturing it
Please let me know


Example :

A$ nc   192.1.2.10   514
Hello
This is test UDP packet
Are you capturing it
Please let me know


If you capture the packets using wireshark / tshark or tcpdump, you'll see above packets on windows system/linux system

If your system wants to listen or receive UDP packets on any linux box, you can execute this :

$ nc -luv port

Example :

B$ nc -luv   514

If you don't have nc command, install it like :

$ yum install nc -y

Try with your own risk :)
Read More
Posted in | No comments

Tuesday, 26 February 2013

New posts are coming soon..

Posted on 10:45 by Unknown
Hi Guys,

It's been a long time I didn't post any article or issue here. There were few transitions in my career and I was bit busy. Hence, I didn't get time to update or post. There are more technologies on which I'll discuss. Following are in pipeline :

Linux Kernel
A bit about Red Hat Company and Product
Linux troubleshooting step and basic concept
CISCO unified computing system( UCS )
CISCO Fabric Interconnect Switch, VLAN, port channeling etc
FCOE, FC protocol
Red Hat Cluster
Veritas Cluster
Veritas Volume Manager
DMP or Veritas Dynamic Multipathing
SAN/NAS
Vmware vSpehere Virtualization, ESX 5.0.0
vCenter and vCenter HeartBeat
vMotion, Update manager
Netapp Storage Filers, Volume, LUN,masking, mapping, exporting, system log analysis etc
SnapVault
Symantec Netbackup Tecnology
Scripting : Python and Perl

So, stay tune...Good Luck :)



Read More
Posted in ESX, Linux, Storage, UCS | No comments
Newer Posts Older Posts Home
Subscribe to: Posts (Atom)

Popular Posts

  • unable connect to socket: No route to host (113)
    Guys, This error message usually comes when you try to access remote linux desktop using vncviewer. Please check the firewall in the linux s...
  • NDMP communication failure error
    Guys, Issue : Netbackup server sends alert NDMP communication failure once everyday. But there is no issue to run scheduled backup jobs. Env...
  • what does it mean by "cman expected_votes="1" two_node="1" in cluster.conf ?
    For two node clusters ordinarily, the loss of quorum after one out of two nodes fails will prevent the remaining node from continuing (if bo...
  • How to make bridge over VLAN?
    How to make bridge over VLAN? Bridging over VLAN's : By constructing a bridge between a "normal" and a "VLAN" ethern...
  • How to verify UDP packet communication between two linux system?
    Guys, Today, I had to check UDP packet communication between linux and a windows system. Main purpose of the windows system was to capturing...
  • How to install pdo_mysql module with php on 64 bit linux machine?
    Guys, The PHP Data Objects (PDO) extension defines a lightweight, consistent interface for accessing databases in PHP. Each database driver ...
  • configure: error: C preprocessor "/lib/cpp" fails sanity check + Resolved
    Guys, I got that error messages when I was going to configure any software on the linux server. I was unable to execute easyapache or ./conf...
  • configure: error: could not find library containing RSA_new
    Guys, It seems you have enabled the SSL option during configuring the package. Please either resolve that dependency or disable the SSL opti...
  • Cannot find config.m4 + phpize +Resolved
    Guys, I got the same error messages and sorted out it. Here is the error that I got. ===== root@server [/home/cpeasyapache/src/php-5.2.9/ext...
  • How to redirect output of script to a file(Need to save log in a file and file should be menioned in the script itself?
    Expectation : @subject Steps : 1. Create a bash script. 2. add line : exec > >(tee /var/log/my_logfile.txt) That's it. All output ...

Categories

  • ACL
  • ESX
  • Linux
  • Storage
  • UCS

Blog Archive

  • ▼  2013 (5)
    • ▼  May (1)
      • NDMP communication failure error
    • ►  April (3)
      • CISCO UCS makes easy to manage IT setup and provid...
      • multipath details on RHEL 6
      • How to verify UDP packet communication between two...
    • ►  February (1)
      • New posts are coming soon..
  • ►  2012 (10)
    • ►  July (1)
    • ►  June (1)
    • ►  April (1)
    • ►  March (3)
    • ►  February (3)
    • ►  January (1)
  • ►  2011 (86)
    • ►  December (3)
    • ►  November (2)
    • ►  September (19)
    • ►  August (9)
    • ►  July (5)
    • ►  June (9)
    • ►  May (12)
    • ►  April (3)
    • ►  March (4)
    • ►  February (5)
    • ►  January (15)
  • ►  2010 (152)
    • ►  December (9)
    • ►  November (34)
    • ►  October (20)
    • ►  September (14)
    • ►  August (24)
    • ►  July (19)
    • ►  June (3)
    • ►  May (25)
    • ►  April (3)
    • ►  January (1)
Powered by Blogger.