Monthly Archives: February 2013

the operation failed because no suitable resource was found

Today i faced a stupid issue about vCloud Director, problem is i can not start vm and getting an error ‘the operation failed because no suitable resource was found’

Because of i was up until 03:00 am to 7:00 , really i couldn’t read a community long stories and targeted first search result of google ūüôā
http://www.vmware.com/support/vcd/doc/rel_notes_vcloud_director_51.html

OVF template creation and media upload operations sometimes fail on organization vDCs backed by datastore clusters
Sometimes, when you attempt to upload media or create an OVF template on an organization that is backed by a datastore cluster, the operation fails. This occurs when the datastore cluster threshold has been exceeded. In the case of OVF template creation, the error message that displays is misleading, as it states that “The operation failed because no suitable resource was found.”

I’m not using datastore cluster but focused warnings on my datastores and one of the datastore disk threshold was yellow which vm was there , i increased¬† the LUN size from storage and vCenter, i’m not sure but reconnected the vCenter to vCloud Director and start it .

Result is working, hope do not face such issue but if problem is like mine then you have a solution and of course need to upgrade latest version but a note vCloud Directore become soo problemeatic this days …

VM

Advertisements

IBM Tivoli Storage Productivity Center and Storwize v7000 PerfomanceMonitoring

Tivoli version 5.1.0.0
V7000 version 6.3.0.3 also i tried 6.4.1.3 too

First login management GUI of v7000 and left site “Access->Users” section (Lock Image) click New User ;

set what name you want, myself created admin
set auth mode local
set usergroup how you want but i selected SecurityAdmin
do not set pass we have set public key , to create a key download puttygen open it click Generate , move your mouse on blank area or take a coffee end of the generation pls save the public key and private key then use the public key at the generation of user.

Screen Shot 2013-02-15 at 9.08.01 PM

This is the user creation screen , here you have to provider generated SSH Public Key

Screen Shot 2013-02-15 at 9.06.37 PM

Now, go to Tivoli GUI

Step 1

Under Administrative Services –> Data Sources –> Storage Subsystems click Add
Device type should be IBM SAN Volume Controller / IBM Storwize V7000
Software Version 5+
IP Address , set you v7000 ip
Select Key, it should be Upload New Key
Administrator User Name , i used superuser
Administrator Password, give the superuser pass
User Name, i used new created user on top
Private SSH Key, i used puttygen generated private key, ppk extension
Click Add and wait little
You will see additional table become and show the storage, click next below

Discovery Process will be start, it should be success and then click next again

A new page will be appear which storage already selected , go next

Next page about data collection, you have a choose for custom or ready to use , choose Subsystem Advanced Group

Summary, next

Finish and click View job History and wait running job over success

Go to IBM Tivoli Storage Productivity Center –> Monitoring –> Probes –> TPCUser.Subsystem Advanced Probe you will see that newly added storage subsystem is under ¬†Current selections, if you want you can remove or add storage subsystem to under another monitoring probe.

Also you can see the schedule under When to Run tab and can create an alerts too  !

Screen Shot 2013-02-15 at 9.09.07 PM Screen Shot 2013-02-15 at 9.15.44 PM Screen Shot 2013-02-15 at 9.16.31 PM Screen Shot 2013-02-15 at 9.16.55 PM Screen Shot 2013-02-15 at 9.17.04 PM Screen Shot 2013-02-15 at 9.19.29 PM

Step 2

Lets read to collect performance info from newly added storage subsystem

From Disk Manager tree go to Monitoring and open Subsystem Performance Monitors, right click on and select Create Subsystem Performance Monitors,
choose storage and move to right site, click Sampling and Scheduling do not change something only change the duration “Continue indefinitely”
save the config, set a performance monitor name, confirm creation, wait for job over and give some multiple 5 mins to get many data

Thats it !

Screen Shot 2013-02-15 at 9.38.36 PM Screen Shot 2013-02-15 at 9.39.34 PM Screen Shot 2013-02-15 at 9.40.34 PM Screen Shot 2013-02-15 at 9.40.57 PM Screen Shot 2013-02-15 at 9.49.44 PM Screen Shot 2013-02-15 at 9.49.54 PM

vCloud Directore Datastore Recognize Issue

A newly added datastores are not looks like assigned to provider and usage info is N/A.

Screen Shot 2013-02-08 at 6.42.01 PM

Reconnect vCenter and refresh storage profiles , unmount and mount datastore , disable and enable datastore do not help us

With support we did the fallowings

Backup the vCloud Director sql database

stop the cells

Execute such sql commends
delete from dbo.cluster_compute_resource_inv;
delete from dbo.compute_resource_inv;
delete from dbo.custom_field_manager_inv;
delete from dbo.datacenter_inv;
delete from  dbo.datacenter_network_inv;
delete from dbo.datastore_inv;
delete from dbo.datastore_profile_inv
delete from  dbo.dv_portgroup_inv;
delete from dbo.dv_switch_inv;
delete from dbo.folder_inv;
delete from dbo.managed_server_inv;
delete from dbo.managed_server_datastore_inv;
delete from dbo.managed_server_network_inv;
delete from dbo.network_inv;
delete from dbo.resource_pool_inv;
delete from dbo.storage_pod_inv;
delete from dbo.storage_profile_inv;
delete from dbo.task_inv;
delete from dbo.vm_inv;
delete from dbo.property_map;

DELETE FROM qrtz_simple_triggers;
DELETE FROM qrtz_fired_triggers;
DELETE FROM qrtz_cron_triggers;
DELETE FROM qrtz_job_listeners;
DELETE FROM qrtz_scheduler_state;
DELETE FROM qrtz_blob_triggers;
DELETE FROM qrtz_paused_trigger_grps;
DELETE FROM qrtz_triggers;
DELETE FROM qrtz_job_details;

start the cells

reconnect to vCenter from vCloud Director

and hope all problem has gone !

VM

Centos&Redhat6 and NIC Bonding

Today i learned that on Centos6 and of course on Redhat6 also i can say Oracle Linux6 modprobe config is changed

Part 1 

Lets quick configure the bonding

its same like before you have to create a ifcfg-bond0 and  configure nics which will join the bonding

vi /etc/sysconfig/network-scripts/ifcfg-bond0

DEVICE=bond0
IPADDR=your.ip.add.ress
NETWORK=your.net.work.0
NETMASK=255.255.255.0
USERCTL=no
BOOTPROTO=none
ONBOOT=yes

Then configure nics which will join the bonding

vi /etc/sysconfig/network-scripts/ifcfg-eth0

DEVICE=eth0
USERCTL=no
ONBOOT=yes
MASTER=bond0
SLAVE=yes
BOOTPROTO=none

same for other nics which will be slave of  bond0

Now, we arrived modprobe config. Old versions of Centos/Redhat bonding configurations should be set with /etc/modprobe.conf but with Centos6 its little different. You have to get in /etc/modprobe.d folder and create a file  like below

vi /etc/modprobe.d/bonding.conf

alias bond0 bonding
options bond0 mode=balance-alb miimon=100

Be sure that cables are connected to nics

You can find on some articles like modprobe bonding and restart the service network, these are not worked for me , pls try to restart server

Related commands will help you for see everything

cat /proc/net/bonding/bond0
ifconfig

For balancing mode pls check this article http://www.cyberciti.biz/howto/question/static/linux-ethernet-bonding-driver-howto.php

Part 2

This is update after over this article

Pls be sure that NetworkManager service is closed

chkconfig NetworkManager off

Second , its very important that move options line from /etc/modprobe.d/bonding.conf to /etc/sysconfig/network-scripts/ifcfg-bond0 like below

[root@kahin02-11g ~]# cat /etc/sysconfig/network-scripts/ifcfg-bond1
DEVICE=bond1
IPADDR=10.79.79.2
NETWORK=10.79.79.0
NETMASK=255.255.255.0
USERCTL=no
ONBOOT=yes
BONDING_OPTS=”mode=6 miimon=100″

Also if you disable NetworkManager you have to care about DNS like below

DEVICE=bond0
IPADDR=xxx.yyy.zzz.ttt
NETWORK=xxx.yyy.zzz.ttt
NETMASK=255.255.255.0
USERCTL=no
ONBOOT=yes
DNS2=8.8.8.8
DNS1=8.8.4.4
BONDING_OPTS=”mode=6 miimon=100″

And one more , i couldn’t fix the default gw issue to solve this added such line in rc.local

#!/bin/sh
#
# This script will be executed *after* all the other init scripts.
# You can put your own initialization stuff in here if you don’t
# want to do the full Sys V style init stuff.

touch /var/lock/subsys/local
route add default dev bond0 gw you.gateway.ip.address

Also i would like to add such out, if you see this think that you have a mistake, you should not see xxx.yyy.ttt.fff everywhere , just only under of bonding interfaces.

bond0 Link encap:Ethernet HWaddr E4:1F:13:68:6E:20
inet addr:xxx.yyy.ttt.fff Bcast:81.21.160.255 Mask:255.255.255.0
inet6 addr: fe80::e61f:13ff:fe68:6e20/64 Scope:Link
UP BROADCAST RUNNING MASTER MULTICAST MTU:1500 Metric:1
RX packets:359 errors:0 dropped:0 overruns:0 frame:0
TX packets:520 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:0
RX bytes:55635 (54.3 KiB) TX bytes:37727 (36.8 KiB)

bond1 Link encap:Ethernet HWaddr 00:15:17:CF:7F:A0
inet addr:10.79.79.2 Bcast:10.79.79.255 Mask:255.255.255.0
inet6 addr: fe80::215:17ff:fecf:7fa0/64 Scope:Link
UP BROADCAST RUNNING MASTER MULTICAST MTU:1500 Metric:1
RX packets:256 errors:0 dropped:1 overruns:0 frame:0
TX packets:410 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:0
RX bytes:20137 (19.6 KiB) TX bytes:26388 (25.7 KiB)

eth0 Link encap:Ethernet HWaddr E4:1F:13:68:6E:20
inet addr:xxx.yyy.ttt.fff Bcast:81.21.160.255 Mask:255.255.255.0
UP BROADCAST RUNNING SLAVE MULTICAST MTU:1500 Metric:1
RX packets:359 errors:0 dropped:0 overruns:0 frame:0
TX packets:520 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:55635 (54.3 KiB) TX bytes:37727 (36.8 KiB)

eth1 Link encap:Ethernet HWaddr E4:1F:13:68:6E:22
UP BROADCAST SLAVE MULTICAST MTU:1500 Metric:1
RX packets:0 errors:0 dropped:0 overruns:0 frame:0
TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:0 (0.0 b) TX bytes:0 (0.0 b)

eth2 Link encap:Ethernet HWaddr 00:15:17:CF:7F:A1
UP BROADCAST SLAVE MULTICAST MTU:1500 Metric:1
RX packets:0 errors:0 dropped:0 overruns:0 frame:0
TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:0 (0.0 b) TX bytes:0 (0.0 b)
Interrupt:26 Memory:97c60000-97c80000

eth3 Link encap:Ethernet HWaddr 00:15:17:CF:7F:A0
inet addr:xxx.yyy.ttt.fff Bcast:81.21.160.255 Mask:255.255.255.0
UP BROADCAST RUNNING SLAVE MULTICAST MTU:1500 Metric:1
RX packets:256 errors:0 dropped:0 overruns:0 frame:0
TX packets:410 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:20137 (19.6 KiB) TX bytes:26388 (25.7 KiB)
Interrupt:25 Memory:97c20000-97c40000

lo Link encap:Local Loopback
inet addr:127.0.0.1 Mask:255.0.0.0
inet6 addr: ::1/128 Scope:Host
UP LOOPBACK RUNNING MTU:16436 Metric:1
RX packets:26 errors:0 dropped:0 overruns:0 frame:0
TX packets:26 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:0
RX bytes:2028 (1.9 KiB) TX bytes:2028 (1.9 KiB)

usb0 Link encap:Ethernet HWaddr E6:1F:13:5A:6E:23
inet addr:xxx.yyy.ttt.fff Bcast:81.21.160.255 Mask:255.255.255.0
inet6 addr: fe80::e41f:13ff:fe5a:6e23/64 Scope:Link
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:56 errors:0 dropped:0 overruns:0 frame:0
TX packets:6 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:3640 (3.5 KiB) TX bytes:468 (468.0 b)

VM

APD (All Path Down) Task Hangings and Solutions

We faced APD (All path down) issue and tried rescan storage/Datastore/LUN from GUI , command line , storageRM nothing help us just only restart the ESXi host helped.
You can see the logs and article about APD, PDL below

========================================================

vmkernel.all:2013-02-06T09:40:02.033Z cpu3:8732)WARNING: NMP: nmpDeviceAttemptFailover:658:Retry world

 

failover device “naa.600507680187053498000000000000c3” – failed to issue command due to Not found (APD),

 

try again…

vmkernel.all:2013-02-06T09:40:02.033Z cpu3:8732)WARNING: NMP: nmpDeviceAttemptFailover:658:Retry world

 

failover device “naa.600507680187053498000000000000c7” – failed to issue command due to Not found (APD),

 

try again…

vmkernel.all:2013-02-06T09:40:02.033Z cpu1:8732)WARNING: NMP: nmpDeviceAttemptFailover:658:Retry world

 

failover device “naa.600507680187053498000000000000c6” – failed to issue command due to Not found (APD),

 

try again…

vmkernel.all:2013-02-06T09:40:02.033Z cpu1:8732)WARNING: NMP: nmpDeviceAttemptFailover:658:Retry world

 

failover device “naa.600507680187053498000000000000c5” – failed to issue command due to Not found (APD),

 

try again…

vmkernel.all:2013-02-06T09:40:02.033Z cpu2:14962)WARNING: NMP: nmpDeviceAttemptFailover:658:Retry world

 

failover device “naa.600507680187053498000000000000c9” – failed to issue command due to Not found (APD),

 

try again…

vmkernel.all:2013-02-06T09:40:02.033Z cpu1:8732)WARNING: NMP: nmpDeviceAttemptFailover:658:Retry world

 

failover device “naa.600507680187053498000000000000c8” – failed to issue command due to Not found (APD),

 

try again…

vmkernel.all:2013-02-06T09:40:03.032Z cpu1:8732)WARNING: NMP: nmpDeviceAttemptFailover:658:Retry world

 

failover device “naa.600507680187053498000000000000c0” – failed to issue command due to Not found (APD),

 

try again…

vmkernel.all:2013-02-06T09:40:03.033Z cpu2:14962)WARNING: NMP: nmpDeviceAttemptFailover:658:Retry world

 

failover device “naa.600507680187053498000000000000c4” – failed to issue command due to Not found (APD),

 

try again…

vmkernel.all:2013-02-06T09:40:03.033Z cpu2:14962)WARNING: NMP: nmpDeviceAttemptFailover:658:Retry world

 

failover device “naa.600507680187053498000000000000c6” – failed to issue command due to Not found (APD),

 

try again…

vmkernel.all:2013-02-06T09:40:03.033Z cpu2:14962)WARNING: NMP: nmpDeviceAttemptFailover:658:Retry world

 

failover device “naa.600507680187053498000000000000c9” – failed to issue command due to Not found (APD),

 

try again…

vmkernel.all:2013-02-06T09:40:07.032Z cpu5:14962)WARNING: NMP: nmpDeviceAttemptFailover:658:Retry world

 

failover device “naa.6005076801870534980000000000011e” – failed to issue command due to Not found (APD),

 

try again…

vmkernel.all:2013-02-06T09:40:07.032Z cpu5:14962)WARNING: NMP: nmpDeviceAttemptFailover:658:Retry world

 

failover device “naa.600507680187053498000000000000c5” – failed to issue command due to Not found (APD),

 

try again…

vmkernel.all:2013-02-06T09:40:07.032Z cpu4:8732)WARNING: NMP: nmpDeviceAttemptFailover:658:Retry world

 

failover device “naa.600507680187053498000000000000c2” – failed to issue command due to Not found (APD),

 

try again…

vmkernel.all:2013-02-06T09:40:08.033Z cpu1:8732)WARNING: NMP: nmpDeviceAttemptFailover:658:Retry world

 

failover device “naa.600507680187053498000000000000c2” – failed to issue command due to Not found (APD),

 

try again…

vmkernel.all:2013-02-06T09:40:08.033Z cpu1:8732)WARNING: NMP: nmpDeviceAttemptFailover:658:Retry world

 

failover device “naa.600507680187053498000000000000c5” – failed to issue command due to Not found (APD),

 

try again…

vmkernel.all:2013-02-06T09:40:08.033Z cpu1:8732)WARNING: NMP: nmpDeviceAttemptFailover:658:Retry world

 

failover device “naa.6005076801870534980000000000011e” – failed to issue command due to Not found (APD),

 

try again…

================================================

Related article : http://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=2004684

Also we faced task hanging on and locking issues

A task like restarting VM or kind of things is hang then you need to restart vpxd service

A  task like vMotion is hang then you need to restart vCenter service

Also locking issues, when try to start VM, it can give an error like cannot read vmdk file or locking issue then for our case we used standart linux lsof command , very interesting to up th VM we moved VM on different ESXi nodes but it didn’t start after investigation we saw that on one of the ESXi node a files of VM looks like open and this cause locking issue after kill it everything was OK

VM