IBM Power Systems: VIO issue

VIOS data is not collected through REST API because VIOS does not communicate with the HMC, we have seen that many times, check below.
I would start with point 4, then go with 3, both is something what helped a few users recently.

  1. make sure this service is started
    /etc/inittab:
    perfprovider:2:once:/usr/bin/startsrc -s perfprovider > /dev/null 2>&1
    
  2. IBM can provide a script cleanup_cmdb_with_logging.sh for cleaning up CMDB which resiolves that

  3. To resolve issue: www-01.ibm.com/support/docview.wss?uid=isg3T1024482
    also restarting vios daemon should work www.ibm.com/support/pages/when-using-hmc-gui-you-see-message-unable-connect-database-error-occurred

  4. There might be a problem with vio_daemon stuck or not communicating with HMC: forum.xorux.com/discussion/comment/3450#Comment_3450
    www.ibm.com/support/pages/node/629995
    It can be also related to resolving the vioses hostname/s, dns.
    also this might help: follow point 2. from the below IBM Tech Note to stop and start the vio daemon www.ibm.com/support/pages/when-using-hmc-gui-you-see-message-unable-connect-database-error-occurred

    Basically would be enough start or restart vio_daemon under root:
      ls -l /usr/ios/db/bin/solid*
      ps -ef | egrep "vio_daemon|solid|db"
      lssrc -ls vio_daemon  
      stopsrc -s vio_daemon
      startsrc -s vio_daemon
      sleep 10
    
      ps -ef | egrep "vio_daemon|solid|db"
      lssrc -ls vio_daemon
    
  5. www.ibm.com/support/pages/when-using-hmc-gui-you-see-message-unable-connect-database-error-occurred
    after running script cleanup_cmdb_with_logging.sh it works correctly.
    another user it helped as well, he saw error on HMC under Virtual Networks: Error occurred while quering for SharedEthernetAdapter from VIOS ....

  6. here is what resolved our internal issue on our P10 machine, proper DNS setup and vio-daemon restart
    # tail -1 /etc/hosts
    	10.x.x.x        p10-vios p10-vios.int.xorux.com
    # tail -1 /etc/netsvc.conf
    hosts=local,bind4
    # cat /etc/resolv.conf
    domain int.xorux.com
    nameserver 10.x.x.x   
    nameserver 1.1.1.1
    	--> make sure DNS is working properly, nslookup/ping 
    # stopsrc -s vio_daemon
    # startsrc -s vio_daemon
    
  7. can you test the same solution as is described at the end of this thread forum.xorux.com/discussion/comment/5744#Comment_5744

  8. The problem was with the vio servers. certain "work with virtual networks" functions would throw an error on the HMC. The solution was to force the vio to IPv4 name resolution, we received this from IBM:
    # vi /etc/netsvc.conf
    hosts=local4,bind4      <===== change ¿hosts=local,bind¿ to this
    
    Then
    /usr/bin/stopsrc -s vio_daemon
    Wait 300 seconds or until vio_daemon has stopped.
    /usr/sbin/slibclean
    rm -rf /home/ios/CM
    /usr/bin/startsrc -s vio_daemon -a '-d 4'
    ps -ef |grep vio_chgmgt |grep -v grep |awk -F ' ' '{print $2}'
    kill -1 
    
  9. IBM support:
    The VIOS version that you are using has an issue with logs collection, so I'm missing the copy of the CMDB, that is the DB running on VIOS that HMC queries
    I do not see any clear error in the logs but I can¿t check the DB to see if it¿s properly populated
    The good new is that we can recreate this DB without any impact to the running LPARs with the following steps
    $ oem_setup_env
    # stopsrc -s vio_daemon
    # /usr/sbin/slibclean
    # rm -rf /home/ios/CM
    # rm /home/ios/logs/viod_bkps/*
    # startsrc -s vio_daemon -a '-d 3'
    # kill -1 vio_daemon's PID
    
    Then wait few minutes and retry the operation that was failing

  10. forum.xorux.com/discussion/comment/6489#Comment_6489