有時候在 OpenStack 上砍 instance,會發生 instance 停留在 deleting 的狀態,這時候有幾種解決方法:

Reset instance state

根據官方手冊的建議,可以先重設 instance 的狀態,然後在執行刪除。

nova reset-state c6bbbf26-b40a-47e7-8d5c-eb17bf65c485 --active
nova delete c6bbbf26-b40a-47e7-8d5c-eb17bf65c485

--active 有時候可以不加,但根據經驗,加上 --active 比較不會出錯。

另外當 instance 卡在 hardware rebooting 時,reset-state --active 也可以先讓 instance 恢復到原本狀態,然後在重新 reboot。

Set running_deleted_instance_action in nova.conf

在 compute node 的 nova.conf 中,加入:

running_deleted_instance_action=reap

然後再重新啟動 nova-compute,這樣之後在 dashboard 上刪除 instance 就比較不會卡住。

Delete instance data from database

最近遇到的狀況是,user 要刪除壞掉的 instance,結果就一直卡在 deleting。到系統上發現,/var/lib/nova/instance 裡面已經沒有 instance 的相關檔案,系統上也沒有 VM 的 process。

查看 nova-compute.log 得到下列的錯誤訊息:

root@localhost:~# grep 'During wait destroy, instance disappeared' /var/log/nova/nova-compute.log.1 
2014-12-22 14:50:34.704 13317 ERROR nova.virt.libvirt.driver [-] [instance: 4bacd52b-01d7-4137-9aa9-76133ec58978] During wait destroy, instance disappeared.
2014-12-22 15:51:56.907 13317 ERROR nova.virt.libvirt.driver [-] [instance: 4bacd52b-01d7-4137-9aa9-76133ec58978] During wait destroy, instance disappeared.
2014-12-22 15:51:57.306 13317 ERROR nova.virt.libvirt.driver [-] [instance: 43056c16-29ea-4728-b40e-e566a4d4fbb8] During wait destroy, instance disappeared.
2014-12-22 17:30:43.580 13317 ERROR nova.virt.libvirt.driver [-] [instance: 4bacd52b-01d7-4137-9aa9-76133ec58978] During wait destroy, instance disappeared.
2014-12-22 17:30:43.916 13317 ERROR nova.virt.libvirt.driver [-] [instance: 43056c16-29ea-4728-b40e-e566a4d4fbb8] During wait destroy, instance disappeared.
2014-12-22 17:42:39.099 23348 ERROR nova.virt.libvirt.driver [-] [instance: 4bacd52b-01d7-4137-9aa9-76133ec58978] During wait destroy, instance disappeared.

Google 了一下,網路上的解決方法不外乎就兩種:

  1. Reset instance state
  2. 刪掉資料庫中的 instance 資料

第一種試過了好幾次,沒效果。所以決定直接進去資料庫去刪。

利用 phpmyadmin 搜尋的功能,發現在 nova database 中,以下資料表含有 instance data:

  1. block_device_mapping
  2. instance_actions
  3. instance_faults
  4. instance_id_mappings
  5. instance_info_caches
  6. instance_system_metadata
  7. instances

其中,instance_actions 可以不用去管它,剩下的就刪除吧。

參考資料

  1. Reset the state of an instance
  2. Instance marked as deleted but still present on host?
  3. Unable to delete instances in 'ERROR' state

Comments

comments powered by Disqus