System hangs when block device fails, Amazon EC2/EBS
I'm running Ubuntu 12.04 LTS Precise in Amazon EC2.
I'm testing out how to handle a failed EBS device (block device), I do this by "force-detaching" through the API. I've tested this with software raid (mdadm), lvm and GlusterFS. I want to handle the failure, for instance with raid - have it discover the error and fail the disk.
The procedure is (raid):
1. Attach 2 drives
2. Set it up a RAID1
3. Force-detach one of the drives
The mount of the raid1 drives will now hang, mdadm --detail and other commands doesn't respond, they just hang there and the system starts to go into a state where a reboot is the only option.
What i expected was that the system, and mdadm would see the error, mark the drive as failed and continue running, that should be the general idea.
I've tested this out on Ubuntu 10.04.4 LTS (Lucid Lynx) with the same result.
But I also tested it on a RedHat distribution where everything seems to work as expected. So whats the difference? How come Ubuntu handles this so poorly?
Question information
- Language:
- English Edit question
- Status:
- Open
- For:
- Ubuntu Edit question
- Assignee:
- No assignee Edit question
- Last query:
- Last reply:
Can you help with this problem?
Provide an answer of your own, or ask Rudi Meyer for more information if necessary.