503 error running multiple-server

Asked by Shengjie Min

I I've been following the instruction on http://swift.openstack.org/howto_installmultinode.html#configure-the-proxy-node. I have three storage and a proxy server running as 4 vms(host only).
proxy server: 192.168.175.128
storage 1 : 192.168.175.130
storage 2 :192.168.175.131
storage 3 :192.168.175.133

ring builds perfectly, rebalanced perfectly.
rm -f *.builder *.ring.gz backups/*.builder backups/*.ring.gz

swift-ring-builder object.builder create 18 3 1
swift-ring-builder object.builder add z1-192.168.175.130:6000/sda3 1
swift-ring-builder object.builder add z2-192.168.175.131:6000/sda3 1
swift-ring-builder object.builder add z3-192.168.175.133:6000/sda3 1
swift-ring-builder object.builder rebalance

swift-ring-builder container.builder create 18 3 1
swift-ring-builder container.builder add z1-192.168.175.130:6001/sda3 1
swift-ring-builder container.builder add z2-192.168.175.131:6001/sda3 1
swift-ring-builder container.builder add z3-192.168.175.133:6001/sda3 1
swift-ring-builder container.builder rebalance

swift-ring-builder account.builder create 18 3 1
swift-ring-builder account.builder add z1-192.168.175.130:6002/sda3 1
swift-ring-builder account.builder add z2-192.168.175.131:6002/sda3 1
swift-ring-builder account.builder add z3-192.168.175.133:6002/sda3 1
swift-ring-builder account.builder rebalance

When I was trying to start proxy server, tempauth was not working, and then I noticed swauth is still installed with in swift(I can see all the swauth commands files, like swauth-prep, swauth-add-user under /usr/bin), I've changed my proxy-server.conf as below
===============================================
[DEFAULT]
#cert_file = /etc/swift/cert.crt
#key_file = /etc/swift/cert.key
bind_port = 8080
workers = 8
user = swift

[pipeline:main]
pipeline = healthcheck cache swauth proxy-server

[app:proxy-server]
use = egg:swift#proxy
allow_account_management = true

[filter:swauth]
use = egg:swift#swauth
default_swift_cluster = local#http://192.168.175.128:8080/v1#http://localhost:8080/v1
super_admin_key = swauthkey

[filter:healthcheck]
use = egg:swift#healthcheck

[filter:cache]
use = egg:swift#memcache
memcache_servers = 192.168.175.128:11211
========================================================

then when i try to run
================================
root@ubuntu:/etc/swift# swauth-prep -K swauthkey
Auth subsystem prep failed: 500 Server Error
================================
I get this error in sys.log

Jun 17 04:08:32 ubuntu proxy-server Account PUT returning 503 for [503, 503, 503] (txn: tx9fadfec6-1f85-4e8f-9982-2987bc44465d)
Jun 17 04:08:32 ubuntu proxy-server - - 17/Jun/2011/11/08/32 PUT /v1/AUTH_.auth HTTP/1.0 503 - - - - - - tx9fadfec6-1f85-4e8f-9982-2987bc44465d - 0.0125
Jun 17 04:08:32 ubuntu proxy-server STDOUT: EXCEPTION IN handle: Traceback (most recent call last):#012 File "/usr/lib/pymodules/python2.6/swift/common/middleware/swauth.py", line 320, in handle#012 return self.handle_request(req)(env, start_response)#012 File "/usr/lib/pymodules/python2.6/swift/common/middleware/swauth.py", line 383, in handle_request#012 req.response = handler(req)#012 File "/usr/lib/pymodules/python2.6/swift/common/middleware/swauth.py", line 402, in handle_prep#012 (path, resp.status))#012Exception: Could not create the main auth account: /v1/AUTH_.auth 503 Internal Server Error#012: {'SCRIPT_NAME': '/auth/v2', 'webob.adhoc_attrs': {'start_time': 1308308912.1579649, 'bytes_transferred': '-', 'client_disconnect': False}, 'REQUEST_METHOD': 'POST', 'PATH_INFO': '/.prep', 'SERVER_PROTOCOL': 'HTTP/1.0', 'QUERY_STRING': '', 'eventlet.posthooks': [(<bound method Swauth.posthooklogger of <swift.common.middleware.swauth.Swauth object at 0x93d22ac>>, (<Request at 0x93dde0c POST http://127.0.0.1:8080/auth/v2/.prep>,), {})], 'SERVER_NAME': '127.0.0.1', 'REMOTE_ADDR': '127.0.0.1', 'eventlet.input': <eventlet.wsgi.Input object at 0x93da34c>, 'HTTP_X_AUTH_ADMIN_KEY': 'swauthkey', 'wsgi.url_scheme': 'http', 'SERVER_PORT': '8080', 'HTTP_X_AUTH_ADMIN_USER': '.super_admin', 'HTTP_X_CF_TRANS_ID': 'tx9fadfec6-1f85-4e8f-9982-2987bc44465d', 'wsgi.input': <eventlet.wsgi.Input object at 0x93da34c>, 'HTTP_HOST': '127.0.0.1:8080', 'swift.cache': <swift.common.memcached.MemcacheRing object at 0x93d23cc>, 'wsgi.multithread': True, 'wsgi.version': (1, 0), 'GATEWAY_INTERFACE': 'CGI/1.1', 'wsgi.run_once': False, 'wsgi.errors': <swift.common.utils.LoggerFileObject object at 0x936948c>, 'wsgi.multiprocess': False, 'CONTENT_TYPE': None, 'HTTP_ACCEPT_ENCODING': 'identity'} (txn: tx9fadfec6-1f85-4e8f-9982-2987bc44465d)

================================
root@ubuntu:/etc/swift# swauth-add-user -K swauthkey -a shengjie shengjie passw0rd
Account creation failed: 500 Server Error
User creation failed: 500 Server Error
================================

Jun 17 04:09:40 ubuntu proxy-server - - 17/Jun/2011/11/09/40 HEAD /v1/AUTH_.auth/shengjie HTTP/1.0 404 - - - - - - txd8c7a9fe-7e99-4d87-9f7c-9b404b1ab643 - 0.0130
Jun 17 04:09:40 ubuntu proxy-server - - 17/Jun/2011/11/09/40 PUT /v1/AUTH_.auth/shengjie HTTP/1.0 404 - - - - - - txd8c7a9fe-7e99-4d87-9f7c-9b404b1ab643 - 0.0008
Jun 17 04:09:40 ubuntu proxy-server STDOUT: EXCEPTION IN handle: Traceback (most recent call last):#012 File "/usr/lib/pymodules/python2.6/swift/common/middleware/swauth.py", line 320, in handle#012 return self.handle_request(req)(env, start_response)#012 File "/usr/lib/pymodules/python2.6/swift/common/middleware/swauth.py", line 383, in handle_request#012 req.response = handler(req)#012 File "/usr/lib/pymodules/python2.6/swift/common/middleware/swauth.py", line 612, in handle_put_account#012 'account: %s %s' % (path, resp.status))#012Exception: Could not create account within main auth account: /v1/AUTH_.auth/shengjie 404 Not Found#012: {'SCRIPT_NAME': '/auth/v2/shengjie', 'webob.adhoc_attrs': {'start_time': 1308308980.1818759, 'bytes_transferred': '-', 'client_disconnect': False}, 'REQUEST_METHOD': 'PUT', 'PATH_INFO': '', 'SERVER_PROTOCOL': 'HTTP/1.0', 'QUERY_STRING': '', 'eventlet.posthooks': [(<bound method Swauth.posthooklogger of <swift.common.middleware.swauth.Swauth object at 0x93d22ac>>, (<Request at 0x93dde2c PUT http://127.0.0.1:8080/auth/v2/shengjie>,), {})], 'SERVER_NAME': '127.0.0.1', 'REMOTE_ADDR': '127.0.0.1', 'eventlet.input': <eventlet.wsgi.Input object at 0x93da34c>, 'HTTP_X_AUTH_ADMIN_KEY': 'swauthkey', 'wsgi.url_scheme': 'http', 'SERVER_PORT': '8080', 'HTTP_X_AUTH_ADMIN_USER': '.super_admin', 'HTTP_X_CF_TRANS_ID': 'txd8c7a9fe-7e99-4d87-9f7c-9b404b1ab643', 'wsgi.input': <eventlet.wsgi.Input object at 0x93da34c>, 'HTTP_HOST': '127.0.0.1:8080', 'swift.cache': <swift.common.memcached.MemcacheRing object at 0x93d23cc>, 'wsgi.multithread': True, 'wsgi.version': (1, 0), 'GATEWAY_INTERFACE': 'CGI/1.1', 'wsgi.run_once': False, 'wsgi.errors': <swift.common.utils.LoggerFileObject object at 0x936948c>, 'wsgi.multiprocess': False, 'CONTENT_TYPE': None, 'HTTP_ACCEPT_ENCODING': 'identity'} (txn: txd8c7a9fe-7e99-4d87-9f7c-9b404b1ab643)
Jun 17 04:09:40 ubuntu proxy-server - - 17/Jun/2011/11/09/40 HEAD /v1/AUTH_.auth/shengjie HTTP/1.0 404 - - - - - - txc417a4df-7cd5-4eca-94a2-063a6a60de57 - 0.0118
Jun 17 04:09:40 ubuntu proxy-server STDOUT: EXCEPTION IN handle: Traceback (most recent call last):#012 File "/usr/lib/pymodules/python2.6/swift/common/middleware/swauth.py", line 320, in handle#012 return self.handle_request(req)(env, start_response)#012 File "/usr/lib/pymodules/python2.6/swift/common/middleware/swauth.py", line 383, in handle_request#012 req.response = handler(req)#012 File "/usr/lib/pymodules/python2.6/swift/common/middleware/swauth.py", line 891, in handle_put_user#012 (path, resp.status))#012Exception: Could not retrieve account id value: /v1/AUTH_.auth/shengjie 404 Not Found#012: {'HTTP_X_CF_TRANS_ID': 'txc417a4df-7cd5-4eca-94a2-063a6a60de57', 'SCRIPT_NAME': '/auth/v2/shengjie/shengjie', 'webob.adhoc_attrs': {'start_time': 1308308980.2007799, 'bytes_transferred': '-', 'client_disconnect': False}, 'REQUEST_METHOD': 'PUT', 'PATH_INFO': '', 'SERVER_PROTOCOL': 'HTTP/1.0', 'QUERY_STRING': '', 'eventlet.posthooks': [(<bound method Swauth.posthooklogger of <swift.common.middleware.swauth.Swauth object at 0x93d22ac>>, (<Request at 0x93dde8c PUT http://127.0.0.1:8080/auth/v2/shengjie/shengjie>,), {})], 'SERVER_NAME': '127.0.0.1', 'REMOTE_ADDR': '127.0.0.1', 'eventlet.input': <eventlet.wsgi.Input object at 0x93da30c>, 'HTTP_X_AUTH_ADMIN_KEY': 'swauthkey', 'wsgi.url_scheme': 'http', 'SERVER_PORT': '8080', 'HTTP_X_AUTH_USER_KEY': 'passw0rd', 'HTTP_X_AUTH_ADMIN_USER': '.super_admin', 'HTTP_X_AUTH_USER_ADMIN': 'true', 'wsgi.input': <eventlet.wsgi.Input object at 0x93da30c>, 'HTTP_HOST': '127.0.0.1:8080', 'swift.cache': <swift.common.memcached.MemcacheRing object at 0x93d23cc>, 'wsgi.multithread': True, 'wsgi.version': (1, 0), 'GATEWAY_INTERFACE': 'CGI/1.1', 'wsgi.run_once': False, 'wsgi.errors': <swift.common.utils.LoggerFileObject object at 0x936948c>, 'wsgi.multiprocess': False, 'CONTENT_TYPE': None, 'HTTP_ACCEPT_ENCODING': 'identity'} (txn: txc417a4df-7cd5-4eca-94a2-063a6a60de57)

Question information

Language:
English Edit question
Status:
Answered
For:
OpenStack Object Storage (swift) Edit question
Assignee:
No assignee Edit question
Last query:
Last reply:
Revision history for this message
gholt (gholt) said :
#1

You'll need to track down what's happening on your account servers. This line:

Account PUT returning 503 for [503, 503, 503]

Indicates all three account servers are 503ing.

Revision history for this message
Shengjie Min (shengjie-min) said :
#2

i checked sys.log on one of the storage node(storage1 - 192.168.175.130), i've got this
================================
root@ubuntu:/etc/swift# swauth-prep -K swauthkey
Auth subsystem prep failed: 500 Server Error
================================
Jun 17 06:34:36 ubuntu account-server 192.168.175.128 - - [17/Jun/2011:13:34:36 +0000] "PUT /sda3/138541/AUTH_.auth" 507 - "tx23101681-ed3b-4ab2-bea3-8496164f016f" "-" "-" 0.0008 ""

================================
root@ubuntu:/etc/swift# swauth-add-user -K swauthkey -a shengjie shengjie passw0rd
Account creation failed: 500 Server Error
User creation failed: 500 Server Error
================================
Jun 17 06:36:18 ubuntu account-server 192.168.175.128 - - [17/Jun/2011:13:36:18 +0000] "HEAD /sda3/138541/AUTH_.auth" 507 - "tx6f3c1dbb-cb91-44cb-ab40-02bebc461fe7" "-" "-" 0.0010 ""
Jun 17 06:36:19 ubuntu account-server 192.168.175.128 - - [17/Jun/2011:13:36:19 +0000] "HEAD /sda3/138541/AUTH_.auth" 507 - "txc01b3611-70a7-4e46-acef-ac82f95ccebe" "-" "-" 0.0001 ""

Revision history for this message
John Dickinson (notmyname) said :
#3

the 507 on the account server indicates that the drive sda3 is unmounted

Revision history for this message
gholt (gholt) said :
#4

If sda3 isn't really a disk on its own but is instead just a directory on another disk (such as if you were setting up a test system and not an actual production system) you can disable the mount checking with:

mount_check = false

Each in each of these configs:

account-server.conf
container-server.conf
object-server.conf

Revision history for this message
Shengjie Min (shengjie-min) said :
#5

actually, they are mounted, three of them are all mounted

root@ubuntu:/etc/swift# mount /srv/node/sda3
mount: /dev/sda3 already mounted or /srv/node/sda3 busy
mount: according to mtab, /dev/sda3 is already mounted on /srv/node/sda3

From Gparted, All three storage have the same:

Partition: /dev/sda3
File System: xfs
Mount Point:/srv/node/sda3
Size: 4.99GiB
Used: 17.69MiB
Unused: 4.98GiB

Revision history for this message
gholt (gholt) said :
#6

Can you paste your account-server.conf and perhaps also an "ls -la /srv/node/sda3" on one of the account servers?

Revision history for this message
Shengjie Min (shengjie-min) said :
#7

@gholt, I've tried put mount_check = false in all the conf files you mentioned. Now getting this in sys.log on the account servers.

Jun 17 07:05:16 ubuntu account-server ERROR __call__ error with PUT /sda3/138541/AUTH_.auth : [Errno 5] Input/output error: '/srv/node/sda3/accounts/138541' (txn: tx43000d3d-75de-4bf7-8d4e-9aa6cff4d5df)
Jun 17 07:05:16 ubuntu account-server 192.168.175.128 - - [17/Jun/2011:14:05:16 +0000] "PUT /sda3/138541/AUTH_.auth" 500 798 "tx43000d3d-75de-4bf7-8d4e-9aa6cff4d5df" "-" "-" 0.0023 ""
Jun 17 07:05:21 ubuntu kernel: [190617.489812] Filesystem "sda3": xfs_log_force: error 5 returned.

Proxy server error looks the same.

Revision history for this message
Shengjie Min (shengjie-min) said :
#8

I think I am close to the issue now

root@ubuntu:/srv/node# ls -lsa
ls: cannot access sda3
total 8
4 drwxr-xr-x 3 swift swift 4096 2011-06-14 01:45 .
4 drwxr-xr-x 3 root root 4096 2011-06-14 01:45 ..
? d????????? ? ? ? ? ? sda3

root@ubuntu:/srv/node/sda3# ls -lsa
ls: cannot open directory .: Input/output error

root@ubuntu:/etc/swift# xfs_check /srv/node/sda3
xfs_check: cannot open /srv/node/sda3: Input/output error

@gholt:
account-server.conf
[DEFAULT]
bind_ip = 192.168.175.130
workers = 2
mount_check=false

[pipeline:main]
pipeline = account-server

[app:account-server]
use = egg:swift#account

[account-replicator]

[account-auditor]

[account-reaper]

Revision history for this message
gholt (gholt) said :
#9

Yeah, definitely something up with either hardware or the kernel version or something. You'll want to re-enable that mount_check btw; it is a nice safety feature so you don't end up writing data to your os drive. :)

Revision history for this message
Shengjie Min (shengjie-min) said :
#10

Thanks very much, guys.

xfs_repaired all sda3, restarted the storages, reloaded the proxy server, wola, working!

so the error messages made me worried. If you guys didn't tell me 507 means something wrong with the device, I wouldn't even have had the chance to figure out this message:

Jun 17 06:36:18 ubuntu account-server 192.168.175.128 - - [17/Jun/2011:13:36:18 +0000] "HEAD /sda3/138541/AUTH_.auth" 507 - "tx6f3c1dbb-cb91-44cb-ab40-02bebc461fe7" "-" "-" 0.0010 ""

Generally, I found the error msg and logging are not up 2 a very good standard. Is that because the nature of Python, or it's the way the exceptions/errors got handled? we have plans to enhance them? Also the response code, like 201, 204, 500, 503, 507 etc etc, they are not standard http code, right? Are they defined internally in SWIFT somewhere? or we can have a error code menu to check when we running into issues again next time?

Revision history for this message
gholt (gholt) said :
#11

There is work planned to improve logging and documentation, etc. But all that takes time. We mostly throw stack traces because, though they're ugly, they tend to help more than "error reading file". :)

The HTTP codes are almost always standard, or at least /real/ close. 507 means "Insufficient Storage" and though we're using for "Unmounted" that's a pretty close match.

The only non-standard codes I'm aware Swift logs are:

498 Rate-Limiting In Effect
499 Client Disconnected Early

Revision history for this message
Jun Hu (juhu2) said :
#12

I also meet this problem when running "swift stat". I find some error in the /var/log/message:

swift Node error limited 192.168.1.30:6002 (node) (txn: tx395f4fc90fb44e368aace4db659045d4)
swift Account HEAD returning 503 for [] (txn: tx395f4fc90fb44e368aace4db659045d4) (client_ip: 192.168.1.30)

By reading the reply of gholt (gholt) said on 2011-06-17, I try to append "mount_check = false" to the config file of account-server.conf, container-server.conf, object-server.conf.
Then this issue is disappear.

Thand for gholt.

Can you help with this problem?

Provide an answer of your own, or ask Shengjie Min for more information if necessary.

To post a message you must log in.