因為貪玩,安裝了 Linux kernel 4.0

結果在指令: gluster peer probe node2 時,這個指令停頓了很久,最後什麼訊息也沒有提示。
但是再下指令: gluster peer status 卻沒有看到新 node 加入

node1:~$ sudo gluster peer probe node2
node1:~$ sudo gluster peer status
Number of Peers: 0
node1:~$

原本以為是網路問題,但是用 netstattcpdump 檢查,服務是有在 listen,而且封包也有送出

  • netstat

    node1:~$ sudo netstat -tlnuop
    Active Internet connections (only servers)
    Proto Recv-Q Send-Q Local Address           Foreign Address         State       PID/Program name Timer
    tcp        0      0 0.0.0.0:24007           0.0.0.0:*               LISTEN      1858/glusterd    off (0.00/0/0)
    tcp        0      0 0.0.0.0:46283           0.0.0.0:*               LISTEN      1593/rpc.statd   off (0.00/0/0)
    tcp        0      0 0.0.0.0:111             0.0.0.0:*               LISTEN      1579/rpcbind     off (0.00/0/0)
    tcp        0      0 0.0.0.0:22              0.0.0.0:*               LISTEN      1826/sshd        off (0.00/0/0)
    udp        0      0 0.0.0.0:40520           0.0.0.0:*                           1593/rpc.statd   off (0.00/0/0)
    udp        0      0 0.0.0.0:111             0.0.0.0:*                           1579/rpcbind     off (0.00/0/0)
    udp        0      0 0.0.0.0:853             0.0.0.0:*                           1579/rpcbind     off (0.00/0/0)
    udp        0      0 127.0.0.1:921           0.0.0.0:*                           1593/rpc.statd   off (0.00/0/0)
    
  • tcpdump

    node1:~$ sudo tcpdump -i eth0 dst 192.168.0.2
    tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
    listening on eth0, link-type EN10MB (Ethernet), capture size 65535 bytes
    15:10:24.791642 ARP, Request who-has node2 tell node1, length 28
    15:10:24.791993 IP node1.1023 > node2.24007: Flags [S], seq 831525720, win 29200, options [mss 1460,sackOK,TS val 4294852857 ecr 0,nop,wscale 7], length 0
    15:10:25.793703 IP node1.1023 > node2.24007: Flags [S], seq 831525720, win 29200, options [mss 1460,sackOK,TS val 4294853860 ecr 0,nop,wscale 7], length 0
    15:10:27.797709 IP node1.1023 > node2.24007: Flags [S], seq 831525720, win 29200, options [mss 1460,sackOK,TS val 4294855864 ecr 0,nop,wscale 7], length 0
    15:10:31.805723 IP node1.1023 > node2.24007: Flags [S], seq 831525720, win 29200, options [mss 1460,sackOK,TS val 4294859872 ecr 0,nop,wscale 7], length 0
    15:10:39.821772 IP node1.1023 > node2.24007: Flags [S], seq 831525720, win 29200, options [mss 1460,sackOK,TS val 4294867888 ecr 0,nop,wscale 7], length 0
    15:10:45.004918 ARP, Request who-has node2 tell 10.228.91.237, length 46
    15:10:55.837832 IP node1.1023 > node2.24007: Flags [S], seq 831525720, win 29200, options [mss 1460,sackOK,TS val 4294883904 ecr 0,nop,wscale 7], length 0
    15:11:00.845735 ARP, Request who-has node2 tell node1, length 28
    15:11:27.901912 IP node1.1023 > node2.24007: Flags [S], seq 831525720, win 29200, options [mss 1460,sackOK,TS val 4294915968 ecr 0,nop,wscale 7], length 0
    15:11:32.909734 ARP, Request who-has node2 tell node1, length 28
    

/var/log/glusterfs/cli.log 看到錯誤訊息:

[2015-08-05 07:41:13.411424] W [dict.c:1055:data_to_str] (-->/usr/lib/x86_64-linux-gnu/glusterfs/3.5.5/rpc-transport/socket.so(+0x4e5b) [0x7f3739490e5b] (-->/usr/lib/x86_64-linux-gnu/glusterfs/3.5.5/rpc-transport/socket.so(socket_client_get_remote_sockaddr+0x3a) [0x7f37394973aa] (-->/usr/lib/x86_64-linux-gnu/glusterfs/3.5.5/rpc-transport/socket.so(client_fill_address_family+0x20b) [0x7f373949707b]))) 0-dict: data is NULL
[2015-08-05 07:41:13.411456] W [dict.c:1055:data_to_str] (-->/usr/lib/x86_64-linux-gnu/glusterfs/3.5.5/rpc-transport/socket.so(+0x4e5b) [0x7f3739490e5b] (-->/usr/lib/x86_64-linux-gnu/glusterfs/3.5.5/rpc-transport/socket.so(socket_client_get_remote_sockaddr+0x3a) [0x7f37394973aa] (-->/usr/lib/x86_64-linux-gnu/glusterfs/3.5.5/rpc-transport/socket.so(client_fill_address_family+0x216) [0x7f3739497086]))) 0-dict: data is NULL
[2015-08-05 07:41:13.411468] E [name.c:147:client_fill_address_family] 0-glusterfs: transport.address-family not specified. Could not guess default value from (remote-host:(null) or transport.unix.connect-path:(null)) options
[2015-08-05 07:41:13.537520] I [input.c:36:cli_batch] 0-: Exiting with: 110
[2015-08-05 07:41:38.527991] I [socket.c:3645:socket_init] 0-glusterfs: SSL support is NOT enabled
[2015-08-05 07:41:38.528019] I [socket.c:3660:socket_init] 0-glusterfs: using system polling thread
[2015-08-05 07:41:38.528162] I [socket.c:3645:socket_init] 0-glusterfs: SSL support is NOT enabled
[2015-08-05 07:41:38.528181] I [socket.c:3660:socket_init] 0-glusterfs: using system polling thread
[2015-08-05 07:41:38.632487] I [socket.c:2321:socket_event_handler] 0-transport: disconnecting now
[2015-08-05 07:41:38.634165] I [input.c:36:cli_batch] 0-: Exiting with: 0

Solution

Search google 兩個多小時後,忽然發現官方最新的版本是 3.7 而我的還在 3.5.5 (謎之音: 是你資訊太老舊了吧 Orz

於是將 glusterfs 更新到 3.7.3

然後... 問題就解決了 Orz

sudo add-apt-repository ppa:gluster/glusterfs-3.7
sudo apt-get update
sudo apt-get dist-upgrade

Reference

Comments

comments powered by Disqus