[root@ceph-mon01 ~]# ssh-keygen -t rsa Generating public/private rsa key pair. Enter file inwhich to save the key (/root/.ssh/id_rsa): Created directory '/root/.ssh'. Enter passphrase (empty for no passphrase): Enter same passphrase again: Your identification has been saved in /root/.ssh/id_rsa. Your public key has been saved in /root/.ssh/id_rsa.pub. The key fingerprint is: SHA256:3rKj4eAISBB5T/B95dTCDhbyFmllUjFsJ9mkEbWNI6c root@ceph-mon01 The key's randomart image is: +---[RSA 2048]----+ |.... . +B@Oo | |...... o=BB+++ | |.. o. .o++++= . | |. . o .+ . | | . S E | |o . . | |o . . o . | | . o o ..o | | . . o... | +----[SHA256]-----+ [root@ceph-mon01 ~]# ssh-copy-id -i ~/.ssh/id_rsa.pub ceph-mon01 [root@ceph-mon01 ~]# ssh-copy-id -i ~/.ssh/id_rsa.pub ceph-mon02 [root@ceph-mon01 ~]# ssh-copy-id -i ~/.ssh/id_rsa.pub ceph-mon03
For data devices, it can be an existing logical volume in the format of: vg/lv, or a device. For other OSD components like wal, db, and journal, it can be logical volume (in vg/lv format) or it must be a GPT partition.
positional arguments: {list,create} list List OSD info from remote host(s) create Create new Ceph OSD daemon by preparing and activating a device
optional arguments: -h, --help show this help message and exit
Traceback (most recent call last): File "/usr/bin/ceph-deploy", line 18, in <module> from ceph_deploy.cli import main File "/usr/lib/python2.7/site-packages/ceph_deploy/cli.py", line 1, in <module> import pkg_resources ImportError: No module named pkg_resources
解决方法
在ceph-deploy节点上安装python-setuptools
1
yum install python-setuptools -y
RuntimeError: Failed to execute command: ceph --version
[ceph-mon][INFO ] Running command: ceph --version [ceph-mon][ERROR ] Traceback (most recent call last): [ceph-mon][ERROR ] File "/usr/lib/python2.7/site-packages/ceph_deploy/lib/vendor/remoto/process.py", line 119, in run [ceph-mon][ERROR ] reporting(conn, result, timeout) [ceph-mon][ERROR ] File "/usr/lib/python2.7/site-packages/ceph_deploy/lib/vendor/remoto/log.py", line 13, in reporting [ceph-mon][ERROR ] received = result.receive(timeout) [ceph-mon][ERROR ] File "/usr/lib/python2.7/site-packages/ceph_deploy/lib/vendor/remoto/lib/vendor/execnet/gateway_base.py", line 704, in receive [ceph-mon][ERROR ] raise self._getremoteerror() or EOFError() [ceph-mon][ERROR ] RemoteError: Traceback (most recent call last): [ceph-mon][ERROR ] File "/usr/lib/python2.7/site-packages/ceph_deploy/lib/vendor/remoto/lib/vendor/execnet/gateway_base.py", line 1036, in executetask [ceph-mon][ERROR ] function(channel, **kwargs) [ceph-mon][ERROR ] File "<remote exec>", line 12, in _remote_run [ceph-mon][ERROR ] File "/usr/lib64/python2.7/subprocess.py", line 711, in __init__ [ceph-mon][ERROR ] errread, errwrite) [ceph-mon][ERROR ] File "/usr/lib64/python2.7/subprocess.py", line 1327, in _execute_child [ceph-mon][ERROR ] raise child_exception [ceph-mon][ERROR ] OSError: [Errno 2] No such file or directory [ceph-mon][ERROR ] [ceph-mon][ERROR ] [ceph_deploy][ERROR ] RuntimeError: Failed to execute command: ceph --version
解决方法
登录到报错的节点执行以下命令安装Ceph
1
yum install ceph ceph-radosgw -y
stderr: wipefs: error
问题现象
执行ceph-deploy disk zap或在处理Ceph OSD换盘,对新盘进行zap处理的时候,出现如下错误信息:
1 2 3 4 5 6 7
[ceph-mon01][INFO ] Running command: /usr/sbin/ceph-volume lvm zap /dev/sdb [ceph-mon01][WARNIN] --> Zapping: /dev/sdb [ceph-mon01][WARNIN] --> --destroy was not specified, but zapping a whole device will remove the partition table [ceph-mon01][WARNIN] stderr: wipefs: error: /dev/sdb: probing initialization failed: Device or resource busy [ceph-mon01][WARNIN] --> failed to wipefs device, will try again to workaround probable race condition [ceph-mon01][WARNIN] --> RuntimeError: could not complete wipefs on device: /dev/sdb [ceph-mon01][ERROR ] RuntimeError: command returned non-zero exit status: 1
解决方法
登录报错节点通过lsblk确认报错硬盘是否有残余映射信息,并通过dmsetup手动移除
1 2 3 4 5 6 7
# lsblk NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT sdb 8:16 0 100G 0 disk └─ceph--5a9ad8a8--a449--4086--ac27--e18cf778e831-osd--journal--f77ff517--9fbc--4caa--be33--0ce675e07a19 253:0 0 100G 0 lvm
rbd: sysfs write failed RBD image feature set mismatch. You can disable features unsupported by the kernel with "rbd feature disable koenli/disk01 object-map fast-diff deep-flatten". In some cases useful info is found in syslog - try "dmesg | tail". rbd: map failed: (6) No such device or address
启用RGW服务后执行ceph -s在services里没看到有rgw的daemons,在RGW节点上通过systemctl status ceph-radosgw@rgw.<hostname>发现服务启动失败
尝试手动启动RGW服务出现如下错误信息:
1 2 3 4 5 6 7 8 9 10 11
# /usr/bin/radosgw -d --cluster ceph --name client.rgw.ceph-mon01 --setuser ceph --setgroup ceph --debug-rgw=20 2021-09-03 21:51:03.372 7f4f900a5900 0 deferred set uid:gid to 167:167 (ceph:ceph) 2021-09-03 21:51:03.372 7f4f900a5900 0 ceph version 14.2.22 (ca74598065096e6fcbd8433c8779a2be0c889351) nautilus (stable), process radosgw, pid 12559 2021-09-03 21:51:03.394 7f4f79cd5700 20 reqs_thread_entry: start 2021-09-03 21:51:03.401 7f4f900a5900 20 rados->read ofs=0 len=0 2021-09-03 21:51:03.987 7f4f900a5900 0 rgw_init_ioctx ERROR: librados::Rados::pool_create returned (34) Numerical result out of range (this can be due to a pool or placement group misconfiguration, e.g. pg_num < pgp_num or mon_max_pg_per_osd exceeded) 2021-09-03 21:51:03.987 7f4f900a5900 20 get_rados_obj() on obj=.rgw.root:default.realm returned -34 2021-09-03 21:51:03.987 7f4f900a5900 0 failed reading realm info: ret -34 (34) Numerical result out of range 2021-09-03 21:51:03.987 7f4f900a5900 0 ERROR: failed to start notify service ((34) Numerical result out of range 2021-09-03 21:51:03.987 7f4f900a5900 0 ERROR: failed to init services (ret=(34) Numerical result out of range) 2021-09-03 21:51:03.989 7f4f900a5900 -1 Couldn't init storage provider (RADOS)