ICode9

精准搜索请尝试: 精确搜索
首页 > 其他分享> 文章详细

【ceph相关】osd异常问题处理(lvm信息丢失)

2022-07-01 18:32:54  阅读:197  来源: 互联网

标签:00 ceph 07e80157 41e5 lvm osd


一、前言

1、简述

参考文档:
RHEL / CentOS : How to rebuild LVM from Archive (metadata backups)
Red Hat Enterprise Linux 7 逻辑卷管理器管理
Bluestore 下的 OSD 开机自启动分析

本文介绍osd异常排查及相关修复过程,主要涉及lvm修复及osd恢复启动两部分说明

2、问题说明

  • 查看集群状态,osd.1处于down状态
root@node163:~# ceph -s
  cluster:
    id:     9bc47ff2-5323-4964-9e37-45af2f750918
    health: HEALTH_WARN
            too many PGs per OSD (256 > max 250)

  services:
    mon: 3 daemons, quorum node163,node164,node165
    mgr: node163(active), standbys: node164, node165
    mds: ceph-1/1/1 up  {0=node165=up:active}, 2 up:standby
    osd: 3 osds: 2 up, 2 in

  data:
    pools:   3 pools, 256 pgs
    objects: 46 objects, 100MiB
    usage:   2.20GiB used, 198GiB / 200GiB avail
    pgs:     256 active+clean

root@node163:~# ceph osd tree
ID CLASS WEIGHT  TYPE NAME        STATUS REWEIGHT PRI-AFF 
-1       0.29306 root default                             
-5       0.09769     host node163                         
 1   hdd 0.09769         osd.1      down        0 1.00000 
-3       0.09769     host node164                         
 0   hdd 0.09769         osd.0        up  1.00000 1.00000 
-7       0.09769     host node165                         
 2   hdd 0.09769         osd.2        up  1.00000 1.00000 
  • 查看osd.1所在node163节点,磁盘lvm信息丢失,osd处于未挂载状态
root@node163:~# lvs
root@node163:~# vgs
root@node163:~# pvs
root@node163:~# lsblk
NAME   MAJ:MIN RM  SIZE RO TYPE MOUNTPOINT
sda      8:0    0  100G  0 disk 
vda    254:0    0  100G  0 disk 
├─vda1 254:1    0  487M  0 part /boot
├─vda2 254:2    0 54.4G  0 part /
├─vda3 254:3    0    1K  0 part 
├─vda5 254:5    0 39.5G  0 part /data
├─vda6 254:6    0  5.6G  0 part [SWAP]
└─vda7 254:7    0  105M  0 part /boot/efi

root@node163:~# df -h
Filesystem                                   Size  Used Avail Use% Mounted on
udev                                         2.0G     0  2.0G   0% /dev
tmpfs                                        394M   47M  347M  12% /run
/dev/vda2                                     54G   12G   40G  23% /
tmpfs                                        2.0G     0  2.0G   0% /dev/shm
tmpfs                                        5.0M     0  5.0M   0% /run/lock
tmpfs                                        2.0G     0  2.0G   0% /sys/fs/cgroup
/dev/vda1                                    464M  178M  258M  41% /boot
/dev/vda5                                     39G   48M   37G   1% /data
/dev/vda7                                    105M  550K  105M   1% /boot/efi
tmpfs                                        394M     0  394M   0% /run/user/0

root@node163:~# dd if=/dev/sda bs=512 count=4 | hexdump -C
00000000  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
*
4+0 records in
4+0 records out
2048 bytes (2.0 kB, 2.0 KiB) copied, 0.000944624 s, 2.2 MB/s
00000800

二、处理过程

由以上信息可知,磁盘lvm信息丢失,磁盘未挂载,导致osd启动失败
此处我们尝试进行lvm修复和osd恢复启动两部分工作

1、lvm修复

1.1、简述

lvm配置目录结构如下,每当有vg或者lv有配置变更时,lvm都会创建元数据的备份和存档

/etc/lvm/              lvm配置主目录
/etc/lvm/archive       lvm元数据备份(一般存放的文件为完整的lvm配置信息)
/etc/lvm/backup        lvm元数据存档(一般存放的文件为每个lvm阶段性操作记录,比如说vgcreate、lvcreate、lvchange等)
/etc/lvm/lvm.conf      lvm主配置文件,涉及到元数据的备份和存档

可以通过vgcfgrestore --list <vg_name>查询元数据存档信息,根据操作记录找到最完整的lvm配置信息,可通过此命令找到最新的archive文件用于lvm误删恢复

[Unauthorized System] root@node163:/etc/lvm/archive# vgcfgrestore --list ceph-07e80157-b488-41e5-b217-4079d52edb08

  File:        /etc/lvm/archive/ceph-07e80157-b488-41e5-b217-4079d52edb08_00000-999427028.vg
  Couldn't find device with uuid UjxquH-iHJe-NY1A-BdQf-00oD-j22C-heeOTN.
  VG name:        ceph-07e80157-b488-41e5-b217-4079d52edb08
  Description:    Created *before* executing '/sbin/vgcreate --force --yes ceph-07e80157-b488-41e5-b217-4079d52edb08 /dev/sda'
  Backup Time:    Wed Jun 29 14:53:47 2022


  File:        /etc/lvm/archive/ceph-07e80157-b488-41e5-b217-4079d52edb08_00001-98007334.vg
  VG name:        ceph-07e80157-b488-41e5-b217-4079d52edb08
  Description:    Created *before* executing '/sbin/lvcreate --yes -l 100%FREE -n osd-block-8cd1658a-97d7-42d6-8f67-6a076c6fb42d ceph-07e80157-b488-41e5-b217-4079d52edb08'
  Backup Time:    Wed Jun 29 14:53:47 2022


  File:        /etc/lvm/archive/ceph-07e80157-b488-41e5-b217-4079d52edb08_00002-65392131.vg
  VG name:        ceph-07e80157-b488-41e5-b217-4079d52edb08
  Description:    Created *before* executing '/sbin/lvchange --addtag ceph.type=block /dev/ceph-07e80157-b488-41e5-b217-4079d52edb08/osd-block-8cd1658a-97d7-42d6-8f67-6a076c6fb42d'
  Backup Time:    Wed Jun 29 14:53:47 2022


  File:        /etc/lvm/archive/ceph-07e80157-b488-41e5-b217-4079d52edb08_00003-1190179092.vg
  VG name:        ceph-07e80157-b488-41e5-b217-4079d52edb08
  Description:    Created *before* executing '/sbin/lvchange --addtag ceph.block_device=/dev/ceph-07e80157-b488-41e5-b217-4079d52edb08/osd-block-8cd1658a-97d7-42d6-8f67-6a076c6fb42d /dev/ceph-07e80157-b488-41e5-b217-4079d52edb08/osd-block-8cd1658a-97d7-42d6-8f67-6a076c6fb42d'
  Backup Time:    Wed Jun 29 14:53:47 2022


  File:        /etc/lvm/archive/ceph-07e80157-b488-41e5-b217-4079d52edb08_00004-1217184452.vg
  VG name:        ceph-07e80157-b488-41e5-b217-4079d52edb08
  Description:    Created *before* executing '/sbin/lvchange --addtag ceph.vdo=0 /dev/ceph-07e80157-b488-41e5-b217-4079d52edb08/osd-block-8cd1658a-97d7-42d6-8f67-6a076c6fb42d'
  Backup Time:    Wed Jun 29 14:53:48 2022


  File:        /etc/lvm/archive/ceph-07e80157-b488-41e5-b217-4079d52edb08_00005-2051164187.vg
  VG name:        ceph-07e80157-b488-41e5-b217-4079d52edb08
  Description:    Created *before* executing '/sbin/lvchange --addtag ceph.osd_id=1 /dev/ceph-07e80157-b488-41e5-b217-4079d52edb08/osd-block-8cd1658a-97d7-42d6-8f67-6a076c6fb42d'
  Backup Time:    Wed Jun 29 14:53:48 2022

默认情况下,使用pvcreate创建pv,会在物理磁盘第二个512 bytes扇区存放物理卷标签,物理卷标签以字符串LABELONE开头
可通过dd if=<pv_disk_path> bs=512 count=2查询pv设备是否正常
注:物理卷标签一般包括物理卷UUID、块设备大小等信息

--异常节点信息--
root@node163:~# dd if=/dev/sda bs=512 count=2 | hexdump -C
00000000  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
*
2+0 records in
2+0 records out
1024 bytes (1.0 kB, 1.0 KiB) copied, 0.000803544 s, 1.3 MB/s
00000400


--正常节点信息--
root@node164:/etc/lvm/archive# dd if=/dev/sda bs=512 count=2 | hexdump -C
00000000  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
*
2+0 records in
2+0 records out
1024 bytes (1.0 kB, 1.0 KiB) copied, 0.000111721 s, 9.2 MB/s
00000200  4c 41 42 45 4c 4f 4e 45  01 00 00 00 00 00 00 00  |LABELONE........|
00000210  1c 9f f4 1e 20 00 00 00  4c 56 4d 32 20 30 30 31  |.... ...LVM2 001|
00000220  59 6c 6a 79 78 64 59 53  66 4e 44 54 4b 7a 36 64  |YljyxdYSfNDTKz6d|
00000230  41 31 44 56 46 79 52 78  5a 52 39 58 61 49 45 52  |A1DVFyRxZR9XaIER|
00000240  00 00 00 00 19 00 00 00  00 00 10 00 00 00 00 00  |................|
00000250  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
00000260  00 00 00 00 00 00 00 00  00 10 00 00 00 00 00 00  |................|
00000270  00 f0 0f 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
00000280  00 00 00 00 00 00 00 00  01 00 00 00 00 00 00 00  |................|
00000290  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
*
00000400

此外,如果物理磁盘未被覆盖写入新数据,可以通过dd if=<pv_disk_path> count=12 | strings查询lvm相关配置信息

root@node163:~# dd if=/dev/sda count=12 | strings 
 LVM2 x[5A%r0N*>
ceph-07e80157-b488-41e5-b217-4079d52edb08 {
id = "e1Ge2Y-6DAn-EZzA-6btK-MGMW-qVrP-ldcE9R"
seqno = 1
format = "lvm2"
status = ["RESIZEABLE", "READ", "WRITE"]
flags = []
extent_size = 8192
max_lv = 0
max_pv = 0
metadata_copies = 0
physical_volumes {
pv0 {
id = "UjxquH-iHJe-NY1A-BdQf-00oD-j22C-heeOTN"
device = "/dev/sda"
status = ["ALLOCATABLE"]
flags = []
dev_size = 209715200
pe_start = 2048
pe_count = 25599
# Generated by LVM2 version 2.02.133(2) (2015-10-30): Wed Jun 29 14:53:47 2022
contents = "Text Format Volume Group"
version = 1
description = ""
creation_host = "node163"    # Linux node163 4.4.58-20180615.kylin.server.YUN+-generic #kylin SMP Tue Jul 10 14:55:31 CST 2018 aarch64
creation_time = 1656485627    # Wed Jun 29 14:53:47 2022
ceph-07e80157-b488-41e5-b217-4079d52edb08 {
id = "e1Ge2Y-6DAn-EZzA-6btK-MGMW-qVrP-ldcE9R"
seqno = 2
format = "lvm2"
status = ["RESIZEABLE", "READ", "WRITE"]
flags = []
extent_size = 8192
max_lv = 0
max_pv = 0
metadata_copies = 0
physical_volumes {
pv0 {
id = "UjxquH-iHJe-NY1A-BdQf-00oD-j22C-heeOTN"
device = "/dev/sda"
status = ["ALLOCATABLE"]
flags = []
dev_size = 209715200
pe_start = 2048
pe_count = 25599
logical_volumes {
osd-block-8cd1658a-97d7-42d6-8f67-6a076c6fb42d {
12+0 records in
12+0 records out
id = "oV0BZG-WLSM-v2jL-god

1.2、获取lvm信息

在进行lvm修复之前,需要先拿到lv(一般为osd-block-<osd_fsid>)和vg(一般以ceph-开头)信息
注:osd_fsid可通过ceph osd dump | grep <osd_id> | awk '{print $NF}'查询

  • 进入/etc/lvm/archive目录,通过grep `ceph osd dump | grep <osd_id> | awk '{print $NF}'` -R *查询osd的lv和vg信息
# 查询osd.1对应lv为osd-block-8cd1658a-97d7-42d6-8f67-6a076c6fb42d,vg为ceph-07e80157-b488-41e5-b217-4079d52edb08

root@node163:/etc/lvm/archive# grep `ceph osd dump | grep osd.1 | awk '{print $NF}'` -R *
ceph-07e80157-b488-41e5-b217-4079d52edb08_00001-98007334.vg:description = "Created *before* executing '/sbin/lvcreate --yes -l 100%FREE -n osd-block-8cd1658a-97d7-42d6-8f67-6a076c6fb42d ceph-07e80157-b488-41e5-b217-4079d52edb08'"
  • 通过vgcfgrestore --list <vg_name>查询元数据存档信息,根据操作记录找到最完整的archive文件(包含所有lvm配置信息)
# 通过查询vg元数据操作记录及比对archive文件大小,找到最完整的archive文件为/etc/lvm/archive/ceph-07e80157-b488-41e5-b217-4079d52edb08_00016-18371198.vg,查看pv uuid为UjxquH-iHJe-NY1A-BdQf-00oD-j22C-heeOTN

root@node163:/etc/lvm/archive# vgcfgrestore --list ceph-07e80157-b488-41e5-b217-4079d52edb08
  File:        /etc/lvm/archive/ceph-07e80157-b488-41e5-b217-4079d52edb08_00016-18371198.vg
  VG name:        ceph-07e80157-b488-41e5-b217-4079d52edb08
  Description:    Created *before* executing '/sbin/lvchange --addtag ceph.block_uuid=oV0BZG-WLSM-v2jL-godE-o6vd-fdfu-w7Ms5w /dev/ceph-07e80157-b488-41e5-b217-4079d52edb08/osd-block-8cd1658a-97d7-42d6-8f67-6a076c6fb42d'
  Backup Time:    Wed Jun 29 14:53:48 2022

root@node163:/etc/lvm/archive# cat ceph-07e80157-b488-41e5-b217-4079d52edb08_00016-18371198.vg | grep -A 5 physical_volumes 
    physical_volumes {

        pv0 {
            id = "UjxquH-iHJe-NY1A-BdQf-00oD-j22C-heeOTN"
            device = "/dev/sda"    # Hint only

1.3、构造label信息

根据一开始查询的信息得知,osd.1对应物理磁盘pv相关信息已丢失,故无法直接使用vgcfgrestore命令恢复vg配置
此处需要用一个新的硬盘,使用原有的pv-uuid和archive文件创建一个新的pv,将新硬盘前两个扇区信息dd写入到原有的osd.1对应物理磁盘,恢复原有pv信息

root@node163:/etc/lvm/archive# vgcfgrestore -f ceph-07e80157-b488-41e5-b217-4079d52edb08_00016-18371198.vg ceph-07e80157-b488-41e5-b217-4079d52edb08
  Couldn't find device with uuid UjxquH-iHJe-NY1A-BdQf-00oD-j22C-heeOTN.
  PV unknown device missing from cache
  Format-specific setup for unknown device failed
  Restore failed.
  • 将3.1.2获取的archive文件拷贝到新的节点,使用pvcreate -ff --uuid <pv_uuid> --restorefile <archive_file> <pv_disk_path>创建一个相同的pv
[root@node122 ~]# pvcreate -ff --uuid UjxquH-iHJe-NY1A-BdQf-00oD-j22C-heeOTN --restorefile ceph-07e80157-b488-41e5-b217-4079d52edb08_00016-18371198.vg /dev/sdb 
  Couldn't find device with uuid UjxquH-iHJe-NY1A-BdQf-00oD-j22C-heeOTN.
  Physical volume "/dev/sdb" successfully created.
  • 使用dd if=<pv_disk_path> of=<label_file> bs=512 count=2将新pv的label信息写入到一个新文件file_label
[root@node122 ~]# dd if=/dev/sdb of=file_label bs=512 count=2
2+0 records in
2+0 records out
1024 bytes (1.0 kB) copied, 0.219809 s, 4.7 kB/s

[root@node122 ~]# dd if=./file_label | hexdump -C
2+0 records in
2+0 records out
1024 bytes (1.0 kB) copied00000000  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
*
00000200  4c 41 42 45 4c 4f 4e 45  01 00 00 00 00 00 00 00  |LABELONE........|
, 6.1274e-05 s, 16.7 MB/s
00000210  2b a3 c4 46 20 00 00 00  4c 56 4d 32 20 30 30 31  |+..F ...LVM2 001|
00000220  55 6a 78 71 75 48 69 48  4a 65 4e 59 31 41 42 64  |UjxquHiHJeNY1ABd|
00000230  51 66 30 30 6f 44 6a 32  32 43 68 65 65 4f 54 4e  |Qf00oDj22CheeOTN|
00000240  00 00 00 00 19 00 00 00  00 00 10 00 00 00 00 00  |................|
00000250  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
00000260  00 00 00 00 00 00 00 00  00 10 00 00 00 00 00 00  |................|
00000270  00 f0 0f 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
00000280  00 00 00 00 00 00 00 00  02 00 00 00 00 00 00 00  |................|
00000290  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
*
00000400

1.4、恢复pv信息

  • 操作前,使用dd if=<pv_disk_path> of=/home/file_backup bs=512 count=2将osd.1对应物理磁盘前1024字节信息备份到本地
root@node163:/etc/lvm/archive# dd if=/dev/sda bs=512 count=2 | hexdump -C
2+0 records in
2+0 records out
00000000  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
*
00000400
1024 bytes (1.0 kB, 1.0 KiB) copied, 0.000761583 s, 1.3 MB/s

root@node163:/etc/lvm/archive# dd if=/dev/sda of=/home/file_backup bs=512 count=2 
2+0 records in
2+0 records out
1024 bytes (1.0 kB, 1.0 KiB) copied, 0.000825143 s, 1.2 MB/s
  • 使用dd if=<label_file> of=<pv_disk_path> bs=512 count=2,将3.1.3构造的label信息写入到osd.1对应物理磁盘
root@node163:/etc/lvm/archive# dd if=/home/file_label of=/dev/sda bs=512 count=2
2+0 records in
2+0 records out
1024 bytes (1.0 kB, 1.0 KiB) copied, 0.00122898 s, 833 kB/s

root@node163:/etc/lvm/archive# dd if=/dev/sda bs=512 count=2 | hexdump -C
2+0 records in
2+0 records out
00000000  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
*
00000200  4c 41 42 45 4c 4f 4e 45  01 00 00 00 00 00 00 00  |LABELONE........|
00000210  2b a3 c4 46 20 00 00 00  4c 56 4d 32 20 30 30 31  |+..F ...LVM2 001|
00000220  55 6a 78 71 75 48 69 48  4a 65 4e 59 31 41 42 64  |UjxquHiHJeNY1ABd|
00000230  51 66 30 30 6f 44 6a 32  32 43 68 65 65 4f 54 4e  |Qf00oDj22CheeOTN|
00000240  00 00 00 00 19 00 00 00  00 00 10 00 00 00 00 00  |................|
00000250  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
00000260  00 00 00 00 00 00 00 00  00 10 00 00 00 00 00 00  |................|
00000270  00 f0 0f 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
00000280  00 00 00 00 00 00 00 00  02 00 00 00 00 00 00 00  |................|
00000290  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
*
00000400
1024 bytes (1.0 kB, 1.0 KiB) copied, 0.00244905 s, 418 kB/s
  • 使用pvcreate -ff --uuid <pv_uuid> --restorefile <archive_file> <pv_disk_path>,指定pv uuid使用osd.1对应物理磁盘创建pv
root@node163:/etc/lvm/archive# pvcreate -ff --uuid UjxquH-iHJe-NY1A-BdQf-00oD-j22C-heeOTN --restorefile ceph-07e80157-b488-41e5-b217-4079d52edb08_00016-18371198.vg /dev/sda 
  Couldn't find device with uuid UjxquH-iHJe-NY1A-BdQf-00oD-j22C-heeOTN.
  Physical volume "/dev/sda" successfully created

root@node163:/etc/lvm/archive# pvs
  PV         VG   Fmt  Attr PSize   PFree  
  /dev/sda        lvm2 ---  100.00g 100.00g

1.5、恢复vg/lv信息

  • 使用vgcfgrestore -f <archive_file> <vg_name>恢复vg和lv信息,此时lv、vg、pv均已恢复正常
root@node163:/etc/lvm/archive# vgcfgrestore -f ceph-07e80157-b488-41e5-b217-4079d52edb08_00016-18371198.vg ceph-07e80157-b488-41e5-b217-4079d52edb08
  Restored volume group ceph-07e80157-b488-41e5-b217-4079d52edb08

root@node163:/etc/lvm/archive# vgs
  VG                                        #PV #LV #SN Attr   VSize   VFree
  ceph-07e80157-b488-41e5-b217-4079d52edb08   1   1   0 wz--n- 100.00g    0 

root@node163:/etc/lvm/archive# lvs
  LV                                             VG                                        Attr       LSize   Pool Origin Data%  Meta%  Move Log Cpy%Sync Convert
  osd-block-8cd1658a-97d7-42d6-8f67-6a076c6fb42d ceph-07e80157-b488-41e5-b217-4079d52edb08 -wi------- 100.00g

root@node163:~# ll /dev/mapper/
total 0
drwxr-xr-x  2 root root      80 Jul  1 17:28 ./
drwxr-xr-x 19 root root    4520 Jul  1 17:28 ../
lrwxrwxrwx  1 root root       7 Jul  1 17:33 ceph--07e80157--b488--41e5--b217--4079d52edb08-osd--block--8cd1658a--97d7--42d6--8f67--6a076c6fb42d -> ../dm-0
crw-------  1 root root 10, 236 Jul  1 17:28 control
  • 查看lv信息,正常应该有以bluestore block前缀的block信息
    注:如恢复lvm成功之后,lv没有bluestore block前缀信息,则可能存在磁盘被覆盖写入的情况,导致block信息被破坏,此时osd数据已丢失,无法进行下一步恢复
root@node163:~# dd if=/dev/ceph-07e80157-b488-41e5-b217-4079d52edb08/osd-block-8cd1658a-97d7-42d6-8f67-6a076c6fb42d bs=512 count=2 | hexdump -C
2+0 records in
2+0 records out
00000000  62 6c 75 65 73 74 6f 72  65 20 62 6c 6f 63 6b 20  |bluestore block |
00000010  64 65 76 69 63 65 0a 38  63 64 31 36 35 38 61 2d  |device.8cd1658a-|
00000020  39 37 64 37 2d 34 32 64  36 2d 38 66 36 37 2d 36  |97d7-42d6-8f67-6|
00000030  61 30 37 36 63 36 66 62  34 32 64 0a 02 01 16 01  |a076c6fb42d.....|
00000040  00 00 8c d1 65 8a 97 d7  42 d6 8f 67 6a 07 6c 6f  |....e...B..gj.lo|
00000050  b4 2d 00 00 c0 ff 18 00  00 00 fd f6 bb 62 ac 78  |.-...........b.x|
00000060  dc 18 04 00 00 00 6d 61  69 6e 08 00 00 00 06 00  |......main......|
00000070  00 00 62 6c 75 65 66 73  01 00 00 00 31 09 00 00  |..bluefs....1...|
00000080  00 63 65 70 68 5f 66 73  69 64 24 00 00 00 39 62  |.ceph_fsid$...9b|
00000090  63 34 37 66 66 32 2d 35  33 32 33 2d 34 39 36 34  |c47ff2-5323-4964|
000000a0  2d 39 65 33 37 2d 34 35  61 66 32 66 37 35 30 39  |-9e37-45af2f7509|
000000b0  31 38 0a 00 00 00 6b 76  5f 62 61 63 6b 65 6e 64  |18....kv_backend|
000000c0  07 00 00 00 72 6f 63 6b  73 64 62 05 00 00 00 6d  |....rocksdb....m|
000000d0  61 67 69 63 14 00 00 00  63 65 70 68 20 6f 73 64  |agic....ceph osd|
000000e0  20 76 6f 6c 75 6d 65 20  76 30 32 36 09 00 00 00  | volume v026....|
000000f0  6d 6b 66 73 5f 64 6f 6e  65 03 00 00 00 79 65 73  |mkfs_done....yes|
00000100  07 00 00 00 6f 73 64 5f  6b 65 79 28 00 00 00 41  |....osd_key(...A|
00000110  51 44 35 39 72 74 69 41  62 65 2f 4c 52 41 41 65  |QD59rtiAbe/LRAAe|
00000120  6a 4b 6e 42 6d 56 4e 6a  4a 75 37 4e 78 37 79 37  |jKnBmVNjJu7Nx7y7|
00000130  58 38 57 55 41 3d 3d 05  00 00 00 72 65 61 64 79  |X8WUA==....ready|
00000140  05 00 00 00 72 65 61 64  79 06 00 00 00 77 68 6f  |....ready....who|
00000150  61 6d 69 01 00 00 00 31  7e 77 c5 2d 00 00 00 00  |ami....1~w.-....|
00000160  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
*
00000400
1024 bytes (1.0 kB, 1.0 KiB) copied, 0.00132415 s, 773 kB/s

2、osd恢复启动

ceph osd挂载由ceph-volume控制,当lvm修复成功之后,可以执行systemctl start ceph-volume@lvm-<osd_id>-`ceph osd dump | grep <osd_id> | awk '{print $NF'}`,启动lvm相关挂载和osd启动

root@node163:~# systemctl start ceph-volume@lvm-1-`ceph osd dump | grep osd.1 | awk '{print $NF'}`
root@node163:~# systemctl status ceph-volume@lvm-1-`ceph osd dump | grep osd.1 | awk '{print $NF'}`
● ceph-volume@lvm-1-8cd1658a-97d7-42d6-8f67-6a076c6fb42d.service - Ceph Volume activation: lvm-1-8cd1658a-97d7-42d6-8f67-6a076c6fb42d
   Loaded: loaded (/lib/systemd/system/ceph-volume@.service; enabled; vendor preset: enabled)
   Active: inactive (dead) since Fri 2022-07-01 17:54:49 CST; 4s ago
 Main PID: 55683 (code=exited, status=0/SUCCESS)

Jul 01 17:54:48 node163 systemd[1]: Starting Ceph Volume activation: lvm-1-8cd1658a-97d7-42d6-8f67-6a076c6fb42d...
Jul 01 17:54:49 node163 sh[55683]: Running command: ceph-volume lvm trigger 1-8cd1658a-97d7-42d6-8f67-6a076c6fb42d
Jul 01 17:54:49 node163 systemd[1]: Started Ceph Volume activation: lvm-1-8cd1658a-97d7-42d6-8f67-6a076c6fb42d.

root@node163:~# ceph osd in osd.1
marked in osd.1. 

root@node163:~# ceph -s
  cluster:
    id:     9bc47ff2-5323-4964-9e37-45af2f750918
    health: HEALTH_OK

  services:
    mon: 3 daemons, quorum node163,node164,node165
    mgr: node163(active), standbys: node164, node165
    mds: ceph-1/1/1 up  {0=node165=up:active}, 2 up:standby
    osd: 3 osds: 3 up, 3 in

  data:
    pools:   3 pools, 256 pgs
    objects: 46 objects, 100MiB
    usage:   3.21GiB used, 297GiB / 300GiB avail
    pgs:     256 active+clean

注:
如执行上面命令仍无法拉起osd,可执行ceph-volume lvm trigger <osd_id>-<osd_fs_id>查看详细执行步骤,进一步排查定位具体阻塞位置

root@node163:~# ceph-volume lvm trigger 1-8cd1658a-97d7-42d6-8f67-6a076c6fb42d
Running command: mount -t tmpfs tmpfs /var/lib/ceph/osd/ceph-1
Running command: restorecon /var/lib/ceph/osd/ceph-1
Running command: chown -R ceph:ceph /var/lib/ceph/osd/ceph-1
Running command: ceph-bluestore-tool --cluster=ceph prime-osd-dir --dev /dev/ceph-07e80157-b488-41e5-b217-4079d52edb08/osd-block-8cd1658a-97d7-42d6-8f67-6a076c6fb42d --path /var/lib/ceph/osd/ceph-1
Running command: ln -snf /dev/ceph-07e80157-b488-41e5-b217-4079d52edb08/osd-block-8cd1658a-97d7-42d6-8f67-6a076c6fb42d /var/lib/ceph/osd/ceph-1/block
Running command: chown -h ceph:ceph /var/lib/ceph/osd/ceph-1/block
Running command: chown -R ceph:ceph /dev/dm-0
Running command: chown -R ceph:ceph /var/lib/ceph/osd/ceph-1
Running command: systemctl enable ceph-volume@lvm-1-8cd1658a-97d7-42d6-8f67-6a076c6fb42d
Running command: systemctl enable --runtime ceph-osd@1
Running command: systemctl start ceph-osd@1
--> ceph-volume lvm activate successful for osd ID: 1

标签:00,ceph,07e80157,41e5,lvm,osd
来源: https://www.cnblogs.com/luxf0/p/16435630.html

本站声明: 1. iCode9 技术分享网(下文简称本站)提供的所有内容,仅供技术学习、探讨和分享;
2. 关于本站的所有留言、评论、转载及引用,纯属内容发起人的个人观点,与本站观点和立场无关;
3. 关于本站的所有言论和文字,纯属内容发起人的个人观点,与本站观点和立场无关;
4. 本站文章均是网友提供,不完全保证技术分享内容的完整性、准确性、时效性、风险性和版权归属;如您发现该文章侵犯了您的权益,可联系我们第一时间进行删除;
5. 本站为非盈利性的个人网站,所有内容不会用来进行牟利,也不会利用任何形式的广告来间接获益,纯粹是为了广大技术爱好者提供技术内容和技术思想的分享性交流网站。

专注分享技术,共同学习,共同进步。侵权联系[81616952@qq.com]

Copyright (C)ICode9.com, All Rights Reserved.

ICode9版权所有