LOADING

Follow me

利用qemu-guest-agent冻结文件系统
十月 13, 2015|Openstack

利用qemu-guest-agent冻结文件系统

转载:www.silenceli.com?p=250

本文将介绍:

  1. qemu-guest-agent是什么?
  2. 为什么使用qemu-guest-agent
  3. 如何利用qemu-guest-agent冻结文件系统?
  4. qemu-ga与openstack的结合

What is qemu-ga

我个人认为qemu-ga是在虚拟机中安装的一个agent,宿主机host通过通道(unix socket)与虚拟机vm内部的agent进行通信,这样宿主机就有了一种从外部控制/获取虚拟机的手段。比如:host可以向vm下发执行修改 hostname的指令,或者获取vm内所有进程信息的指令。

qemu-ga时刻监听这个unix socket,一旦发现有指令发送来,分析该指令,并执行,通过unix socket返回执行结果。

通过在虚拟机内部预装qemu-ga,云平台对虚拟机的控制能力显著加强,举个例子:阿里云中有个产品叫“云盾 安骑士”,该产品能够自动修复软件安全漏洞、查杀木马、实时告警等等。其本质上来说,也是在vm中安装了某种agent。

Why use qemu-ga

目前市面上的开源agent产品也有不少,最有名的是qemu-ga和ovirt-guest-agent,通过比较发现

qemu-ga ovirt-ga
开发语言 C语言 python
通道协议 qmp协议(QEMU Machine Protocol) 自定义
提供商 QEMU官方 Red Hat
操作系统支持 windows/linux
对于linux直接提供rpm包
windows/linux
对于linux直接提供rpm包
支持功能 guest-set-vcpus
guest-get-vcpus
guest-network-get-interfaces
guest-suspend-hybrid
guest-suspend-ram
guest-suspend-disk
guest-fstrim
guest-fsfreeze-thaw
guest-fsfreeze-freeze
guest-fsfreeze-status
guest-file-flush
guest-file-seek
guest-file-seek
guest-file-read
guest-file-close
guest-file-open
guest-shutdown
guest-info
guest-set-time
guest-get-time
guest-ping
guest-sync
guest-sync-delimited
1. information(吐出的信息,定期吐出可配置)
主机名
操作系统及版本
IP地址
已安装的软件
可用的内存
已登录的用户
活动用户(不详)2. 被触发的消息,即vm内部出现某种情况后,ovirt-ga将发送消息给host
开机
心跳(定期发送)
活动用户切换
windows锁屏
windows log off
windows log on
ovirt-ga被卸载

3.执行的命令
锁屏
自动登录
自动log off
关机

可扩展性 提供专门的方式,每一个功能需要增加一个对应的文件 直接修改ovirt-ga源码,通常修改
GuestAgentLinux2.py
OVirtAgentLogic.py
openstack兼容性 openstack支持
相关bp:https://blueprints.launchpad.net/nova/+spec/qemu-guest-agent-support
https://blueprints.launchpad.net/nova/+spec/quiesced-image-snapshots-with-qemu-guest-agent
openstack不支持,需要手动修改openstack代码

通过对比,qemu-ga的优势是qemu官方出品,与openstack深度结合,而且协议规范,代码规范,添加新的功能时,也相对独立,同时原生的qemu-ga就支持freezefs功能,这些优势都是ovirt-guest-agent无法比拟的。

how to use qemu-ga

  1. 在虚拟机中安装qemu-ga,针对centos 6.X
    1
    
    yum install qemu-guest-agent
  2. 修改安装后的qemu-ga配置文件
    1
    2
    3
    4
    5
    6
    7
    
    #修改/etc/sysconfig/qemu-ga文件
    将 
    # Enable fsfreeze hook. See the --fsfreeze-hook option in "qemu-ga --help".
    FSFREEZE_HOOK_ENABLE=0
    改为
    # Enable fsfreeze hook. See the --fsfreeze-hook option in "qemu-ga --help".
    FSFREEZE_HOOK_ENABLE=1
    1
    2
    3
    4
    5
    
    #修改/etc/sysconfig/qemu-ga,注释掉BLACKLIST_RPC这一行,将所有功能开放
    将
    BLACKLIST_RPC="guest-file-open,guest-file-close,guest-file-read,guest-file-write,guest-file-seek,guest-file-flush"
    改为
    #BLACKLIST_RPC="guest-file-open,guest-file-close,guest-file-read,guest-file-write,guest-file-seek,guest-file-flush"
  3. 将虚拟机关机,在虚拟机配置文件libvirt.xml中的<devices>下面添加下述配置,并重新启动虚拟机
    1
    2
    3
    4
    
    <channel type='unix'>
       <source mode='bind' path='/var/lib/libvirt/qemu/f16x86_64.agent'/>
       <target type='virtio' name='org.qemu.guest_agent.0'/>
    </channel>
  4. 测试是否正常:
    1
    2
    3
    4
    5
    6
    7
    8
    9
    
    #得到虚拟机对应的domain id
    [root@node-12 ~]# virsh list
     Id    名称                         状态
    ----------------------------------------------------
     90    instance-0000209f              running
     
    #使用命令进行测试
    [root@node-12 ~]# virsh qemu-agent-command 90 '{"execute":"guest-info"}'
    {"return":{"version":"0.12.1","supported_commands":[{"enabled":true,"name":"guest-set-vcpus"},{"enabled":true,"name":"guest-get-vcpus"},{"enabled":true,"name":"guest-network-get-interfaces"},{"enabled":true,"name":"guest-suspend-hybrid"},{"enabled":true,"name":"guest-suspend-ram"},{"enabled":true,"name":"guest-suspend-disk"},{"enabled":true,"name":"guest-fstrim"},{"enabled":true,"name":"guest-fsfreeze-thaw"},{"enabled":true,"name":"guest-fsfreeze-freeze"},{"enabled":true,"name":"guest-fsfreeze-status"},{"enabled":true,"name":"guest-file-flush"},{"enabled":true,"name":"guest-file-seek"},{"enabled":true,"name":"guest-file-write"},{"enabled":true,"name":"guest-file-read"},{"enabled":true,"name":"guest-file-close"},{"enabled":true,"name":"guest-file-open"},{"enabled":true,"name":"guest-shutdown"},{"enabled":true,"name":"guest-info"},{"enabled":true,"name":"guest-set-time"},{"enabled":true,"name":"guest-get-time"},{"enabled":true,"name":"guest-ping"},{"enabled":true,"name":"guest-sync"},{"enabled":true,"name":"guest-sync-delimited"}]}}
  5. freeze文件系统的方法:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
#直接用virsh命令,freeze文件系统
[root@node-12 ~]# virsh qemu-agent-command 90 '{"execute":"guest-fsfreeze-freeze"}'
{"return":1}
 
#freeze后,可以查询当前虚拟机文件系统的状态,表明是frozen
[root@node-12 ~]# virsh qemu-agent-command 90 '{"execute":"guest-fsfreeze-status"}'
{"return":"frozen"}
 
#thaw(解封)文件系统
[root@node-12 ~]# virsh qemu-agent-command 90 '{"execute":"guest-fsfreeze-thaw"}'
{"return":1}
 
#thaw后,文件系统为解封状态
[root@node-12 ~]# virsh qemu-agent-command 90 '{"execute":"guest-fsfreeze-status"}'
{"return":"thawed"}

The integration of qemu-ga and openstack

首先需要在openstack镜像中增加metadata信息:hw_qemu_guest_agent=yes,命令如下:

1
nova image-meta 6410b84d-a473-4ece-83e5-09848a545645 set hw_qemu_guest_agent=yes

这样创建的虚拟机就会增加qemu-ga通道

目前和qemu-ga在openstack(kilo)中,只被用在创建volume虚拟机的snapshot时,封锁文件系统,保证创建的 snapshot的数据一致性。在def snapshot_volume_backed(nova/compute/api.py中),此函数是nova api执行的,详见下面的代码,我们只关注封锁文件系统,解除封锁文件系统与之类似,留给朋友们自己分析

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
    # NOTE(melwitt): We don't check instance lock for snapshot because lock is
    #                intended to prevent accidental change/delete of instances
    @check_instance_state(vm_state=[vm_states.ACTIVE, vm_states.STOPPED])
    def snapshot_volume_backed(self, context, instance, image_meta, name,
                               extra_properties=None):
        """Snapshot the given volume-backed instance.
 
        :param instance: nova.objects.instance.Instance object
        :param image_meta: metadata for the new image
        :param name: name of the backup or snapshot
        :param extra_properties: dict of extra image properties to include
 
        :returns: the new image metadata
        """
        image_meta['name'] = name
        image_meta['is_public'] = False
        properties = image_meta['properties']
        if instance.root_device_name:
            properties['root_device_name'] = instance.root_device_name
        properties.update(extra_properties or {})
 
        quiesced = False
        if instance.vm_state == vm_states.ACTIVE:
            try:
                #封锁文件系统
                self.compute_rpcapi.quiesce_instance(context, instance)
                quiesced = True
            except (exception.InstanceQuiesceNotSupported,
                    exception.NovaException, NotImplementedError) as err:
                if strutils.bool_from_string(properties.get(
                        'os_require_quiesce')):
                    raise
                else:
                    LOG.info(_LI('Skipping quiescing instance: '
                                 '%(reason)s.'), {'reason': err},
                             context=context, instance=instance)
 
        bdms = objects.BlockDeviceMappingList.get_by_instance_uuid(
                context, instance.uuid)
 
        mapping = []
        for bdm in bdms:
            if bdm.no_device:
                continue
 
            if bdm.is_volume:
                # create snapshot based on volume_id
                volume = self.volume_api.get(context, bdm.volume_id)
                # NOTE(yamahata): Should we wait for snapshot creation?
                #                 Linux LVM snapshot creation completes in
                #                 short time, it doesn't matter for now.
                name = _('snapshot for %s') % image_meta['name']
                #创建volume的snapshot
                snapshot = self.volume_api.create_snapshot_force(
                    context, volume['id'], name, volume['display_description'])
                mapping_dict = block_device.snapshot_from_bdm(snapshot['id'],
                                                              bdm)
                mapping_dict = mapping_dict.get_image_mapping()
            else:
                mapping_dict = bdm.get_image_mapping()
 
            mapping.append(mapping_dict)
 
        #解封文件系统
        if quiesced:
            self.compute_rpcapi.unquiesce_instance(context, instance, mapping)
 
        # NOTE (ndipanov): Remove swap/ephemerals from mappings as they will be
        # in the block_device_mapping for the new image.
        image_mappings = properties.get('mappings')
        if image_mappings:
            properties['mappings'] = [m for m in image_mappings
                                      if not block_device.is_swap_or_ephemeral(
                                          m['virtual'])]
        if mapping:
            properties['block_device_mapping'] = mapping
            properties['bdm_v2'] = True
 
        for attr in ('status', 'location', 'id', 'owner'):
            image_meta.pop(attr, None)
 
        # the new image is simply a bucket of properties (particularly the
        # block device mapping, kernel and ramdisk IDs) with no image data,
        # hence the zero size
        image_meta['size'] = 0
 
        return self.image_api.create(context, image_meta)

进一步封锁文件系统的代码,进入到compute/rpcapi.py中的    def quiesce_instance(self, ctxt, instance)该函数主要发送封锁文件系统指令(rpc call)给nova compute:

1
2
3
4
5
6
    def quiesce_instance(self, ctxt, instance):
        version = self._compat_ver('4.0', '3.39')
        cctxt = self.client.prepare(server=_compute_host(None, instance),
                version=version)
        #向消息队列中发送封锁文件系统的指令,此函数最终由compute/manager.py中的def quiesce_instance执行
        return cctxt.call(ctxt, 'quiesce_instance', instance=instance)

接下来调用进入了nova compute,由compute/manager.py继续处理:

1
2
3
4
5
6
7
8
9
10
11
    @wrap_exception()
    def quiesce_instance(self, context, instance):
        """Quiesce an instance on this host."""
        context = context.elevated()
        #通过instance得到instance所使用的image信息
        image_ref = instance.image_ref
        #通过image信息得到image保存的metadata信息,因为凡是支持qemu-ga的,对应的image都需要有hw_qemu_guest_agent=yes的metedata信息
        image_meta = compute_utils.get_image_metadata(
            context, self.image_api, image_ref, instance)
        #调用具体的driver(比如libvirt,vmware,xen等等)来实现
        self.driver.quiesce(context, instance, image_meta)

继续,只有libvirt支持该功能,代码推进到virt/libvirt/driver.py的def quiesce函数:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
    def quiesce(self, context, instance, image_meta):
        """Freeze the guest filesystems to prepare for snapshot.
 
        The qemu-guest-agent must be setup to execute fsfreeze.
        """
        #继续调用
        self._set_quiesced(context, instance, image_meta, True)
 
 
    def _set_quiesced(self, context, instance, image_meta, quiesced):
        #判断是否支持封锁文件系统的操作,需要满足3点
        #1.虚拟机类型是qemu/kvm
        #2.qemu版本必须大于1.2.5
        #3.image_meta中必须有hw_qemu_guest_agent=yes
        supported, reason = self._can_quiesce(image_meta)
        #不支持,就抛异常
        if not supported:
            raise exception.InstanceQuiesceNotSupported(
                instance_id=instance.uuid, reason=reason)
 
        try:
            domain = self._host.get_domain(instance)
            if quiesced:
                #调用libvirt库函数,封锁文件系统。类似于执行了virsh qemu-agent-command 90   '{"execute":"guest-fsfreeze-freeze"}'
                domain.fsFreeze()
            else:
                #调用libvirt库函数,解除文件系统封锁。类似于执行了virsh qemu-agent-command 90   '{"execute":"guest-fsfreeze-thaw"}'
                domain.fsThaw()
        except libvirt.libvirtError as ex:
            error_code = ex.get_error_code()
            msg = (_('Error from libvirt while quiescing %(instance_name)s: '
                     '[Error Code %(error_code)s] %(ex)s')
                   % {'instance_name': instance.name,
                      'error_code': error_code, 'ex': ex})
            raise exception.NovaException(msg)

 

reference

[1]  http://www.ovirt.org/Guest_Agent  ovirt-guest-agent介绍

[2] http://wiki.qemu.org/Features/QAPI/GuestAgent qemu-ga介绍

[3] http://wiki.qemu.org/QMP qemu-ga的qmp协议介绍

[4] https://blueprints.launchpad.net/nova/+spec/quiesced-image-snapshots-with-qemu-guest-agent 创建volume虚拟机snapshot时,使用qemu-ga封锁文件系统的bp

[5] https://blueprints.launchpad.net/nova/+spec/qemu-guest-agent-support 在nova中增加支持qemu-ga的bp

no comments
Share

发表评论