容器使用lxcfs进行资源隔离
背景
容器可以通过cgroup
的方式对资源的使用情况进行限制,包括cpu、内存等,但是容器内的一些进程获取系统参数时获取到的还是宿主机的参数,如node中通过os获取cpu数量,获取到的就是宿主机的数据,而非容器的数据。这是由于容器并没有做到对/proc
,/sys
等文件系统的资源视图隔离。
如下所示:
部署一个node容器,并对该容器的cpu和内存进行限制
apiVersion: apps/v1
kind: Deployment
metadata:
name: node-test
spec:
selector:
matchLabels:
app: node
replicas: 1
template:
metadata:
labels:
app: node
spec:
containers:
- name: node
image: node:12.18.4
command:
- sleep
args:
- infinity
resources:
limits:
cpu: 2000m
memory: "256Mi"
requests:
cpu: 2000m
memory: "256Mi"
进入容器中查看,可以看到/proc/cpuinfo
下的cpu数据和free显示的内存数据都是宿主机的数据,同时进入node,使用os模块获取到的cpu和内存数据也都是宿主机的数据
使用lxcfs进行资源视图隔离
lxcfs项目地址:https://github.com/lxc/lxcfs
lxcfs是一个小型FUSE文件系统,其目的是使容器看起来更像一个虚拟机。lxcfs将为容器提供一下信息:
/proc/cpuinfo
/proc/diskstats
/proc/meminfo
/proc/stat
/proc/swaps
/proc/uptime
/proc/slabinfo
/sys/devices/system/cpu/online
这使得在容器中获取到的这些信息是容器真实的数据,而非宿主机的数据
在Docker中使用lxcfs
安装lxcfs
git clone git://github.com/lxc/lxcfs
cd lxcfs
meson setup -Dinit-script=systemd --prefix=/usr build/
meson compile -C build/
sudo meson install -C build/
sudo mkdir -p /var/lib/lxcfs
sudo lxcfs /var/lib/lxcfs
编写service文件,注册为系统服务
cat > /usr/lib/systemd/system/lxcfs.service <<EOF
[Unit]
Description=lxcfs
[Service]
ExecStart=/usr/bin/lxcfs -f /var/lib/lxcfs
Restart=on-failure
[Install]
WantedBy=multi-user.targetEOF
启动服务并设置为开机自启
systemctl enable lxcfs --now
在Docker中使用lxcfs示例
docker run -it -m 256m --memory-swap 256m \
-v /var/lib/lxcfs/proc/cpuinfo:/proc/cpuinfo:rw \
-v /var/lib/lxcfs/proc/diskstats:/proc/diskstats:rw \
-v /var/lib/lxcfs/proc/meminfo:/proc/meminfo:rw \
-v /var/lib/lxcfs/proc/stat:/proc/stat:rw \
-v /var/lib/lxcfs/proc/swaps:/proc/swaps:rw \
-v /var/lib/lxcfs/proc/uptime:/proc/uptime:rw \
-v /var/lib/lxcfs/proc/slabinfo:/proc/slabinfo:rw \
-v /var/lib/lxcfs/sys/devices/system/cpu:/sys/devices/system/cpu:rw \
ubuntu:18.04 /bin/bash
在K8s中使用lxcfs
在kubernetes中使用lxcfs需要解决两个问题:
- k8s的每个节点都需要安装lxcfs
- 每个pod都需要挂载lxcfs维护的/var/lib/lxcfs/proc文件
针对第一个问题,我们可以使用DaemonSet类型的控制器,以保证在每个节点都安装lxcfs,可直接使用以下yaml部署文件 lxcfs.yaml:
使用以下命令部署lxcfs至每个节点:apiVersion: apps/v1 kind: DaemonSet metadata: name: lxcfs labels: app: lxcfs spec: selector: matchLabels: app: lxcfs template: metadata: labels: app: lxcfs spec: hostPID: true tolerations: - key: node-role.kubernetes.io/master effect: NoSchedule containers: - name: lxcfs image: registry.cn-hangzhou.aliyuncs.com/toodo/lxcfs:3.1.2-build.0 imagePullPolicy: Always securityContext: privileged: true volumeMounts: - name: cgroup mountPath: /sys/fs/cgroup - name: lxcfs mountPath: /var/lib/lxcfs mountPropagation: Bidirectional - name: usr-local mountPath: /usr/local volumes: - name: cgroup hostPath: path: /sys/fs/cgroup - name: usr-local hostPath: path: /usr/local - name: lxcfs hostPath: path: /var/lib/lxcfs type: DirectoryOrCreate
针对第二个问题,可以通过在k8s资源文件中声明对宿主机/var/lib/lxcfs系列文件进行挂载,也可以通过admission-webhook进行控制【可参考:https://github.com/denverdino/lxcfs-admission-webhook】,我们以第一种方式为例:kubectl apply -f lxcfs.yaml
进入容器中可以看到使用命令获取到的为容器实际分配的资源大小了apiVersion: apps/v1 kind: Deployment metadata: name: node-test-lxcfs spec: selector: matchLabels: app: node-test-lxcfs replicas: 1 template: metadata: labels: app: node-test-lxcfs spec: containers: - name: node-test-lxcfs image: node:12.18.4 resources: limits: cpu: 2000m memory: "256Mi" requests: cpu: 2000m memory: "256Mi" ports: - containerPort: 80 volumeMounts: - name: cpuinfo mountPath: /proc/cpuinfo - name: diskstats mountPath: /proc/diskstats - name: meminfo mountPath: /proc/meminfo - name: stat mountPath: /proc/stat - name: swaps mountPath: /proc/swaps - name: uptime mountPath: /proc/uptime volumes: - name: cpuinfo hostPath: path: /var/lib/lxcfs/proc/cpuinfo - name: diskstats hostPath: path: /var/lib/lxcfs/proc/diskstats - name: meminfo hostPath: path: /var/lib/lxcfs/proc/meminfo - name: stat hostPath: path: /var/lib/lxcfs/proc/stat - name: swaps hostPath: path: /var/lib/lxcfs/proc/swaps - name: uptime hostPath: path: /var/lib/lxcfs/proc/uptime
附录
lxcfs镜像构建文件
Dockerfile
FROM centos:7 as build
RUN sed -e 's|^mirrorlist=|#mirrorlist=|g' -e 's|^#baseurl=http://mirror.centos.org/altarch/|baseurl=https://mirrors.ustc.edu.cn/centos-altarch/|g' -e 's|^#baseurl=http://mirror.centos.org/centos|baseurl=https://mirrors.ustc.edu.cn/centos|g' -i.bak /etc/yum.repos.d/CentOS-Base.repo && \
yum -y update
RUN yum -y install fuse-devel pam-devel wget install gcc automake autoconf libtool make
ENV LXCFS_VERSION 3.1.2
RUN wget https://linuxcontainers.org/downloads/lxcfs/lxcfs-$LXCFS_VERSION.tar.gz && \
mkdir /lxcfs && tar xzvf lxcfs-$LXCFS_VERSION.tar.gz -C /lxcfs --strip-components=1 && \
cd /lxcfs && ./configure && make
FROM centos:7
STOPSIGNAL SIGINT
COPY --from=build /lxcfs/lxcfs /usr/local/bin/lxcfs
COPY --from=build /lxcfs/.libs/liblxcfs.so /usr/local/lib/lxcfs/liblxcfs.so
COPY --from=build /lxcfs/lxcfs /lxcfs/lxcfs
COPY --from=build /lxcfs/.libs/liblxcfs.so /lxcfs/liblxcfs.so
COPY --from=build /usr/lib64/libfuse.so.2.9.2 /lxcfs/libfuse.so.2.9.2
COPY --from=build /usr/lib64/libulockmgr.so.1.0.1 /lxcfs/libulockmgr.so.1.0.1
COPY start.sh /
CMD ["/start.sh"]
start.sh
#!/bin/bash
# Cleanup
nsenter -m/proc/1/ns/mnt fusermount -u /var/lib/lxcfs 2> /dev/null || true
nsenter -m/proc/1/ns/mnt [ -L /etc/mtab ] || \
sed -i "/^lxcfs \/var\/lib\/lxcfs fuse.lxcfs/d" /etc/mtab
# remove /var/lib/lxcfs
rm -rf /var/lib/lxcfs/*
# Prepare
mkdir -p /usr/local/lib/lxcfs /var/lib/lxcfs
# Update lxcfs
cp -f /lxcfs/lxcfs /usr/local/bin/lxcfs
cp -f /lxcfs/liblxcfs.so /usr/local/lib/lxcfs/liblxcfs.so
cp -f /lxcfs/libfuse.so.2.9.2 /usr/lib64/libfuse.so.2.9.2
cp -f /lxcfs/libulockmgr.so.1.0.1 /usr/lib64/libulockmgr.so.1.0.1
ln -s /usr/lib64/libfuse.so.2.9.2 /usr/lib64/libfuse.so.2
ln -s /usr/lib64/libulockmgr.so.1.0.1 /usr/lib64/libulockmgr.so.1
# Mount
exec nsenter -m/proc/1/ns/mnt /usr/local/bin/lxcfs /var/lib/lxcfs/