多路复用机制学习笔记

2020-03-14 16:00:16 阅读：235 来源： 互联网

标签：14 多路复用 epoll redis 笔记 fd ctl 机制 root

以redis server 为例

redis server 启动时调用bind() 传入文件描述符fd6 绑定端口6379，调用listen()监听端口，并通过accept() 等待连接

root@pmghong-VirtualBox:/usr/local/redis/bin# strace -ff -o /data/redis_strace/redis ./redis-server 
root@pmghong-VirtualBox:/proc/22330/fd# ls /data/redis_strace/ -l
total 48
-rw-r--r-- 1 root root 34741 3月  14 10:37 redis.25102
-rw-r--r-- 1 root root   134 3月  14 10:37 redis.25105
-rw-r--r-- 1 root root   134 3月  14 10:37 redis.25106
-rw-r--r-- 1 root root   134 3月  14 10:37 redis.25107

root@pmghong-VirtualBox:/proc/22330/fd# vi /data/redis_strace/redis.25102
... ...
epoll_create(1024)                      = 5
socket(PF_INET6, SOCK_STREAM, IPPROTO_TCP) = 6
setsockopt(6, SOL_IPV6, IPV6_V6ONLY, [1], 4) = 0
setsockopt(6, SOL_SOCKET, SO_REUSEADDR, [1], 4) = 0
bind(6, {sa_family=AF_INET6, sin6_port=htons(6379), inet_pton(AF_INET6, "::", &sin6_addr), sin6_flowinfo=0, sin6_scope_id=0}, 28) = 0
listen(6, 511)                          = 0
fcntl(6, F_GETFL)                       = 0x2 (flags O_RDWR)
fcntl(6, F_SETFL, O_RDWR|O_NONBLOCK)    = 0
... ...
root@pmghong-VirtualBox:/proc/25102/fd# ll
total 0
dr-x------ 2 root root  0 3月  14 12:05 ./
dr-xr-xr-x 9 root root  0 3月  14 10:37 ../
lrwx------ 1 root root 64 3月  14 12:28 0 -> /dev/pts/0
lrwx------ 1 root root 64 3月  14 12:28 1 -> /dev/pts/0
lrwx------ 1 root root 64 3月  14 12:05 2 -> /dev/pts/0
lr-x------ 1 root root 64 3月  14 12:28 3 -> pipe:[104062]
l-wx------ 1 root root 64 3月  14 12:28 4 -> pipe:[104062]
lrwx------ 1 root root 64 3月  14 12:28 5 -> anon_inode:[eventpoll]
lrwx------ 1 root root 64 3月  14 12:28 6 -> socket:[104063]
lrwx------ 1 root root 64 3月  14 12:28 7 -> socket:[104064]
lrwx------ 1 root root 64 3月  14 12:28 8 -> socket:[256344]

第一阶段：BIO（阻塞IO）

多路复用机制学习笔记

Redis Server 启动后通过文件描述符fd6 监听系统内核

Client1 / Client2 分别通过fd7,fd8 请求访问redis

在BIO的场景下，redis server 会调用read()方法并进入阻塞状态，也就是直到fd7 有请求过来，处理完才能处理其他请求

这个模式缺点很明显，就是阻塞IO导致效率低

第二阶段 NIO （非阻塞IO）

多路复用机制学习笔记

跟BIO的区别在于，调用read(fd7) 时，如果没有请求数据，立即给redis server 返回一个错误

redis server 收到该类型的错误即可知道当前连接没有数据过来，可以继续处理下一个请求，提高处理效率

bind(6, {sa_family=AF_INET6, sin6_port=htons(6379), inet_pton(AF_INET6, "::", &sin6_addr), sin6_flowinfo=0, sin6_scope_id=0}, 28) = 0
listen(6, 511)                          = 0
fcntl(6, F_GETFL)                       = 0x2 (flags O_RDWR)
fcntl(6, F_SETFL, O_RDWR|O_NONBLOCK)    = 0

该模式的问题在于，定时轮询调用read(fdx)系统调用，当多个client 请求过来时，需要频繁的进行内核态/用户态切换，上下文切换开销大

第三阶段 select 同步非阻塞

多路复用机制学习笔记

int select(int nfds, fd_set *readfds, fd_set *writefds,
fd_set *exceptfds, struct timeval *timeout);

select() and pselect() allow a program to monitor multiple file descriptors, waiting until one or more of the file descriptors become "ready" for some class
of I/O operation (e.g., input possible).  A file descriptor is considered ready if it is possible to perform a corresponding I/O  operation  (e.g.,  read(2)
without blocking, or a sufficiently small write(2)).

目标是同时监听多个fd，直到一个或者多个fd 进入ready 状态，才会调用read()等系统调用处理业务逻辑，而不像上述的NIO场景下，需要轮询调用x个read()

select 只能解决事件通知问题（即哪些进程能读，哪些不能读的问题），但到了内核态，仍需在内核中遍历x个fd，看哪个client 发生了IO，再通知select 把结果集返回给server端，接着由sever端向指定的fd发起read() 系统调用

第四阶段 epoll 多路复用

多路复用机制学习笔记

epoll 机制包括 epoll_create / epoll_ctl / epoll_wait 3个系统调用

// epoll_create
// 说明
epoll_create() creates an epoll(7) instance.

//函数签名
int epoll_create(int size);

//返回值 
On success, these system calls return a nonnegative file descriptor.  
On error, -1 is returned, and errno is set to indicate the error.

//epoll_ctl
//说明
This  system  call  performs control operations on the epoll(7) instance referred to by the file descriptor epfd.  It requests that the operation op be per‐
formed for the target file descriptor, fd.

//函数签名
int epoll_ctl(int epfd, int op, int fd, struct epoll_event *event);

// op 类型 EPOLL_CTL_ADD / EPOLL_CTL_MOD  /EPOLL_CTL_DEL

// 返回值
When successful, epoll_ctl() returns zero.  
When an error occurs, epoll_ctl() returns -1 and errno is set appropriately.

//epoll_ctl
//说明
The  epoll_wait()  system call waits for events on the epoll(7) instance referred to by the file descriptor epfd.  The memory area pointed to by events will
contain the events that will be available for the caller.  Up to maxevents are returned by epoll_wait().  The maxevents argument must be greater than zero.

//函数签名
int epoll_wait(int epfd, struct epoll_event *events, int maxevents, int timeout);

//返回值
When successful, epoll_wait() returns the number of file descriptors ready for the requested I/O, or zero if no file  descriptor  became  ready  during  the
requested timeout milliseconds.  When an error occurs, epoll_wait() returns -1 and errno is set appropriately.

epoll_create(1024)                      = 5
... ...
bind(6, {sa_family=AF_INET6, sin6_port=htons(6379), inet_pton(AF_INET6, "::", &sin6_addr), sin6_flowinfo=0, sin6_scope_id=0}, 28) = 0
listen(6, 511)                          = 0
... ...
bind(7, {sa_family=AF_INET, sin_port=htons(6379), sin_addr=inet_addr("0.0.0.0")}, 16) = 0
listen(7, 511)                          = 0
... ...
epoll_ctl(5, EPOLL_CTL_ADD, 6, {EPOLLIN, {u32=6, u64=6}}) = 0
epoll_ctl(5, EPOLL_CTL_ADD, 7, {EPOLLIN, {u32=7, u64=7}}) = 0
epoll_ctl(5, EPOLL_CTL_ADD, 3, {EPOLLIN, {u32=3, u64=3}}) = 0

write(...)
read(...)
epoll_wait(5, [], 10128, 0)             = 0

1、进程启动时通过epoll_create() 创建epoll instance，成功时返回一个非负数的fdn，失败返回-1还有错误码

2、调用epoll_ctl(上一步epoll_create 返回的fd，op，fd6，事件类型<accpet>)

3、调用epoll_wait() 监听内核事件，调用成功时返回该fd。例如当c1请求redisserver 时，首先需要通过fd6建立连接，此时通过epoll_ctl() 中对fd6 的accept()调用可以监听到该请求，并将fd6传给epoll_wait()

4、redis server端从epoll_wait() 获取需要IO操作的fd，发现c1 通过fd6请求建立连接，为其分配fd7，并在epoll_ctl()注册一个监听，例如epoll_ctl(fdn,op, fd7, <read>)

通过上述的事件通知方式，可以解决select 中，内核态循环遍历所有fd的缺点，仅当接收到IO中断事件，才通知上层程序，提高工作效率。

标签：14,多路复用,epoll,redis,笔记,fd,ctl,机制,root
来源： https://blog.51cto.com/pmghong/2478289

本站声明： 1. iCode9 技术分享网（下文简称本站）提供的所有内容，仅供技术学习、探讨和分享；
2. 关于本站的所有留言、评论、转载及引用，纯属内容发起人的个人观点，与本站观点和立场无关；
3. 关于本站的所有言论和文字，纯属内容发起人的个人观点，与本站观点和立场无关；
4. 本站文章均是网友提供，不完全保证技术分享内容的完整性、准确性、时效性、风险性和版权归属；如您发现该文章侵犯了您的权益，可联系我们第一时间进行删除；
5. 本站为非盈利性的个人网站，所有内容不会用来进行牟利，也不会利用任何形式的广告来间接获益，纯粹是为了广大技术爱好者提供技术内容和技术思想的分享性交流网站。

ICode9

多路复用机制学习笔记