再談 Linux epoll 驚群?jiǎn)栴}的原因和解決方案
時(shí)間:2021-10-22 15:13:29
手機(jī)看文章
掃描二維碼
隨時(shí)隨地手機(jī)看文章
[導(dǎo)讀]緣起近期排查了一個(gè)問(wèn)題,epoll驚群的問(wèn)題,起初我并不認(rèn)為這是驚群導(dǎo)致,因?yàn)閺默F(xiàn)象上看,只是體現(xiàn)了CPU不均衡。一共fork了20個(gè)Server進(jìn)程,在請(qǐng)求負(fù)載中等的時(shí)候,有三四個(gè)Server進(jìn)程呈現(xiàn)出比較高的CPU利用率,其余的Server進(jìn)程的CPU利用率都是非常低。中斷,...
緣起近期排查了一個(gè)問(wèn)題,epoll驚群的問(wèn)題,起初我并不認(rèn)為這是驚群導(dǎo)致,因?yàn)閺默F(xiàn)象上看,只是體現(xiàn)了CPU不均衡。一共fork了20個(gè)Server進(jìn)程,在請(qǐng)求負(fù)載中等的時(shí)候,有三四個(gè)Server進(jìn)程呈現(xiàn)出比較高的CPU利用率,其余的Server進(jìn)程的CPU利用率都是非常低。中斷,軟中斷都是均衡的,網(wǎng)卡RSS和CPU之間進(jìn)行了bind之后依然如故,既然系統(tǒng)層面查不出個(gè)所以然,只能從服務(wù)的角度來(lái)查了。自上而下的排查首先就想到了strace,沒(méi)想到一下子就暴露了原形:
accept(4,?0x9ecd930,?[16])??????????????=?-1?EAGAIN?(Resource?temporarily?unavailable)
accept(4,?0x9ecd930,?[16])??????????????=?-1?EAGAIN?(Resource?temporarily?unavailable)
accept(4,?0x9ecd930,?[16])??????????????=?-1?EAGAIN?(Resource?temporarily?unavailable)
accept(4,?0x9ecd930,?[16])??????????????=?-1?EAGAIN?(Resource?temporarily?unavailable)
accept(4,?0x9ecd930,?[16])??????????????=?-1?EAGAIN?(Resource?temporarily?unavailable)
accept(4,?0x9ecd930,?[16])??????????????=?-1?EAGAIN?(Resource?temporarily?unavailable)
accept(4,?0x9ecd930,?[16])??????????????=?-1?EAGAIN?(Resource?temporarily?unavailable)
accept(4,?0x9ecd930,?[16])??????????????=?-1?EAGAIN?(Resource?temporarily?unavailable)
accept(4,?0x9ecd930,?[16])??????????????=?-1?EAGAIN?(Resource?temporarily?unavailable)
accept(4,?0x9ecd930,?[16])??????????????=?-1?EAGAIN?(Resource?temporarily?unavailable)
accept(4,?0x9ecd930,?[16])??????????????=?-1?EAGAIN?(Resource?temporarily?unavailable)
accept(4,?0x9ecd930,?[16])??????????????=?-1?EAGAIN?(Resource?temporarily?unavailable)
如果僅僅strace accept,即加上“-e trace=accept”參數(shù)的話(huà),偶爾會(huì)有accept成功的現(xiàn)象:accept(4,?0x9ecd930,?[16])??????????????=?-1?EAGAIN?(Resource?temporarily?unavailable)
accept(4,?0x9ecd930,?[16])??????????????=?-1?EAGAIN?(Resource?temporarily?unavailable)
accept(4,?0x9ecd930,?[16])??????????????=?-1?EAGAIN?(Resource?temporarily?unavailable)
accept(4,?0x9ecd930,?[16])??????????????=?-1?EAGAIN?(Resource?temporarily?unavailable)
accept(4,?0x9ecd930,?[16])??????????????=?-1?EAGAIN?(Resource?temporarily?unavailable)
accept(4,?{sa_family=AF_INET,?sin_port=htons(39306),?sin_addr=inet_addr("172.16.1.202")},?[16])?=?19
accept(4,?0x9ecd930,?[16])??????????????=?-1?EAGAIN?(Resource?temporarily?unavailable)
accept(4,?0x9ecd930,?[16])??????????????=?-1?EAGAIN?(Resource?temporarily?unavailable)
accept(4,?0x9ecd930,?[16])??????????????=?-1?EAGAIN?(Resource?temporarily?unavailable)
大量的CPU空轉(zhuǎn),進(jìn)一步加大請(qǐng)求負(fù)載,CPU空轉(zhuǎn)明顯降低,這說(shuō)明在預(yù)期的空轉(zhuǎn)期間,新來(lái)的請(qǐng)求降低了空轉(zhuǎn)率…現(xiàn)象明顯偏向于這就是驚群導(dǎo)致的之判斷!本文將詳細(xì)說(shuō)一下關(guān)于epoll的細(xì)節(jié)?,F(xiàn)在開(kāi)始!
題目中為什么是“再談”,因?yàn)檫@個(gè)話(huà)題別人已經(jīng)聊過(guò)很多了,我順勢(shì)繼續(xù)下去而已。簡(jiǎn)單介紹驚群和事件模型關(guān)于什么是驚群,這里不再做概念上的解釋?zhuān)芩训竭@篇文章的想必已經(jīng)有所了解,如果仍有概念上的疑惑,自行百度或者谷歌。驚群?jiǎn)栴}一般出現(xiàn)在那些web服務(wù)器上,曾經(jīng)Linux系統(tǒng)有個(gè)經(jīng)典的accept驚群?jiǎn)栴}困擾了大家非常久的時(shí)間,這個(gè)問(wèn)題現(xiàn)在已經(jīng)在內(nèi)核曾經(jīng)得以解決,具體來(lái)講就是當(dāng)有新的連接進(jìn)入到accept隊(duì)列的時(shí)候,內(nèi)核喚醒且僅喚醒一個(gè)進(jìn)程來(lái)處理,這是通過(guò)以下的代碼來(lái)實(shí)現(xiàn)的:list_for_each_entry_safe(curr,?next,?