Re: [libvirt] [RFC] Faster libvirtd restart with nwfilter rules

Monday, 1 October 2018

ping

On 24.09.2018 10:41, Nikolay Shirokovskiy wrote:
...
 Hi, all.                                                             

   On fat hosts which are capable to run hundreds of VMs restarting libvirtd 
 makes it's services unavailable for a long time if VMs use network filters. In       

 my tests each of 100 VMs has no-promisc [1] and no-mac-spoofing filters and
 executing virsh list right after daemon restart takes appoximately 140s if no
 firewalld is running (that is ebtables/iptables/ip6tables commands are used to           

 configure kernel tables).                                                                

   The problem is daemon does not even start to read from client connections
 because state drivers are not initialized. Initialization is blocked in state            

 drivers autostart which grabs VMs locks. And VMs locks are hold by VMs
 reconnection code. Each VM reloads network tables on reconnection and this               

 reloading is serialized on updateMutex in gentech nwfilter driver.
 Workarounding autostart won't help much because even if state drivers will
 initialize listing VM won't be possible because listing VMs takes each VM lock       

 one by one too. However managing VM that passed reconnection phase will be               

 possible which takes same 140s in worst case.                                            

   Note that this issue is only applicable if we use filters configuration that           

 don't need ip learning. In the latter case situation is different because
 reconnection code spawns new thread that apply network rules only after ip is            

 learned from traffic and this thread does not grab VM lock. As result VMs are            

 managable but reloading filters in background takes appoximately those same
 140s. I guess managing network filters during this period can have issues too.           

 Anyway this situation does not look good so fixing the described issue by                

 spawning threads even without ip learning does not look nice to me.                      

   What speed up is possible on conservative approach? First we can remove for            

 test purpuses firewall ruleLock, gentech dirver updateMutex and filter object            

 mutex which do not serve function in restart scenario. This gives 36s restart            

 time. The speed up is archived because heavy fork/preexec steps are now run              

 concurrently.

 Next we can try to reduce fork/preexec time. To estimate its contibution alone
 let's bring back the above locks. It turns out the most time takes fork itself
 and closing 8k (on my system) file descriptors in preexec. Using vfork gives
 2x boost and so does dropping mass close. (I check this mass close contribution
 because I not quite understand the purpose of this step - libvirt typically set
 close-on-exec flag on it's descriptors). So this two optimizations alone can
 result in restart time of 30s.

 Unfortunately combining the above two approaches does not give boost multiple
 of them along. The reason is due to concurrency and high number of VMs (100)
 preexec boost does not have significant role and using vfork dininishes
 concurrency as it freezes all parent threads before execve. So dropping locks
 and closes gives 33s restart time and adding vfork to this gives 25s restart
 time.

 Another approach is to use --atomic-file option for ebtables
 (iptables/ip6tables unfortunately does not have one). The idea is to save table
 to file/edit file/commit table to kernel. I hoped this could give performance
 boost because we don't need to load/store kernel network table for a single
 rule update. In order to isolate approaches I also dropped all ip/ip6 updates
 which can not be done this way. In this approach we can not drop ruleLock in
 firewall because no other VM threads should change tables between save/commit.
 This approach gives restart time 25s. But this approach is broken anyway as we
 can not be sure another application doesn't change newtork table between
 save/commit in which case these changes will be lost.

 After all I think we need to move in a different direction. We can add API to
 all binaries and firewalld to execute many commands in one run. We can pass
 commands as arguments or wrote them into file which is then given to binary.
 Then libvirt itself can update for example bridge network table in couple of
 commands. The exact number depends on new API. For example if we add option to
 delete chains recursively and an option not to fail on NOENT error we can
 change table in one command (no listing current rules is required).

 [1] no-promisc filter

 <filter name='no-promisc' chain='root' priority='-750'>
   <uuid>6d055022-1192-4a3d-ae1f-576baa5564b6</uuid>
   <rule action='return' direction='in' priority='500'>
     <mac dstmacaddr='ff:ff:ff:ff:ff:ff'/>
   </rule>
   <rule action='return' direction='in' priority='500'>
     <mac dstmacaddr='$MAC'/>
   </rule>
   <rule action='return' direction='in' priority='500'>
     <mac dstmacaddr='33:33:00:00:00:00'
dstmacmask='ff:ff:00:00:00:00'/>
   </rule>
   <rule action='drop' direction='in' priority='500'>
     <mac/>
   </rule>
   <rule action='return' direction='in' priority='500'>
     <mac dstmacaddr='01:00:5e:00:00:00'
dstmacmask='ff:ff:ff:80:00:00'/>
   </rule>
 </filter>

 --
 libvir-list mailing list
 libvir-list(a)redhat.com
 https://www.redhat.com/mailman/listinfo/libvir-list

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

Re: [libvirt] [RFC] Faster libvirtd restart with nwfilter rules