Hi,

 

I start a virtual machine with commandline:

    /usr/libexec/qemu-kvm --enable-kvm -smp 8 -m 8192 -device vfio-pci,host=0000:81:00.0

 

Then I pause the qemu process before executing the main_loop function by gdb.

At this moment, lspci shows the regions are disabled like below:

    81:00.0 3D controller: NVIDIA Corporation GP100GL [Tesla P100 PCIe 16GB] (rev a1)

        Subsystem: NVIDIA Corporation Device 118f

        Physical Slot: 0-6

        Control: I/O- Mem- BusMaster- SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+

        Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-

        Interrupt: pin A routed to IRQ 35

        NUMA node: 1

        Region 0: Memory at c8000000 (32-bit, non-prefetchable) [disabled] [size=16M]

        Region 1: Memory at 27800000000 (64-bit, prefetchable) [disabled] [size=16G]

        Region 3: Memory at 27c00000000 (64-bit, prefetchable) [disabled] [size=32M]

 

But after the command:

echo 1 > /sys/bus/pci/devices/0000:81:00.0/reset

lspci shows the regions are *not* disabled:

    81:00.0 3D controller: NVIDIA Corporation GP100GL [Tesla P100 PCIe 16GB] (rev a1)

        Subsystem: Huawei Technologies Co., Ltd. Device 2061

        Physical Slot: 0-6

        Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr+ Stepping- SERR+ FastB2B- DisINTx-

        Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-

        Latency: 0, Cache Line Size: 32 bytes

        Interrupt: pin A routed to IRQ 7

        NUMA node: 1

        Region 0: Memory at c8000000 (32-bit, non-prefetchable) [size=16M]

        Region 1: Memory at 27800000000 (64-bit, prefetchable) [size=16G]

        Region 3: Memory at 27c00000000 (64-bit, prefetchable) [size=32M]

 

AFAIK, qemu performs vfio_pci_reset like the below callstack:

    Qemu:

        vfio_pci_reset

            ioctl(vdev->vbasedev.fd, VFIO_DEVICE_RESET)

Kernel:

    vfio_pci_ioctl

        pci_try_reset_function

            __pci_reset_function_locked

                    pci_parent_bus_reset

                        pci_reset_bridge_secondary_bus

 

and write 1 to the reset interface of sysfs go through the path:

Kernel:

    reset_store

        pci_reset_function

            __pci_reset_function_locked

                    pci_parent_bus_reset

                        pci_reset_bridge_secondary_bus

 

So seem that these two methods are same actually, I am confused why the results are inconsistent.

 

Thanks,

Zongyong Wu