AMD perf_ibs_start crash

Severity

Warning/NMI Storm/Lockup

Found by

perf_fuzzer: Vince Weaver

First Seen

???

Most recently Seen

4.4-rc4
4.9-rc6 (no crash though)

Reproducible

yes

Found On

a10

Linux-kernel Mailing List Report

22 October 2015: perf: fuzzer triggered trouble on AMD, maybe ibs related

Analysis

Kernel Splat

  1. No crash?
    
    *** perf_fuzzer 0.32-rc0 *** by Vince Weaver
    
            Linux version 4.9.0-rc5+ x86_64
            Processor: AMD 21/19/1
    
            Stopping after 30000
            Watchdog enabled with timeout 60s
            Will auto-exit if signal storm detected
            Seeding RNG from time 1479467663
    
            To reproduce, try:
                    echo 1 > /proc/sys/kernel/nmi_watchdog
                    echo 0 > /proc/sys/kernel/perf_event_paranoid
                    echo 25750 > /proc/sys/kernel/perf_event_max_sample_rate
                    ./perf_fuzzer -s 30000 -r 1479467663
    
            Fuzzing the following syscalls: mmap perf_event_open close read write ioctl fork prctl poll 
            Also attempting the following: signal-handler-on-overflow busy-instruction-loop accessing-perf-proc-and-sys-files trashing-the-mmap-page 
    
            Pid=24863, sleeping 1s
    ==================================================
    Starting fuzzing at 2016-11-18 06:14:24
    ==================================================
    Cannot open /sys/kernel/tracing/kprobe_events
    Throttling event 17 fd 21, last_refresh=0, period=0, type=8 throttles 0
    Throttling event 17 fd 21, last_refresh=0, period=0, type=8 throttles 1
    Throttling event 17 fd 21, last_refresh=0, period=0, type=8 throttles 2
    Throttling event 17 fd 21, last_refresh=0, period=0, type=8 throttles 3
    Throttling event 17 fd 21, last_refresh=0, period=0, type=8 throttles 4
    Throttling event 17 fd 21, last_refresh=0, period=0, type=8 throttles 5
    Throttling event 17 fd 21, last_refresh=0, period=0, type=8 throttles 6
    Throttling event 17 fd 21, last_refresh=0, period=0, type=8 throttles 7
    Throttling event 17 fd 21, last_refresh=0, period=0, type=8 throttles 8
    Throttling event 17 fd 21, last_refresh=0, period=0, type=8 throttles 9
    Throttling event 17 fd 21, last_refresh=0, period=0, type=8 throttles 10
    Stuck in a signal storm w/o forward progress; Max throttle count hit, giving up
    Signal from invalid fd 21 Bad file descriptor
    Throttling event 24 fd 27, last_refresh=0, period=32, type=1 throttles 0
    Throttling event 34 fd 38, last_refresh=0, period=0, type=7 throttles 0
    Throttling event 34 fd 38, last_refresh=0, period=0, type=7 throttles 1
    Throttling event 34 fd 38, last_refresh=0, period=0, type=7 throttles 2
    SIGIO due to RT queue overflow
    SIGIO due to RT queue overflow
    Signal from invalid fd 38 Bad file descriptor
    Iteration 10000 (303.030303 ops/s)
            Open attempts: 131824  Successful: 903  Currently open: 2
                    EPERM : 7
                    ENOENT : 726
                    E2BIG : 11561
                    EBADF : 11186
                    EINVAL : 106533
                    EOVERFLOW : 1
                    EOPNOTSUPP : 907
                    Trinity Type (Normal 221/32847)(Sampling 24/32898)(Global 637/33139)(Random 21/32940)
                    Type (Hardware 188/18590)(software 380/18723)(tracepoint 59/18525)(Cache 39/16471)(cpu 204/18578)(breakpoint 13/18849)(amd_nb 5/2043)(ibs_fetch 3/1968)(ibs_op 3/2198)(msr 9/2024)(#10 0/15)(#11 0/9)(#12 0/13)(#13 0/11)(#14 0/9)(#15 0/9)(#16 0/130)(#17 0/14)(#18 0/12)(>19 0/13633)
            Close:  901/902 Successful
            Read:   795/878 Successful
            Write:  0/876 Successful
            Ioctl:  318/835 Successful: (ENABLE 71/71)(DISABLE 76/76)(REFRESH 2/72)(RESET 77/78)(PERIOD 6/81)(SET_OUTPUT 13/82)(SET_FILTER 0/78)(ID 68/68)(SET_BPF 0/73)(PAUSE_OUTPUT 5/95)(#10 0/0)(#11 0/0)(#12 0/0)(#13 0/0)(#14 0/0)(>14 0/61)
            Mmap:   438/1079 Successful: (MMAP 438/1079)(TRASH 70/148)(READ 110072/156202)(UNMAP 438/994)(AUX 0/125)(AUX_READ 0/0)
            Prctl:  858/858 Successful
            Fork:   451/451 Successful
            Poll:   865/913 Successful
            Access: 95/906 Successful
            Overflows: 174349  Recursive: 0
            SIGIOs due to RT signal queue full: 2
    Throttling event 9 fd 12, last_refresh=0, period=0, type=7 throttles 0
    Throttling event 9 fd 12, last_refresh=0, period=0, type=7 throttles 1
    Throttling event 9 fd 12, last_refresh=0, period=0, type=7 throttles 2
    Throttling event 9 fd 12, last_refresh=0, period=0, type=7 throttles 3
    Throttling event 9 fd 12, last_refresh=0, period=0, type=7 throttles 4
    Throttling event 9 fd 12, last_refresh=0, period=0, type=7 throttles 5
    Throttling event 9 fd 12, last_refresh=0, period=0, type=7 throttles 6
    Throttling event 9 fd 12, last_refresh=0, period=0, type=7 throttles 7
    Throttling event 9 fd 12, last_refresh=0, period=0, type=7 throttles 8
    Throttling event 9 fd 12, last_refresh=0, period=0, type=7 throttles 9
    Throttling event 9 fd 12, last_refresh=0, period=0, type=7 throttles 10
    Stuck in a signal storm w/o forward progress; Max throttle count hit, giving up
    Signal from invalid fd 12 Bad file descriptor
    SIGIO due to RT queue overflow
    SIGIO due to RT queue overflow
    Iteration 20000 (270.270270 ops/s)
            Open attempts: 132808  Successful: 891  Currently open: 11
                    EPERM : 10
                    ENOENT : 690
                    E2BIG : 11629
                    EBADF : 11373
                    EINVAL : 107292
                    EOVERFLOW : 1
                    EOPNOTSUPP : 922
                    Trinity Type (Normal 203/33282)(Sampling 30/33126)(Global 630/33089)(Random 28/33311)
                    Type (Hardware 194/18701)(software 340/19083)(tracepoint 68/18984)(Cache 44/16522)(cpu 198/18851)(breakpoint 25/18415)(amd_nb 8/2118)(ibs_fetch 5/2026)(ibs_op 0/2240)(msr 9/2116)(#10 0/17)(#11 0/10)(#12 0/11)(#13 0/19)(#14 0/5)(#15 0/6)(#16 0/139)(#17 0/10)(#18 0/9)(>19 0/13526)
            Close:  882/883 Successful
            Read:   754/831 Successful
            Write:  0/822 Successful
            Ioctl:  326/811 Successful: (ENABLE 72/72)(DISABLE 75/75)(REFRESH 7/76)(RESET 78/78)(PERIOD 1/52)(SET_OUTPUT 5/64)(SET_FILTER 0/80)(ID 79/79)(SET_BPF 0/88)(PAUSE_OUTPUT 9/69)(#10 0/0)(#11 0/0)(#12 0/0)(#13 0/0)(#14 0/0)(>14 0/78)
            Mmap:   397/1041 Successful: (MMAP 397/1041)(TRASH 71/137)(READ 110074/110082)(UNMAP 396/952)(AUX 0/109)(AUX_READ 0/0)
            Prctl:  912/912 Successful
            Fork:   461/461 Successful
            Poll:   842/965 Successful
            Access: 96/912 Successful
            Overflows: 122077  Recursive: 0
            SIGIOs due to RT signal queue full: 2
    Throttling event 10 fd 15, last_refresh=0, period=0, type=7 throttles 0
    Throttling event 10 fd 15, last_refresh=0, period=0, type=7 throttles 1
    Throttling event 10 fd 15, last_refresh=0, period=0, type=7 throttles 2
    Throttling event 10 fd 15, last_refresh=0, period=0, type=7 throttles 3
    Throttling event 10 fd 15, last_refresh=0, period=0, type=7 throttles 4
    Throttling event 10 fd 15, last_refresh=0, period=0, type=7 throttles 5
    Throttling event 10 fd 15, last_refresh=0, period=0, type=7 throttles 6
    Throttling event 10 fd 15, last_refresh=0, period=0, type=7 throttles 7
    Throttling event 10 fd 15, last_refresh=0, period=0, type=7 throttles 8
    Throttling event 10 fd 15, last_refresh=0, period=0, type=7 throttles 9
    Throttling event 10 fd 15, last_refresh=0, period=0, type=7 throttles 10
    Stuck in a signal storm w/o forward progress; Max throttle count hit, giving up
    Signal from invalid fd 15 Bad file descriptor
    SIGIO due to RT queue overflow
    SIGIO due to RT queue overflow
    Throttling event 2 fd 6, last_refresh=0, period=0, type=8 throttles 0
    Throttling event 2 fd 6, last_refresh=0, period=0, type=8 throttles 1
    Throttling event 2 fd 6, last_refresh=0, period=0, type=8 throttles 2
    Throttling event 2 fd 6, last_refresh=0, period=0, type=8 throttles 3
    Throttling event 2 fd 6, last_refresh=0, period=0, type=8 throttles 4
    Throttling event 2 fd 6, last_refresh=0, period=0, type=8 throttles 5
    Throttling event 2 fd 6, last_refresh=0, period=0, type=8 throttles 6
    Throttling event 2 fd 6, last_refresh=0, period=0, type=8 throttles 7
    Throttling event 2 fd 6, last_refresh=0, period=0, type=8 throttles 8
    Throttling event 2 fd 6, last_refresh=0, period=0, type=8 throttles 9
    Throttling event 2 fd 6, last_refresh=0, period=0, type=8 throttles 10
    Stuck in a signal storm w/o forward progress; Max throttle count hit, giving up
    Signal from invalid fd 6 Bad file descriptor
    [56285.317827] ------------[ cut here ]------------
    [56285.328331] WARNING: CPU: 0 PID: 0 at arch/x86/events/amd/ibs.c:409 perf_ibs_start+0xda/0x1cf
    [56285.343892] Modules linked in: nfsd auth_rpcgss oid_registry nfs_acl nfs lockd grace fscache sunrpc nls_utf8 nls_cp437 vfat fat edac_mce_amd edac_core kvm_amd kvm snd_hda_codec_realtek snd_hda_codec_generic radeon snd_hda_codec_hdmi irqbypass snd_hda_intel ghash_clmulni_intel snd_hda_codec ppdev hp_wmi sparse_keymap ttm drm_kms_helper aesni_intel evdev snd_hda_core drm aes_x86_64 pl2303 snd_hwdep ablk_helper usbserial snd_pcm cryptd lrw efi_pstore gf128mul glue_helper serio_raw efivars snd_timer parport_pc parport pcspkr k10temp sg tpm_infineon sp5100_tco i2c_algo_bit wmi fb_sys_fops syscopyarea sysfillrect sysimgblt snd i2c_piix4 soundcore i2c_core button acpi_cpufreq tpm_tis tpm_tis_core tpm processor ohci_pci sr_mod cdrom sd_mod ohci_hcd ehci_pci tg3 ahci ptp libahci pps_core libata xhci_pci ehci_hcd xhci_hcd crc32c_intel libphy psmouse usbcore scsi_mod usb_common
    [56285.482634] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.9.0-rc5+ #41
    [56285.497612] Hardware name: Hewlett-Packard HP Compaq Pro 6305 SFF/1850, BIOS K06 v02.57 08/16/2013
    [56285.482634] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.9.0-rc5+ #41
    [56285.497612] Hardware name: Hewlett-Packard HP Compaq Pro 6305 SFF/1850, BIOS K06 v02.57 08/16/2013
    [56285.515357]  ffff8801edc07cb0 ffffffff81429ccc 0000000000000000 0000000000000000
    [56285.531951]  ffff8801edc07cf0 ffffffff81079064 0000019900000009 ffff8801e7606900
    [56285.548309]  ffffe8ffffc0de98 ffffffff81c1a4e0 ffff8801e7606a48 ffffffff81c1a4e0
    [56285.564512] Call Trace:
    [56285.575249]   [56285.577244]  [] dump_stack+0x4d/0x63
    [56285.590488]  [] __warn+0xfa/0x115
    [56285.603490]  [] warn_slowpath_null+0x18/0x1a
    [56285.617414]  [] perf_ibs_start+0xda/0x1cf
    [56285.631040]  [] ? perf_ibs_resume+0x25/0x25
    [56285.644802]  [] ? perf_event_task_tick+0x155/0x2c3
    [56285.659128]  [] perf_event_task_tick+0x17a/0x2c3
    [56285.673223]  [] scheduler_tick+0x9e/0xcd
    [56285.686584]  [] update_process_times+0x42/0x4f
    [56285.700434]  [] tick_sched_handle+0x70/0x81
    [56285.714007]  [] tick_sched_timer+0x34/0x63
    [56285.727411]  [] __hrtimer_run_queues+0x20e/0x3ad
    [56285.741333]  [] ? update_fast_timekeeper+0x40/0x47
    [56285.755409]  [] ? tick_sched_do_timer+0x41/0x41
    [56285.769198]  [] ? hrtimer_cancel+0x1b/0x1b
    [56285.782496]  [] ? ktime_get_update_offsets_now+0x4c/0xde
    [56285.797077]  [] hrtimer_interrupt+0xa7/0x1a4
    [56285.810528]  [] local_apic_timer_interrupt+0x63/0x6a
    [56285.824671]  [] smp_apic_timer_interrupt+0x29/0x39
    [56285.838655]  [] apic_timer_interrupt+0x7f/0x90
    [56285.852226]   [56285.854224]  [] ? sched_idle_set_state+0x25/0x31
    [56285.868030]  [] ? cpuidle_enter_state+0x147/0x1ba
    [56285.881988]  [] cpuidle_enter+0x12/0x14
    [56285.894903]  [] cpu_startup_entry+0x190/0x202
    [56285.908393]  [] rest_init+0x82/0x85
    [56285.921069]  [] start_kernel+0x48d/0x4b1
    [56285.934217]  [] ? thread_stack_cache_init+0x6/0x6
    [56285.948081]  [] ? cmdline_find_option_bool+0x92/0x106
    [56285.962230]  [] ? early_idt_handler_array+0x120/0x120
    [56285.976304]  [] ? early_idt_handler_array+0x120/0x120
    [56285.990301]  [] x86_64_start_reservations+0x2a/0x2c
    [56286.004100]  [] x86_64_start_kernel+0x143/0x152
    [56286.017492] ---[ end trace b597ef9238fc8c40 ]---
    
    
    
  2. echo "0" > /proc/sys/kernel/perf_event_paranoid
    ./fast_repro99.sh 
    Stopping after 30000
    Watchdog enabled with timeout 60s
    Will auto-exit if signal storm detected
    
    *** perf_fuzzer 0.31-pre *** by Vince Weaver
    
    	Linux version 4.4.0-rc4+ x86_64
    	Processor: AMD 21/19/1
    
    	Seeding random number generator with 1449774065
    	/proc/sys/kernel/perf_event_max_sample_rate currently: 100000/s
    	/proc/sys/kernel/perf_event_paranoid currently: 0
    	Logging perf_event_open() failures: no
    	Running fsync after every syscall: no
    	To reproduce, try: ./perf_fuzzer -s 30000 -r 1449774065
    
    Pid=2078, sleeping 1s
    ==================================================
    Fuzzing the following syscalls:
    	mmap perf_event_open close read write ioctl fork prctl poll 
    *NOT* Fuzzing the following syscalls:
    	
    Also attempting the following:
    	signal-handler-on-overflow busy-instruction-loop accessing-perf-proc-and-sys-files trashing-the-mmap-page 
    *NOT* attempting the following:
    	
    ==================================================
    Cannot open /sys/kernel/tracing/kprobe_events
    Throttling event 26 fd 29, last_refresh=0, period=65420, type=1 throttles 0
    Throttling event 26 fd 29, last_refresh=0, period=65420, type=1 throttles 1
    Throttling event 26 fd 29, last_refresh=0, period=65420, type=1 throttles 2
    Throttling event 26 fd 29, last_refresh=0, period=65420, type=1 throttles 3
    Throttling event 26 fd 29, last_refresh=0, period=65420, type=1 throttles 4
    Throttling event 26 fd 29, last_refresh=0, period=65420, type=1 throttles 5
    Throttling event 26 fd 29, last_refresh=0, period=65420, type=1 throttles 6
    
    Message from syslogd@a10 at Dec 10 14:01:08 ...
    kernel:[   57.886694] Uhhuh. NMI received for unknown reason 3d on CPU 3.
    Message from syslogd@a10 at Dec 10 14:01:08 ...
    kernel:[   57.892629] Do you have a strange power saving mode enabled?
    
    [   56.075422] ------------[ cut here ]------------
    [   56.080067] WARNING: CPU: 1 PID: 0 at arch/x86/kernel/cpu/perf_event_amd_ibs.c:372 perf_ibs_start+0x43/0x131()
    [   56.090072] Modules linked in: nfsd auth_rpcgss oid_registry nfs_acl nfs lockd grace fscache sunrpc nls_utf8 nls_cp437 vfat fat snd_hda_codec_realtek snd_hda_codec_generic snd_hda_codec_hdmi kvm_amd kvm irqbypass sha256_generic hmac drbg ansi_cprng aesni_intel aes_x86_64 ablk_helper cryptd lrw ppdev gf128mul evdev glue_helper hp_wmi sparse_keymap acpi_cpufreq tpm_infineon snd_hda_intel snd_hda_codec snd_hda_core snd_hwdep snd_pcm psmouse pl2303 radeon ttm drm_kms_helper drm pcspkr efi_pstore serio_raw efivars i2c_algo_bit k10temp snd_timer usbserial shpchp fb_sys_fops snd parport_pc syscopyarea sysfillrect wmi i2c_piix4 parport i2c_core soundcore sysimgblt button tpm_tis tpm processor sg sr_mod cdrom sd_mod ohci_pci ahci tg3 libahci ehci_pci libata xhci_pci ohci_hcd ptp xhci_hcd pps_core ehci_hcd
    [   56.162339]  libph crc32c_intel scsi_mod usbcore usb_common
    [   56.166810] CPU: 1 PID: 0 Comm: swapper/1 Tainted: G        W       4.4.0-rc4+ #18
    [   56.174385] Hardware name: Hewlett-Packard HP Compaq Pro 6305 SFF/1850, BIOS K06 v02.57 08/16/2013
    [   56.183343]  0000000000000006 ffffffff81242bd9 0000000000000000 ffffffff81045607
    [   56.190820]  ffffffff8102a736 ffff88009ca8a400 ffffe8ffffc835d0 ffffffff81823380
    [   56.198294]  ffff88022ec8f960 ffffffff8102a736 ffff88009ca8a400 ffffe8ffffc865f0
    [   56.205776] Call Trace:
    [   56.208226]    [] ? dump_stack+0x40/0x50
    [   56.214178]  [] ? warn_slowpath_common+0x94/0xa9
    [   56.220365]  [] ? perf_ibs_start+0x43/0x131
    [   56.226114]  [] ? perf_ibs_start+0x43/0x131
    [   56.231868]  [] ? perf_event_task_tick+0xf4/0x19b
    [   56.238143]  [] ? tick_sched_do_timer+0x24/0x24
    [   56.244241]  [] ? scheduler_tick+0x64/0x7d
    [   56.249906]  [] ? update_process_times+0x3b/0x45
    [   56.256094]  [] ? tick_sched_handle+0x3e/0x4a
    [   56.262019]  [] ? tick_sched_timer+0x2f/0x53
    [   56.267858]  [] ? __hrtimer_run_queues+0xb9/0x18b
    [   56.274132]  [] ? hrtimer_interrupt+0x61/0x101
    [   56.280141]  [] ? smp_apic_timer_interrupt+0x20/0x2f
    [   56.286678]  [] ? apic_timer_interrupt+0x7f/0x90
    [   56.292867]    [] ? cpuidle_enter_state+0xf3/0x145
    [   56.299693]  [] ? cpu_startup_entry+0x170/0x1db
    [   56.305795] ---[ end trace c5ec8b03a2f4c316 ]---
    [   57.886694] Uhhuh. NMI received for unknown reason 3d on CPU 3.
    [   57.892629] Do you have a strange power saving mode enabled?
    [   57.898295] Dazed and confused, but trying to continue
    [   59.994455] Uhhuh. NMI received for unknown reason 2d on CPU 2.
    [   60.000388] Do you have a strange power saving mode enabled?
    [   60.006054] Dazed and confused, but trying to continue
    [   60.104287] Uhhuh. NMI received for unknown reason 3d on CPU 2.
    [   60.110239] Do you have a strange power saving mode enabled?
    [   60.115919] Dazed and confused, but trying to continue
    
    

Back to perf_fuzzer bugs found