Skip to content

Conversation

@cdsteinkuehler
Copy link

Add cape-universal-hdmi pin info to am335x-bone-common-pinmux.dtsi
Edit hdmi dtsi include files to use new mode= setting to set HDMI
mode at startup, leaving pinmux configurable at runtime.

Signed-off-by: Charles Steinkuehler [email protected]

No-audio version works as expected, but HDMI with audio seems to not bring up the display properly. I think this might have to do with being built as modules instead of being compiled directly into the kernel.

It's easy enough to switch back and forth between the two pinmux methods, just turn all the mode = "hdmi" into status = "disabled" in the nxp-hdmi include files, the pinmux additions can be left in place.

RobertCNelson and others added 30 commits September 24, 2014 14:17
#git://git.ti.com/ti-cm3-pm-firmware/amx3-cm3.git
#4e219d5053ee41b8fa8f85b48b1529ae4c6feb48

Signed-off-by: Robert Nelson <[email protected]>
This patch was derived from 2 commits, but I removed the non-pinmux-helper portions
and added the helper to the bb.org_defconfig:

capemgr: Capemgr makefiles and Kconfig fragments.

Introduce a cape loader using DT overlays and dynamic
DT objects.

Makefile and Kconfig fragments.

Signed-off-by: Pantelis Antoniou <[email protected]>

Conflicts:
	arch/arm/mach-omap2/Kconfig
	drivers/misc/Kconfig
	drivers/misc/Makefile

And:

Pinmux helper driver.

    That's just a hack to get a pinmux helper driver working.

    Define in the DT

    	helper {
    		compatible = "bone-pinmux-helper";
    		pinctrl-names = "default";
    		pinctrl-0 = <&helper_pins>;
    		status = "okay";
    	};
Pinctrl already supports multiple states. Just make them visible.
devm_kfree warned out... why? no idea.
A gpio OF helper driver that allows configuration to be done via
DT.
Signed-off-by: Jason Kridner <[email protected]>
This adds gpio and pinmux helpers to the majority of available expansion header pins
based on the cape-universal work from Charles Steinkuehler making them userspace
configurable. This is not a substitute for Capemgr as it doesn't perform the
configuration based on cape detection, nor does it enable dynamic configuration of
all types of peripherals that could be on a cape. It does, however, enable many
developers to rapidly experiment with a lesser degree of complexity.

Derived from:
https://github.com/cdsteinkuehler/beaglebone-universal-io/blob/52461b52ef3203e648399c16c7e160c848a04b5c$

Signed-off-by: Jason Kridner <[email protected]>
Cc: Charles Steinkuehler <[email protected]>
Signed-off-by: Robert Nelson <[email protected]>
…new mode parameter is used to set the initial pinmux mode to something other than "default" or NULL, which is what happens currently. This allows enabling SoC hardware via device-tree which requires specific pinmux settings to function on boot, but still leaves the pinmux register under control of the bone-pinmux- helper driver meaning the pinmux setting can be changed at run time via user-mode access to sysfs.

Signed-off-by: Charles Steinkuehler <[email protected]>
Signed-off-by: Robert Nelson <[email protected]>
Signed-off-by: Robert Nelson <[email protected]>
Signed-off-by: Robert Nelson <[email protected]>
Signed-off-by: Robert Nelson <[email protected]>
Signed-off-by: Robert Nelson <[email protected]>
Signed-off-by: Robert Nelson <[email protected]>
Signed-off-by: Robert Nelson <[email protected]>
Signed-off-by: Robert Nelson <[email protected]>
Signed-off-by: Robert Nelson <[email protected]>
(split into pinmux and cape patches rcn-ee)

Signed-off-by: John Syn <[email protected]>
Signed-off-by: Robert Nelson <[email protected]>
Signed-off-by: Robert Nelson <[email protected]>
Signed-off-by: Robert Nelson <[email protected]>
Signed-off-by: Robert Nelson <[email protected]>
Signed-off-by: Robert Nelson <[email protected]>
RobertCNelson pushed a commit that referenced this pull request Nov 4, 2019
[ Upstream commit c268e7a ]

KASAN: global-out-of-bounds Read in dvb_pll_attach

Syzbot reported global-out-of-bounds Read in dvb_pll_attach, while
accessing id[dvb_pll_devcount], because dvb_pll_devcount was 65,
that is more than size of 'id' which is DVB_PLL_MAX(64).

Rather than increasing dvb_pll_devcount every time, use ida so that
numbers are allocated correctly. This does mean that no more than
64 devices can be attached at the same time, but this is more than
sufficient.

usb 1-1: dvb_usb_v2: will pass the complete MPEG2 transport stream to the
software demuxer
dvbdev: DVB: registering new adapter (774 Friio White ISDB-T USB2.0)
usb 1-1: media controller created
dvbdev: dvb_create_media_entity: media entity 'dvb-demux' registered.
tc90522 0-0018: Toshiba TC90522 attached.
usb 1-1: DVB: registering adapter 0 frontend 0 (Toshiba TC90522 ISDB-T
module)...
dvbdev: dvb_create_media_entity: media entity 'Toshiba TC90522 ISDB-T
module' registered.
==================================================================
BUG: KASAN: global-out-of-bounds in dvb_pll_attach+0x6c5/0x830
drivers/media/dvb-frontends/dvb-pll.c:798
Read of size 4 at addr ffffffff89c9e5e0 by task kworker/0:1/12

CPU: 0 PID: 12 Comm: kworker/0:1 Not tainted 5.2.0-rc6+ #13
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS
Google 01/01/2011
Workqueue: usb_hub_wq hub_event
Call Trace:
  __dump_stack lib/dump_stack.c:77 [inline]
  dump_stack+0xca/0x13e lib/dump_stack.c:113
  print_address_description+0x67/0x231 mm/kasan/report.c:188
  __kasan_report.cold+0x1a/0x32 mm/kasan/report.c:317
  kasan_report+0xe/0x20 mm/kasan/common.c:614
  dvb_pll_attach+0x6c5/0x830 drivers/media/dvb-frontends/dvb-pll.c:798
  dvb_pll_probe+0xfe/0x174 drivers/media/dvb-frontends/dvb-pll.c:877
  i2c_device_probe+0x790/0xaa0 drivers/i2c/i2c-core-base.c:389
  really_probe+0x281/0x660 drivers/base/dd.c:509
  driver_probe_device+0x104/0x210 drivers/base/dd.c:670
  __device_attach_driver+0x1c2/0x220 drivers/base/dd.c:777
  bus_for_each_drv+0x15c/0x1e0 drivers/base/bus.c:454
  __device_attach+0x217/0x360 drivers/base/dd.c:843
  bus_probe_device+0x1e4/0x290 drivers/base/bus.c:514
  device_add+0xae6/0x16f0 drivers/base/core.c:2111
  i2c_new_client_device+0x5b3/0xc40 drivers/i2c/i2c-core-base.c:778
  i2c_new_device+0x19/0x50 drivers/i2c/i2c-core-base.c:821
  dvb_module_probe+0xf9/0x220 drivers/media/dvb-core/dvbdev.c:985
  friio_tuner_attach+0x125/0x1d0 drivers/media/usb/dvb-usb-v2/gl861.c:536
  dvb_usbv2_adapter_frontend_init
drivers/media/usb/dvb-usb-v2/dvb_usb_core.c:675 [inline]
  dvb_usbv2_adapter_init drivers/media/usb/dvb-usb-v2/dvb_usb_core.c:804
[inline]
  dvb_usbv2_init drivers/media/usb/dvb-usb-v2/dvb_usb_core.c:865 [inline]
  dvb_usbv2_probe.cold+0x24dc/0x255d
drivers/media/usb/dvb-usb-v2/dvb_usb_core.c:980
  usb_probe_interface+0x305/0x7a0 drivers/usb/core/driver.c:361
  really_probe+0x281/0x660 drivers/base/dd.c:509
  driver_probe_device+0x104/0x210 drivers/base/dd.c:670
  __device_attach_driver+0x1c2/0x220 drivers/base/dd.c:777
  bus_for_each_drv+0x15c/0x1e0 drivers/base/bus.c:454
  __device_attach+0x217/0x360 drivers/base/dd.c:843
  bus_probe_device+0x1e4/0x290 drivers/base/bus.c:514
  device_add+0xae6/0x16f0 drivers/base/core.c:2111
  usb_set_configuration+0xdf6/0x1670 drivers/usb/core/message.c:2023
  generic_probe+0x9d/0xd5 drivers/usb/core/generic.c:210
  usb_probe_device+0x99/0x100 drivers/usb/core/driver.c:266
  really_probe+0x281/0x660 drivers/base/dd.c:509
  driver_probe_device+0x104/0x210 drivers/base/dd.c:670
  __device_attach_driver+0x1c2/0x220 drivers/base/dd.c:777
  bus_for_each_drv+0x15c/0x1e0 drivers/base/bus.c:454
  __device_attach+0x217/0x360 drivers/base/dd.c:843
  bus_probe_device+0x1e4/0x290 drivers/base/bus.c:514
  device_add+0xae6/0x16f0 drivers/base/core.c:2111
  usb_new_device.cold+0x8c1/0x1016 drivers/usb/core/hub.c:2534
  hub_port_connect drivers/usb/core/hub.c:5089 [inline]
  hub_port_connect_change drivers/usb/core/hub.c:5204 [inline]
  port_event drivers/usb/core/hub.c:5350 [inline]
  hub_event+0x1ada/0x3590 drivers/usb/core/hub.c:5432
  process_one_work+0x905/0x1570 kernel/workqueue.c:2269
  process_scheduled_works kernel/workqueue.c:2331 [inline]
  worker_thread+0x7ab/0xe20 kernel/workqueue.c:2417
  kthread+0x30b/0x410 kernel/kthread.c:255
  ret_from_fork+0x24/0x30 arch/x86/entry/entry_64.S:352

The buggy address belongs to the variable:
  id+0x100/0x120

Memory state around the buggy address:
  ffffffff89c9e480: fa fa fa fa 00 00 fa fa fa fa fa fa 00 00 00 00
  ffffffff89c9e500: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> ffffffff89c9e580: 00 00 00 00 00 00 00 00 00 00 00 00 fa fa fa fa
                                                        ^
  ffffffff89c9e600: 04 fa fa fa fa fa fa fa 04 fa fa fa fa fa fa fa
  ffffffff89c9e680: 04 fa fa fa fa fa fa fa 04 fa fa fa fa fa fa fa
==================================================================

Reported-by: [email protected]
Signed-off-by: Sean Young <[email protected]>
Signed-off-by: Mauro Carvalho Chehab <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
RobertCNelson pushed a commit that referenced this pull request May 9, 2020
commit 7cf64b1 upstream.

vt_in_use() dereferences console_driver->ttys[i] without proper locking.
This is broken because the tty can be closed and freed concurrently.

We could fix this by using 'READ_ONCE(console_driver->ttys[i]) != NULL'
and skipping the check of tty_struct::count.  But, looking at
console_driver->ttys[i] isn't really appropriate anyway because even if
it is NULL the tty can still be in the process of being closed.

Instead, fix it by making vt_in_use() require console_lock() and check
whether the vt is allocated and has port refcount > 1.  This works since
following the patch "vt: vt_ioctl: fix VT_DISALLOCATE freeing in-use
virtual console" the port refcount is incremented while the vt is open.

Reproducer (very unreliable, but it worked for me after a few minutes):

	#include <fcntl.h>
	#include <linux/vt.h>

	int main()
	{
		int fd, nproc;
		struct vt_stat state;
		char ttyname[16];

		fd = open("/dev/tty10", O_RDONLY);
		for (nproc = 1; nproc < 8; nproc *= 2)
			fork();
		for (;;) {
			sprintf(ttyname, "/dev/tty%d", rand() % 8);
			close(open(ttyname, O_RDONLY));
			ioctl(fd, VT_GETSTATE, &state);
		}
	}

KASAN report:

	BUG: KASAN: use-after-free in vt_in_use drivers/tty/vt/vt_ioctl.c:48 [inline]
	BUG: KASAN: use-after-free in vt_ioctl+0x1ad3/0x1d70 drivers/tty/vt/vt_ioctl.c:657
	Read of size 4 at addr ffff888065722468 by task syz-vt2/132

	CPU: 0 PID: 132 Comm: syz-vt2 Not tainted 5.6.0-rc5-00130-g089b6d3654916 #13
	Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS ?-20191223_100556-anatol 04/01/2014
	Call Trace:
	 [...]
	 vt_in_use drivers/tty/vt/vt_ioctl.c:48 [inline]
	 vt_ioctl+0x1ad3/0x1d70 drivers/tty/vt/vt_ioctl.c:657
	 tty_ioctl+0x9db/0x11b0 drivers/tty/tty_io.c:2660
	 [...]

	Allocated by task 136:
	 [...]
	 kzalloc include/linux/slab.h:669 [inline]
	 alloc_tty_struct+0x96/0x8a0 drivers/tty/tty_io.c:2982
	 tty_init_dev+0x23/0x350 drivers/tty/tty_io.c:1334
	 tty_open_by_driver drivers/tty/tty_io.c:1987 [inline]
	 tty_open+0x3ca/0xb30 drivers/tty/tty_io.c:2035
	 [...]

	Freed by task 41:
	 [...]
	 kfree+0xbf/0x200 mm/slab.c:3757
	 free_tty_struct+0x8d/0xb0 drivers/tty/tty_io.c:177
	 release_one_tty+0x22d/0x2f0 drivers/tty/tty_io.c:1468
	 process_one_work+0x7f1/0x14b0 kernel/workqueue.c:2264
	 worker_thread+0x8b/0xc80 kernel/workqueue.c:2410
	 [...]

Fixes: 4001d7b ("vt: push down the tty lock so we can see what is left to tackle")
Cc: <[email protected]> # v3.4+
Acked-by: Jiri Slaby <[email protected]>
Signed-off-by: Eric Biggers <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Greg Kroah-Hartman <[email protected]>
RobertCNelson pushed a commit that referenced this pull request May 9, 2020
[ Upstream commit 1bc7896 ]

When experimenting with bpf_send_signal() helper in our production
environment (5.2 based), we experienced a deadlock in NMI mode:
   #5 [ffffc9002219f770] queued_spin_lock_slowpath at ffffffff8110be24
   #6 [ffffc9002219f770] _raw_spin_lock_irqsave at ffffffff81a43012
   #7 [ffffc9002219f780] try_to_wake_up at ffffffff810e7ecd
   #8 [ffffc9002219f7e0] signal_wake_up_state at ffffffff810c7b55
   #9 [ffffc9002219f7f0] __send_signal at ffffffff810c8602
  #10 [ffffc9002219f830] do_send_sig_info at ffffffff810ca31a
  #11 [ffffc9002219f868] bpf_send_signal at ffffffff8119d227
  #12 [ffffc9002219f988] bpf_overflow_handler at ffffffff811d4140
  #13 [ffffc9002219f9e0] __perf_event_overflow at ffffffff811d68cf
  #14 [ffffc9002219fa10] perf_swevent_overflow at ffffffff811d6a09
  #15 [ffffc9002219fa38] ___perf_sw_event at ffffffff811e0f47
  #16 [ffffc9002219fc30] __schedule at ffffffff81a3e04d
  #17 [ffffc9002219fc90] schedule at ffffffff81a3e219
  #18 [ffffc9002219fca0] futex_wait_queue_me at ffffffff8113d1b9
  #19 [ffffc9002219fcd8] futex_wait at ffffffff8113e529
  #20 [ffffc9002219fdf0] do_futex at ffffffff8113ffbc
  #21 [ffffc9002219fec0] __x64_sys_futex at ffffffff81140d1c
  #22 [ffffc9002219ff38] do_syscall_64 at ffffffff81002602
  #23 [ffffc9002219ff50] entry_SYSCALL_64_after_hwframe at ffffffff81c00068

The above call stack is actually very similar to an issue
reported by Commit eac9153 ("bpf/stackmap: Fix deadlock with
rq_lock in bpf_get_stack()") by Song Liu. The only difference is
bpf_send_signal() helper instead of bpf_get_stack() helper.

The above deadlock is triggered with a perf_sw_event.
Similar to Commit eac9153, the below almost identical reproducer
used tracepoint point sched/sched_switch so the issue can be easily caught.
  /* stress_test.c */
  #include <stdio.h>
  #include <stdlib.h>
  #include <sys/mman.h>
  #include <pthread.h>
  #include <sys/types.h>
  #include <sys/stat.h>
  #include <fcntl.h>

  #define THREAD_COUNT 1000
  char *filename;
  void *worker(void *p)
  {
        void *ptr;
        int fd;
        char *pptr;

        fd = open(filename, O_RDONLY);
        if (fd < 0)
                return NULL;
        while (1) {
                struct timespec ts = {0, 1000 + rand() % 2000};

                ptr = mmap(NULL, 4096 * 64, PROT_READ, MAP_PRIVATE, fd, 0);
                usleep(1);
                if (ptr == MAP_FAILED) {
                        printf("failed to mmap\n");
                        break;
                }
                munmap(ptr, 4096 * 64);
                usleep(1);
                pptr = malloc(1);
                usleep(1);
                pptr[0] = 1;
                usleep(1);
                free(pptr);
                usleep(1);
                nanosleep(&ts, NULL);
        }
        close(fd);
        return NULL;
  }

  int main(int argc, char *argv[])
  {
        void *ptr;
        int i;
        pthread_t threads[THREAD_COUNT];

        if (argc < 2)
                return 0;

        filename = argv[1];

        for (i = 0; i < THREAD_COUNT; i++) {
                if (pthread_create(threads + i, NULL, worker, NULL)) {
                        fprintf(stderr, "Error creating thread\n");
                        return 0;
                }
        }

        for (i = 0; i < THREAD_COUNT; i++)
                pthread_join(threads[i], NULL);
        return 0;
  }
and the following command:
  1. run `stress_test /bin/ls` in one windown
  2. hack bcc trace.py with the following change:
#     --- a/tools/trace.py
#     +++ b/tools/trace.py
     @@ -513,6 +513,7 @@ BPF_PERF_OUTPUT(%s);
              __data.tgid = __tgid;
              __data.pid = __pid;
              bpf_get_current_comm(&__data.comm, sizeof(__data.comm));
     +        bpf_send_signal(10);
      %s
      %s
              %s.perf_submit(%s, &__data, sizeof(__data));
  3. in a different window run
     ./trace.py -p $(pidof stress_test) t:sched:sched_switch

The deadlock can be reproduced in our production system.

Similar to Song's fix, the fix is to delay sending signal if
irqs is disabled to avoid deadlocks involving with rq_lock.
With this change, my above stress-test in our production system
won't cause deadlock any more.

I also implemented a scale-down version of reproducer in the
selftest (a subsequent commit). With latest bpf-next,
it complains for the following potential deadlock.
  [   32.832450] -> #1 (&p->pi_lock){-.-.}:
  [   32.833100]        _raw_spin_lock_irqsave+0x44/0x80
  [   32.833696]        task_rq_lock+0x2c/0xa0
  [   32.834182]        task_sched_runtime+0x59/0xd0
  [   32.834721]        thread_group_cputime+0x250/0x270
  [   32.835304]        thread_group_cputime_adjusted+0x2e/0x70
  [   32.835959]        do_task_stat+0x8a7/0xb80
  [   32.836461]        proc_single_show+0x51/0xb0
  ...
  [   32.839512] -> #0 (&(&sighand->siglock)->rlock){....}:
  [   32.840275]        __lock_acquire+0x1358/0x1a20
  [   32.840826]        lock_acquire+0xc7/0x1d0
  [   32.841309]        _raw_spin_lock_irqsave+0x44/0x80
  [   32.841916]        __lock_task_sighand+0x79/0x160
  [   32.842465]        do_send_sig_info+0x35/0x90
  [   32.842977]        bpf_send_signal+0xa/0x10
  [   32.843464]        bpf_prog_bc13ed9e4d3163e3_send_signal_tp_sched+0x465/0x1000
  [   32.844301]        trace_call_bpf+0x115/0x270
  [   32.844809]        perf_trace_run_bpf_submit+0x4a/0xc0
  [   32.845411]        perf_trace_sched_switch+0x10f/0x180
  [   32.846014]        __schedule+0x45d/0x880
  [   32.846483]        schedule+0x5f/0xd0
  ...

  [   32.853148] Chain exists of:
  [   32.853148]   &(&sighand->siglock)->rlock --> &p->pi_lock --> &rq->lock
  [   32.853148]
  [   32.854451]  Possible unsafe locking scenario:
  [   32.854451]
  [   32.855173]        CPU0                    CPU1
  [   32.855745]        ----                    ----
  [   32.856278]   lock(&rq->lock);
  [   32.856671]                                lock(&p->pi_lock);
  [   32.857332]                                lock(&rq->lock);
  [   32.857999]   lock(&(&sighand->siglock)->rlock);

  Deadlock happens on CPU0 when it tries to acquire &sighand->siglock
  but it has been held by CPU1 and CPU1 tries to grab &rq->lock
  and cannot get it.

  This is not exactly the callstack in our production environment,
  but sympotom is similar and both locks are using spin_lock_irqsave()
  to acquire the lock, and both involves rq_lock. The fix to delay
  sending signal when irq is disabled also fixed this issue.

Signed-off-by: Yonghong Song <[email protected]>
Signed-off-by: Alexei Starovoitov <[email protected]>
Cc: Song Liu <[email protected]>
Link: https://lore.kernel.org/bpf/[email protected]
Signed-off-by: Sasha Levin <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
RobertCNelson pushed a commit that referenced this pull request Jun 15, 2020
[ Upstream commit 7b01b72 ]

When the nvmem framework is enabled, a nvmem device is created per mtd
device/partition.

It is not uncommon that a device can have multiple mtd devices with
partitions that have the same name. Eg, when there DT overlay is allowed
and the same device with mtd is attached twice.

Under that circumstances, the mtd fails to register due to a name
duplication on the nvmem framework.

With this patch we use the mtdX name instead of the partition name,
which is unique.

[    8.948991] sysfs: cannot create duplicate filename '/bus/nvmem/devices/Production Data'
[    8.948992] CPU: 7 PID: 246 Comm: systemd-udevd Not tainted 5.5.0-qtec-standard #13
[    8.948993] Hardware name: AMD Dibbler/Dibbler, BIOS 05.22.04.0019 10/26/2019
[    8.948994] Call Trace:
[    8.948996]  dump_stack+0x50/0x70
[    8.948998]  sysfs_warn_dup.cold+0x17/0x2d
[    8.949000]  sysfs_do_create_link_sd.isra.0+0xc2/0xd0
[    8.949002]  bus_add_device+0x74/0x140
[    8.949004]  device_add+0x34b/0x850
[    8.949006]  nvmem_register.part.0+0x1bf/0x640
...
[    8.948926] mtd mtd8: Failed to register NVMEM device

Fixes: c4dfa25 ("mtd: add support for reading MTD devices via the nvmem API")
Signed-off-by: Ricardo Ribalda Delgado <[email protected]>
Acked-by: Miquel Raynal <[email protected]>
Signed-off-by: Richard Weinberger <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
fhgwright pushed a commit to fhgwright/bb-linux that referenced this pull request Jul 26, 2020
[ Upstream commit c2e929b ]

Booting a power9 server with hash MMU could trigger an undefined
behaviour because pud_offset(p4d, 0) will do,

0 >> (PAGE_SHIFT:16 + PTE_INDEX_SIZE:8 + H_PMD_INDEX_SIZE:10)

Fix it by converting pud_index() and friends to static inline
functions.

UBSAN: shift-out-of-bounds in arch/powerpc/mm/ptdump/ptdump.c:282:15
shift exponent 34 is too large for 32-bit type 'int'
CPU: 6 PID: 1 Comm: swapper/0 Not tainted 5.6.0-rc4-next-20200303+ beagleboard#13
Call Trace:
dump_stack+0xf4/0x164 (unreliable)
ubsan_epilogue+0x18/0x78
__ubsan_handle_shift_out_of_bounds+0x160/0x21c
walk_pagetables+0x2cc/0x700
walk_pud at arch/powerpc/mm/ptdump/ptdump.c:282
(inlined by) walk_pagetables at arch/powerpc/mm/ptdump/ptdump.c:311
ptdump_check_wx+0x8c/0xf0
mark_rodata_ro+0x48/0x80
kernel_init+0x74/0x194
ret_from_kernel_thread+0x5c/0x74

Suggested-by: Christophe Leroy <[email protected]>
Signed-off-by: Qian Cai <[email protected]>
Signed-off-by: Michael Ellerman <[email protected]>
Reviewed-by: Christophe Leroy <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Sasha Levin <[email protected]>
RobertCNelson pushed a commit that referenced this pull request Sep 25, 2020
[ Upstream commit 8dbe4c5 ]

of_find_node_by_name() will do an of_node_put() on the "from" argument.
With CONFIG_OF_DYNAMIC enabled which checks for device_node reference
counts, we would be getting a warning like this:

[    6.347230] refcount_t: increment on 0; use-after-free.
[    6.352498] WARNING: CPU: 3 PID: 77 at lib/refcount.c:156
refcount_inc_checked+0x38/0x44
[    6.360601] Modules linked in:
[    6.363661] CPU: 3 PID: 77 Comm: kworker/3:1 Tainted: G        W
5.4.46-gb78b3e9956e6 #13
[    6.372546] Hardware name: BCM97278SV (DT)
[    6.376649] Workqueue: events deferred_probe_work_func
[    6.381796] pstate: 60000005 (nZCv daif -PAN -UAO)
[    6.386595] pc : refcount_inc_checked+0x38/0x44
[    6.391133] lr : refcount_inc_checked+0x38/0x44
...
[    6.478791] Call trace:
[    6.481243]  refcount_inc_checked+0x38/0x44
[    6.485433]  kobject_get+0x3c/0x4c
[    6.488840]  of_node_get+0x24/0x34
[    6.492247]  of_irq_find_parent+0x3c/0xe0
[    6.496263]  of_irq_parse_one+0xe4/0x1d0
[    6.500191]  irq_of_parse_and_map+0x44/0x84
[    6.504381]  bcm_sf2_sw_probe+0x22c/0x844
[    6.508397]  platform_drv_probe+0x58/0xa8
[    6.512413]  really_probe+0x238/0x3fc
[    6.516081]  driver_probe_device+0x11c/0x12c
[    6.520358]  __device_attach_driver+0xa8/0x100
[    6.524808]  bus_for_each_drv+0xb4/0xd0
[    6.528650]  __device_attach+0xd0/0x164
[    6.532493]  device_initial_probe+0x24/0x30
[    6.536682]  bus_probe_device+0x38/0x98
[    6.540524]  deferred_probe_work_func+0xa8/0xd4
[    6.545061]  process_one_work+0x178/0x288
[    6.549078]  process_scheduled_works+0x44/0x48
[    6.553529]  worker_thread+0x218/0x270
[    6.557285]  kthread+0xdc/0xe4
[    6.560344]  ret_from_fork+0x10/0x18
[    6.563925] ---[ end trace 68f65caf69bb152a ]---

Fix this by adding a of_node_get() to increment the reference count
prior to the call.

Fixes: afa3b59 ("net: dsa: bcm_sf2: Ensure correct sub-node is parsed")
Signed-off-by: Florian Fainelli <[email protected]>
Reviewed-by: Andrew Lunn <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
RobertCNelson pushed a commit that referenced this pull request Sep 25, 2020
[ Upstream commit e24c644 ]

I compiled with AddressSanitizer and I had these memory leaks while I
was using the tep_parse_format function:

    Direct leak of 28 byte(s) in 4 object(s) allocated from:
        #0 0x7fb07db49ffe in __interceptor_realloc (/lib/x86_64-linux-gnu/libasan.so.5+0x10dffe)
        #1 0x7fb07a724228 in extend_token /home/pduplessis/repo/linux/tools/lib/traceevent/event-parse.c:985
        #2 0x7fb07a724c21 in __read_token /home/pduplessis/repo/linux/tools/lib/traceevent/event-parse.c:1140
        #3 0x7fb07a724f78 in read_token /home/pduplessis/repo/linux/tools/lib/traceevent/event-parse.c:1206
        #4 0x7fb07a725191 in __read_expect_type /home/pduplessis/repo/linux/tools/lib/traceevent/event-parse.c:1291
        #5 0x7fb07a7251df in read_expect_type /home/pduplessis/repo/linux/tools/lib/traceevent/event-parse.c:1299
        #6 0x7fb07a72e6c8 in process_dynamic_array_len /home/pduplessis/repo/linux/tools/lib/traceevent/event-parse.c:2849
        #7 0x7fb07a7304b8 in process_function /home/pduplessis/repo/linux/tools/lib/traceevent/event-parse.c:3161
        #8 0x7fb07a730900 in process_arg_token /home/pduplessis/repo/linux/tools/lib/traceevent/event-parse.c:3207
        #9 0x7fb07a727c0b in process_arg /home/pduplessis/repo/linux/tools/lib/traceevent/event-parse.c:1786
        #10 0x7fb07a731080 in event_read_print_args /home/pduplessis/repo/linux/tools/lib/traceevent/event-parse.c:3285
        #11 0x7fb07a731722 in event_read_print /home/pduplessis/repo/linux/tools/lib/traceevent/event-parse.c:3369
        #12 0x7fb07a740054 in __tep_parse_format /home/pduplessis/repo/linux/tools/lib/traceevent/event-parse.c:6335
        #13 0x7fb07a74047a in __parse_event /home/pduplessis/repo/linux/tools/lib/traceevent/event-parse.c:6389
        #14 0x7fb07a740536 in tep_parse_format /home/pduplessis/repo/linux/tools/lib/traceevent/event-parse.c:6431
        #15 0x7fb07a785acf in parse_event ../../../src/fs-src/fs.c:251
        #16 0x7fb07a785ccd in parse_systems ../../../src/fs-src/fs.c:284
        #17 0x7fb07a786fb3 in read_metadata ../../../src/fs-src/fs.c:593
        #18 0x7fb07a78760e in ftrace_fs_source_init ../../../src/fs-src/fs.c:727
        #19 0x7fb07d90c19c in add_component_with_init_method_data ../../../../src/lib/graph/graph.c:1048
        #20 0x7fb07d90c87b in add_source_component_with_initialize_method_data ../../../../src/lib/graph/graph.c:1127
        #21 0x7fb07d90c92a in bt_graph_add_source_component ../../../../src/lib/graph/graph.c:1152
        #22 0x55db11aa632e in cmd_run_ctx_create_components_from_config_components ../../../src/cli/babeltrace2.c:2252
        #23 0x55db11aa6fda in cmd_run_ctx_create_components ../../../src/cli/babeltrace2.c:2347
        #24 0x55db11aa780c in cmd_run ../../../src/cli/babeltrace2.c:2461
        #25 0x55db11aa8a7d in main ../../../src/cli/babeltrace2.c:2673
        #26 0x7fb07d5460b2 in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x270b2)

The token variable in the process_dynamic_array_len function is
allocated in the read_expect_type function, but is not freed before
calling the read_token function.

Free the token variable before calling read_token in order to plug the
leak.

Signed-off-by: Philippe Duplessis-Guindon <[email protected]>
Reviewed-by: Steven Rostedt (VMware) <[email protected]>
Link: https://lore.kernel.org/linux-trace-devel/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
RobertCNelson pushed a commit that referenced this pull request Sep 25, 2020
[ Upstream commit 7fd5b25 ]

After commit 92cc68e ("drm/vblank: Use
spin_(un)lock_irq() in drm_crtc_vblank_on()") omapdrm locking is broken:

WARNING: inconsistent lock state
5.8.0-rc2-00483-g92cc68e35863 #13 Tainted: G        W
--------------------------------
inconsistent {HARDIRQ-ON-W} -> {IN-HARDIRQ-W} usage.
swapper/0/0 [HC1[1]:SC0[0]:HE0:SE1] takes:
ea98222c (&dev->event_lock#2){?.+.}-{2:2}, at: drm_handle_vblank+0x4c/0x520 [drm]
{HARDIRQ-ON-W} state was registered at:
  trace_hardirqs_on+0x9c/0x1ec
  _raw_spin_unlock_irq+0x20/0x58
  omap_crtc_atomic_enable+0x54/0xa0 [omapdrm]
  drm_atomic_helper_commit_modeset_enables+0x218/0x270 [drm_kms_helper]
  omap_atomic_commit_tail+0x48/0xc4 [omapdrm]
  commit_tail+0x9c/0x190 [drm_kms_helper]
  drm_atomic_helper_commit+0x154/0x188 [drm_kms_helper]
  drm_client_modeset_commit_atomic+0x228/0x268 [drm]
  drm_client_modeset_commit_locked+0x60/0x1d0 [drm]
  drm_client_modeset_commit+0x24/0x40 [drm]
  drm_fb_helper_restore_fbdev_mode_unlocked+0x54/0xa8 [drm_kms_helper]
  drm_fb_helper_set_par+0x2c/0x5c [drm_kms_helper]
  drm_fb_helper_hotplug_event.part.0+0xa0/0xbc [drm_kms_helper]
  drm_kms_helper_hotplug_event+0x24/0x30 [drm_kms_helper]
  output_poll_execute+0x1a8/0x1c0 [drm_kms_helper]
  process_one_work+0x268/0x800
  worker_thread+0x30/0x4e0
  kthread+0x164/0x190
  ret_from_fork+0x14/0x20

The reason for this is that omapdrm calls drm_crtc_vblank_on() while
holding event_lock taken with spin_lock_irq().

It is not clear why drm_crtc_vblank_on() and drm_crtc_vblank_get() are
called while holding event_lock. I don't see any problem with moving
those calls outside the lock, which is what this patch does.

Signed-off-by: Tomi Valkeinen <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
Reviewed-by: Laurent Pinchart <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
RobertCNelson pushed a commit that referenced this pull request Oct 27, 2020
[ Upstream commit b12eea5 ]

The evsel->unit borrows a pointer of pmu event or alias instead of
owns a string.  But tool event (duration_time) passes a result of
strdup() caused a leak.

It was found by ASAN during metric test:

  Direct leak of 210 byte(s) in 70 object(s) allocated from:
    #0 0x7fe366fca0b5 in strdup (/lib/x86_64-linux-gnu/libasan.so.5+0x920b5)
    #1 0x559fbbcc6ea3 in add_event_tool util/parse-events.c:414
    #2 0x559fbbcc6ea3 in parse_events_add_tool util/parse-events.c:1414
    #3 0x559fbbd8474d in parse_events_parse util/parse-events.y:439
    #4 0x559fbbcc95da in parse_events__scanner util/parse-events.c:2096
    #5 0x559fbbcc95da in __parse_events util/parse-events.c:2141
    #6 0x559fbbc28555 in check_parse_id tests/pmu-events.c:406
    #7 0x559fbbc28555 in check_parse_id tests/pmu-events.c:393
    #8 0x559fbbc28555 in check_parse_cpu tests/pmu-events.c:415
    #9 0x559fbbc28555 in test_parsing tests/pmu-events.c:498
    #10 0x559fbbc0109b in run_test tests/builtin-test.c:410
    #11 0x559fbbc0109b in test_and_print tests/builtin-test.c:440
    #12 0x559fbbc03e69 in __cmd_test tests/builtin-test.c:695
    #13 0x559fbbc03e69 in cmd_test tests/builtin-test.c:807
    #14 0x559fbbc691f4 in run_builtin /home/namhyung/project/linux/tools/perf/perf.c:312
    #15 0x559fbbb071a8 in handle_internal_command /home/namhyung/project/linux/tools/perf/perf.c:364
    #16 0x559fbbb071a8 in run_argv /home/namhyung/project/linux/tools/perf/perf.c:408
    #17 0x559fbbb071a8 in main /home/namhyung/project/linux/tools/perf/perf.c:538
    #18 0x7fe366b68cc9 in __libc_start_main ../csu/libc-start.c:308

Fixes: f0fbb11 ("perf stat: Implement duration_time as a proper event")
Signed-off-by: Namhyung Kim <[email protected]>
Acked-by: Jiri Olsa <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: Ian Rogers <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Stephane Eranian <[email protected]>
Link: http://lore.kernel.org/lkml/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
RobertCNelson pushed a commit that referenced this pull request Oct 27, 2020
[ Upstream commit d26383d ]

The following leaks were detected by ASAN:

  Indirect leak of 360 byte(s) in 9 object(s) allocated from:
    #0 0x7fecc305180e in calloc (/lib/x86_64-linux-gnu/libasan.so.5+0x10780e)
    #1 0x560578f6dce5 in perf_pmu__new_format util/pmu.c:1333
    #2 0x560578f752fc in perf_pmu_parse util/pmu.y:59
    #3 0x560578f6a8b7 in perf_pmu__format_parse util/pmu.c:73
    #4 0x560578e07045 in test__pmu tests/pmu.c:155
    #5 0x560578de109b in run_test tests/builtin-test.c:410
    #6 0x560578de109b in test_and_print tests/builtin-test.c:440
    #7 0x560578de401a in __cmd_test tests/builtin-test.c:661
    #8 0x560578de401a in cmd_test tests/builtin-test.c:807
    #9 0x560578e49354 in run_builtin /home/namhyung/project/linux/tools/perf/perf.c:312
    #10 0x560578ce71a8 in handle_internal_command /home/namhyung/project/linux/tools/perf/perf.c:364
    #11 0x560578ce71a8 in run_argv /home/namhyung/project/linux/tools/perf/perf.c:408
    #12 0x560578ce71a8 in main /home/namhyung/project/linux/tools/perf/perf.c:538
    #13 0x7fecc2b7acc9 in __libc_start_main ../csu/libc-start.c:308

Fixes: cff7f95 ("perf tests: Move pmu tests into separate object")
Signed-off-by: Namhyung Kim <[email protected]>
Acked-by: Jiri Olsa <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: Ian Rogers <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Stephane Eranian <[email protected]>
Link: http://lore.kernel.org/lkml/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
RobertCNelson pushed a commit that referenced this pull request Oct 27, 2020
[ Upstream commit ca2ef4f ]

A kernel built with KASAN && FTRACE_WITH_REGS && !MODULES, produces a
boot-time splat in the bowels of ftrace:

| [    0.000000] ftrace: allocating 32281 entries in 127 pages
| [    0.000000] ------------[ cut here ]------------
| [    0.000000] WARNING: CPU: 0 PID: 0 at kernel/trace/ftrace.c:2019 ftrace_bug+0x27c/0x328
| [    0.000000] CPU: 0 PID: 0 Comm: swapper Not tainted 5.4.0-rc3-00008-g7f08ae53a7e3 #13
| [    0.000000] Hardware name: linux,dummy-virt (DT)
| [    0.000000] pstate: 60000085 (nZCv daIf -PAN -UAO)
| [    0.000000] pc : ftrace_bug+0x27c/0x328
| [    0.000000] lr : ftrace_init+0x640/0x6cc
| [    0.000000] sp : ffffa000120e7e00
| [    0.000000] x29: ffffa000120e7e00 x28: ffff00006ac01b10
| [    0.000000] x27: ffff00006ac898c0 x26: dfffa00000000000
| [    0.000000] x25: ffffa000120ef290 x24: ffffa0001216df40
| [    0.000000] x23: 000000000000018d x22: ffffa0001244c700
| [    0.000000] x21: ffffa00011bf393c x20: ffff00006ac898c0
| [    0.000000] x19: 00000000ffffffff x18: 0000000000001584
| [    0.000000] x17: 0000000000001540 x16: 0000000000000007
| [    0.000000] x15: 0000000000000000 x14: ffffa00010432770
| [    0.000000] x13: ffff940002483519 x12: 1ffff40002483518
| [    0.000000] x11: 1ffff40002483518 x10: ffff940002483518
| [    0.000000] x9 : dfffa00000000000 x8 : 0000000000000001
| [    0.000000] x7 : ffff940002483519 x6 : ffffa0001241a8c0
| [    0.000000] x5 : ffff940002483519 x4 : ffff940002483519
| [    0.000000] x3 : ffffa00011780870 x2 : 0000000000000001
| [    0.000000] x1 : 1fffe0000d591318 x0 : 0000000000000000
| [    0.000000] Call trace:
| [    0.000000]  ftrace_bug+0x27c/0x328
| [    0.000000]  ftrace_init+0x640/0x6cc
| [    0.000000]  start_kernel+0x27c/0x654
| [    0.000000] random: get_random_bytes called from print_oops_end_marker+0x30/0x60 with crng_init=0
| [    0.000000] ---[ end trace 0000000000000000 ]---
| [    0.000000] ftrace faulted on writing
| [    0.000000] [<ffffa00011bf393c>] _GLOBAL__sub_D_65535_0___tracepoint_initcall_level+0x4/0x28
| [    0.000000] Initializing ftrace call sites
| [    0.000000] ftrace record flags: 0
| [    0.000000]  (0)
| [    0.000000]  expected tramp: ffffa000100b3344

This is due to an unfortunate combination of several factors.

Building with KASAN results in the compiler generating anonymous
functions to register/unregister global variables against the shadow
memory. These functions are placed in .text.startup/.text.exit, and
given mangled names like _GLOBAL__sub_{I,D}_65535_0_$OTHER_SYMBOL. The
kernel linker script places these in .init.text and .exit.text
respectively, which are both discarded at runtime as part of initmem.

Building with FTRACE_WITH_REGS uses -fpatchable-function-entry=2, which
also instruments KASAN's anonymous functions. When these are discarded
with the rest of initmem, ftrace removes dangling references to these
call sites.

Building without MODULES implicitly disables STRICT_MODULE_RWX, and
causes arm64's patch_map() function to treat any !core_kernel_text()
symbol as something that can be modified in-place. As core_kernel_text()
is only true for .text and .init.text, with the latter depending on
system_state < SYSTEM_RUNNING, we'll treat .exit.text as something that
can be patched in-place. However, .exit.text is mapped read-only.

Hence in this configuration the ftrace init code blows up while trying
to patch one of the functions generated by KASAN.

We could try to filter out the call sites in .exit.text rather than
initializing them, but this would be inconsistent with how we handle
.init.text, and requires hooking into core bits of ftrace. The behaviour
of patch_map() is also inconsistent today, so instead let's clean that
up and have it consistently handle .exit.text.

This patch teaches patch_map() to handle .exit.text at init time,
preventing the boot-time splat above. The flow of patch_map() is
reworked to make the logic clearer and minimize redundant
conditionality.

Fixes: 3b23e49 ("arm64: implement ftrace with regs")
Signed-off-by: Mark Rutland <[email protected]>
Cc: Amit Daniel Kachhap <[email protected]>
Cc: Ard Biesheuvel <[email protected]>
Cc: Torsten Duwe <[email protected]>
Cc: Will Deacon <[email protected]>
Signed-off-by: Catalin Marinas <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
RobertCNelson pushed a commit that referenced this pull request May 14, 2021
commit 4d14c5c upstream

Calling btrfs_qgroup_reserve_meta_prealloc from
btrfs_delayed_inode_reserve_metadata can result in flushing delalloc
while holding a transaction and delayed node locks. This is deadlock
prone. In the past multiple commits:

 * ae5e070 ("btrfs: qgroup: don't try to wait flushing if we're
already holding a transaction")

 * 6f23277 ("btrfs: qgroup: don't commit transaction when we already
 hold the handle")

Tried to solve various aspects of this but this was always a
whack-a-mole game. Unfortunately those 2 fixes don't solve a deadlock
scenario involving btrfs_delayed_node::mutex. Namely, one thread
can call btrfs_dirty_inode as a result of reading a file and modifying
its atime:

  PID: 6963   TASK: ffff8c7f3f94c000  CPU: 2   COMMAND: "test"
  #0  __schedule at ffffffffa529e07d
  #1  schedule at ffffffffa529e4ff
  #2  schedule_timeout at ffffffffa52a1bdd
  #3  wait_for_completion at ffffffffa529eeea             <-- sleeps with delayed node mutex held
  #4  start_delalloc_inodes at ffffffffc0380db5
  #5  btrfs_start_delalloc_snapshot at ffffffffc0393836
  #6  try_flush_qgroup at ffffffffc03f04b2
  #7  __btrfs_qgroup_reserve_meta at ffffffffc03f5bb6     <-- tries to reserve space and starts delalloc inodes.
  #8  btrfs_delayed_update_inode at ffffffffc03e31aa      <-- acquires delayed node mutex
  #9  btrfs_update_inode at ffffffffc0385ba8
 #10  btrfs_dirty_inode at ffffffffc038627b               <-- TRANSACTIION OPENED
 #11  touch_atime at ffffffffa4cf0000
 #12  generic_file_read_iter at ffffffffa4c1f123
 #13  new_sync_read at ffffffffa4ccdc8a
 #14  vfs_read at ffffffffa4cd0849
 #15  ksys_read at ffffffffa4cd0bd1
 #16  do_syscall_64 at ffffffffa4a052eb
 #17  entry_SYSCALL_64_after_hwframe at ffffffffa540008c

This will cause an asynchronous work to flush the delalloc inodes to
happen which can try to acquire the same delayed_node mutex:

  PID: 455    TASK: ffff8c8085fa4000  CPU: 5   COMMAND: "kworker/u16:30"
  #0  __schedule at ffffffffa529e07d
  #1  schedule at ffffffffa529e4ff
  #2  schedule_preempt_disabled at ffffffffa529e80a
  #3  __mutex_lock at ffffffffa529fdcb                    <-- goes to sleep, never wakes up.
  #4  btrfs_delayed_update_inode at ffffffffc03e3143      <-- tries to acquire the mutex
  #5  btrfs_update_inode at ffffffffc0385ba8              <-- this is the same inode that pid 6963 is holding
  #6  cow_file_range_inline.constprop.78 at ffffffffc0386be7
  #7  cow_file_range at ffffffffc03879c1
  #8  btrfs_run_delalloc_range at ffffffffc038894c
  #9  writepage_delalloc at ffffffffc03a3c8f
 #10  __extent_writepage at ffffffffc03a4c01
 #11  extent_write_cache_pages at ffffffffc03a500b
 #12  extent_writepages at ffffffffc03a6de2
 #13  do_writepages at ffffffffa4c277eb
 #14  __filemap_fdatawrite_range at ffffffffa4c1e5bb
 #15  btrfs_run_delalloc_work at ffffffffc0380987         <-- starts running delayed nodes
 #16  normal_work_helper at ffffffffc03b706c
 #17  process_one_work at ffffffffa4aba4e4
 #18  worker_thread at ffffffffa4aba6fd
 #19  kthread at ffffffffa4ac0a3d
 #20  ret_from_fork at ffffffffa54001ff

To fully address those cases the complete fix is to never issue any
flushing while holding the transaction or the delayed node lock. This
patch achieves it by calling qgroup_reserve_meta directly which will
either succeed without flushing or will fail and return -EDQUOT. In the
latter case that return value is going to be propagated to
btrfs_dirty_inode which will fallback to start a new transaction. That's
fine as the majority of time we expect the inode will have
BTRFS_DELAYED_NODE_INODE_DIRTY flag set which will result in directly
copying the in-memory state.

Fixes: c53e965 ("btrfs: qgroup: try to flush qgroup space when we get -EDQUOT")
CC: [email protected] # 5.10+
Reviewed-by: Qu Wenruo <[email protected]>
Signed-off-by: Nikolay Borisov <[email protected]>
Signed-off-by: David Sterba <[email protected]>
[sudip: adjust context]
Signed-off-by: Sudip Mukherjee <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
RobertCNelson pushed a commit that referenced this pull request Oct 1, 2021
[ Upstream commit c4eb1f4 ]

In __htab_map_lookup_and_delete_batch(), hash buckets are iterated
over to count the number of elements in each bucket (bucket_size).
If bucket_size is large enough, the multiplication to calculate
kvmalloc() size could overflow, resulting in out-of-bounds write
as reported by KASAN:

  [...]
  [  104.986052] BUG: KASAN: vmalloc-out-of-bounds in __htab_map_lookup_and_delete_batch+0x5ce/0xb60
  [  104.986489] Write of size 4194224 at addr ffffc9010503be70 by task crash/112
  [  104.986889]
  [  104.987193] CPU: 0 PID: 112 Comm: crash Not tainted 5.14.0-rc4 #13
  [  104.987552] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.13.0-1ubuntu1.1 04/01/2014
  [  104.988104] Call Trace:
  [  104.988410]  dump_stack_lvl+0x34/0x44
  [  104.988706]  print_address_description.constprop.0+0x21/0x140
  [  104.988991]  ? __htab_map_lookup_and_delete_batch+0x5ce/0xb60
  [  104.989327]  ? __htab_map_lookup_and_delete_batch+0x5ce/0xb60
  [  104.989622]  kasan_report.cold+0x7f/0x11b
  [  104.989881]  ? __htab_map_lookup_and_delete_batch+0x5ce/0xb60
  [  104.990239]  kasan_check_range+0x17c/0x1e0
  [  104.990467]  memcpy+0x39/0x60
  [  104.990670]  __htab_map_lookup_and_delete_batch+0x5ce/0xb60
  [  104.990982]  ? __wake_up_common+0x4d/0x230
  [  104.991256]  ? htab_of_map_free+0x130/0x130
  [  104.991541]  bpf_map_do_batch+0x1fb/0x220
  [...]

In hashtable, if the elements' keys have the same jhash() value, the
elements will be put into the same bucket. By putting a lot of elements
into a single bucket, the value of bucket_size can be increased to
trigger the integer overflow.

Triggering the overflow is possible for both callers with CAP_SYS_ADMIN
and callers without CAP_SYS_ADMIN.

It will be trivial for a caller with CAP_SYS_ADMIN to intentionally
reach this overflow by enabling BPF_F_ZERO_SEED. As this flag will set
the random seed passed to jhash() to 0, it will be easy for the caller
to prepare keys which will be hashed into the same value, and thus put
all the elements into the same bucket.

If the caller does not have CAP_SYS_ADMIN, BPF_F_ZERO_SEED cannot be
used. However, it will be still technically possible to trigger the
overflow, by guessing the random seed value passed to jhash() (32bit)
and repeating the attempt to trigger the overflow. In this case,
the probability to trigger the overflow will be low and will take
a very long time.

Fix the integer overflow by calling kvmalloc_array() instead of
kvmalloc() to allocate memory.

Fixes: 0579963 ("bpf: Add batch ops to all htab bpf map")
Signed-off-by: Tatsuhiko Yasumatsu <[email protected]>
Signed-off-by: Daniel Borkmann <[email protected]>
Link: https://lore.kernel.org/bpf/[email protected]
Signed-off-by: Sasha Levin <[email protected]>
RobertCNelson pushed a commit that referenced this pull request Oct 1, 2021
commit 41d5854 upstream.

I got several memory leak reports from Asan with a simple command.  It
was because VDSO is not released due to the refcount.  Like in
__dsos_addnew_id(), it should put the refcount after adding to the list.

  $ perf record true
  [ perf record: Woken up 1 times to write data ]
  [ perf record: Captured and wrote 0.030 MB perf.data (10 samples) ]

  =================================================================
  ==692599==ERROR: LeakSanitizer: detected memory leaks

  Direct leak of 439 byte(s) in 1 object(s) allocated from:
    #0 0x7fea52341037 in __interceptor_calloc ../../../../src/libsanitizer/asan/asan_malloc_linux.cpp:154
    #1 0x559bce4aa8ee in dso__new_id util/dso.c:1256
    #2 0x559bce59245a in __machine__addnew_vdso util/vdso.c:132
    #3 0x559bce59245a in machine__findnew_vdso util/vdso.c:347
    #4 0x559bce50826c in map__new util/map.c:175
    #5 0x559bce503c92 in machine__process_mmap2_event util/machine.c:1787
    #6 0x559bce512f6b in machines__deliver_event util/session.c:1481
    #7 0x559bce515107 in perf_session__deliver_event util/session.c:1551
    #8 0x559bce51d4d2 in do_flush util/ordered-events.c:244
    #9 0x559bce51d4d2 in __ordered_events__flush util/ordered-events.c:323
    #10 0x559bce519bea in __perf_session__process_events util/session.c:2268
    #11 0x559bce519bea in perf_session__process_events util/session.c:2297
    #12 0x559bce2e7a52 in process_buildids /home/namhyung/project/linux/tools/perf/builtin-record.c:1017
    #13 0x559bce2e7a52 in record__finish_output /home/namhyung/project/linux/tools/perf/builtin-record.c:1234
    #14 0x559bce2ed4f6 in __cmd_record /home/namhyung/project/linux/tools/perf/builtin-record.c:2026
    #15 0x559bce2ed4f6 in cmd_record /home/namhyung/project/linux/tools/perf/builtin-record.c:2858
    #16 0x559bce422db4 in run_builtin /home/namhyung/project/linux/tools/perf/perf.c:313
    #17 0x559bce2acac8 in handle_internal_command /home/namhyung/project/linux/tools/perf/perf.c:365
    #18 0x559bce2acac8 in run_argv /home/namhyung/project/linux/tools/perf/perf.c:409
    #19 0x559bce2acac8 in main /home/namhyung/project/linux/tools/perf/perf.c:539
    #20 0x7fea51e76d09 in __libc_start_main ../csu/libc-start.c:308

  Indirect leak of 32 byte(s) in 1 object(s) allocated from:
    #0 0x7fea52341037 in __interceptor_calloc ../../../../src/libsanitizer/asan/asan_malloc_linux.cpp:154
    #1 0x559bce520907 in nsinfo__copy util/namespaces.c:169
    #2 0x559bce50821b in map__new util/map.c:168
    #3 0x559bce503c92 in machine__process_mmap2_event util/machine.c:1787
    #4 0x559bce512f6b in machines__deliver_event util/session.c:1481
    #5 0x559bce515107 in perf_session__deliver_event util/session.c:1551
    #6 0x559bce51d4d2 in do_flush util/ordered-events.c:244
    #7 0x559bce51d4d2 in __ordered_events__flush util/ordered-events.c:323
    #8 0x559bce519bea in __perf_session__process_events util/session.c:2268
    #9 0x559bce519bea in perf_session__process_events util/session.c:2297
    #10 0x559bce2e7a52 in process_buildids /home/namhyung/project/linux/tools/perf/builtin-record.c:1017
    #11 0x559bce2e7a52 in record__finish_output /home/namhyung/project/linux/tools/perf/builtin-record.c:1234
    #12 0x559bce2ed4f6 in __cmd_record /home/namhyung/project/linux/tools/perf/builtin-record.c:2026
    #13 0x559bce2ed4f6 in cmd_record /home/namhyung/project/linux/tools/perf/builtin-record.c:2858
    #14 0x559bce422db4 in run_builtin /home/namhyung/project/linux/tools/perf/perf.c:313
    #15 0x559bce2acac8 in handle_internal_command /home/namhyung/project/linux/tools/perf/perf.c:365
    #16 0x559bce2acac8 in run_argv /home/namhyung/project/linux/tools/perf/perf.c:409
    #17 0x559bce2acac8 in main /home/namhyung/project/linux/tools/perf/perf.c:539
    #18 0x7fea51e76d09 in __libc_start_main ../csu/libc-start.c:308

  SUMMARY: AddressSanitizer: 471 byte(s) leaked in 2 allocation(s).

Signed-off-by: Namhyung Kim <[email protected]>
Acked-by: Jiri Olsa <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: Ian Rogers <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: http://lore.kernel.org/lkml/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
Signed-off-by: Hanjun Guo <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
RobertCNelson pushed a commit that referenced this pull request Dec 17, 2021
commit 57f0ff0 upstream.

It's later supposed to be either a correct address or NULL. Without the
initialization, it may contain an undefined value which results in the
following segmentation fault:

  # perf top --sort comm -g --ignore-callees=do_idle

terminates with:

  #0  0x00007ffff56b7685 in __strlen_avx2 () from /lib64/libc.so.6
  #1  0x00007ffff55e3802 in strdup () from /lib64/libc.so.6
  #2  0x00005555558cb139 in hist_entry__init (callchain_size=<optimized out>, sample_self=true, template=0x7fffde7fb110, he=0x7fffd801c250) at util/hist.c:489
  #3  hist_entry__new (template=template@entry=0x7fffde7fb110, sample_self=sample_self@entry=true) at util/hist.c:564
  #4  0x00005555558cb4ba in hists__findnew_entry (hists=hists@entry=0x5555561d9e38, entry=entry@entry=0x7fffde7fb110, al=al@entry=0x7fffde7fb420,
      sample_self=sample_self@entry=true) at util/hist.c:657
  #5  0x00005555558cba1b in __hists__add_entry (hists=hists@entry=0x5555561d9e38, al=0x7fffde7fb420, sym_parent=<optimized out>, bi=bi@entry=0x0, mi=mi@entry=0x0,
      sample=sample@entry=0x7fffde7fb4b0, sample_self=true, ops=0x0, block_info=0x0) at util/hist.c:288
  #6  0x00005555558cbb70 in hists__add_entry (sample_self=true, sample=0x7fffde7fb4b0, mi=0x0, bi=0x0, sym_parent=<optimized out>, al=<optimized out>, hists=0x5555561d9e38)
      at util/hist.c:1056
  #7  iter_add_single_cumulative_entry (iter=0x7fffde7fb460, al=<optimized out>) at util/hist.c:1056
  #8  0x00005555558cc8a4 in hist_entry_iter__add (iter=iter@entry=0x7fffde7fb460, al=al@entry=0x7fffde7fb420, max_stack_depth=<optimized out>, arg=arg@entry=0x7fffffff7db0)
      at util/hist.c:1231
  #9  0x00005555557cdc9a in perf_event__process_sample (machine=<optimized out>, sample=0x7fffde7fb4b0, evsel=<optimized out>, event=<optimized out>, tool=0x7fffffff7db0)
      at builtin-top.c:842
  #10 deliver_event (qe=<optimized out>, qevent=<optimized out>) at builtin-top.c:1202
  #11 0x00005555558a9318 in do_flush (show_progress=false, oe=0x7fffffff80e0) at util/ordered-events.c:244
  #12 __ordered_events__flush (oe=oe@entry=0x7fffffff80e0, how=how@entry=OE_FLUSH__TOP, timestamp=timestamp@entry=0) at util/ordered-events.c:323
  #13 0x00005555558a9789 in __ordered_events__flush (timestamp=<optimized out>, how=<optimized out>, oe=<optimized out>) at util/ordered-events.c:339
  #14 ordered_events__flush (how=OE_FLUSH__TOP, oe=0x7fffffff80e0) at util/ordered-events.c:341
  #15 ordered_events__flush (oe=oe@entry=0x7fffffff80e0, how=how@entry=OE_FLUSH__TOP) at util/ordered-events.c:339
  #16 0x00005555557cd631 in process_thread (arg=0x7fffffff7db0) at builtin-top.c:1114
  #17 0x00007ffff7bb817a in start_thread () from /lib64/libpthread.so.0
  #18 0x00007ffff5656dc3 in clone () from /lib64/libc.so.6

If you look at the frame #2, the code is:

488	 if (he->srcline) {
489          he->srcline = strdup(he->srcline);
490          if (he->srcline == NULL)
491              goto err_rawdata;
492	 }

If he->srcline is not NULL (it is not NULL if it is uninitialized rubbish),
it gets strdupped and strdupping a rubbish random string causes the problem.

Also, if you look at the commit 1fb7d06, it adds the srcline property
into the struct, but not initializing it everywhere needed.

Committer notes:

Now I see, when using --ignore-callees=do_idle we end up here at line
2189 in add_callchain_ip():

2181         if (al.sym != NULL) {
2182                 if (perf_hpp_list.parent && !*parent &&
2183                     symbol__match_regex(al.sym, &parent_regex))
2184                         *parent = al.sym;
2185                 else if (have_ignore_callees && root_al &&
2186                   symbol__match_regex(al.sym, &ignore_callees_regex)) {
2187                         /* Treat this symbol as the root,
2188                            forgetting its callees. */
2189                         *root_al = al;
2190                         callchain_cursor_reset(cursor);
2191                 }
2192         }

And the al that doesn't have the ->srcline field initialized will be
copied to the root_al, so then, back to:

1211 int hist_entry_iter__add(struct hist_entry_iter *iter, struct addr_location *al,
1212                          int max_stack_depth, void *arg)
1213 {
1214         int err, err2;
1215         struct map *alm = NULL;
1216
1217         if (al)
1218                 alm = map__get(al->map);
1219
1220         err = sample__resolve_callchain(iter->sample, &callchain_cursor, &iter->parent,
1221                                         iter->evsel, al, max_stack_depth);
1222         if (err) {
1223                 map__put(alm);
1224                 return err;
1225         }
1226
1227         err = iter->ops->prepare_entry(iter, al);
1228         if (err)
1229                 goto out;
1230
1231         err = iter->ops->add_single_entry(iter, al);
1232         if (err)
1233                 goto out;
1234

That al at line 1221 is what hist_entry_iter__add() (called from
sample__resolve_callchain()) saw as 'root_al', and then:

        iter->ops->add_single_entry(iter, al);

will go on with al->srcline with a bogus value, I'll add the above
sequence to the cset and apply, thanks!

Signed-off-by: Michael Petlan <[email protected]>
CC: Milian Wolff <[email protected]>
Cc: Jiri Olsa <[email protected]>
Fixes: 1fb7d06 ("perf report Use srcline from callchain for hist entries")
Link: https //lore.kernel.org/r/[email protected]
Reported-by: Juri Lelli <[email protected]>
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
RobertCNelson pushed a commit that referenced this pull request Dec 17, 2021
commit df909df upstream.

ARM: kasan: Fix __get_user_check failure with kasan

In macro __get_user_check defined in arch/arm/include/asm/uaccess.h,
error code is store in register int __e(r0). When kasan is
enabled, assigning value to kernel address might trigger kasan check,
which unexpectedly overwrites r0 and causes undefined behavior on arm
kasan images.

One example is failure in do_futex and results in process soft lockup.
Log:
watchdog: BUG: soft lockup - CPU#0 stuck for 62946ms! [rs:main
Q:Reg:1151]
...
(__asan_store4) from (futex_wait_setup+0xf8/0x2b4)
(futex_wait_setup) from (futex_wait+0x138/0x394)
(futex_wait) from (do_futex+0x164/0xe40)
(do_futex) from (sys_futex_time32+0x178/0x230)
(sys_futex_time32) from (ret_fast_syscall+0x0/0x50)

The soft lockup happens in function futex_wait_setup. The reason is
function get_futex_value_locked always return EINVAL, thus pc jump
back to retry label and causes looping.

This line in function get_futex_value_locked
	ret = __get_user(*dest, from);
is expanded to
	*dest = (typeof(*(p))) __r2; ,
in macro __get_user_check. Writing to pointer dest triggers kasan check
and overwrites the return value of __get_user_x function.
The assembly code of get_futex_value_locked in kernel/futex.c:
...
c01f6dc8:       eb0b020e        bl      c04b7608 <__get_user_4>
// "x = (typeof(*(p))) __r2;" triggers kasan check and r0 is overwritten
c01f6dCc:       e1a00007        mov     r0, r7
c01f6dd0:       e1a05002        mov     r5, r2
c01f6dd4:       eb04f1e6        bl      c0333574 <__asan_store4>
c01f6dd8:       e5875000        str     r5, [r7]
// save ret value of __get_user(*dest, from), which is dest address now
c01f6ddc:       e1a05000        mov     r5, r0
...
// checking return value of __get_user failed
c01f6e00:       e3550000        cmp     r5, #0
...
c01f6e0c:       01a00005        moveq   r0, r5
// assign return value to EINVAL
c01f6e10:       13e0000d        mvnne   r0, #13

Return value is the destination address of get_user thus certainly
non-zero, so get_futex_value_locked always return EINVAL.

Fix it by using a tmp vairable to store the error code before the
assignment. This fix has no effects to non-kasan images thanks to compiler
optimization. It only affects cases that overwrite r0 due to kasan check.

This should fix bug discussed in Link:
[1] https://lore.kernel.org/linux-arm-kernel/[email protected]/

Fixes: 4210157 ("ARM: 9017/2: Enable KASan for ARM")
Signed-off-by: Lexi Shao <[email protected]>
Signed-off-by: Russell King (Oracle) <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
RobertCNelson pushed a commit that referenced this pull request Dec 17, 2021
[ Upstream commit b93c6a9 ]

When I do fuzz test for bonding device interface, I got the following
use-after-free Calltrace:

==================================================================
BUG: KASAN: use-after-free in bond_enslave+0x1521/0x24f0
Read of size 8 at addr ffff88825bc11c00 by task ifenslave/7365

CPU: 5 PID: 7365 Comm: ifenslave Tainted: G            E     5.15.0-rc1+ #13
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.13.0-1ubuntu1 04/01/2014
Call Trace:
 dump_stack_lvl+0x6c/0x8b
 print_address_description.constprop.0+0x48/0x70
 kasan_report.cold+0x82/0xdb
 __asan_load8+0x69/0x90
 bond_enslave+0x1521/0x24f0
 bond_do_ioctl+0x3e0/0x450
 dev_ifsioc+0x2ba/0x970
 dev_ioctl+0x112/0x710
 sock_do_ioctl+0x118/0x1b0
 sock_ioctl+0x2e0/0x490
 __x64_sys_ioctl+0x118/0x150
 do_syscall_64+0x35/0xb0
 entry_SYSCALL_64_after_hwframe+0x44/0xae
RIP: 0033:0x7f19159cf577
Code: b3 66 90 48 8b 05 11 89 2c 00 64 c7 00 26 00 00 00 48 c7 c0 ff ff ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 b8 10 00 00 00 0f 05 <48> 3d 01 f0 ff ff 78
RSP: 002b:00007ffeb3083c78 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
RAX: ffffffffffffffda RBX: 00007ffeb3084bca RCX: 00007f19159cf577
RDX: 00007ffeb3083ce0 RSI: 0000000000008990 RDI: 0000000000000003
RBP: 00007ffeb3084bc4 R08: 0000000000000040 R09: 0000000000000000
R10: 00007ffeb3084bc0 R11: 0000000000000246 R12: 00007ffeb3083ce0
R13: 0000000000000000 R14: 0000000000000000 R15: 00007ffeb3083cb0

Allocated by task 7365:
 kasan_save_stack+0x23/0x50
 __kasan_kmalloc+0x83/0xa0
 kmem_cache_alloc_trace+0x22e/0x470
 bond_enslave+0x2e1/0x24f0
 bond_do_ioctl+0x3e0/0x450
 dev_ifsioc+0x2ba/0x970
 dev_ioctl+0x112/0x710
 sock_do_ioctl+0x118/0x1b0
 sock_ioctl+0x2e0/0x490
 __x64_sys_ioctl+0x118/0x150
 do_syscall_64+0x35/0xb0
 entry_SYSCALL_64_after_hwframe+0x44/0xae

Freed by task 7365:
 kasan_save_stack+0x23/0x50
 kasan_set_track+0x20/0x30
 kasan_set_free_info+0x24/0x40
 __kasan_slab_free+0xf2/0x130
 kfree+0xd1/0x5c0
 slave_kobj_release+0x61/0x90
 kobject_put+0x102/0x180
 bond_sysfs_slave_add+0x7a/0xa0
 bond_enslave+0x11b6/0x24f0
 bond_do_ioctl+0x3e0/0x450
 dev_ifsioc+0x2ba/0x970
 dev_ioctl+0x112/0x710
 sock_do_ioctl+0x118/0x1b0
 sock_ioctl+0x2e0/0x490
 __x64_sys_ioctl+0x118/0x150
 do_syscall_64+0x35/0xb0
 entry_SYSCALL_64_after_hwframe+0x44/0xae

Last potentially related work creation:
 kasan_save_stack+0x23/0x50
 kasan_record_aux_stack+0xb7/0xd0
 insert_work+0x43/0x190
 __queue_work+0x2e3/0x970
 delayed_work_timer_fn+0x3e/0x50
 call_timer_fn+0x148/0x470
 run_timer_softirq+0x8a8/0xc50
 __do_softirq+0x107/0x55f

Second to last potentially related work creation:
 kasan_save_stack+0x23/0x50
 kasan_record_aux_stack+0xb7/0xd0
 insert_work+0x43/0x190
 __queue_work+0x2e3/0x970
 __queue_delayed_work+0x130/0x180
 queue_delayed_work_on+0xa7/0xb0
 bond_enslave+0xe25/0x24f0
 bond_do_ioctl+0x3e0/0x450
 dev_ifsioc+0x2ba/0x970
 dev_ioctl+0x112/0x710
 sock_do_ioctl+0x118/0x1b0
 sock_ioctl+0x2e0/0x490
 __x64_sys_ioctl+0x118/0x150
 do_syscall_64+0x35/0xb0
 entry_SYSCALL_64_after_hwframe+0x44/0xae

The buggy address belongs to the object at ffff88825bc11c00
 which belongs to the cache kmalloc-1k of size 1024
The buggy address is located 0 bytes inside of
 1024-byte region [ffff88825bc11c00, ffff88825bc12000)
The buggy address belongs to the page:
page:ffffea00096f0400 refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x25bc10
head:ffffea00096f0400 order:3 compound_mapcount:0 compound_pincount:0
flags: 0x57ff00000010200(slab|head|node=1|zone=2|lastcpupid=0x7ff)
raw: 057ff00000010200 ffffea0009a71c08 ffff888240001968 ffff88810004dbc0
raw: 0000000000000000 00000000000a000a 00000001ffffffff 0000000000000000
page dumped because: kasan: bad access detected

Memory state around the buggy address:
 ffff88825bc11b00: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
 ffff88825bc11b80: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
>ffff88825bc11c00: fa fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
                   ^
 ffff88825bc11c80: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
 ffff88825bc11d00: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
==================================================================

Put new_slave in bond_sysfs_slave_add() will cause use-after-free problems
when new_slave is accessed in the subsequent error handling process. Since
new_slave will be put in the subsequent error handling process, remove the
unnecessary put to fix it.
In addition, when sysfs_create_file() fails, if some files have been crea-
ted successfully, we need to call sysfs_remove_file() to remove them.
Since there are sysfs_create_files() & sysfs_remove_files() can be used,
use these two functions instead.

Fixes: 7afcaec (bonding: use kobject_put instead of _del after kobject_add)
Signed-off-by: Huang Guobin <[email protected]>
Reviewed-by: Jakub Kicinski <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
RobertCNelson pushed a commit that referenced this pull request Mar 31, 2022
[ Upstream commit 4224cfd ]

When bringing down the netdevice or system shutdown, a panic can be
triggered while accessing the sysfs path because the device is already
removed.

    [  755.549084] mlx5_core 0000:12:00.1: Shutdown was called
    [  756.404455] mlx5_core 0000:12:00.0: Shutdown was called
    ...
    [  757.937260] BUG: unable to handle kernel NULL pointer dereference at           (null)
    [  758.031397] IP: [<ffffffff8ee11acb>] dma_pool_alloc+0x1ab/0x280

    crash> bt
    ...
    PID: 12649  TASK: ffff8924108f2100  CPU: 1   COMMAND: "amsd"
    ...
     #9 [ffff89240e1a38b0] page_fault at ffffffff8f38c778
        [exception RIP: dma_pool_alloc+0x1ab]
        RIP: ffffffff8ee11acb  RSP: ffff89240e1a3968  RFLAGS: 00010046
        RAX: 0000000000000246  RBX: ffff89243d874100  RCX: 0000000000001000
        RDX: 0000000000000000  RSI: 0000000000000246  RDI: ffff89243d874090
        RBP: ffff89240e1a39c0   R8: 000000000001f080   R9: ffff8905ffc03c00
        R10: ffffffffc04680d4  R11: ffffffff8edde9fd  R12: 00000000000080d0
        R13: ffff89243d874090  R14: ffff89243d874080  R15: 0000000000000000
        ORIG_RAX: ffffffffffffffff  CS: 0010  SS: 0018
    #10 [ffff89240e1a39c8] mlx5_alloc_cmd_msg at ffffffffc04680f3 [mlx5_core]
    #11 [ffff89240e1a3a18] cmd_exec at ffffffffc046ad62 [mlx5_core]
    #12 [ffff89240e1a3ab8] mlx5_cmd_exec at ffffffffc046b4fb [mlx5_core]
    #13 [ffff89240e1a3ae8] mlx5_core_access_reg at ffffffffc0475434 [mlx5_core]
    #14 [ffff89240e1a3b40] mlx5e_get_fec_caps at ffffffffc04a7348 [mlx5_core]
    #15 [ffff89240e1a3bb0] get_fec_supported_advertised at ffffffffc04992bf [mlx5_core]
    #16 [ffff89240e1a3c08] mlx5e_get_link_ksettings at ffffffffc049ab36 [mlx5_core]
    #17 [ffff89240e1a3ce8] __ethtool_get_link_ksettings at ffffffff8f25db46
    #18 [ffff89240e1a3d48] speed_show at ffffffff8f277208
    #19 [ffff89240e1a3dd8] dev_attr_show at ffffffff8f0b70e3
    #20 [ffff89240e1a3df8] sysfs_kf_seq_show at ffffffff8eedbedf
    #21 [ffff89240e1a3e18] kernfs_seq_show at ffffffff8eeda596
    #22 [ffff89240e1a3e28] seq_read at ffffffff8ee76d10
    #23 [ffff89240e1a3e98] kernfs_fop_read at ffffffff8eedaef5
    #24 [ffff89240e1a3ed8] vfs_read at ffffffff8ee4e3ff
    #25 [ffff89240e1a3f08] sys_read at ffffffff8ee4f27f
    #26 [ffff89240e1a3f50] system_call_fastpath at ffffffff8f395f92

    crash> net_device.state ffff89443b0c0000
      state = 0x5  (__LINK_STATE_START| __LINK_STATE_NOCARRIER)

To prevent this scenario, we also make sure that the netdevice is present.

Signed-off-by: suresh kumar <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
RobertCNelson pushed a commit that referenced this pull request Aug 26, 2022
[ Upstream commit f75fd4b ]

While booting secondary CPUs, cpus_read_[lock/unlock] is not keeping
online cpumask stable. The transient online mask results in below
calltrace.

[    0.324121] CPU1: Booted secondary processor 0x0000000001 [0x410fd083]
[    0.346652] Detected PIPT I-cache on CPU2
[    0.347212] CPU2: Booted secondary processor 0x0000000002 [0x410fd083]
[    0.377255] Detected PIPT I-cache on CPU3
[    0.377823] CPU3: Booted secondary processor 0x0000000003 [0x410fd083]
[    0.379040] ------------[ cut here ]------------
[    0.383662] WARNING: CPU: 0 PID: 10 at kernel/workqueue.c:3084 __flush_work+0x12c/0x138
[    0.384850] Modules linked in:
[    0.385403] CPU: 0 PID: 10 Comm: rcu_tasks_rude_ Not tainted 5.17.0-rc3-v8+ #13
[    0.386473] Hardware name: Raspberry Pi 4 Model B Rev 1.4 (DT)
[    0.387289] pstate: 20000005 (nzCv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
[    0.388308] pc : __flush_work+0x12c/0x138
[    0.388970] lr : __flush_work+0x80/0x138
[    0.389620] sp : ffffffc00aaf3c60
[    0.390139] x29: ffffffc00aaf3d20 x28: ffffffc009c16af0 x27: ffffff80f761df48
[    0.391316] x26: 0000000000000004 x25: 0000000000000003 x24: 0000000000000100
[    0.392493] x23: ffffffffffffffff x22: ffffffc009c16b10 x21: ffffffc009c16b28
[    0.393668] x20: ffffffc009e53861 x19: ffffff80f77fbf40 x18: 00000000d744fcc9
[    0.394842] x17: 000000000000000b x16: 00000000000001c2 x15: ffffffc009e57550
[    0.396016] x14: 0000000000000000 x13: ffffffffffffffff x12: 0000000100000000
[    0.397190] x11: 0000000000000462 x10: ffffff8040258008 x9 : 0000000100000000
[    0.398364] x8 : 0000000000000000 x7 : ffffffc0093c8bf4 x6 : 0000000000000000
[    0.399538] x5 : 0000000000000000 x4 : ffffffc00a976e40 x3 : ffffffc00810444c
[    0.400711] x2 : 0000000000000004 x1 : 0000000000000000 x0 : 0000000000000000
[    0.401886] Call trace:
[    0.402309]  __flush_work+0x12c/0x138
[    0.402941]  schedule_on_each_cpu+0x228/0x278
[    0.403693]  rcu_tasks_rude_wait_gp+0x130/0x144
[    0.404502]  rcu_tasks_kthread+0x220/0x254
[    0.405264]  kthread+0x174/0x1ac
[    0.405837]  ret_from_fork+0x10/0x20
[    0.406456] irq event stamp: 102
[    0.406966] hardirqs last  enabled at (101): [<ffffffc0093c8468>] _raw_spin_unlock_irq+0x78/0xb4
[    0.408304] hardirqs last disabled at (102): [<ffffffc0093b8270>] el1_dbg+0x24/0x5c
[    0.409410] softirqs last  enabled at (54): [<ffffffc0081b80c8>] local_bh_enable+0xc/0x2c
[    0.410645] softirqs last disabled at (50): [<ffffffc0081b809c>] local_bh_disable+0xc/0x2c
[    0.411890] ---[ end trace 0000000000000000 ]---
[    0.413000] smp: Brought up 1 node, 4 CPUs
[    0.413762] SMP: Total of 4 processors activated.
[    0.414566] CPU features: detected: 32-bit EL0 Support
[    0.415414] CPU features: detected: 32-bit EL1 Support
[    0.416278] CPU features: detected: CRC32 instructions
[    0.447021] Callback from call_rcu_tasks_rude() invoked.
[    0.506693] Callback from call_rcu_tasks() invoked.

This commit therefore fixes this issue by applying a single-CPU
optimization to the RCU Tasks Rude grace-period process.  The key point
here is that the purpose of this RCU flavor is to force a schedule on
each online CPU since some past event.  But the rcu_tasks_rude_wait_gp()
function runs in the context of the RCU Tasks Rude's grace-period kthread,
so there must already have been a context switch on the current CPU since
the call to either synchronize_rcu_tasks_rude() or call_rcu_tasks_rude().
So if there is only a single CPU online, RCU Tasks Rude's grace-period
kthread does not need to anything at all.

It turns out that the rcu_tasks_rude_wait_gp() function's call to
schedule_on_each_cpu() causes problems during early boot.  During that
time, there is only one online CPU, namely the boot CPU.  Therefore,
applying this single-CPU optimization fixes early-boot instances of
this problem.

Link: https://lore.kernel.org/lkml/[email protected]/T/
Suggested-by: Paul E. McKenney <[email protected]>
Signed-off-by: Padmanabha Srinivasaiah <[email protected]>
Signed-off-by: Paul E. McKenney <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
RobertCNelson pushed a commit that referenced this pull request Aug 26, 2022
commit f4534c9 upstream.

We got issue as follows:
EXT4-fs error (device loop0) in ext4_reserve_inode_write:5741: Out of memory
EXT4-fs error (device loop0): ext4_setattr:5462: inode #13: comm syz-executor.0: mark_inode_dirty error
EXT4-fs error (device loop0) in ext4_setattr:5519: Out of memory
EXT4-fs error (device loop0): ext4_ind_map_blocks:595: inode #13: comm syz-executor.0: Can't allocate blocks for non-extent mapped inodes with bigalloc
------------[ cut here ]------------
WARNING: CPU: 1 PID: 4361 at fs/ext4/file.c:301 ext4_file_write_iter+0x11c9/0x1220
Modules linked in:
CPU: 1 PID: 4361 Comm: syz-executor.0 Not tainted 5.10.0+ #1
RIP: 0010:ext4_file_write_iter+0x11c9/0x1220
RSP: 0018:ffff924d80b27c00 EFLAGS: 00010282
RAX: ffffffff815a3379 RBX: 0000000000000000 RCX: 000000003b000000
RDX: ffff924d81601000 RSI: 00000000000009cc RDI: 00000000000009cd
RBP: 000000000000000d R08: ffffffffbc5a2c6b R09: 0000902e0e52a96f
R10: ffff902e2b7c1b40 R11: ffff902e2b7c1b40 R12: 000000000000000a
R13: 0000000000000001 R14: ffff902e0e52aa10 R15: ffffffffffffff8b
FS:  00007f81a7f65700(0000) GS:ffff902e3bc80000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: ffffffffff600400 CR3: 000000012db88001 CR4: 00000000003706e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
 do_iter_readv_writev+0x2e5/0x360
 do_iter_write+0x112/0x4c0
 do_pwritev+0x1e5/0x390
 __x64_sys_pwritev2+0x7e/0xa0
 do_syscall_64+0x37/0x50
 entry_SYSCALL_64_after_hwframe+0x44/0xa9

Above issue may happen as follows:
Assume
inode.i_size=4096
EXT4_I(inode)->i_disksize=4096

step 1: set inode->i_isize = 8192
ext4_setattr
  if (attr->ia_size != inode->i_size)
    EXT4_I(inode)->i_disksize = attr->ia_size;
    rc = ext4_mark_inode_dirty
       ext4_reserve_inode_write
          ext4_get_inode_loc
            __ext4_get_inode_loc
              sb_getblk --> return -ENOMEM
   ...
   if (!error)  ->will not update i_size
     i_size_write(inode, attr->ia_size);
Now:
inode.i_size=4096
EXT4_I(inode)->i_disksize=8192

step 2: Direct write 4096 bytes
ext4_file_write_iter
 ext4_dio_write_iter
   iomap_dio_rw ->return error
 if (extend)
   ext4_handle_inode_extension
     WARN_ON_ONCE(i_size_read(inode) < EXT4_I(inode)->i_disksize);
->Then trigger warning.

To solve above issue, if mark inode dirty failed in ext4_setattr just
set 'EXT4_I(inode)->i_disksize' with old value.

Signed-off-by: Ye Bin <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Theodore Ts'o <[email protected]>
Cc: [email protected]
Signed-off-by: Greg Kroah-Hartman <[email protected]>
RobertCNelson pushed a commit that referenced this pull request Aug 26, 2022
[ Upstream commit a625357 ]

Sometimes it is necessary to use a PLT entry to call an ftrace
trampoline. This is handled by ftrace_make_call() and ftrace_make_nop(),
with each having *almost* identical logic, but this is not handled by
ftrace_modify_call() since its introduction in commit:

  3b23e49 ("arm64: implement ftrace with regs")

Due to this, if we ever were to call ftrace_modify_call() for a callsite
which requires a PLT entry for a trampoline, then either:

a) If the old addr requires a trampoline, ftrace_modify_call() will use
   an out-of-range address to generate the 'old' branch instruction.
   This will result in warnings from aarch64_insn_gen_branch_imm() and
   ftrace_modify_code(), and no instructions will be modified. As
   ftrace_modify_call() will return an error, this will result in
   subsequent internal ftrace errors.

b) If the old addr does not require a trampoline, but the new addr does,
   ftrace_modify_call() will use an out-of-range address to generate the
   'new' branch instruction. This will result in warnings from
   aarch64_insn_gen_branch_imm(), and ftrace_modify_code() will replace
   the 'old' branch with a BRK. This will result in a kernel panic when
   this BRK is later executed.

Practically speaking, case (a) is vastly more likely than case (b), and
typically this will result in internal ftrace errors that don't
necessarily affect the rest of the system. This can be demonstrated with
an out-of-tree test module which triggers ftrace_modify_call(), e.g.

| # insmod test_ftrace.ko
| test_ftrace: Function test_function raw=0xffffb3749399201c, callsite=0xffffb37493992024
| branch_imm_common: offset out of range
| branch_imm_common: offset out of range
| ------------[ ftrace bug ]------------
| ftrace failed to modify
| [<ffffb37493992024>] test_function+0x8/0x38 [test_ftrace]
|  actual:   1d:00:00:94
| Updating ftrace call site to call a different ftrace function
| ftrace record flags: e0000002
|  (2) R
|  expected tramp: ffffb374ae42ed54
| ------------[ cut here ]------------
| WARNING: CPU: 0 PID: 165 at kernel/trace/ftrace.c:2085 ftrace_bug+0x280/0x2b0
| Modules linked in: test_ftrace(+)
| CPU: 0 PID: 165 Comm: insmod Not tainted 5.19.0-rc2-00002-g4d9ead8b45ce #13
| Hardware name: linux,dummy-virt (DT)
| pstate: 60400005 (nZCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
| pc : ftrace_bug+0x280/0x2b0
| lr : ftrace_bug+0x280/0x2b0
| sp : ffff80000839ba00
| x29: ffff80000839ba00 x28: 0000000000000000 x27: ffff80000839bcf0
| x26: ffffb37493994180 x25: ffffb374b0991c28 x24: ffffb374b0d70000
| x23: 00000000ffffffea x22: ffffb374afcc33b0 x21: ffffb374b08f9cc8
| x20: ffff572b8462c000 x19: ffffb374b08f9000 x18: ffffffffffffffff
| x17: 6c6c6163202c6331 x16: ffffb374ae5ad110 x15: ffffb374b0d51ee4
| x14: 0000000000000000 x13: 3435646532346561 x12: 3437336266666666
| x11: 203a706d61727420 x10: 6465746365707865 x9 : ffffb374ae5149e8
| x8 : 336266666666203a x7 : 706d617274206465 x6 : 00000000fffff167
| x5 : ffff572bffbc4a08 x4 : 00000000fffff167 x3 : 0000000000000000
| x2 : 0000000000000000 x1 : ffff572b84461e00 x0 : 0000000000000022
| Call trace:
|  ftrace_bug+0x280/0x2b0
|  ftrace_replace_code+0x98/0xa0
|  ftrace_modify_all_code+0xe0/0x144
|  arch_ftrace_update_code+0x14/0x20
|  ftrace_startup+0xf8/0x1b0
|  register_ftrace_function+0x38/0x90
|  test_ftrace_init+0xd0/0x1000 [test_ftrace]
|  do_one_initcall+0x50/0x2b0
|  do_init_module+0x50/0x1f0
|  load_module+0x17c8/0x1d64
|  __do_sys_finit_module+0xa8/0x100
|  __arm64_sys_finit_module+0x2c/0x3c
|  invoke_syscall+0x50/0x120
|  el0_svc_common.constprop.0+0xdc/0x100
|  do_el0_svc+0x3c/0xd0
|  el0_svc+0x34/0xb0
|  el0t_64_sync_handler+0xbc/0x140
|  el0t_64_sync+0x18c/0x190
| ---[ end trace 0000000000000000 ]---

We can solve this by consistently determining whether to use a PLT entry
for an address.

Note that since (the earlier) commit:

  f1a54ae ("arm64: module/ftrace: intialize PLT at load time")

... we can consistently determine the PLT address that a given callsite
will use, and therefore ftrace_make_nop() does not need to skip
validation when a PLT is in use.

This patch factors the existing logic out of ftrace_make_call() and
ftrace_make_nop() into a common ftrace_find_callable_addr() helper
function, which is used by ftrace_make_call(), ftrace_make_nop(), and
ftrace_modify_call(). In ftrace_make_nop() the patching is consistently
validated by ftrace_modify_code() as we can always determine what the
old instruction should have been.

Fixes: 3b23e49 ("arm64: implement ftrace with regs")
Signed-off-by: Mark Rutland <[email protected]>
Cc: Ard Biesheuvel <[email protected]>
Cc: Will Deacon <[email protected]>
Tested-by: "Ivan T. Ivanov" <[email protected]>
Reviewed-by: Chengming Zhou <[email protected]>
Reviewed-by: Ard Biesheuvel <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Catalin Marinas <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
nmenon pushed a commit to nmenon/nm-beagle-linux that referenced this pull request May 22, 2023
[ Upstream commit 93c660c ]

ASAN reports an use-after-free in btf_dump_name_dups:

ERROR: AddressSanitizer: heap-use-after-free on address 0xffff927006db at pc 0xaaaab5dfb618 bp 0xffffdd89b890 sp 0xffffdd89b928
READ of size 2 at 0xffff927006db thread T0
    #0 0xaaaab5dfb614 in __interceptor_strcmp.part.0 (test_progs+0x21b614)
    beagleboard#1 0xaaaab635f144 in str_equal_fn tools/lib/bpf/btf_dump.c:127
    beagleboard#2 0xaaaab635e3e0 in hashmap_find_entry tools/lib/bpf/hashmap.c:143
    beagleboard#3 0xaaaab635e72c in hashmap__find tools/lib/bpf/hashmap.c:212
    beagleboard#4 0xaaaab6362258 in btf_dump_name_dups tools/lib/bpf/btf_dump.c:1525
    beagleboard#5 0xaaaab636240c in btf_dump_resolve_name tools/lib/bpf/btf_dump.c:1552
    beagleboard#6 0xaaaab6362598 in btf_dump_type_name tools/lib/bpf/btf_dump.c:1567
    beagleboard#7 0xaaaab6360b48 in btf_dump_emit_struct_def tools/lib/bpf/btf_dump.c:912
    beagleboard#8 0xaaaab6360630 in btf_dump_emit_type tools/lib/bpf/btf_dump.c:798
    beagleboard#9 0xaaaab635f720 in btf_dump__dump_type tools/lib/bpf/btf_dump.c:282
    beagleboard#10 0xaaaab608523c in test_btf_dump_incremental tools/testing/selftests/bpf/prog_tests/btf_dump.c:236
    beagleboard#11 0xaaaab6097530 in test_btf_dump tools/testing/selftests/bpf/prog_tests/btf_dump.c:875
    beagleboard#12 0xaaaab6314ed0 in run_one_test tools/testing/selftests/bpf/test_progs.c:1062
    beagleboard#13 0xaaaab631a0a8 in main tools/testing/selftests/bpf/test_progs.c:1697
    beagleboard#14 0xffff9676d214 in __libc_start_main ../csu/libc-start.c:308
    beagleboard#15 0xaaaab5d65990  (test_progs+0x185990)

0xffff927006db is located 11 bytes inside of 16-byte region [0xffff927006d0,0xffff927006e0)
freed by thread T0 here:
    #0 0xaaaab5e2c7c4 in realloc (test_progs+0x24c7c4)
    beagleboard#1 0xaaaab634f4a0 in libbpf_reallocarray tools/lib/bpf/libbpf_internal.h:191
    beagleboard#2 0xaaaab634f840 in libbpf_add_mem tools/lib/bpf/btf.c:163
    beagleboard#3 0xaaaab636643c in strset_add_str_mem tools/lib/bpf/strset.c:106
    beagleboard#4 0xaaaab6366560 in strset__add_str tools/lib/bpf/strset.c:157
    beagleboard#5 0xaaaab6352d70 in btf__add_str tools/lib/bpf/btf.c:1519
    beagleboard#6 0xaaaab6353e10 in btf__add_field tools/lib/bpf/btf.c:2032
    beagleboard#7 0xaaaab6084fcc in test_btf_dump_incremental tools/testing/selftests/bpf/prog_tests/btf_dump.c:232
    beagleboard#8 0xaaaab6097530 in test_btf_dump tools/testing/selftests/bpf/prog_tests/btf_dump.c:875
    beagleboard#9 0xaaaab6314ed0 in run_one_test tools/testing/selftests/bpf/test_progs.c:1062
    beagleboard#10 0xaaaab631a0a8 in main tools/testing/selftests/bpf/test_progs.c:1697
    beagleboard#11 0xffff9676d214 in __libc_start_main ../csu/libc-start.c:308
    beagleboard#12 0xaaaab5d65990  (test_progs+0x185990)

previously allocated by thread T0 here:
    #0 0xaaaab5e2c7c4 in realloc (test_progs+0x24c7c4)
    beagleboard#1 0xaaaab634f4a0 in libbpf_reallocarray tools/lib/bpf/libbpf_internal.h:191
    beagleboard#2 0xaaaab634f840 in libbpf_add_mem tools/lib/bpf/btf.c:163
    beagleboard#3 0xaaaab636643c in strset_add_str_mem tools/lib/bpf/strset.c:106
    beagleboard#4 0xaaaab6366560 in strset__add_str tools/lib/bpf/strset.c:157
    beagleboard#5 0xaaaab6352d70 in btf__add_str tools/lib/bpf/btf.c:1519
    beagleboard#6 0xaaaab6353ff0 in btf_add_enum_common tools/lib/bpf/btf.c:2070
    beagleboard#7 0xaaaab6354080 in btf__add_enum tools/lib/bpf/btf.c:2102
    beagleboard#8 0xaaaab6082f50 in test_btf_dump_incremental tools/testing/selftests/bpf/prog_tests/btf_dump.c:162
    beagleboard#9 0xaaaab6097530 in test_btf_dump tools/testing/selftests/bpf/prog_tests/btf_dump.c:875
    beagleboard#10 0xaaaab6314ed0 in run_one_test tools/testing/selftests/bpf/test_progs.c:1062
    beagleboard#11 0xaaaab631a0a8 in main tools/testing/selftests/bpf/test_progs.c:1697
    beagleboard#12 0xffff9676d214 in __libc_start_main ../csu/libc-start.c:308
    beagleboard#13 0xaaaab5d65990  (test_progs+0x185990)

The reason is that the key stored in hash table name_map is a string
address, and the string memory is allocated by realloc() function, when
the memory is resized by realloc() later, the old memory may be freed,
so the address stored in name_map references to a freed memory, causing
use-after-free.

Fix it by storing duplicated string address in name_map.

Fixes: 919d2b1 ("libbpf: Allow modification of BTF and add btf__add_str API")
Signed-off-by: Xu Kuohai <[email protected]>
Signed-off-by: Andrii Nakryiko <[email protected]>
Acked-by: Martin KaFai Lau <[email protected]>
Link: https://lore.kernel.org/bpf/[email protected]
Signed-off-by: Sasha Levin <[email protected]>
nmenon pushed a commit to nmenon/nm-beagle-linux that referenced this pull request May 22, 2023
…g the sock

[ Upstream commit 3cf7203 ]

There is a race condition in vxlan that when deleting a vxlan device
during receiving packets, there is a possibility that the sock is
released after getting vxlan_sock vs from sk_user_data. Then in
later vxlan_ecn_decapsulate(), vxlan_get_sk_family() we will got
NULL pointer dereference. e.g.

   #0 [ffffa25ec6978a38] machine_kexec at ffffffff8c669757
   beagleboard#1 [ffffa25ec6978a90] __crash_kexec at ffffffff8c7c0a4d
   beagleboard#2 [ffffa25ec6978b58] crash_kexec at ffffffff8c7c1c48
   beagleboard#3 [ffffa25ec6978b60] oops_end at ffffffff8c627f2b
   beagleboard#4 [ffffa25ec6978b80] page_fault_oops at ffffffff8c678fcb
   beagleboard#5 [ffffa25ec6978bd8] exc_page_fault at ffffffff8d109542
   beagleboard#6 [ffffa25ec6978c00] asm_exc_page_fault at ffffffff8d200b62
      [exception RIP: vxlan_ecn_decapsulate+0x3b]
      RIP: ffffffffc1014e7b  RSP: ffffa25ec6978cb0  RFLAGS: 00010246
      RAX: 0000000000000008  RBX: ffff8aa000888000  RCX: 0000000000000000
      RDX: 000000000000000e  RSI: ffff8a9fc7ab803e  RDI: ffff8a9fd1168700
      RBP: ffff8a9fc7ab803e   R8: 0000000000700000   R9: 00000000000010ae
      R10: ffff8a9fcb748980  R11: 0000000000000000  R12: ffff8a9fd1168700
      R13: ffff8aa000888000  R14: 00000000002a0000  R15: 00000000000010ae
      ORIG_RAX: ffffffffffffffff  CS: 0010  SS: 0018
   beagleboard#7 [ffffa25ec6978ce8] vxlan_rcv at ffffffffc10189cd [vxlan]
   beagleboard#8 [ffffa25ec6978d90] udp_queue_rcv_one_skb at ffffffff8cfb6507
   beagleboard#9 [ffffa25ec6978dc0] udp_unicast_rcv_skb at ffffffff8cfb6e45
  beagleboard#10 [ffffa25ec6978dc8] __udp4_lib_rcv at ffffffff8cfb8807
  beagleboard#11 [ffffa25ec6978e20] ip_protocol_deliver_rcu at ffffffff8cf76951
  beagleboard#12 [ffffa25ec6978e48] ip_local_deliver at ffffffff8cf76bde
  beagleboard#13 [ffffa25ec6978ea0] __netif_receive_skb_one_core at ffffffff8cecde9b
  beagleboard#14 [ffffa25ec6978ec8] process_backlog at ffffffff8cece139
  beagleboard#15 [ffffa25ec6978f00] __napi_poll at ffffffff8ceced1a
  beagleboard#16 [ffffa25ec6978f28] net_rx_action at ffffffff8cecf1f3
  beagleboard#17 [ffffa25ec6978fa0] __softirqentry_text_start at ffffffff8d4000ca
  beagleboard#18 [ffffa25ec6978ff0] do_softirq at ffffffff8c6fbdc3

Reproducer: https://github.com/Mellanox/ovs-tests/blob/master/test-ovs-vxlan-remove-tunnel-during-traffic.sh

Fix this by waiting for all sk_user_data reader to finish before
releasing the sock.

Reported-by: Jianlin Shi <[email protected]>
Suggested-by: Jakub Sitnicki <[email protected]>
Fixes: 6a93cc9 ("udp-tunnel: Add a few more UDP tunnel APIs")
Signed-off-by: Hangbin Liu <[email protected]>
Reviewed-by: Jiri Pirko <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
RobertCNelson pushed a commit that referenced this pull request Mar 12, 2024
[ Upstream commit b16904f ]

With latest upstream llvm18, the following test cases failed:

  $ ./test_progs -j
  #13/2    bpf_cookie/multi_kprobe_link_api:FAIL
  #13/3    bpf_cookie/multi_kprobe_attach_api:FAIL
  #13      bpf_cookie:FAIL
  #77      fentry_fexit:FAIL
  #78/1    fentry_test/fentry:FAIL
  #78      fentry_test:FAIL
  #82/1    fexit_test/fexit:FAIL
  #82      fexit_test:FAIL
  #112/1   kprobe_multi_test/skel_api:FAIL
  #112/2   kprobe_multi_test/link_api_addrs:FAIL
  [...]
  #112     kprobe_multi_test:FAIL
  #356/17  test_global_funcs/global_func17:FAIL
  #356     test_global_funcs:FAIL

Further analysis shows llvm upstream patch [1] is responsible for the above
failures. For example, for function bpf_fentry_test7() in net/bpf/test_run.c,
without [1], the asm code is:

  0000000000000400 <bpf_fentry_test7>:
     400: f3 0f 1e fa                   endbr64
     404: e8 00 00 00 00                callq   0x409 <bpf_fentry_test7+0x9>
     409: 48 89 f8                      movq    %rdi, %rax
     40c: c3                            retq
     40d: 0f 1f 00                      nopl    (%rax)

... and with [1], the asm code is:

  0000000000005d20 <bpf_fentry_test7.specialized.1>:
    5d20: e8 00 00 00 00                callq   0x5d25 <bpf_fentry_test7.specialized.1+0x5>
    5d25: c3                            retq

... and <bpf_fentry_test7.specialized.1> is called instead of <bpf_fentry_test7>
and this caused test failures for #13/#77 etc. except #356.

For test case #356/17, with [1] (progs/test_global_func17.c)), the main prog
looks like:

  0000000000000000 <global_func17>:
       0:       b4 00 00 00 2a 00 00 00 w0 = 0x2a
       1:       95 00 00 00 00 00 00 00 exit

... which passed verification while the test itself expects a verification
failure.

Let us add 'barrier_var' style asm code in both places to prevent function
specialization which caused selftests failure.

  [1] llvm/llvm-project#72903

Signed-off-by: Yonghong Song <[email protected]>
Signed-off-by: Daniel Borkmann <[email protected]>
Link: https://lore.kernel.org/bpf/[email protected]
Signed-off-by: Sasha Levin <[email protected]>
RobertCNelson pushed a commit that referenced this pull request Jul 11, 2025
[ Upstream commit 9779d45 ]

The function mbox_chan_received_data() calls the Rx callback of the
mailbox client driver. The callback might set chan_in_use flag from
pcc_send_data(). This flag's status determines whether the PCC channel
is in use.

However, there is a potential race condition where chan_in_use is
updated incorrectly due to concurrency between the interrupt handler
(pcc_mbox_irq()) and the command sender(pcc_send_data()).

The 'chan_in_use' flag of a channel is set to true after sending a
command. And the flag of the new command may be cleared erroneous by
the interrupt handler afer mbox_chan_received_data() returns,

As a result, the interrupt being level triggered can't be cleared in
pcc_mbox_irq() and it will be disabled after the number of handled times
exceeds the specified value. The error log is as follows:

  |  kunpeng_hccs HISI04B2:00: PCC command executed timeout!
  |  kunpeng_hccs HISI04B2:00: get port link status info failed, ret = -110
  |  irq 13: nobody cared (try booting with the "irqpoll" option)
  |  Call trace:
  |   dump_backtrace+0x0/0x210
  |   show_stack+0x1c/0x2c
  |   dump_stack+0xec/0x130
  |   __report_bad_irq+0x50/0x190
  |   note_interrupt+0x1e4/0x260
  |   handle_irq_event+0x144/0x17c
  |   handle_fasteoi_irq+0xd0/0x240
  |   __handle_domain_irq+0x80/0xf0
  |   gic_handle_irq+0x74/0x2d0
  |   el1_irq+0xbc/0x140
  |   mnt_clone_write+0x0/0x70
  |   file_update_time+0xcc/0x160
  |   fault_dirty_shared_page+0xe8/0x150
  |   do_shared_fault+0x80/0x1d0
  |   do_fault+0x118/0x1a4
  |   handle_pte_fault+0x154/0x230
  |   __handle_mm_fault+0x1ac/0x390
  |   handle_mm_fault+0xf0/0x250
  |   do_page_fault+0x184/0x454
  |   do_translation_fault+0xac/0xd4
  |   do_mem_abort+0x44/0xb4
  |   el0_da+0x40/0x74
  |   el0_sync_handler+0x60/0xb4
  |   el0_sync+0x168/0x180
  |  handlers:
  |   pcc_mbox_irq
  |  Disabling IRQ #13

To solve this issue, pcc_mbox_irq() must clear 'chan_in_use' flag before
the call to mbox_chan_received_data().

Tested-by: Adam Young <[email protected]>
Tested-by: Robbie King <[email protected]>
Signed-off-by: Huisong Li <[email protected]>
(sudeep.holla: Minor updates to the subject, commit message and comment)
Signed-off-by: Sudeep Holla <[email protected]>
Signed-off-by: Jassi Brar <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
RobertCNelson pushed a commit that referenced this pull request Jul 11, 2025
[ Upstream commit 88f7f56 ]

When a bio with REQ_PREFLUSH is submitted to dm, __send_empty_flush()
generates a flush_bio with REQ_OP_WRITE | REQ_PREFLUSH | REQ_SYNC,
which causes the flush_bio to be throttled by wbt_wait().

An example from v5.4, similar problem also exists in upstream:

    crash> bt 2091206
    PID: 2091206  TASK: ffff2050df92a300  CPU: 109  COMMAND: "kworker/u260:0"
     #0 [ffff800084a2f7f0] __switch_to at ffff80004008aeb8
     #1 [ffff800084a2f820] __schedule at ffff800040bfa0c4
     #2 [ffff800084a2f880] schedule at ffff800040bfa4b4
     #3 [ffff800084a2f8a0] io_schedule at ffff800040bfa9c4
     #4 [ffff800084a2f8c0] rq_qos_wait at ffff8000405925bc
     #5 [ffff800084a2f940] wbt_wait at ffff8000405bb3a0
     #6 [ffff800084a2f9a0] __rq_qos_throttle at ffff800040592254
     #7 [ffff800084a2f9c0] blk_mq_make_request at ffff80004057cf38
     #8 [ffff800084a2fa60] generic_make_request at ffff800040570138
     #9 [ffff800084a2fae0] submit_bio at ffff8000405703b4
    #10 [ffff800084a2fb50] xlog_write_iclog at ffff800001280834 [xfs]
    #11 [ffff800084a2fbb0] xlog_sync at ffff800001280c3c [xfs]
    #12 [ffff800084a2fbf0] xlog_state_release_iclog at ffff800001280df4 [xfs]
    #13 [ffff800084a2fc10] xlog_write at ffff80000128203c [xfs]
    #14 [ffff800084a2fcd0] xlog_cil_push at ffff8000012846dc [xfs]
    #15 [ffff800084a2fda0] xlog_cil_push_work at ffff800001284a2c [xfs]
    #16 [ffff800084a2fdb0] process_one_work at ffff800040111d08
    #17 [ffff800084a2fe00] worker_thread at ffff8000401121cc
    #18 [ffff800084a2fe70] kthread at ffff800040118de4

After commit 2def284 ("xfs: don't allow log IO to be throttled"),
the metadata submitted by xlog_write_iclog() should not be throttled.
But due to the existence of the dm layer, throttling flush_bio indirectly
causes the metadata bio to be throttled.

Fix this by conditionally adding REQ_IDLE to flush_bio.bi_opf, which makes
wbt_should_throttle() return false to avoid wbt_wait().

Signed-off-by: Jinliang Zheng <[email protected]>
Reviewed-by: Tianxiang Peng <[email protected]>
Reviewed-by: Hao Peng <[email protected]>
Signed-off-by: Mikulas Patocka <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
RobertCNelson pushed a commit that referenced this pull request Jul 15, 2025
[ Upstream commit 4dde20b ]

When the binary path is excessively long, the generated probe_name in libbpf
exceeds the kernel's MAX_EVENT_NAME_LEN limit (64 bytes).
This causes legacy uprobe event attachment to fail with error code -22.

The fix reorders the fields to place the unique ID before the name.
This ensures that even if truncation occurs via snprintf, the unique ID
remains intact, preserving event name uniqueness. Additionally, explicit
checks with MAX_EVENT_NAME_LEN are added to enforce length constraints.

Before Fix:
	./test_progs -t attach_probe/kprobe-long_name
	......
	libbpf: failed to add legacy kprobe event for 'bpf_testmod_looooooooooooooooooooooooooooooong_name+0x0': -EINVAL
	libbpf: prog 'handle_kprobe': failed to create kprobe 'bpf_testmod_looooooooooooooooooooooooooooooong_name+0x0' perf event: -EINVAL
	test_attach_kprobe_long_event_name:FAIL:attach_kprobe_long_event_name unexpected error: -22
	test_attach_probe:PASS:uprobe_ref_ctr_cleanup 0 nsec
	#13/11   attach_probe/kprobe-long_name:FAIL
	#13      attach_probe:FAIL

	./test_progs -t attach_probe/uprobe-long_name
	......
	libbpf: failed to add legacy uprobe event for /root/linux-bpf/bpf-next/tools/testing/selftests/bpf/test_progs:0x13efd9: -EINVAL
	libbpf: prog 'handle_uprobe': failed to create uprobe '/root/linux-bpf/bpf-next/tools/testing/selftests/bpf/test_progs:0x13efd9' perf event: -EINVAL
	test_attach_uprobe_long_event_name:FAIL:attach_uprobe_long_event_name unexpected error: -22
	#13/10   attach_probe/uprobe-long_name:FAIL
	#13      attach_probe:FAIL
After Fix:
	./test_progs -t attach_probe/uprobe-long_name
	#13/10   attach_probe/uprobe-long_name:OK
	#13      attach_probe:OK
	Summary: 1/1 PASSED, 0 SKIPPED, 0 FAILED

	./test_progs -t attach_probe/kprobe-long_name
	#13/11   attach_probe/kprobe-long_name:OK
	#13      attach_probe:OK
	Summary: 1/1 PASSED, 0 SKIPPED, 0 FAILED

Fixes: 46ed5fc ("libbpf: Refactor and simplify legacy kprobe code")
Fixes: cc10623 ("libbpf: Add legacy uprobe attaching support")
Signed-off-by: Hengqi Chen <[email protected]>
Signed-off-by: Feng Yang <[email protected]>
Signed-off-by: Andrii Nakryiko <[email protected]>
Link: https://lore.kernel.org/bpf/[email protected]
Signed-off-by: Sasha Levin <[email protected]>
RobertCNelson pushed a commit that referenced this pull request Jul 31, 2025
[ Upstream commit eedf3e3 ]

ACPICA commit 1c28da2242783579d59767617121035dafba18c3

This was originally done in NetBSD:
NetBSD/src@b69d1ac
and is the correct alternative to the smattering of `memcpy`s I
previously contributed to this repository.

This also sidesteps the newly strict checks added in UBSAN:
llvm/llvm-project@7926744

Before this change we see the following UBSAN stack trace in Fuchsia:

  #0    0x000021afcfdeca5e in acpi_rs_get_address_common(struct acpi_resource*, union aml_resource*) ../../third_party/acpica/source/components/resources/rsaddr.c:329 <platform-bus-x86.so>+0x6aca5e
  #1.2  0x000021982bc4af3c in ubsan_get_stack_trace() compiler-rt/lib/ubsan/ubsan_diag.cpp:41 <libclang_rt.asan.so>+0x41f3c
  #1.1  0x000021982bc4af3c in maybe_print_stack_trace() compiler-rt/lib/ubsan/ubsan_diag.cpp:51 <libclang_rt.asan.so>+0x41f3c
  #1    0x000021982bc4af3c in ~scoped_report() compiler-rt/lib/ubsan/ubsan_diag.cpp:395 <libclang_rt.asan.so>+0x41f3c
  #2    0x000021982bc4bb6f in handletype_mismatch_impl() compiler-rt/lib/ubsan/ubsan_handlers.cpp:137 <libclang_rt.asan.so>+0x42b6f
  #3    0x000021982bc4b723 in __ubsan_handle_type_mismatch_v1 compiler-rt/lib/ubsan/ubsan_handlers.cpp:142 <libclang_rt.asan.so>+0x42723
  #4    0x000021afcfdeca5e in acpi_rs_get_address_common(struct acpi_resource*, union aml_resource*) ../../third_party/acpica/source/components/resources/rsaddr.c:329 <platform-bus-x86.so>+0x6aca5e
  #5    0x000021afcfdf2089 in acpi_rs_convert_aml_to_resource(struct acpi_resource*, union aml_resource*, struct acpi_rsconvert_info*) ../../third_party/acpica/source/components/resources/rsmisc.c:355 <platform-bus-x86.so>+0x6b2089
  #6    0x000021afcfded169 in acpi_rs_convert_aml_to_resources(u8*, u32, u32, u8, void**) ../../third_party/acpica/source/components/resources/rslist.c:137 <platform-bus-x86.so>+0x6ad169
  #7    0x000021afcfe2d24a in acpi_ut_walk_aml_resources(struct acpi_walk_state*, u8*, acpi_size, acpi_walk_aml_callback, void**) ../../third_party/acpica/source/components/utilities/utresrc.c:237 <platform-bus-x86.so>+0x6ed24a
  #8    0x000021afcfde66b7 in acpi_rs_create_resource_list(union acpi_operand_object*, struct acpi_buffer*) ../../third_party/acpica/source/components/resources/rscreate.c:199 <platform-bus-x86.so>+0x6a66b7
  #9    0x000021afcfdf6979 in acpi_rs_get_method_data(acpi_handle, const char*, struct acpi_buffer*) ../../third_party/acpica/source/components/resources/rsutils.c:770 <platform-bus-x86.so>+0x6b6979
  #10   0x000021afcfdf708f in acpi_walk_resources(acpi_handle, char*, acpi_walk_resource_callback, void*) ../../third_party/acpica/source/components/resources/rsxface.c:731 <platform-bus-x86.so>+0x6b708f
  #11   0x000021afcfa95dcf in acpi::acpi_impl::walk_resources(acpi::acpi_impl*, acpi_handle, const char*, acpi::Acpi::resources_callable) ../../src/devices/board/lib/acpi/acpi-impl.cc:41 <platform-bus-x86.so>+0x355dcf
  #12   0x000021afcfaa8278 in acpi::device_builder::gather_resources(acpi::device_builder*, acpi::Acpi*, fidl::any_arena&, acpi::Manager*, acpi::device_builder::gather_resources_callback) ../../src/devices/board/lib/acpi/device-builder.cc:84 <platform-bus-x86.so>+0x368278
  #13   0x000021afcfbddb87 in acpi::Manager::configure_discovered_devices(acpi::Manager*) ../../src/devices/board/lib/acpi/manager.cc:75 <platform-bus-x86.so>+0x49db87
  #14   0x000021afcf99091d in publish_acpi_devices(acpi::Manager*, zx_device_t*, zx_device_t*) ../../src/devices/board/drivers/x86/acpi-nswalk.cc:95 <platform-bus-x86.so>+0x25091d
  #15   0x000021afcf9c1d4e in x86::X86::do_init(x86::X86*) ../../src/devices/board/drivers/x86/x86.cc:60 <platform-bus-x86.so>+0x281d4e
  #16   0x000021afcf9e33ad in λ(x86::X86::ddk_init::(anon class)*) ../../src/devices/board/drivers/x86/x86.cc:77 <platform-bus-x86.so>+0x2a33ad
  #17   0x000021afcf9e313e in fit::internal::target<(lambda at../../src/devices/board/drivers/x86/x86.cc:76:19), false, false, std::__2::allocator<std::byte>, void>::invoke(void*) ../../sdk/lib/fit/include/lib/fit/internal/function.h:183 <platform-bus-x86.so>+0x2a313e
  #18   0x000021afcfbab4c7 in fit::internal::function_base<16UL, false, void(), std::__2::allocator<std::byte>>::invoke(const fit::internal::function_base<16UL, false, void (), std::__2::allocator<std::byte> >*) ../../sdk/lib/fit/include/lib/fit/internal/function.h:522 <platform-bus-x86.so>+0x46b4c7
  #19   0x000021afcfbab342 in fit::function_impl<16UL, false, void(), std::__2::allocator<std::byte>>::operator()(const fit::function_impl<16UL, false, void (), std::__2::allocator<std::byte> >*) ../../sdk/lib/fit/include/lib/fit/function.h:315 <platform-bus-x86.so>+0x46b342
  #20   0x000021afcfcd98c3 in async::internal::retained_task::Handler(async_dispatcher_t*, async_task_t*, zx_status_t) ../../sdk/lib/async/task.cc:24 <platform-bus-x86.so>+0x5998c3
  #21   0x00002290f9924616 in λ(const driver_runtime::Dispatcher::post_task::(anon class)*, std::__2::unique_ptr<driver_runtime::callback_request, std::__2::default_delete<driver_runtime::callback_request> >, zx_status_t) ../../src/devices/bin/driver_runtime/dispatcher.cc:789 <libdriver_runtime.so>+0x10a616
  #22   0x00002290f9924323 in fit::internal::target<(lambda at../../src/devices/bin/driver_runtime/dispatcher.cc:788:7), true, false, std::__2::allocator<std::byte>, void, std::__2::unique_ptr<driver_runtime::callback_request, std::__2::default_delete<driver_runtime::callback_request>>, int>::invoke(void*, std::__2::unique_ptr<driver_runtime::callback_request, std::__2::default_delete<driver_runtime::callback_request> >, int) ../../sdk/lib/fit/include/lib/fit/internal/function.h:128 <libdriver_runtime.so>+0x10a323
  #23   0x00002290f9904b76 in fit::internal::function_base<24UL, true, void(std::__2::unique_ptr<driver_runtime::callback_request, std::__2::default_delete<driver_runtime::callback_request>>, int), std::__2::allocator<std::byte>>::invoke(const fit::internal::function_base<24UL, true, void (std::__2::unique_ptr<driver_runtime::callback_request, std::__2::default_delete<driver_runtime::callback_request> >, int), std::__2::allocator<std::byte> >*, std::__2::unique_ptr<driver_runtime::callback_request, std::__2::default_delete<driver_runtime::callback_request> >, int) ../../sdk/lib/fit/include/lib/fit/internal/function.h:522 <libdriver_runtime.so>+0xeab76
  #24   0x00002290f9904831 in fit::callback_impl<24UL, true, void(std::__2::unique_ptr<driver_runtime::callback_request, std::__2::default_delete<driver_runtime::callback_request>>, int), std::__2::allocator<std::byte>>::operator()(fit::callback_impl<24UL, true, void (std::__2::unique_ptr<driver_runtime::callback_request, std::__2::default_delete<driver_runtime::callback_request> >, int), std::__2::allocator<std::byte> >*, std::__2::unique_ptr<driver_runtime::callback_request, std::__2::default_delete<driver_runtime::callback_request> >, int) ../../sdk/lib/fit/include/lib/fit/function.h:471 <libdriver_runtime.so>+0xea831
  #25   0x00002290f98d5adc in driver_runtime::callback_request::Call(driver_runtime::callback_request*, std::__2::unique_ptr<driver_runtime::callback_request, std::__2::default_delete<driver_runtime::callback_request> >, zx_status_t) ../../src/devices/bin/driver_runtime/callback_request.h:74 <libdriver_runtime.so>+0xbbadc
  #26   0x00002290f98e1e58 in driver_runtime::Dispatcher::dispatch_callback(driver_runtime::Dispatcher*, std::__2::unique_ptr<driver_runtime::callback_request, std::__2::default_delete<driver_runtime::callback_request> >) ../../src/devices/bin/driver_runtime/dispatcher.cc:1248 <libdriver_runtime.so>+0xc7e58
  #27   0x00002290f98e4159 in driver_runtime::Dispatcher::dispatch_callbacks(driver_runtime::Dispatcher*, std::__2::unique_ptr<driver_runtime::Dispatcher::event_waiter, std::__2::default_delete<driver_runtime::Dispatcher::event_waiter> >, fbl::ref_ptr<driver_runtime::Dispatcher>) ../../src/devices/bin/driver_runtime/dispatcher.cc:1308 <libdriver_runtime.so>+0xca159
  #28   0x00002290f9918414 in λ(const driver_runtime::Dispatcher::create_with_adder::(anon class)*, std::__2::unique_ptr<driver_runtime::Dispatcher::event_waiter, std::__2::default_delete<driver_runtime::Dispatcher::event_waiter> >, fbl::ref_ptr<driver_runtime::Dispatcher>) ../../src/devices/bin/driver_runtime/dispatcher.cc:353 <libdriver_runtime.so>+0xfe414
  #29   0x00002290f991812d in fit::internal::target<(lambda at../../src/devices/bin/driver_runtime/dispatcher.cc:351:7), true, false, std::__2::allocator<std::byte>, void, std::__2::unique_ptr<driver_runtime::Dispatcher::event_waiter, std::__2::default_delete<driver_runtime::Dispatcher::event_waiter>>, fbl::ref_ptr<driver_runtime::Dispatcher>>::invoke(void*, std::__2::unique_ptr<driver_runtime::Dispatcher::event_waiter, std::__2::default_delete<driver_runtime::Dispatcher::event_waiter> >, fbl::ref_ptr<driver_runtime::Dispatcher>) ../../sdk/lib/fit/include/lib/fit/internal/function.h:128 <libdriver_runtime.so>+0xfe12d
  #30   0x00002290f9906fc7 in fit::internal::function_base<8UL, true, void(std::__2::unique_ptr<driver_runtime::Dispatcher::event_waiter, std::__2::default_delete<driver_runtime::Dispatcher::event_waiter>>, fbl::ref_ptr<driver_runtime::Dispatcher>), std::__2::allocator<std::byte>>::invoke(const fit::internal::function_base<8UL, true, void (std::__2::unique_ptr<driver_runtime::Dispatcher::event_waiter, std::__2::default_delete<driver_runtime::Dispatcher::event_waiter> >, fbl::ref_ptr<driver_runtime::Dispatcher>), std::__2::allocator<std::byte> >*, std::__2::unique_ptr<driver_runtime::Dispatcher::event_waiter, std::__2::default_delete<driver_runtime::Dispatcher::event_waiter> >, fbl::ref_ptr<driver_runtime::Dispatcher>) ../../sdk/lib/fit/include/lib/fit/internal/function.h:522 <libdriver_runtime.so>+0xecfc7
  #31   0x00002290f9906c66 in fit::function_impl<8UL, true, void(std::__2::unique_ptr<driver_runtime::Dispatcher::event_waiter, std::__2::default_delete<driver_runtime::Dispatcher::event_waiter>>, fbl::ref_ptr<driver_runtime::Dispatcher>), std::__2::allocator<std::byte>>::operator()(const fit::function_impl<8UL, true, void (std::__2::unique_ptr<driver_runtime::Dispatcher::event_waiter, std::__2::default_delete<driver_runtime::Dispatcher::event_waiter> >, fbl::ref_ptr<driver_runtime::Dispatcher>), std::__2::allocator<std::byte> >*, std::__2::unique_ptr<driver_runtime::Dispatcher::event_waiter, std::__2::default_delete<driver_runtime::Dispatcher::event_waiter> >, fbl::ref_ptr<driver_runtime::Dispatcher>) ../../sdk/lib/fit/include/lib/fit/function.h:315 <libdriver_runtime.so>+0xecc66
  #32   0x00002290f98e73d9 in driver_runtime::Dispatcher::event_waiter::invoke_callback(driver_runtime::Dispatcher::event_waiter*, std::__2::unique_ptr<driver_runtime::Dispatcher::event_waiter, std::__2::default_delete<driver_runtime::Dispatcher::event_waiter> >, fbl::ref_ptr<driver_runtime::Dispatcher>) ../../src/devices/bin/driver_runtime/dispatcher.h:543 <libdriver_runtime.so>+0xcd3d9
  #33   0x00002290f98e700d in driver_runtime::Dispatcher::event_waiter::handle_event(std::__2::unique_ptr<driver_runtime::Dispatcher::event_waiter, std::__2::default_delete<driver_runtime::Dispatcher::event_waiter> >, async_dispatcher_t*, async::wait_base*, zx_status_t, zx_packet_signal_t const*) ../../src/devices/bin/driver_runtime/dispatcher.cc:1442 <libdriver_runtime.so>+0xcd00d
  #34   0x00002290f9918983 in async_loop_owned_event_handler<driver_runtime::Dispatcher::event_waiter>::handle_event(async_loop_owned_event_handler<driver_runtime::Dispatcher::event_waiter>*, async_dispatcher_t*, async::wait_base*, zx_status_t, zx_packet_signal_t const*) ../../src/devices/bin/driver_runtime/async_loop_owned_event_handler.h:59 <libdriver_runtime.so>+0xfe983
  #35   0x00002290f9918b9e in async::wait_method<async_loop_owned_event_handler<driver_runtime::Dispatcher::event_waiter>, &async_loop_owned_event_handler<driver_runtime::Dispatcher::event_waiter>::handle_event>::call_handler(async_dispatcher_t*, async_wait_t*, zx_status_t, zx_packet_signal_t const*) ../../sdk/lib/async/include/lib/async/cpp/wait.h:201 <libdriver_runtime.so>+0xfeb9e
  #36   0x00002290f99bf509 in async_loop_dispatch_wait(async_loop_t*, async_wait_t*, zx_status_t, zx_packet_signal_t const*) ../../sdk/lib/async-loop/loop.c:394 <libdriver_runtime.so>+0x1a5509
  #37   0x00002290f99b9958 in async_loop_run_once(async_loop_t*, zx_time_t) ../../sdk/lib/async-loop/loop.c:343 <libdriver_runtime.so>+0x19f958
  #38   0x00002290f99b9247 in async_loop_run(async_loop_t*, zx_time_t, _Bool) ../../sdk/lib/async-loop/loop.c:301 <libdriver_runtime.so>+0x19f247
  #39   0x00002290f99ba962 in async_loop_run_thread(void*) ../../sdk/lib/async-loop/loop.c:860 <libdriver_runtime.so>+0x1a0962
  #40   0x000041afd176ef30 in start_c11(void*) ../../zircon/third_party/ulib/musl/pthread/pthread_create.c:63 <libc.so>+0x84f30
  #41   0x000041afd18a448d in thread_trampoline(uintptr_t, uintptr_t) ../../zircon/system/ulib/runtime/thread.cc:100 <libc.so>+0x1ba48d

Link: acpica/acpica@1c28da22
Signed-off-by: Rafael J. Wysocki <[email protected]>
Link: https://patch.msgid.link/[email protected]
Signed-off-by: Tamir Duberstein <[email protected]>
[ rjw: Pick up the tag from Tamir ]
Signed-off-by: Rafael J. Wysocki <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
RobertCNelson pushed a commit that referenced this pull request Aug 27, 2025
commit 312d02a upstream.

The OP-TEE driver registers the function notif_callback() for FF-A
notifications. However, this function is called in an atomic context
leading to errors like this when processing asynchronous notifications:

 | BUG: sleeping function called from invalid context at kernel/locking/mutex.c:258
 | in_atomic(): 1, irqs_disabled(): 1, non_block: 0, pid: 9, name: kworker/0:0
 | preempt_count: 1, expected: 0
 | RCU nest depth: 0, expected: 0
 | CPU: 0 UID: 0 PID: 9 Comm: kworker/0:0 Not tainted 6.14.0-00019-g657536ebe0aa #13
 | Hardware name: linux,dummy-virt (DT)
 | Workqueue: ffa_pcpu_irq_notification notif_pcpu_irq_work_fn
 | Call trace:
 |  show_stack+0x18/0x24 (C)
 |  dump_stack_lvl+0x78/0x90
 |  dump_stack+0x18/0x24
 |  __might_resched+0x114/0x170
 |  __might_sleep+0x48/0x98
 |  mutex_lock+0x24/0x80
 |  optee_get_msg_arg+0x7c/0x21c
 |  simple_call_with_arg+0x50/0xc0
 |  optee_do_bottom_half+0x14/0x20
 |  notif_callback+0x3c/0x48
 |  handle_notif_callbacks+0x9c/0xe0
 |  notif_get_and_handle+0x40/0x88
 |  generic_exec_single+0x80/0xc0
 |  smp_call_function_single+0xfc/0x1a0
 |  notif_pcpu_irq_work_fn+0x2c/0x38
 |  process_one_work+0x14c/0x2b4
 |  worker_thread+0x2e4/0x3e0
 |  kthread+0x13c/0x210
 |  ret_from_fork+0x10/0x20

Fix this by adding work queue to process the notification in a
non-atomic context.

Fixes: d0476a5 ("optee: ffa_abi: add asynchronous notifications")
Cc: [email protected]
Reviewed-by: Sumit Garg <[email protected]>
Tested-by: Sudeep Holla <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jens Wiklander <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
RobertCNelson pushed a commit that referenced this pull request Aug 27, 2025
commit 228af5a upstream.

The commit 86bc882 ("staging: vchiq_arm: Create keep-alive thread
during probe") introduced a regression for certain configurations,
which doesn't have a VCHIQ user. This results in a unused and hanging
keep-alive thread:

  INFO: task vchiq-keep/0:85 blocked for more than 120 seconds.
        Not tainted 6.12.34-v8-+ #13
  "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
  task:vchiq-keep/0    state:D stack:0 pid:85    tgid:85    ppid:2
  Call trace:
   __switch_to+0x188/0x230
   __schedule+0xa54/0xb28
   schedule+0x80/0x120
   schedule_preempt_disabled+0x30/0x50
   kthread+0xd4/0x1a0
   ret_from_fork+0x10/0x20

Fixes: 86bc882 ("staging: vchiq_arm: Create keep-alive thread during probe")
Reported-by: Maíra Canal <[email protected]>
Closes: https://lore.kernel.org/linux-staging/[email protected]/
Cc: [email protected]
Signed-off-by: Stefan Wahren <[email protected]>
Reviewed-by: Maíra Canal <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Greg Kroah-Hartman <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants