Skip to content

ksmbd rdma reads hitting kernel panic 6.5.0 on arm server #499

@varadakari

Description

@varadakari

Testing ksmbd on arm server(64bit) with ubuntu 6.5.0 is hitting following panic.

Jan  2 11:58:29 ss193 kernel: [10638.510907] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000038
Jan  2 11:58:29 ss193 kernel: [10638.519728] Mem abort info:
Jan  2 11:58:29 ss193 kernel: [10638.522526]   ESR = 0x0000000096000004
Jan  2 11:58:29 ss193 kernel: [10638.526268]   EC = 0x25: DABT (current EL), IL = 32 bits
Jan  2 11:58:29 ss193 kernel: [10638.531573]   SET = 0, FnV = 0
Jan  2 11:58:29 ss193 kernel: [10638.534622]   EA = 0, S1PTW = 0
Jan  2 11:58:29 ss193 kernel: [10638.537758]   FSC = 0x04: level 0 translation fault
Jan  2 11:58:29 ss193 kernel: [10638.542630] Data abort info:
Jan  2 11:58:29 ss193 kernel: [10638.545508]   ISV = 0, ISS = 0x00000004, ISS2 = 0x00000000
Jan  2 11:58:29 ss193 kernel: [10638.550987]   CM = 0, WnR = 0, TnD = 0, TagAccess = 0
Jan  2 11:58:29 ss193 kernel: [10638.556032]   GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0
Jan  2 11:58:29 ss193 kernel: [10638.561339] user pgtable: 4k pages, 48-bit VAs, pgdp=00000006ee4f3000
Jan  2 11:58:29 ss193 kernel: [10638.567774] [0000000000000038] pgd=0000000000000000, p4d=0000000000000000
Jan  2 11:58:29 ss193 kernel: [10638.574562] Internal error: Oops: 0000000096000004 [#1] SMP
Jan  2 11:58:31 ss193 kernel: [10638.580123] Modules linked in: ksmbd(OE) nls_utf8 libdes rpcrdma rdma_cm iw_cm ib_cm sbsa_gwdt ipmi_ssif ipmi_devintf ipmi_msghandler nvme_fabrics target_core_mod 8021q garp mrp stp llc overlay binfmt_misc nls_iso8859_1 sch_fq_codel dm_multipath scsi_dh_rdac scsi_dh_emc scsi_dh_alua zfs(POE) spl(OE) efi_pstore drm nfsd auth_rpcgss nfs_acl lockd grace sunrpc ip_tables x_tables autofs4 btrfs blake2b_generic raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor xor_neon raid6_pq libcrc32c raid1 raid0 multipath linear dw_mmc_bluefield dw_mmc_pltfm dw_mmc mlx5_ib ib_uverbs ib_core mlx5_core mlxfw crct10dif_ce nvme psample nvme_core tls sdhci_of_dwcmshc nvme_common sdhci_pltfm vitesse pci_hyperv_intf sdhci  aes_neon_bs aes_neon_blk [last unloaded: crc32_generic]
Jan  2 11:58:31 ss193 kernel: [10638.656521] CPU: 8 PID: 575244 Comm: ksmbd:r445 Tainted: P           OE      6.5.0-45-generic #45~22.04.1-Ubuntu
Jan  2 11:58:31 ss193 kernel: [10638.679529] pstate: 60400009 (nZCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
Jan  2 11:58:31 ss193 kernel: [10638.686478] pc : smb_direct_read+0x1cc/0x3f8 [ksmbd]
Jan  2 11:58:31 ss193 kernel: [10638.691446] lr : ksmbd_conn_handler_loop+0x18c/0x440 [ksmbd]
Jan  2 11:58:31 ss193 kernel: [10638.697102] sp : ffff8000f9313d00
Jan  2 11:58:31 ss193 kernel: [10638.700403] x29: ffff8000f9313d00 x28: 0000000000000000 x27: 0000000000000000
Jan  2 11:58:31 ss193 kernel: [10638.707527] x26: 0000000000000000 x25: ffffc8e19edd9188 x24: ffff00027e4aec70
Jan  2 11:58:31 ss193 kernel: [10638.714650] x23: ffffc8e19ede9f68 x22: 0000000000000004 x21: ffff8000f9313e64
Jan  2 11:58:31 ss193 kernel: [10638.721773] x20: 0000000000000004 x19: ffff00027e4aec00 x18: ffff80008d8fd018
Jan  2 11:58:31 ss193 kernel: [10638.728896] x17: 0000000000000000 x16: 0000000000000000 x15: 0000000000000005
Jan  2 11:58:31 ss193 kernel: [10638.736019] x14: 0000000000000000 x13: 0000000000000000 x12: 0000003d63a78000
Jan  2 11:58:31 ss193 kernel: [10638.743142] x11: 0000000000000000 x10: 0000000000000000 x9 : ffffc8e19edab11c
Jan  2 11:58:31 ss193 kernel: [10638.750264] x8 : ffff0008140231c0 x7 : 0000000000000000 x6 : 0000000000000000
Jan  2 11:58:31 ss193 kernel: [10638.757386] x5 : 0000000000000000 x4 : ffffc8e19edcf208 x3 : ffff00028298c300
Jan  2 11:58:31 ss193 kernel: [10638.764509] x2 : 0000000000000039 x1 : 0000000000000001 x0 : ffff00027e4aec70
Jan  2 11:58:31 ss193 kernel: [10638.771632] Call trace:
Jan  2 11:58:31 ss193 kernel: [10638.774068]  smb_direct_read+0x1cc/0x3f8 [ksmbd]
Jan  2 11:58:31 ss193 kernel: [10638.778682]  ksmbd_conn_handler_loop+0x18c/0x440 [ksmbd]
Jan  2 11:58:31 ss193 kernel: [10638.783989]  kthread+0x100/0x118
Jan  2 11:58:31 ss193 kernel: [10638.787212]  ret_from_fork+0x10/0x20
Jan  2 11:58:31 ss193 kernel: [10638.790778] Code: 54000920 f9403a63 d100207b 9100e762 (3940e377)
Jan  2 11:58:31 ss193 kernel: [10638.796860] ---[ end trace 0000000000000000 ]---

This module is compiled on same machine commit 7390347
we are consistently hitting this issue. Client machine is running rocky linux 8.10 . Mount and writes are sucessful, hitting this issue on reads only. Running fio on 8 threads on the client on 8 different files.

please let me know if you need any more information.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions