* [GIT PULL] slab updates for 6.11
@ 2024-09-16 9:45 Vlastimil Babka
2024-09-18 7:06 ` Linus Torvalds
2024-09-18 8:10 ` pr-tracker-bot
0 siblings, 2 replies; 10+ messages in thread
From: Vlastimil Babka @ 2024-09-16 9:45 UTC (permalink / raw)
To: Linus Torvalds
Cc: David Rientjes, Christoph Lameter, Andrew Morton, linux-mm, LKML,
Roman Gushchin, Hyeonggon Yoo, Christian Brauner, RCU,
Shakeel Butt, Uladzislau Rezki (Sony)
Hi Linus,
please pull the latest slab updates from:
git://git.kernel.org/pub/scm/linux/kernel/git/vbabka/slab.git tags/slab-for-6.12
There's a small conflict with the rcu tree:
https://lore.kernel.org/lkml/20240812124748.3725011b@canb.auug.org.au/
For Christian's kmem_cache_create refactoring series, a commit in the vfs.file
tree was merged as a prerequisity, and you have already merged vfs.file. Due to
vfs.tree based on 6.11-rc4 and slab on v6.11-rc5, the git pull-rq generated
diffstat below now however seems to contain also changes from rc5 commits. The
shortlog is accurate.
Thanks,
Vlastimil
======================================
This time it's mostly refactoring and improving APIs for slab users in the
kernel, along with some debugging improvements.
* kmem_cache_create() refactoring (Christian Brauner)
Over the years have been growing new parameters to kmem_cache_create() where
most of them are needed only for a small number of caches - most recently the
rcu_freeptr_offset parameter. To avoid adding new parameters to
kmem_cache_create() and adjusting all its callers, or creating new wrappers
such as kmem_cache_create_rcu(), we can now pass extra parameters using the
new struct kmem_cache_args. Not explicitly initialized fields default to values
interpreted as unused. kmem_cache_create() is for now a wrapper that works both
with the new form: kmem_cache_create(name, object_size, args, flags) and the
legacy form: kmem_cache_create(name, object_size, align, flags, ctor)
* kmem_cache_destroy() waits for kfree_rcu()'s in flight (Vlastimil Babka,
Uladislau Rezki)
Since SLOB removal, kfree() is allowed for freeing objects allocated by
kmem_cache_create(). By extension kfree_rcu() as allowed as well, which can
allow converting simple call_rcu() callbacks that only do kmem_cache_free(),
as there was never a kmem_cache_free_rcu() variant. However, for caches that
can be destroyed e.g. on module removal, the cache owners knew to issue
rcu_barrier() first to wait for the pending call_rcu()'s, and this is not
sufficient for pending kfree_rcu()'s due to its internal batching
optimizations. Ulad has provided a new kvfree_rcu_barrier() and to make the
usage less error-prone, kmem_cache_destroy() calls it. Additionally, destroying
SLAB_TYPESAFE_BY_RCU caches now again issues rcu_barrier() synchronously
instead of using an async work, because the past motivation for async work no
longer applies. Users of custom call_rcu() callbacks should however keep
calling rcu_barrier() before cache destruction.
* Debugging use-after-free in SLAB_TYPESAFE_BY_RCU caches (Jann Horn)
Currently, KASAN cannot catch UAFs in such caches as it is legal to access
them within a grace period, and we only track the grace period when trying to
free the underlying slab page. The new CONFIG_SLUB_RCU_DEBUG option changes the
freeing of individual object to be RCU-delayed, after which KASAN can poison
them.
* Delayed memcg charging (Shakeel Butt)
In some cases, the memcg is uknown at allocation time, such as receiving network
packets in softirq context. With kmem_cache_charge() these may be now charged
later when the user and its memcg is known.
* Misc fixes and improvements (Pedro Falcato, Axel Rasmussen, Christoph Lameter,
Yan Zhen, Peng Fan, Xavier).
----------------------------------------------------------------
Axel Rasmussen (1):
mm, slub: print CPU id (and its node) on slab OOM
Christian Brauner (17):
slab: s/__kmem_cache_create/do_kmem_cache_create/g
slab: add struct kmem_cache_args
slab: port kmem_cache_create() to struct kmem_cache_args
slab: port kmem_cache_create_rcu() to struct kmem_cache_args
slab: port kmem_cache_create_usercopy() to struct kmem_cache_args
slab: pass struct kmem_cache_args to create_cache()
slab: pull kmem_cache_open() into do_kmem_cache_create()
slab: pass struct kmem_cache_args to do_kmem_cache_create()
slab: remove rcu_freeptr_offset from struct kmem_cache
slab: port KMEM_CACHE() to struct kmem_cache_args
slab: port KMEM_CACHE_USERCOPY() to struct kmem_cache_args
slab: create kmem_cache_create() compatibility layer
file: port to struct kmem_cache_args
slab: remove kmem_cache_create_rcu()
slab: make kmem_cache_create_usercopy() static inline
slab: make __kmem_cache_create() static inline
io_uring: port to struct kmem_cache_args
Christoph Lameter (1):
Reenable NUMA policy support in the slab allocator
Jann Horn (2):
kasan: catch invalid free before SLUB reinitializes the object
slub: Introduce CONFIG_SLUB_RCU_DEBUG
Pedro Falcato (1):
slab: Warn on duplicate cache names when DEBUG_VM=y
Peng Fan (1):
mm, slub: avoid zeroing kmalloc redzone
Shakeel Butt (1):
memcg: add charging of already allocated slab objects
Uladzislau Rezki (Sony) (1):
rcu/kvfree: Add kvfree_rcu_barrier() API
Vlastimil Babka (10):
mm, slab: dissolve shutdown_cache() into its caller
mm, slab: unlink slabinfo, sysfs and debugfs immediately
mm, slab: move kfence_shutdown_cache() outside slab_mutex
mm, slab: reintroduce rcu_barrier() into kmem_cache_destroy()
mm, slab: call kvfree_rcu_barrier() from kmem_cache_destroy()
kunit, slub: add test_kfree_rcu() and test_leak_destroy()
Merge branch 'vfs.file' of gitolite.kernel.org:pub/scm/linux/kernel/git/vfs/vfs into slab/for-6.12/kmem_cache_args
mm, slab: restore kerneldoc for kmem_cache_create()
Merge branch 'slab/for-6.12/rcu_barriers' into slab/for-next
Merge branch 'slab/for-6.12/kmem_cache_args' into slab/for-next
Xavier (1):
mm/slab: Optimize the code logic in find_mergeable()
Yan Zhen (1):
mm, slab: use kmem_cache_free() to free from kmem_buckets_cache
Documentation/core-api/workqueue.rst | 2 +-
.../bindings/input/touchscreen/edt-ft5x06.yaml | 1 +
Documentation/filesystems/erofs.rst | 2 +-
Documentation/filesystems/smb/ksmbd.rst | 26 +-
Documentation/kbuild/llvm.rst | 2 +-
MAINTAINERS | 36 +-
Makefile | 4 +-
arch/arm64/kvm/mmu.c | 9 +-
arch/arm64/kvm/sys_regs.c | 6 +
arch/arm64/kvm/vgic/vgic-debug.c | 2 +-
arch/arm64/kvm/vgic/vgic-init.c | 9 +-
arch/arm64/kvm/vgic/vgic.c | 5 +
arch/arm64/kvm/vgic/vgic.h | 7 +
arch/mips/kernel/cevt-r4k.c | 15 +-
arch/mips/kernel/cpu-probe.c | 4 +
arch/s390/Kconfig | 13 +
arch/s390/boot/startup.c | 58 +--
arch/s390/boot/vmem.c | 14 +-
arch/s390/boot/vmlinux.lds.S | 7 +-
arch/s390/include/asm/page.h | 3 +-
arch/s390/kernel/setup.c | 19 +-
arch/s390/kernel/vmlinux.lds.S | 2 +-
arch/s390/tools/relocs.c | 2 +-
block/blk-lib.c | 25 +-
drivers/accessibility/speakup/genmap.c | 1 -
drivers/accessibility/speakup/makemapdata.c | 1 -
drivers/acpi/video_detect.c | 22 ++
drivers/ata/pata_macio.c | 30 +-
drivers/bluetooth/btintel.c | 10 -
drivers/bluetooth/btintel_pcie.c | 3 -
drivers/bluetooth/btmtksdio.c | 3 -
drivers/bluetooth/btrtl.c | 1 -
drivers/bluetooth/btusb.c | 4 +-
drivers/bluetooth/hci_qca.c | 4 +-
drivers/bluetooth/hci_vhci.c | 2 -
drivers/cxl/core/pci.c | 10 +-
drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c | 2 +-
drivers/gpu/drm/amd/amdgpu/amdgpu_psp_ta.c | 3 +
drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c | 5 +-
drivers/gpu/drm/amd/amdgpu/sdma_v5_2.c | 18 +-
drivers/gpu/drm/i915/display/intel_dp_hdcp.c | 4 +-
drivers/gpu/drm/msm/adreno/adreno_gpu.c | 2 +-
drivers/gpu/drm/msm/disp/dpu1/dpu_encoder.c | 4 +-
drivers/gpu/drm/msm/disp/dpu1/dpu_hw_catalog.c | 4 +-
drivers/gpu/drm/msm/disp/dpu1/dpu_kms.h | 14 +-
drivers/gpu/drm/msm/disp/dpu1/dpu_plane.c | 20 +-
drivers/gpu/drm/msm/dp/dp_ctrl.c | 2 +
drivers/gpu/drm/msm/dp/dp_panel.c | 19 +-
drivers/gpu/drm/msm/msm_mdss.c | 2 +-
drivers/gpu/drm/nouveau/nvkm/core/firmware.c | 9 +-
drivers/gpu/drm/nouveau/nvkm/falcon/fw.c | 6 +
drivers/gpu/drm/xe/Makefile | 2 +
drivers/gpu/drm/xe/display/xe_display.c | 28 +-
drivers/gpu/drm/xe/display/xe_dsb_buffer.c | 8 +
drivers/gpu/drm/xe/display/xe_fb_pin.c | 3 +
drivers/gpu/drm/xe/regs/xe_gt_regs.h | 9 +
drivers/gpu/drm/xe/xe_bo.c | 6 +-
drivers/gpu/drm/xe/xe_device.c | 32 ++
drivers/gpu/drm/xe/xe_device.h | 1 +
drivers/gpu/drm/xe/xe_exec_queue.c | 24 +-
drivers/gpu/drm/xe/xe_exec_queue_types.h | 2 -
drivers/gpu/drm/xe/xe_gsc.c | 8 +-
drivers/gpu/drm/xe/xe_gt.c | 55 +++
drivers/gpu/drm/xe/xe_gt_pagefault.c | 18 +-
drivers/gpu/drm/xe/xe_gt_types.h | 6 +
drivers/gpu/drm/xe/xe_guc_submit.c | 4 +-
drivers/gpu/drm/xe/xe_hw_fence.c | 9 +-
drivers/gpu/drm/xe/xe_hw_fence_types.h | 7 +-
drivers/gpu/drm/xe/xe_mmio.c | 28 +-
drivers/gpu/drm/xe/xe_observation.c | 1 -
drivers/gpu/drm/xe/xe_pat.c | 11 +-
drivers/gpu/drm/xe/xe_pm.c | 11 +-
drivers/gpu/drm/xe/xe_preempt_fence.c | 3 +-
drivers/gpu/drm/xe/xe_preempt_fence_types.h | 2 +
drivers/gpu/drm/xe/xe_sched_job.c | 3 +-
drivers/gpu/drm/xe/xe_trace.h | 2 +-
drivers/gpu/drm/xe/xe_wa.c | 18 +
drivers/gpu/drm/xe/xe_wa_oob.rules | 1 +
drivers/hid/amd-sfh-hid/amd_sfh_hid.c | 4 +-
drivers/hid/hid-asus.c | 3 +
drivers/hid/hid-cougar.c | 2 +-
drivers/hid/hid-ids.h | 3 +
drivers/hid/hid-multitouch.c | 33 ++
drivers/hid/wacom_wac.c | 4 +-
drivers/input/joystick/adc-joystick.c | 7 +-
drivers/input/misc/uinput.c | 14 +
drivers/input/mouse/synaptics.c | 1 +
drivers/input/serio/i8042-acpipnpio.h | 29 +-
drivers/input/serio/i8042.c | 10 +-
drivers/input/touchscreen/ads7846.c | 2 +-
drivers/input/touchscreen/edt-ft5x06.c | 6 +
drivers/input/touchscreen/himax_hx83112b.c | 14 +-
drivers/iommu/iommufd/device.c | 2 +-
drivers/iommu/iommufd/selftest.c | 2 +-
drivers/mmc/core/mmc_test.c | 9 +-
drivers/mmc/host/dw_mmc.c | 8 +
drivers/mmc/host/mtk-sd.c | 8 +-
drivers/net/bonding/bond_main.c | 21 +-
drivers/net/bonding/bond_options.c | 2 +-
drivers/net/dsa/microchip/ksz_ptp.c | 5 +-
drivers/net/dsa/mv88e6xxx/global1_atu.c | 3 +-
drivers/net/dsa/ocelot/felix.c | 126 +++++-
drivers/net/ethernet/broadcom/bnxt/bnxt.c | 4 +-
drivers/net/ethernet/broadcom/bnxt/bnxt.h | 2 -
drivers/net/ethernet/broadcom/bnxt/bnxt_ethtool.c | 4 -
drivers/net/ethernet/broadcom/bnxt/bnxt_xdp.c | 5 -
drivers/net/ethernet/chelsio/cxgb4/cxgb4_filter.c | 3 +-
.../net/ethernet/freescale/dpaa2/dpaa2-switch.c | 7 +-
.../net/ethernet/intel/ice/devlink/devlink_port.c | 4 +-
drivers/net/ethernet/intel/ice/ice_base.c | 21 +-
drivers/net/ethernet/intel/ice/ice_txrx.c | 47 +--
drivers/net/ethernet/intel/igb/igb_main.c | 1 +
.../net/ethernet/marvell/octeontx2/af/rvu_cpt.c | 23 +-
drivers/net/ethernet/mellanox/mlx5/core/en.h | 1 +
drivers/net/ethernet/mellanox/mlx5/core/en_main.c | 21 +-
drivers/net/ethernet/mellanox/mlx5/core/en_rx.c | 26 +-
.../mellanox/mlx5/core/lib/ipsec_fs_roce.c | 6 +-
drivers/net/ethernet/mscc/ocelot.c | 279 ++++++++++++-
drivers/net/ethernet/mscc/ocelot_fdma.c | 3 +-
drivers/net/ethernet/mscc/ocelot_vcap.c | 1 +
drivers/net/ethernet/mscc/ocelot_vsc7514.c | 4 +
drivers/net/ethernet/wangxun/ngbe/ngbe_mdio.c | 8 +-
drivers/net/ethernet/xilinx/xilinx_axienet.h | 1 +
drivers/net/ethernet/xilinx/xilinx_axienet_main.c | 25 +-
drivers/net/phy/realtek.c | 3 +-
drivers/net/virtio_net.c | 2 +-
drivers/nvme/host/core.c | 2 +-
drivers/nvme/host/nvme.h | 1 -
drivers/platform/surface/aggregator/controller.c | 3 +-
.../platform/surface/surface_aggregator_registry.c | 58 ++-
drivers/platform/x86/asus-wmi.c | 16 +-
drivers/platform/x86/dell/Kconfig | 1 +
drivers/platform/x86/dell/dell-uart-backlight.c | 8 +
.../x86/intel/speed_select_if/isst_tpmi_core.c | 3 +-
drivers/pmdomain/imx/imx93-pd.c | 5 +-
drivers/pmdomain/imx/scu-pd.c | 5 -
drivers/power/sequencing/pwrseq-qcom-wcn.c | 2 +-
drivers/s390/crypto/ap_bus.c | 7 +-
drivers/scsi/sd.c | 3 +
drivers/spi/spi-cadence-quadspi.c | 14 +-
drivers/spi/spi-fsl-lpspi.c | 31 +-
drivers/spi/spi-pxa2xx-pci.c | 15 +-
drivers/spi/spi-pxa2xx-platform.c | 26 +-
drivers/spi/spi-pxa2xx.c | 20 +-
drivers/spi/spi-pxa2xx.h | 3 +-
drivers/spi/spi-zynqmp-gqspi.c | 30 +-
.../staging/media/atomisp/include/linux/atomisp.h | 1 -
drivers/thermal/thermal_debugfs.c | 6 +-
drivers/thermal/thermal_of.c | 23 +-
drivers/ufs/core/ufshcd.c | 6 +-
drivers/ufs/host/ufs-qcom.c | 6 +-
fs/bcachefs/alloc_background.c | 66 ++--
fs/bcachefs/alloc_background_format.h | 1 +
fs/bcachefs/bcachefs_format.h | 3 +-
fs/bcachefs/btree_cache.c | 25 ++
fs/bcachefs/btree_cache.h | 2 +
fs/bcachefs/btree_iter.h | 9 +
fs/bcachefs/btree_key_cache.c | 31 +-
fs/bcachefs/btree_update_interior.c | 46 ++-
fs/bcachefs/buckets.c | 74 ++--
fs/bcachefs/buckets_waiting_for_journal.c | 4 +-
fs/bcachefs/data_update.c | 209 +++++-----
fs/bcachefs/extents.c | 41 ++
fs/bcachefs/extents.h | 1 +
fs/bcachefs/fs-io-buffered.c | 2 +-
fs/bcachefs/fs-ioctl.c | 3 +-
fs/bcachefs/fsck.c | 6 +-
fs/bcachefs/journal.c | 2 +-
fs/bcachefs/journal_sb.c | 15 +
fs/bcachefs/movinggc.c | 2 +-
fs/bcachefs/recovery.c | 9 +-
fs/bcachefs/replicas.c | 3 +-
fs/bcachefs/sb-downgrade.c | 8 +-
fs/bcachefs/util.c | 1 -
fs/bcachefs/xattr.c | 12 +-
fs/erofs/dir.c | 35 +-
fs/erofs/inode.c | 18 +-
fs/erofs/internal.h | 2 +-
fs/erofs/super.c | 26 +-
fs/erofs/zutil.c | 3 +-
fs/file_table.c | 11 +-
fs/nfs/callback_xdr.c | 6 +-
fs/nfs/delegation.c | 15 +-
fs/nfs/nfs4proc.c | 12 +-
fs/nfs/pnfs.c | 5 +-
fs/nfs/super.c | 2 +
fs/smb/client/cifsfs.c | 6 +-
fs/smb/client/cifsglob.h | 6 +-
fs/smb/client/connect.c | 3 +
fs/smb/client/file.c | 4 +-
fs/smb/client/ioctl.c | 2 +
fs/smb/client/link.c | 1 +
fs/smb/client/reparse.c | 11 +-
fs/smb/server/connection.c | 34 +-
fs/smb/server/connection.h | 3 +-
fs/smb/server/mgmt/user_session.c | 9 +
fs/smb/server/oplock.c | 2 +-
fs/smb/server/smb2pdu.c | 31 +-
fs/smb/server/smb_common.h | 4 +-
fs/super.c | 4 +-
include/acpi/video.h | 1 +
include/linux/blkdev.h | 7 +-
include/linux/dsa/ocelot.h | 47 +++
include/linux/kasan.h | 63 ++-
include/linux/panic.h | 1 +
include/linux/rcutiny.h | 5 +
include/linux/rcutree.h | 1 +
include/linux/slab.h | 228 ++++++++++-
include/net/bluetooth/hci.h | 17 +-
include/net/bluetooth/hci_core.h | 2 +-
include/net/dsa.h | 16 +-
include/net/kcm.h | 1 +
include/scsi/scsi_cmnd.h | 2 +-
include/soc/mscc/ocelot.h | 12 +-
include/soc/mscc/ocelot_vcap.h | 2 +
include/trace/events/rpcrdma.h | 36 ++
include/uapi/drm/xe_drm.h | 8 +-
include/ufs/ufshcd.h | 8 +
io_uring/io_uring.c | 14 +-
io_uring/kbuf.c | 9 +-
kernel/cgroup/cpuset.c | 38 +-
kernel/panic.c | 8 +-
kernel/printk/printk.c | 2 +-
kernel/rcu/tree.c | 109 +++++-
kernel/workqueue.c | 50 +--
lib/slub_kunit.c | 31 ++
mm/Kconfig.debug | 32 ++
mm/kasan/common.c | 62 +--
mm/kasan/kasan_test.c | 46 +++
mm/slab.h | 13 +-
mm/slab_common.c | 354 ++++++-----------
mm/slub.c | 412 +++++++++++++-------
net/bluetooth/hci_core.c | 19 +-
net/bluetooth/hci_event.c | 2 +-
net/bluetooth/mgmt.c | 4 +
net/bluetooth/smp.c | 144 +++----
net/core/netpoll.c | 2 -
net/dsa/tag.c | 5 +-
net/dsa/tag.h | 135 +++++--
net/dsa/tag_ocelot.c | 37 +-
net/ipv4/inet_connection_sock.c | 5 +-
net/ipv4/tcp_ipv4.c | 14 +
net/ipv4/udp_offload.c | 3 +-
net/ipv6/ip6_output.c | 10 +
net/ipv6/ip6_tunnel.c | 12 +-
net/iucv/iucv.c | 4 +-
net/kcm/kcmsock.c | 4 +
net/mctp/test/route-test.c | 2 +-
net/mptcp/pm.c | 13 -
net/mptcp/pm_netlink.c | 142 ++++---
net/mptcp/protocol.h | 3 -
net/netfilter/nf_flow_table_inet.c | 3 +
net/netfilter/nf_flow_table_ip.c | 3 +
net/netfilter/nft_counter.c | 9 +-
net/openvswitch/datapath.c | 2 +-
net/sched/sch_netem.c | 47 ++-
net/sunrpc/xprtrdma/ib_client.c | 6 +-
samples/trace_events/trace_custom_sched.c | 1 -
scripts/Makefile.build | 2 +-
scripts/Makefile.lib | 28 +-
scripts/Makefile.modfinal | 2 +-
scripts/Makefile.vmlinux | 2 +-
scripts/Makefile.vmlinux_o | 2 +-
scripts/kconfig/merge_config.sh | 2 +
scripts/link-vmlinux.sh | 3 +-
sound/soc/codecs/cs42l42.c | 1 -
tools/testing/cxl/Kbuild | 1 +
tools/testing/cxl/test/mock.c | 12 +
.../selftests/drivers/net/mlxsw/ethtool_lanes.sh | 3 +-
.../selftests/net/forwarding/bridge_vlan_aware.sh | 54 ++-
tools/testing/selftests/net/forwarding/lib.sh | 57 +++
.../selftests/net/forwarding/local_termination.sh | 431 +++++++++++++++++----
tools/testing/selftests/net/mptcp/mptcp_join.sh | 76 +++-
tools/testing/selftests/net/udpgro.sh | 53 +--
tools/testing/selftests/tc-testing/tdc.py | 1 -
275 files changed, 4049 insertions(+), 1673 deletions(-)
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [GIT PULL] slab updates for 6.11
2024-09-16 9:45 [GIT PULL] slab updates for 6.11 Vlastimil Babka
@ 2024-09-18 7:06 ` Linus Torvalds
2024-09-18 14:40 ` Uladzislau Rezki
2024-09-18 8:10 ` pr-tracker-bot
1 sibling, 1 reply; 10+ messages in thread
From: Linus Torvalds @ 2024-09-18 7:06 UTC (permalink / raw)
To: Vlastimil Babka
Cc: David Rientjes, Christoph Lameter, Andrew Morton, linux-mm, LKML,
Roman Gushchin, Hyeonggon Yoo, Christian Brauner, RCU,
Shakeel Butt, Uladzislau Rezki (Sony)
On Mon, 16 Sept 2024 at 11:45, Vlastimil Babka <vbabka@suse.cz> wrote:
>
> There's a small conflict with the rcu tree:
> https://lore.kernel.org/lkml/20240812124748.3725011b@canb.auug.org.au/
Hmm. The conflict resolution is trivial, but the code itself looks buggy.
Look here, commit 2b55d6a42d14 ("rcu/kvfree: Add kvfree_rcu_barrier()
API") makes kvfree_rcu_queue_batch() do this:
bool queued = false;
...
for (i = 0; i < KFREE_N_BATCHES; i++) {
...
queued = queue_rcu_work(system_wq, &krwp->rcu_work);
...
return queued;
and note how that return value is completely nonsensical. It doesn't
imply anything got queued. It's returning whether the *last* call to
queue_rcu_work() resulted in queued work.
There is no way the return value is meaningful that I can see, and
honestly, that means that the code in kvfree_rcu_barrier() looks
actively buggy, and at worst might be an endless loop
Now, maybe there's some reason why the code works fine, but it looks
really really wrong. Please fix.
The fix might be either a big comment about why it's ok, or making the
"queued" assignment be a '|=' instead, or perhaps breaking out of the
loop on the first successful queueing, or whatever.
But not this "randomly return _one_ value of many of the queuing success".
I've merged this, but I expect this to be fixed.
Linus
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [GIT PULL] slab updates for 6.11
2024-09-16 9:45 [GIT PULL] slab updates for 6.11 Vlastimil Babka
2024-09-18 7:06 ` Linus Torvalds
@ 2024-09-18 8:10 ` pr-tracker-bot
1 sibling, 0 replies; 10+ messages in thread
From: pr-tracker-bot @ 2024-09-18 8:10 UTC (permalink / raw)
To: Vlastimil Babka
Cc: Linus Torvalds, David Rientjes, Christoph Lameter, Andrew Morton,
linux-mm, LKML, Roman Gushchin, Hyeonggon Yoo, Christian Brauner,
RCU, Shakeel Butt, Uladzislau Rezki (Sony)
The pull request you sent on Mon, 16 Sep 2024 11:45:42 +0200:
> git://git.kernel.org/pub/scm/linux/kernel/git/vbabka/slab.git tags/slab-for-6.12
has been merged into torvalds/linux.git:
https://git.kernel.org/torvalds/c/bdf56c7580d267a123cc71ca0f2459c797b76fde
Thank you!
--
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/prtracker.html
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [GIT PULL] slab updates for 6.11
2024-09-18 7:06 ` Linus Torvalds
@ 2024-09-18 14:40 ` Uladzislau Rezki
2024-09-26 16:35 ` Vlastimil Babka
0 siblings, 1 reply; 10+ messages in thread
From: Uladzislau Rezki @ 2024-09-18 14:40 UTC (permalink / raw)
To: Linus Torvalds
Cc: Vlastimil Babka, David Rientjes, Christoph Lameter,
Andrew Morton, linux-mm, LKML, Roman Gushchin, Hyeonggon Yoo,
Christian Brauner, RCU, Shakeel Butt
Hello, Linus!
On Wed, Sep 18, 2024 at 9:06 AM Linus Torvalds
<torvalds@linux-foundation.org> wrote:
>
> On Mon, 16 Sept 2024 at 11:45, Vlastimil Babka <vbabka@suse.cz> wrote:
> >
> > There's a small conflict with the rcu tree:
> > https://lore.kernel.org/lkml/20240812124748.3725011b@canb.auug.org.au/
>
> Hmm. The conflict resolution is trivial, but the code itself looks buggy.
>
> Look here, commit 2b55d6a42d14 ("rcu/kvfree: Add kvfree_rcu_barrier()
> API") makes kvfree_rcu_queue_batch() do this:
>
> bool queued = false;
> ...
> for (i = 0; i < KFREE_N_BATCHES; i++) {
> ...
> queued = queue_rcu_work(system_wq, &krwp->rcu_work);
> ...
> return queued;
>
> and note how that return value is completely nonsensical. It doesn't
> imply anything got queued. It's returning whether the *last* call to
> queue_rcu_work() resulted in queued work.
>
> There is no way the return value is meaningful that I can see, and
> honestly, that means that the code in kvfree_rcu_barrier() looks
> actively buggy, and at worst might be an endless loop
>
> Now, maybe there's some reason why the code works fine, but it looks
> really really wrong. Please fix.
>
> The fix might be either a big comment about why it's ok, or making the
> "queued" assignment be a '|=' instead, or perhaps breaking out of the
> loop on the first successful queueing, or whatever.
>
> But not this "randomly return _one_ value of many of the queuing success".
>
Thank you for valuable feedback! Indeed it is hard to follow, even
though it works correctly.
I will add the comment and also break the loop on first queuing as you
suggested!
It does not make sense to loop further because following iterations
are never successful
thus never overwrite "queued" variable(it never reaches the
queue_rcu_work() call).
<snip>
bool queued = false;
...
for (i = 0; i < KFREE_N_BATCHES; i++) {
if (need_offload_krc(krcp)) {
queued = queue_rcu_work(system_wq, &krwp->rcu_work);
...
return queued;
<snip>
if we queued, "if(need_offload_krc())" condition is never true anymore.
Below refactoring makes it clear. I will send the patch to address it.
<snip>
diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
index a60616e69b66..b1f883fcd918 100644
--- a/kernel/rcu/tree.c
+++ b/kernel/rcu/tree.c
@@ -3607,11 +3607,12 @@ kvfree_rcu_queue_batch(struct kfree_rcu_cpu *krcp)
}
// One work is per one batch, so there are three
- // "free channels", the batch can handle. It can
- // be that the work is in the pending state when
- // channels have been detached following by each
- // other.
+ // "free channels", the batch can handle. Break
+ // the loop since it is done with this CPU thus
+ // queuing an RCU work is _always_ success here.
queued = queue_rcu_work(system_unbound_wq,
&krwp->rcu_work);
+ WARN_ON_ONCE(!queued);
+ break;
}
}
<snip>
Thanks!
--
Uladzislau Rezki
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [GIT PULL] slab updates for 6.11
2024-09-18 14:40 ` Uladzislau Rezki
@ 2024-09-26 16:35 ` Vlastimil Babka
2024-09-26 16:40 ` Uladzislau Rezki
0 siblings, 1 reply; 10+ messages in thread
From: Vlastimil Babka @ 2024-09-26 16:35 UTC (permalink / raw)
To: Uladzislau Rezki, Linus Torvalds
Cc: David Rientjes, Christoph Lameter, Andrew Morton, linux-mm, LKML,
Roman Gushchin, Hyeonggon Yoo, Christian Brauner, RCU,
Shakeel Butt
On 9/18/24 16:40, Uladzislau Rezki wrote:
>>
> Thank you for valuable feedback! Indeed it is hard to follow, even
> though it works correctly.
> I will add the comment and also break the loop on first queuing as you
> suggested!
>
> It does not make sense to loop further because following iterations
> are never successful
> thus never overwrite "queued" variable(it never reaches the
> queue_rcu_work() call).
>
> <snip>
> bool queued = false;
> ...
> for (i = 0; i < KFREE_N_BATCHES; i++) {
> if (need_offload_krc(krcp)) {
> queued = queue_rcu_work(system_wq, &krwp->rcu_work);
> ...
> return queued;
> <snip>
>
> if we queued, "if(need_offload_krc())" condition is never true anymore.
>
> Below refactoring makes it clear. I will send the patch to address it.
Looks good, AFAICT. Can you send the full patch then? Thanks.
> <snip>
> diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
> index a60616e69b66..b1f883fcd918 100644
> --- a/kernel/rcu/tree.c
> +++ b/kernel/rcu/tree.c
> @@ -3607,11 +3607,12 @@ kvfree_rcu_queue_batch(struct kfree_rcu_cpu *krcp)
> }
>
> // One work is per one batch, so there are three
> - // "free channels", the batch can handle. It can
> - // be that the work is in the pending state when
> - // channels have been detached following by each
> - // other.
> + // "free channels", the batch can handle. Break
> + // the loop since it is done with this CPU thus
> + // queuing an RCU work is _always_ success here.
> queued = queue_rcu_work(system_unbound_wq,
> &krwp->rcu_work);
> + WARN_ON_ONCE(!queued);
> + break;
> }
> }
> <snip>
>
> Thanks!
>
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [GIT PULL] slab updates for 6.11
2024-09-26 16:35 ` Vlastimil Babka
@ 2024-09-26 16:40 ` Uladzislau Rezki
2024-09-26 16:46 ` Vlastimil Babka
0 siblings, 1 reply; 10+ messages in thread
From: Uladzislau Rezki @ 2024-09-26 16:40 UTC (permalink / raw)
To: Vlastimil Babka
Cc: Uladzislau Rezki, Linus Torvalds, David Rientjes,
Christoph Lameter, Andrew Morton, linux-mm, LKML, Roman Gushchin,
Hyeonggon Yoo, Christian Brauner, RCU, Shakeel Butt
On Thu, Sep 26, 2024 at 06:35:27PM +0200, Vlastimil Babka wrote:
> On 9/18/24 16:40, Uladzislau Rezki wrote:
> >>
> > Thank you for valuable feedback! Indeed it is hard to follow, even
> > though it works correctly.
> > I will add the comment and also break the loop on first queuing as you
> > suggested!
> >
> > It does not make sense to loop further because following iterations
> > are never successful
> > thus never overwrite "queued" variable(it never reaches the
> > queue_rcu_work() call).
> >
> > <snip>
> > bool queued = false;
> > ...
> > for (i = 0; i < KFREE_N_BATCHES; i++) {
> > if (need_offload_krc(krcp)) {
> > queued = queue_rcu_work(system_wq, &krwp->rcu_work);
> > ...
> > return queued;
> > <snip>
> >
> > if we queued, "if(need_offload_krc())" condition is never true anymore.
> >
> > Below refactoring makes it clear. I will send the patch to address it.
>
> Looks good, AFAICT. Can you send the full patch then? Thanks.
>
I will do so. We can send it from RCU-side for rcX, this merge window or
you can do it.
What is the best for you?
--
Uladzislau Rezki
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [GIT PULL] slab updates for 6.11
2024-09-26 16:40 ` Uladzislau Rezki
@ 2024-09-26 16:46 ` Vlastimil Babka
2024-09-26 17:07 ` Uladzislau Rezki
0 siblings, 1 reply; 10+ messages in thread
From: Vlastimil Babka @ 2024-09-26 16:46 UTC (permalink / raw)
To: Uladzislau Rezki
Cc: Linus Torvalds, David Rientjes, Christoph Lameter, Andrew Morton,
linux-mm, LKML, Roman Gushchin, Hyeonggon Yoo, Christian Brauner,
RCU, Shakeel Butt
On 9/26/24 18:40, Uladzislau Rezki wrote:
> On Thu, Sep 26, 2024 at 06:35:27PM +0200, Vlastimil Babka wrote:
>> On 9/18/24 16:40, Uladzislau Rezki wrote:
>> >>
>> > Thank you for valuable feedback! Indeed it is hard to follow, even
>> > though it works correctly.
>> > I will add the comment and also break the loop on first queuing as you
>> > suggested!
>> >
>> > It does not make sense to loop further because following iterations
>> > are never successful
>> > thus never overwrite "queued" variable(it never reaches the
>> > queue_rcu_work() call).
>> >
>> > <snip>
>> > bool queued = false;
>> > ...
>> > for (i = 0; i < KFREE_N_BATCHES; i++) {
>> > if (need_offload_krc(krcp)) {
>> > queued = queue_rcu_work(system_wq, &krwp->rcu_work);
>> > ...
>> > return queued;
>> > <snip>
>> >
>> > if we queued, "if(need_offload_krc())" condition is never true anymore.
>> >
>> > Below refactoring makes it clear. I will send the patch to address it.
>>
>> Looks good, AFAICT. Can you send the full patch then? Thanks.
>>
> I will do so. We can send it from RCU-side for rcX, this merge window or
> you can do it.
>
> What is the best for you?
Guess I could do via slab tree since the original commit went there too.
Thanks
> --
> Uladzislau Rezki
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [GIT PULL] slab updates for 6.11
2024-09-26 16:46 ` Vlastimil Babka
@ 2024-09-26 17:07 ` Uladzislau Rezki
0 siblings, 0 replies; 10+ messages in thread
From: Uladzislau Rezki @ 2024-09-26 17:07 UTC (permalink / raw)
To: Vlastimil Babka
Cc: Uladzislau Rezki, Linus Torvalds, David Rientjes,
Christoph Lameter, Andrew Morton, linux-mm, LKML, Roman Gushchin,
Hyeonggon Yoo, Christian Brauner, RCU, Shakeel Butt
On Thu, Sep 26, 2024 at 06:46:29PM +0200, Vlastimil Babka wrote:
> On 9/26/24 18:40, Uladzislau Rezki wrote:
> > On Thu, Sep 26, 2024 at 06:35:27PM +0200, Vlastimil Babka wrote:
> >> On 9/18/24 16:40, Uladzislau Rezki wrote:
> >> >>
> >> > Thank you for valuable feedback! Indeed it is hard to follow, even
> >> > though it works correctly.
> >> > I will add the comment and also break the loop on first queuing as you
> >> > suggested!
> >> >
> >> > It does not make sense to loop further because following iterations
> >> > are never successful
> >> > thus never overwrite "queued" variable(it never reaches the
> >> > queue_rcu_work() call).
> >> >
> >> > <snip>
> >> > bool queued = false;
> >> > ...
> >> > for (i = 0; i < KFREE_N_BATCHES; i++) {
> >> > if (need_offload_krc(krcp)) {
> >> > queued = queue_rcu_work(system_wq, &krwp->rcu_work);
> >> > ...
> >> > return queued;
> >> > <snip>
> >> >
> >> > if we queued, "if(need_offload_krc())" condition is never true anymore.
> >> >
> >> > Below refactoring makes it clear. I will send the patch to address it.
> >>
> >> Looks good, AFAICT. Can you send the full patch then? Thanks.
> >>
> > I will do so. We can send it from RCU-side for rcX, this merge window or
> > you can do it.
> >
> > What is the best for you?
>
> Guess I could do via slab tree since the original commit went there too.
>
Make sense. I will send to you then!
--
Uladzislau Rezki
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [GIT PULL] slab updates for 6.11
2024-07-17 10:49 Vlastimil Babka
@ 2024-07-18 23:06 ` pr-tracker-bot
0 siblings, 0 replies; 10+ messages in thread
From: pr-tracker-bot @ 2024-07-18 23:06 UTC (permalink / raw)
To: Vlastimil Babka
Cc: Linus Torvalds, David Rientjes, Christoph Lameter, Joonsoo Kim,
Pekka Enberg, Andrew Morton, linux-mm, LKML, Roman Gushchin,
Hyeonggon Yoo, Kees Cook, Chengming Zhou
The pull request you sent on Wed, 17 Jul 2024 12:49:23 +0200:
> git://git.kernel.org/pub/scm/linux/kernel/git/vbabka/slab.git tags/slab-for-6.11
has been merged into torvalds/linux.git:
https://git.kernel.org/torvalds/c/76d9b92e68f2bb55890f935c5143f4fef97a935d
Thank you!
--
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/prtracker.html
^ permalink raw reply [flat|nested] 10+ messages in thread
* [GIT PULL] slab updates for 6.11
@ 2024-07-17 10:49 Vlastimil Babka
2024-07-18 23:06 ` pr-tracker-bot
0 siblings, 1 reply; 10+ messages in thread
From: Vlastimil Babka @ 2024-07-17 10:49 UTC (permalink / raw)
To: Linus Torvalds
Cc: David Rientjes, Christoph Lameter, Joonsoo Kim, Pekka Enberg,
Andrew Morton, linux-mm, LKML, Roman Gushchin, Hyeonggon Yoo,
Kees Cook, Chengming Zhou
Hi Linus,
please pull the latest slab updates from:
git://git.kernel.org/pub/scm/linux/kernel/git/vbabka/slab.git tags/slab-for-6.11
no merge conflicts with other trees expected.
Thanks,
Vlastimil
======================================
The most prominent change this time is the kmem_buckets based hardening of
kmalloc() allocations from Kees Cook. We have also extended the kmalloc()
alignment guarantees for non-power-of-two sizes in a way that benefits rust.
The rest are various cleanups and non-critical fixups.
- Dedicated bucket allocator (Kees Cook)
This series [1] enhances the probabilistic defense against heap
spraying/grooming of CONFIG_RANDOM_KMALLOC_CACHES from last year. kmalloc()
users that are known to be useful for exploits can get completely separate
set of kmalloc caches that can't be shared with other users. The first
converted users are alloc_msg() and memdup_user(). The hardening is enabled by
CONFIG_SLAB_BUCKETS.
- Extended kmalloc() alignment guarantees (Vlastimil Babka)
For years now we have guaranteed natural alignment for power-of-two
allocations, but nothing was defined for other sizes (in practice, we have
two such buckets, kmalloc-96 and kmalloc-192). To avoid unnecessary padding
in the rust layer due to its alignment rules, extend the guarantee so that
the alignment is at least the largest power-of-two divisor of the requested
size. This fits what rust needs, is a superset of the existing power-of-two
guarantee, and does not in practice change the layout (and thus does not add
overhead due to padding) of the kmalloc-96 and kmalloc-192 caches, unless slab
debugging is enabled for them.
- Cleanups and non-critical fixups (Chengming Zhou, Suren Baghdasaryan, Matthew
Willcox, Alex Shi, Vlastimil Babka)
Various tweaks related to the new alloc profiling code, folio conversion,
debugging and more leftovers after SLAB.
[1] https://lore.kernel.org/all/20240701190152.it.631-kees@kernel.org/
----------------------------------------------------------------
Alex Shi (Tencent) (1):
mm/memcg: alignment memcg_data define condition
Chengming Zhou (3):
slab: make check_object() more consistent
slab: don't put freepointer outside of object if only orig_size
slab: delete useless RED_INACTIVE and RED_ACTIVE
Kees Cook (6):
mm/slab: Introduce kmem_buckets typedef
mm/slab: Plumb kmem_buckets into __do_kmalloc_node()
mm/slab: Introduce kvmalloc_buckets_node() that can take kmem_buckets argument
mm/slab: Introduce kmem_buckets_create() and family
ipc, msg: Use dedicated slab buckets for alloc_msg()
mm/util: Use dedicated slab buckets for memdup_user()
Matthew Wilcox (Oracle) (1):
mm: Reduce the number of slab->folio casts
Suren Baghdasaryan (2):
mm, slab: move allocation tagging code in the alloc path into a hook
mm, slab: move prepare_slab_obj_exts_hook under CONFIG_MEM_ALLOC_PROFILING
Vlastimil Babka (3):
mm, slab: don't wrap internal functions with alloc_hooks()
slab, rust: extend kmalloc() alignment guarantees to remove Rust padding
Merge branch 'slab/for-6.11/buckets' into slab/for-next
Documentation/core-api/memory-allocation.rst | 6 +-
include/linux/mm.h | 6 +-
include/linux/mm_types.h | 9 +-
include/linux/poison.h | 7 +-
include/linux/slab.h | 97 +++++++++----
ipc/msgutil.c | 13 +-
kernel/configs/hardening.config | 1 +
lib/fortify_kunit.c | 2 -
lib/slub_kunit.c | 2 +-
mm/Kconfig | 17 +++
mm/slab.h | 14 +-
mm/slab_common.c | 111 +++++++++++++-
mm/slub.c | 209 +++++++++++++++------------
mm/util.c | 23 ++-
rust/kernel/alloc/allocator.rs | 19 +--
scripts/kernel-doc | 1 +
tools/include/linux/poison.h | 7 +-
17 files changed, 369 insertions(+), 175 deletions(-)
^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2024-09-26 17:07 UTC | newest]
Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-09-16 9:45 [GIT PULL] slab updates for 6.11 Vlastimil Babka
2024-09-18 7:06 ` Linus Torvalds
2024-09-18 14:40 ` Uladzislau Rezki
2024-09-26 16:35 ` Vlastimil Babka
2024-09-26 16:40 ` Uladzislau Rezki
2024-09-26 16:46 ` Vlastimil Babka
2024-09-26 17:07 ` Uladzislau Rezki
2024-09-18 8:10 ` pr-tracker-bot
-- strict thread matches above, loose matches on Subject: below --
2024-07-17 10:49 Vlastimil Babka
2024-07-18 23:06 ` pr-tracker-bot
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox