From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 38B6CC433F5 for ; Tue, 16 Nov 2021 00:16:46 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id D15EA63238 for ; Tue, 16 Nov 2021 00:16:45 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org D15EA63238 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=suse.cz Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id 880C26B0080; Mon, 15 Nov 2021 19:16:40 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 4CED86B007D; Mon, 15 Nov 2021 19:16:40 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 0406B6B0085; Mon, 15 Nov 2021 19:16:39 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0029.hostedemail.com [216.40.44.29]) by kanga.kvack.org (Postfix) with ESMTP id A91816B0081 for ; Mon, 15 Nov 2021 19:16:39 -0500 (EST) Received: from smtpin12.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay03.hostedemail.com (Postfix) with ESMTP id 62B118249980 for ; Tue, 16 Nov 2021 00:16:39 +0000 (UTC) X-FDA: 78812877318.12.E4BD1E1 Received: from smtp-out2.suse.de (smtp-out2.suse.de [195.135.220.29]) by imf01.hostedemail.com (Postfix) with ESMTP id 7CF355092A09 for ; Tue, 16 Nov 2021 00:16:22 +0000 (UTC) Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by smtp-out2.suse.de (Postfix) with ESMTPS id 7AF171FD48; Tue, 16 Nov 2021 00:16:37 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.cz; s=susede2_rsa; t=1637021797; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=u9cmDV+W39NbISKe4Rpy4iaf0hJRaEPDlFeFhi7sLAY=; b=sofNy7Ts8wEk964/22BEIxbaYZpyOAnyD7+y9w2Gk40y3agB3ksCDOLCq5CqgXUvTCLi01 8p0/m6HnEUX1+5K+q/XJi8El4b/4jVXhtZESnYjtkR94hGj6whAewcBnFrVr4gIxa3AYdl KRdlW/37oc5r+xW/rMaj6IFWZt+QNz0= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.cz; s=susede2_ed25519; t=1637021797; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=u9cmDV+W39NbISKe4Rpy4iaf0hJRaEPDlFeFhi7sLAY=; b=M4t2GSOp3YO5IUdo/X2q1Zh8vVGFYUxaJ/tZO9KqzqqiZ6qJQSuqmbCX7RbmyFapJU9ApN KnAyYGvM6pPKp9BA== Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by imap2.suse-dmz.suse.de (Postfix) with ESMTPS id 56EAF13F72; Tue, 16 Nov 2021 00:16:37 +0000 (UTC) Received: from dovecot-director2.suse.de ([192.168.254.65]) by imap2.suse-dmz.suse.de with ESMTPSA id sNlSFGX4kmFjXAAAMHmgww (envelope-from ); Tue, 16 Nov 2021 00:16:37 +0000 From: Vlastimil Babka To: Matthew Wilcox , linux-mm@kvack.org, Christoph Lameter , David Rientjes , Joonsoo Kim , Pekka Enberg Cc: Vlastimil Babka Subject: [RFC PATCH 03/32] mm: Split slab into its own type Date: Tue, 16 Nov 2021 01:15:59 +0100 Message-Id: <20211116001628.24216-4-vbabka@suse.cz> X-Mailer: git-send-email 2.33.1 In-Reply-To: <20211116001628.24216-1-vbabka@suse.cz> References: <20211116001628.24216-1-vbabka@suse.cz> MIME-Version: 1.0 X-Developer-Signature: v=1; a=openpgp-sha256; l=8910; i=vbabka@suse.cz; h=from:subject; bh=tc35dxwuzr49FdLXpEgDm6pay8HWkRAZQUYlSXKFKE0=; b=owEBbQGS/pANAwAIAeAhynPxiakQAcsmYgBhkvglkEhXzbAePBWg/4rw+i8lSp63Bfe6zY/vblbf ws+BrveJATMEAAEIAB0WIQSNS5MBqTXjGL5IXszgIcpz8YmpEAUCYZL4JQAKCRDgIcpz8YmpEGV3B/ 9JGVchURCJegNGZ12URC+jYBXBNHKNxIYbB2oAEpmcTnJEZKvwbog4qvaBy8V/shLf+mwsqmj8lq1A bC2BY+Gv02zUvEY29uFYo8pMQZunmBkAeoU4QTnAhDg5M6RTPHNo1RwtHXYG8Y1hvEp/MeCfx+tzDR OIhkZS3/aXqpi0qy/eh3T1qFfeyDXbnl0YOX58Hf8rhDM6BsRN1tjhyJEgrzYXgy7hJJ8jZ36AkJYW FCxs6rAnNWu5PcHwPt/dmBvoatDaMdARIF/aiT7dgY6zBPIaLJWIfmGNIDZWLwbcRJdSB3/3HbNViO tusag0LuHjfEHMeS7sluRzl8s5h40d X-Developer-Key: i=vbabka@suse.cz; a=openpgp; fpr=A940D434992C2E8E99103D50224FA7E7CC82A664 X-Stat-Signature: ynjieey3u3zpsi1gpfwxw3kaekninydy Authentication-Results: imf01.hostedemail.com; dkim=pass header.d=suse.cz header.s=susede2_rsa header.b=sofNy7Ts; dkim=pass header.d=suse.cz header.s=susede2_ed25519 header.b=M4t2GSOp; spf=pass (imf01.hostedemail.com: domain of vbabka@suse.cz designates 195.135.220.29 as permitted sender) smtp.mailfrom=vbabka@suse.cz; dmarc=none X-Rspamd-Server: rspam01 X-Rspamd-Queue-Id: 7CF355092A09 X-HE-Tag: 1637021782-623279 Content-Transfer-Encoding: quoted-printable X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: "Matthew Wilcox (Oracle)" Make struct slab independent of struct page. It still uses the underlying memory in struct page for storing slab-specific data, but slab and slub can now be weaned off using struct page directly. Some of the wrapper functions (slab_address() and slab_order()) still need to cast to struct folio, but this is a significant disentanglement. [ vbabka@suse.cz: Rebase on folios, use folio instead of page where possi= ble. Do not duplicate flags field in struct slab, instead make the related accessors go through slab_folio(). For testing pfmemalloc use the folio_*_active flag accessors directly so the PageSlabPfmemalloc wrappe= rs can be removed later. Make folio_slab() expect only folio_test_slab() =3D=3D true folios and virt_to_slab() return NULL for !folio_test_slab(). Move struct slab to mm/slab.h. Don't represent with struct slab pages that are not true slab pages, bu= t just a compound page obtained directly rom page allocator (with large kmalloc() for SLUB and SLOB). ] Signed-off-by: Matthew Wilcox (Oracle) Signed-off-by: Vlastimil Babka --- include/linux/mm_types.h | 10 +-- mm/slab.h | 167 +++++++++++++++++++++++++++++++++++++++ mm/slub.c | 8 +- 3 files changed, 176 insertions(+), 9 deletions(-) diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h index bb8c6f5f19bc..a70c292474f0 100644 --- a/include/linux/mm_types.h +++ b/include/linux/mm_types.h @@ -56,11 +56,11 @@ struct mem_cgroup; * in each subpage, but you may need to restore some of their values * afterwards. * - * SLUB uses cmpxchg_double() to atomically update its freelist and - * counters. That requires that freelist & counters be adjacent and - * double-word aligned. We align all struct pages to double-word - * boundaries, and ensure that 'freelist' is aligned within the - * struct. + * SLUB uses cmpxchg_double() to atomically update its freelist and coun= ters. + * That requires that freelist & counters in struct slab be adjacent and + * double-word aligned. Because struct slab currently just reinterprets = the + * bits of struct page, we align all struct pages to double-word boundar= ies, + * and ensure that 'freelist' is aligned within struct slab. */ #ifdef CONFIG_HAVE_ALIGNED_STRUCT_PAGE #define _struct_page_alignment __aligned(2 * sizeof(unsigned long)) diff --git a/mm/slab.h b/mm/slab.h index 58c01a34e5b8..9b2c7072a290 100644 --- a/mm/slab.h +++ b/mm/slab.h @@ -5,6 +5,173 @@ * Internal slab definitions */ =20 +/* Reuses the bits in struct page */ +struct slab { + unsigned long __page_flags; + union { + struct list_head slab_list; + struct { /* Partial pages */ + struct slab *next; +#ifdef CONFIG_64BIT + int slabs; /* Nr of slabs left */ +#else + short int slabs; +#endif + }; + struct rcu_head rcu_head; + }; + struct kmem_cache *slab_cache; /* not slob */ + /* Double-word boundary */ + void *freelist; /* first free object */ + union { + void *s_mem; /* slab: first object */ + unsigned long counters; /* SLUB */ + struct { /* SLUB */ + unsigned inuse:16; + unsigned objects:15; + unsigned frozen:1; + }; + }; + + union { + unsigned int active; /* SLAB */ + int units; /* SLOB */ + }; + atomic_t __page_refcount; +#ifdef CONFIG_MEMCG + unsigned long memcg_data; +#endif +}; + +#define SLAB_MATCH(pg, sl) \ + static_assert(offsetof(struct page, pg) =3D=3D offsetof(struct slab, sl= )) +SLAB_MATCH(flags, __page_flags); +SLAB_MATCH(compound_head, slab_list); /* Ensure bit 0 is clear */ +SLAB_MATCH(slab_list, slab_list); +SLAB_MATCH(rcu_head, rcu_head); +SLAB_MATCH(slab_cache, slab_cache); +SLAB_MATCH(s_mem, s_mem); +SLAB_MATCH(active, active); +SLAB_MATCH(_refcount, __page_refcount); +#ifdef CONFIG_MEMCG +SLAB_MATCH(memcg_data, memcg_data); +#endif +#undef SLAB_MATCH +static_assert(sizeof(struct slab) <=3D sizeof(struct page)); + +/** + * folio_slab - Converts from folio to slab. + * @folio: The folio. + * + * Currently struct slab is a different representation of a folio where + * folio_test_slab() is true. + * + * Return: The slab which contains this folio. + */ +#define folio_slab(folio) (_Generic((folio), \ + const struct folio *: (const struct slab *)(folio), \ + struct folio *: (struct slab *)(folio))) + +/** + * slab_folio - The folio allocated for a slab + * @slab: The slab. + * + * Slabs are allocated as folios that contain the individual objects and= are + * using some fields in the first struct page of the folio - those field= s are + * now accessed by struct slab. It is occasionally necessary to convert = back to + * a folio in order to communicate with the rest of the mm. Please use = this + * helper function instead of casting yourself, as the implementation ma= y change + * in the future. + */ +#define slab_folio(s) (_Generic((s), \ + const struct slab *: (const struct folio *)s, \ + struct slab *: (struct folio *)s)) + +/** + * page_slab - Converts from first struct page to slab. + * @p: The first (either head of compound or single) page of slab. + * + * A temporary wrapper to convert struct page to struct slab in situatio= ns where + * we know the page is the compound head, or single order-0 page. + * + * Long-term ideally everything would work with struct slab directly or = go + * through folio to struct slab. + * + * Return: The slab which contains this page + */ +#define page_slab(p) (_Generic((p), \ + const struct page *: (const struct slab *)(p), \ + struct page *: (struct slab *)(p))) + +/** + * slab_page - The first struct page allocated for a slab + * @slab: The slab. + * + * A convenience wrapper for converting slab to the first struct page of= the + * underlying folio, to communicate with code not yet converted to folio= or + * struct slab. + */ +#define slab_page(s) folio_page(slab_folio(s), 0) + +/* + * If network-based swap is enabled, sl*b must keep track of whether pag= es + * were allocated from pfmemalloc reserves. + */ +static inline bool slab_test_pfmemalloc(const struct slab *slab) +{ + return folio_test_active((struct folio *)slab_folio(slab)); +} + +static inline void slab_set_pfmemalloc(struct slab *slab) +{ + folio_set_active(slab_folio(slab)); +} + +static inline void slab_clear_pfmemalloc(struct slab *slab) +{ + folio_clear_active(slab_folio(slab)); +} + +static inline void __slab_clear_pfmemalloc(struct slab *slab) +{ + __folio_clear_active(slab_folio(slab)); +} + +static inline void *slab_address(const struct slab *slab) +{ + return page_address(slab_page(slab)); +} + +static inline int slab_nid(const struct slab *slab) +{ + return folio_nid(slab_folio(slab)); +} + +static inline pg_data_t *slab_pgdat(const struct slab *slab) +{ + return folio_pgdat(slab_folio(slab)); +} + +static inline struct slab *virt_to_slab(const void *addr) +{ + struct folio *folio =3D page_folio(virt_to_page(addr)); + + if (!folio_test_slab(folio)) + return NULL; + + return folio_slab(folio); +} + +static inline int slab_order(const struct slab *slab) +{ + return folio_order((struct folio *)slab_folio(slab)); +} + +static inline size_t slab_size(const struct slab *slab) +{ + return PAGE_SIZE << slab_order(slab); +} + #ifdef CONFIG_SLOB /* * Common fields provided in kmem_cache by all slab allocators diff --git a/mm/slub.c b/mm/slub.c index 17cbbe385797..a0124ee8f865 100644 --- a/mm/slub.c +++ b/mm/slub.c @@ -3787,7 +3787,7 @@ static unsigned int slub_min_objects; * requested a higher minimum order then we start with that one instead = of * the smallest order which will fit the object. */ -static inline unsigned int slab_order(unsigned int size, +static inline unsigned int calc_slab_order(unsigned int size, unsigned int min_objects, unsigned int max_order, unsigned int fract_leftover) { @@ -3851,7 +3851,7 @@ static inline int calculate_order(unsigned int size= ) =20 fraction =3D 16; while (fraction >=3D 4) { - order =3D slab_order(size, min_objects, + order =3D calc_slab_order(size, min_objects, slub_max_order, fraction); if (order <=3D slub_max_order) return order; @@ -3864,14 +3864,14 @@ static inline int calculate_order(unsigned int si= ze) * We were unable to place multiple objects in a slab. Now * lets see if we can place a single object there. */ - order =3D slab_order(size, 1, slub_max_order, 1); + order =3D calc_slab_order(size, 1, slub_max_order, 1); if (order <=3D slub_max_order) return order; =20 /* * Doh this slab cannot be placed using slub_max_order. */ - order =3D slab_order(size, 1, MAX_ORDER, 1); + order =3D calc_slab_order(size, 1, MAX_ORDER, 1); if (order < MAX_ORDER) return order; return -ENOSYS; --=20 2.33.1