From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id A3574C433EF for ; Mon, 11 Oct 2021 17:13:22 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 19F4F60C4B for ; Mon, 11 Oct 2021 17:13:22 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org 19F4F60C4B Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=cmpxchg.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id 58CB56B0071; Mon, 11 Oct 2021 13:13:21 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 53BC46B0072; Mon, 11 Oct 2021 13:13:21 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 40446900002; Mon, 11 Oct 2021 13:13:21 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0047.hostedemail.com [216.40.44.47]) by kanga.kvack.org (Postfix) with ESMTP id 30C0E6B0071 for ; Mon, 11 Oct 2021 13:13:21 -0400 (EDT) Received: from smtpin25.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay03.hostedemail.com (Postfix) with ESMTP id E64EE82499A8 for ; Mon, 11 Oct 2021 17:13:20 +0000 (UTC) X-FDA: 78684802560.25.E2A01C6 Received: from mail-qt1-f171.google.com (mail-qt1-f171.google.com [209.85.160.171]) by imf24.hostedemail.com (Postfix) with ESMTP id 62871B001BFF for ; Mon, 11 Oct 2021 17:13:20 +0000 (UTC) Received: by mail-qt1-f171.google.com with SMTP id i1so15893021qtr.6 for ; Mon, 11 Oct 2021 10:13:20 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=cmpxchg-org.20210112.gappssmtp.com; s=20210112; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to; bh=dfGnrBg4njo/WyM3tSDzm03fxi8pmrgz9t1GUCutrho=; b=lSThp691sdnnk2BitYvEfDwdPKJ7CAIKq3MBTxArR6AN1FS8uoheHAHW14z9bi70gN RZNLooi0beTC/Oi4rIV9FZvpZ9uPXr/9iUi6mhGjS11Rn+/QgbgIGvdDn82tQz/nA/5z jPW0ULq9d9FXeigfqItkM78MO07M0Z5Fm2SaHh84NRGjdX/ekJetLNT063Um03NZKxoh v2emCG4luyXChsi+IccuAKFL6CWGvN4Uzwyz6eble9SQZjjb1S2rgTwOi/2kOfUS5/I7 x4HYOjx6ItLzzkK9kbT9a6RSpvcsZBvA35aufSjvoU6lUeXohrnIEcPlrJamTcezyQk8 4icw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=dfGnrBg4njo/WyM3tSDzm03fxi8pmrgz9t1GUCutrho=; b=6u7N2xCtyzZYdW+TyUD5Catih4vw6nkk6fAtdRhvXpoGL1k47RqIvxB0HUXOUGpvxA WB5RLozt/yACNBcNlCehQ6Z7/jn85Pwb14XCWpcmBDnZuhrQ+/L4GeUXdqG66wF5JmVX LKfFcEOZnYwNKo6BJwPBmDh+bXo2vJlGu9PKD0h4gXrjyc0xMTjF9orIbcpzNBHYOCGn XT9ALWuOQGm1zmIMDci40vazebFxAM1bDdaQwrsKQergNGtzxtGXyGUa1jHSNeZ2IIht xpXo9al8fIbrubQMPh5HndHmG/GKgeBncuAZfgXJIfkttzL5hm2BiUmX5jAi79kTVLQO mrtA== X-Gm-Message-State: AOAM532irjj6gp+u60m1WffEP6RDzSSsv0EEmbxKhoRc36dH91haJL0m jFCL9UmUPMBrLVaEn+WK5EIweA== X-Google-Smtp-Source: ABdhPJxKJ/xJxF0YD40AwhgVaN4VHa4QYEgf4lmaBqK35K6hxt0KQ7zqRTDJ4JDrL9N/O3RwQKQ4iA== X-Received: by 2002:a05:622a:1209:: with SMTP id y9mr1840429qtx.13.1633972399597; Mon, 11 Oct 2021 10:13:19 -0700 (PDT) Received: from localhost (cpe-98-15-154-102.hvc.res.rr.com. [98.15.154.102]) by smtp.gmail.com with ESMTPSA id c6sm4482795qkg.85.2021.10.11.10.13.19 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 11 Oct 2021 10:13:19 -0700 (PDT) Date: Mon, 11 Oct 2021 13:13:18 -0400 From: Johannes Weiner To: "Matthew Wilcox (Oracle)" Cc: Roman Gushchin , linux-mm@kvack.org Subject: Re: [PATCH 57/62] memcg: Convert object cgroups from struct page to struct slab Message-ID: References: <20211004134650.4031813-1-willy@infradead.org> <20211004134650.4031813-58-willy@infradead.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20211004134650.4031813-58-willy@infradead.org> X-Rspamd-Server: rspam01 X-Rspamd-Queue-Id: 62871B001BFF X-Stat-Signature: n7brer8mjjnw5z1grfxqex19t9w6z3t5 Authentication-Results: imf24.hostedemail.com; dkim=pass header.d=cmpxchg-org.20210112.gappssmtp.com header.s=20210112 header.b=lSThp691; dmarc=pass (policy=none) header.from=cmpxchg.org; spf=pass (imf24.hostedemail.com: domain of hannes@cmpxchg.org designates 209.85.160.171 as permitted sender) smtp.mailfrom=hannes@cmpxchg.org X-HE-Tag: 1633972400-45626 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: CC Roman for the slab tracking bits On Mon, Oct 04, 2021 at 02:46:45PM +0100, Matthew Wilcox (Oracle) wrote: > @@ -537,41 +537,41 @@ static inline bool PageMemcgKmem(struct page *page) > } > > /* > - * page_objcgs - get the object cgroups vector associated with a page > - * @page: a pointer to the page struct > + * slab_objcgs - get the object cgroups vector associated with a page > + * @slab: a pointer to the slab struct > * > - * Returns a pointer to the object cgroups vector associated with the page, > - * or NULL. This function assumes that the page is known to have an > + * Returns a pointer to the object cgroups vector associated with the slab, > + * or NULL. This function assumes that the slab is known to have an > * associated object cgroups vector. It's not safe to call this function > * against pages, which might have an associated memory cgroup: e.g. > * kernel stack pages. > */ > -static inline struct obj_cgroup **page_objcgs(struct page *page) > +static inline struct obj_cgroup **slab_objcgs(struct slab *slab) > { > - unsigned long memcg_data = READ_ONCE(page->memcg_data); > + unsigned long memcg_data = READ_ONCE(slab->memcg_data); > > - VM_BUG_ON_PAGE(memcg_data && !(memcg_data & MEMCG_DATA_OBJCGS), page); > - VM_BUG_ON_PAGE(memcg_data & MEMCG_DATA_KMEM, page); > + VM_BUG_ON_PAGE(memcg_data && !(memcg_data & MEMCG_DATA_OBJCGS), slab_page(slab)); > + VM_BUG_ON_PAGE(memcg_data & MEMCG_DATA_KMEM, slab_page(slab)); > > return (struct obj_cgroup **)(memcg_data & ~MEMCG_DATA_FLAGS_MASK); > } I like this whole patch series, but I think for memcg this is a particularly nice cleanup. Because right now we can have user pages pointing to a memcg, random alloc_page(GFP_ACCOUNT) pages pointing to an objcg, and slab pages pointing to an array of objcgs - all in the same memcg_data member. After your patch, slab->memcg_data points to an array of objcgs, period. The only time it doesn't is when there is a bug. Once the memcg_data member is no longer physically shared between page and slab, we can do: struct slab { struct obj_cgroup **objcgs; }; and ditch the accessor function altogether. > - * page_objcgs_check - get the object cgroups vector associated with a page > - * @page: a pointer to the page struct > + * slab_objcgs_check - get the object cgroups vector associated with a page > + * @slab: a pointer to the slab struct > * > - * Returns a pointer to the object cgroups vector associated with the page, > - * or NULL. This function is safe to use if the page can be directly associated > + * Returns a pointer to the object cgroups vector associated with the slab, > + * or NULL. This function is safe to use if the slab can be directly associated > * with a memory cgroup. > */ > -static inline struct obj_cgroup **page_objcgs_check(struct page *page) > +static inline struct obj_cgroup **slab_objcgs_check(struct slab *slab) > { > - unsigned long memcg_data = READ_ONCE(page->memcg_data); > + unsigned long memcg_data = READ_ONCE(slab->memcg_data); > > if (!memcg_data || !(memcg_data & MEMCG_DATA_OBJCGS)) > return NULL; > > - VM_BUG_ON_PAGE(memcg_data & MEMCG_DATA_KMEM, page); > + VM_BUG_ON_PAGE(memcg_data & MEMCG_DATA_KMEM, slab_page(slab)); > > return (struct obj_cgroup **)(memcg_data & ~MEMCG_DATA_FLAGS_MASK); This is a bit weird. The function is used in one place, to check whether a random page is a slab page. It's essentially a generic type check on the page! After your changes, you pass a struct slab that might well be invalid if this isn't a slab page, and you rely on the PAGE's memcg_data to tell you whether this is the case. It works because page->memcg_data is overlaid with slab->memcg_data, but that won't be the case if we allocate struct slab separately. To avoid that trap down the road, I think it would be better to keep the *page* the ambiguous object for now, and only resolve to struct slab after the type check. So that every time you see struct slab, you know it's valid. In fact, I think it would be best to just inline page_objcgs_check() into its sole caller. It would clarify the resolution from wildcard page to valid struct slab quite a bit: > @@ -2819,38 +2819,39 @@ int memcg_alloc_page_obj_cgroups(struct page *page, struct kmem_cache *s, > */ > struct mem_cgroup *mem_cgroup_from_obj(void *p) > { > - struct page *page; > + struct slab *slab; > > if (mem_cgroup_disabled()) > return NULL; > > - page = virt_to_head_page(p); > + slab = virt_to_slab(p); > > /* > * Slab objects are accounted individually, not per-page. > * Memcg membership data for each individual object is saved in > - * the page->obj_cgroups. > + * the slab->obj_cgroups. > */ > - if (page_objcgs_check(page)) { > + if (slab_objcgs_check(slab)) { I.e. do this instead: page = virt_to_head_page(p); /* object is backed by slab */ if (page->memcg_data & MEMCG_DATA_OBJCGS) { struct slab *slab = (struct slab *)page; objcg = slab_objcgs(...)[] return objcg ? obj_cgroup_memcg(objcg): NULL; } /* object is backed by a regular kernel page */ return page_memcg_check(page); > struct obj_cgroup *objcg; > unsigned int off; > > - off = obj_to_index(page->slab_cache, page, p); > - objcg = page_objcgs(page)[off]; > + off = obj_to_index(slab->slab_cache, slab, p); > + objcg = slab_objcgs(slab)[off]; > if (objcg) > return obj_cgroup_memcg(objcg); > > return NULL; > } > > + /* I am pretty sure this could just be 'return NULL' */ No, we could still be looking at a regular page that is being tracked by memcg. People do (void *)__get_free_pages(GFP_ACCOUNT). So this needs to stay 'return page_memcg_check()'.