From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <owner-linux-mm@kvack.org>
X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on
	aws-us-west-2-korg-lkml-1.web.codeaurora.org
Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17])
	by smtp.lore.kernel.org (Postfix) with ESMTP id 5D6DAC4321E
	for <linux-mm@archiver.kernel.org>; Tue, 29 Nov 2022 13:27:24 +0000 (UTC)
Received: by kanga.kvack.org (Postfix)
	id 9F4706B0074; Tue, 29 Nov 2022 08:27:23 -0500 (EST)
Received: by kanga.kvack.org (Postfix, from userid 40)
	id 9A4EF6B0075; Tue, 29 Nov 2022 08:27:23 -0500 (EST)
X-Delivered-To: int-list-linux-mm@kvack.org
Received: by kanga.kvack.org (Postfix, from userid 63042)
	id 86D2C8E0001; Tue, 29 Nov 2022 08:27:23 -0500 (EST)
X-Delivered-To: linux-mm@kvack.org
Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11])
	by kanga.kvack.org (Postfix) with ESMTP id 781916B0074
	for <linux-mm@kvack.org>; Tue, 29 Nov 2022 08:27:23 -0500 (EST)
Received: from smtpin12.hostedemail.com (a10.router.float.18 [10.200.18.1])
	by unirelay08.hostedemail.com (Postfix) with ESMTP id B6FA2140F70
	for <linux-mm@kvack.org>; Tue, 29 Nov 2022 07:27:43 +0000 (UTC)
X-FDA: 80185650006.12.CDA0456
Received: from mail-lj1-f176.google.com (mail-lj1-f176.google.com [209.85.208.176])
	by imf17.hostedemail.com (Postfix) with ESMTP id 4B54F40002
	for <linux-mm@kvack.org>; Tue, 29 Nov 2022 07:27:43 +0000 (UTC)
Received: by mail-lj1-f176.google.com with SMTP id r8so15821676ljn.8
        for <linux-mm@kvack.org>; Mon, 28 Nov 2022 23:27:42 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=gmail.com; s=20210112;
        h=user-agent:in-reply-to:content-disposition:mime-version:references
         :message-id:subject:cc:to:from:date:from:to:cc:subject:date
         :message-id:reply-to;
        bh=7Dml/9oHSBXGehrKnQ+3841cy9PRAO3te2e2VG2uxHo=;
        b=WZQgZtIUnwuDmYfEvqbeiYn1nnpt8Jglx/AnzAGEOmD+sBPAfVOx7BH15zZnCkaFMl
         HsfeiHm9/eQ6O60b208LOFn9nADQLqwkGMqtaHo+Qp34bs5vHEy1Wn+8iNT1CY/vxQnT
         yRyAUf1IRnaOXFfS2bmok/asWkH/qvsbOFM62HeHcLb/y6tNhp9iceyJTB8kl2oxDy4V
         7YU8ps9ymdw8MpkkoE/FnLG2G7RYJkOSx7w7qKP9qH5kazV/v7GC6QsjsqxTx8MTx/i9
         2OxhkE1helxVW1ZHPr1EjG+a2TKET/48S+nTqO4ihjn1TrUPCYvQjL15Hymo3ALiJLMt
         LDPg==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=1e100.net; s=20210112;
        h=user-agent:in-reply-to:content-disposition:mime-version:references
         :message-id:subject:cc:to:from:date:x-gm-message-state:from:to:cc
         :subject:date:message-id:reply-to;
        bh=7Dml/9oHSBXGehrKnQ+3841cy9PRAO3te2e2VG2uxHo=;
        b=pPUYI4B404RtwmxkUctLZRODfEYOOrtGkJaCGKdVyLnHhWvX2Kg6xl9gaegHdpHhlC
         Axg0WgmmOR8jAZdqwvn9vSWrejXQN6sJ2AbAUw9iWS7tLf9fweiCPtrnQ1pnEWHBqQRC
         6XLpMELhkXP+lSk7TYxkQPxQ6FUWbBvwsLu8nwpbktDRRZ6MmHSwINJ8DLFIFx6U3TNe
         SGrKeH48PXr5ucS2VDR+URFfwZhy5B3ASg9hJgfgjDFC/hinllxgNtlvjRMk/Pd8V7qF
         EAiPm8JAAr1iFIFoufqUeRR/BzxdJeQLPsXKxBSlbRzrRh204l7mG6e8CCB8rC7kDaWg
         ntLg==
X-Gm-Message-State: ANoB5pmjXghJIvuxX9QWjz4/yXr8wtpmliv8P7p5yRNXxtF2CSC3/P4A
	5x62PoQqSdgATwS+BJCdcaM=
X-Google-Smtp-Source: AA0mqf71GEWjA8mSSRLO0geez5HHzxBweLm/s/a+C2ApS7gQ3ym3YutcxieyZEjK73o+1tKF8LLKyw==
X-Received: by 2002:a2e:9789:0:b0:277:41d:6c1e with SMTP id y9-20020a2e9789000000b00277041d6c1emr17031603lji.330.1669706861528;
        Mon, 28 Nov 2022 23:27:41 -0800 (PST)
Received: from grain.localdomain ([5.18.253.97])
        by smtp.gmail.com with ESMTPSA id s30-20020a05651c201e00b0026df5232c7fsm1453905ljo.42.2022.11.28.23.27.40
        (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256);
        Mon, 28 Nov 2022 23:27:40 -0800 (PST)
Received: by grain.localdomain (Postfix, from userid 1000)
	id D07395A0020; Tue, 29 Nov 2022 10:27:39 +0300 (MSK)
Date: Tue, 29 Nov 2022 10:27:39 +0300
From: Cyrill Gorcunov <gorcunov@gmail.com>
To: Muhammad Usama Anjum <usama.anjum@collabora.com>
Cc: Andrew Morton <akpm@linux-foundation.org>, Mel Gorman <mgorman@suse.de>,
	Peter Xu <peterx@redhat.com>, David Hildenbrand <david@redhat.com>,
	Andrei Vagin <avagin@gmail.com>, kernel@collabora.com,
	stable@vger.kernel.org, linux-mm@kvack.org,
	linux-kernel@vger.kernel.org
Subject: Re: [PATCH v2] mm: set the vma flags dirty before testing if it is
 mergeable
Message-ID: <Y4W0axw0ZgORtfkt@grain>
References: <20221122115007.2787017-1-usama.anjum@collabora.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <20221122115007.2787017-1-usama.anjum@collabora.com>
User-Agent: Mutt/2.2.7 (2022-08-07)
ARC-Authentication-Results: i=1;
	imf17.hostedemail.com;
	dkim=pass header.d=gmail.com header.s=20210112 header.b=WZQgZtIU;
	spf=pass (imf17.hostedemail.com: domain of gorcunov@gmail.com designates 209.85.208.176 as permitted sender) smtp.mailfrom=gorcunov@gmail.com;
	dmarc=pass (policy=none) header.from=gmail.com
ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1669706863; a=rsa-sha256;
	cv=none;
	b=K5SL/6SsmYvQsm8Az4zDBwPInyAEbd88ogQ2f8HGZHscJMQn8jADbUrCoF5Ov6wMomekoL
	Nhdd6CFbqU1AKwDHNAcU43eujT/4iH1ZkU0yHuQqs05EsjakCD24+rSx5aiz2rc1LwQqYM
	4TSCVl/MdOBoXvv4e7kZdgOlAKqTNds=
ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com;
	s=arc-20220608; t=1669706863;
	h=from:from:sender:reply-to:subject:subject:date:date:
	 message-id:message-id:to:to:cc:cc:mime-version:mime-version:
	 content-type:content-type:content-transfer-encoding:
	 in-reply-to:in-reply-to:references:references:dkim-signature;
	bh=7Dml/9oHSBXGehrKnQ+3841cy9PRAO3te2e2VG2uxHo=;
	b=Rqm3imJWVPNVWzPFXFdRVbn+ZDWVT2snwqK4cmtLre7SZEd9hbpjN+G7wMtaYMnzB8safo
	Uv9jH4tisaESFMTlV2/1ImJWf8V4iJYszk0x9yUqqpZ/tGdB2SYjzNjfeJ3CCcTp9zHJBv
	08MOIAdb8JkK22x2E7IqR6HrDnOPjHM=
X-Rspam-User: 
X-Rspamd-Server: rspam07
X-Rspamd-Queue-Id: 4B54F40002
Authentication-Results: imf17.hostedemail.com;
	dkim=pass header.d=gmail.com header.s=20210112 header.b=WZQgZtIU;
	spf=pass (imf17.hostedemail.com: domain of gorcunov@gmail.com designates 209.85.208.176 as permitted sender) smtp.mailfrom=gorcunov@gmail.com;
	dmarc=pass (policy=none) header.from=gmail.com
X-Stat-Signature: r647g5uerjj6to1j3sc1jkobonjdzote
X-HE-Tag: 1669706863-449263
X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4
Sender: owner-linux-mm@kvack.org
Precedence: bulk
X-Loop: owner-majordomo@kvack.org
List-ID: <linux-mm.kvack.org>

On Tue, Nov 22, 2022 at 04:50:07PM +0500, Muhammad Usama Anjum wrote:
> The VM_SOFTDIRTY should be set in the vma flags to be tested if new
> allocation should be merged in previous vma or not. With this patch,
> the new allocations are merged in the previous VMAs.

Hi Muhammad! Thanks for the patch and sorry for late reply. Here is a moment
I don't understand -- when we test for vma merge we use is_mergeable_vma() helper
which excludes VM_SOFTDIRTY flag from comarision, so setting this flag earlier
should not change the behaviour. Or I miss something obvious?

> I've tested it by reverting the commit 34228d473efe ("mm: ignore
> VM_SOFTDIRTY on VMA merging") and after adding this following patch,
> I'm seeing that all the new allocations done through mmap() are merged
> in the previous VMAs. The number of VMAs doesn't increase drastically
> which had contributed to the crash of gimp. If I run the same test after
> reverting and not including this patch, the number of VMAs keep on
> increasing with every mmap() syscall which proves this patch.

The is_mergeable_vma is key function here, either we should setup VM_SOFTDIRTY
explicitly as your patch does and drop VM_SOFTDIRTY from is_mergeable_vma,
or we continue excluding this flag in such low level helper as is.

> The commit 34228d473efe ("mm: ignore VM_SOFTDIRTY on VMA merging")
> seems like a workaround. But it lets the soft-dirty and non-soft-dirty
> VMA to get merged. It helps in avoiding the creation of too many VMAs.
> But it creates the problem while adding the feature of clearing the
> soft-dirty status of only a part of the memory region.

So you need an extended functionality, could you please put this
changelog snippet somewhere on top? Otherwise srat reading this patch
I simply didn't get what we're trying to achieve.

> 
> Cc: <stable@vger.kernel.org>
> Fixes: d9104d1ca966 ("mm: track vma changes with VM_SOFTDIRTY bit")

Wait, is there some critical bug or error that needs stable@ to be
patched? The way softdirty has been implemented in first place is
to reach minimum needs for dirty page tracking. More precise tracking
(such as partial cleanup of memory region) will require at least other
structures to remember which part of vma is cleared and which one is
dirty after their merge. And I don't think this is possible to implement
without extending vma structure itself (which is big enough already).

Or maybe I'm blind and not see obvious problem here, sorry then :)

> Signed-off-by: Muhammad Usama Anjum <usama.anjum@collabora.com>
> ---
> We need more testing of this patch.
> 
> While implementing clear soft-dirty bit for a range of address space, I'm
> facing an issue. The non-soft dirty VMA gets merged sometimes with the soft
> dirty VMA. Thus the non-soft dirty VMA become dirty which is undesirable.
>
> When discussed with the some other developers they consider it the
> regression. Why the non-soft dirty page should appear as soft dirty when it
> isn't soft dirty in reality? I agree with them. Should we revert
> 34228d473efe or find a workaround in the IOCTL?

Well, this is not the regression, it is been designed this way because
there is no place to keep subflags on regions covered by one VMA and non
merging them cause vma fragmentation (I've seen massive vma fragmentations
especially in db engines). So no, reverting it is not an option but rather
will cause problems in real applications I fear.

> 
> * Revert may cause the VMAs to expand in uncontrollable situation where the
> soft dirty bit of a lot of memory regions or the whole address space is
> being cleared again and again. AFAIK normal process must either be only
> clearing a few memory regions. So the applications should be okay. There is
> still chance of regressions if some applications are already using the
> soft-dirty bit. I'm not sure how to test it.

Main purpose of this dirty functionality came from containers c/r procedure.
As far as I remember we've been clearing vmas for the whole container, though
it's been a while and i'm not involved into c/r development right now so may
miss something from my memory.

> * Add a flag in the IOCTL to ignore the dirtiness of VMA. The user will
> surely lose the functionality to detect reused memory regions. But the
> extraneous soft-dirty pages would not appear. I'm trying to do this in the
> patch series [1]. Some discussion is going on that this fails with some
> mprotect use case [2]. I still need to have a look at the mprotect selftest
> to see how and why this fails. I think this can be implemented after some
> more work probably in mprotect side.

ioctl might be an option indeed

> 
> [1] https://lore.kernel.org/all/20221109102303.851281-1-usama.anjum@collabora.com/
> [2] https://lore.kernel.org/all/bfcae708-db21-04b4-0bbe-712badd03071@redhat.com/