From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-6.5 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id DB0C1C4320E for ; Fri, 20 Aug 2021 19:12:08 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 7E79561102 for ; Fri, 20 Aug 2021 19:12:08 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org 7E79561102 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=redhat.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id BD51A6B0071; Fri, 20 Aug 2021 15:12:07 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id B85586B0072; Fri, 20 Aug 2021 15:12:07 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id A4CEF8D0001; Fri, 20 Aug 2021 15:12:07 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0034.hostedemail.com [216.40.44.34]) by kanga.kvack.org (Postfix) with ESMTP id 8912A6B0071 for ; Fri, 20 Aug 2021 15:12:07 -0400 (EDT) Received: from smtpin08.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay02.hostedemail.com (Postfix) with ESMTP id 2862B2AEEE for ; Fri, 20 Aug 2021 19:12:07 +0000 (UTC) X-FDA: 78496404294.08.2DEA75F Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [216.205.24.124]) by imf29.hostedemail.com (Postfix) with ESMTP id B142D9000264 for ; Fri, 20 Aug 2021 19:12:06 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1629486726; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=2tZh7fIirAY20LFdym0XInFDFkFJinMnH6nV5ZYKg78=; b=TrIdeWt3kOnWX1LQcSFabW7Ed/UWazxPdKklreWgVuZ4KiPCIiiGjubbhicJXVRAHcrBA/ FCtMUXv9WClw252jg2WVqAsqKNdwrp2fdswFQn51li0BX1LoumPnLwdvgpsym2gKQpg3LD v9lUFSzJ2+Zs7tIo+DwA9gVQvP0K5m8= Received: from mail-qt1-f200.google.com (mail-qt1-f200.google.com [209.85.160.200]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-553-uQCB_qGzM7q1yds3WE4CWg-1; Fri, 20 Aug 2021 15:12:04 -0400 X-MC-Unique: uQCB_qGzM7q1yds3WE4CWg-1 Received: by mail-qt1-f200.google.com with SMTP id w19-20020ac87e930000b029025a2609eb04so5324868qtj.17 for ; Fri, 20 Aug 2021 12:12:04 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=2tZh7fIirAY20LFdym0XInFDFkFJinMnH6nV5ZYKg78=; b=c2Dk+89MW4O3Iv8Qk6ZZEuhWyMedit7xjNUC9PIB8+s2LqRcVEEoFElS8nGNmzYRUE mukJl9Hu3eGmY9ms0iYgKMZAq0mFBfT+ZY87dc4FDC7eQKnafv4VhxpErvtIxFygvcks BLihkreAaUcnDeJ2TfJfkNyxCesMlDhQ3hpLagYWuKcsP3YFNLWHpXT1K6F5HxAUbx58 NPQ7nIjHyBSeYu4uqO7ouFgEm/r0FT+CY9Qu18isceFmJpgNPxQzD6B7tZp8Z7qxpHXy iu+fpNm+41ff2CFdOv0eCG1ZFVoZuQFwdAzwXs9GlfKMZsrSrncX+4/lMW5Eiu5Yh99c a/jw== X-Gm-Message-State: AOAM5309NpeBOOLmZ0IvsvaS2A2oH69BTNC+6Hb2aETjejQRRPYoe0YS VQpu/iBMjJbTQHwgj+19IMW+bvgF4/X1fb427s0rw3FRTOHLC172PoOtoQW4UqrUcAMgVlmCrHV OarVaUpCCD9g= X-Received: by 2002:a37:9445:: with SMTP id w66mr10437799qkd.410.1629486724409; Fri, 20 Aug 2021 12:12:04 -0700 (PDT) X-Google-Smtp-Source: ABdhPJzpB0vcNQo/sTpBiY0C92p1/c61IvSfKnROX6eWTDOUGU4LY+9Jqi6/Q57UYo9u0BlstvcOUw== X-Received: by 2002:a37:9445:: with SMTP id w66mr10437783qkd.410.1629486724176; Fri, 20 Aug 2021 12:12:04 -0700 (PDT) Received: from t490s ([2607:fea8:56a3:500::d413]) by smtp.gmail.com with ESMTPSA id 69sm3990176qke.55.2021.08.20.12.12.02 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 20 Aug 2021 12:12:03 -0700 (PDT) Date: Fri, 20 Aug 2021 15:12:01 -0400 From: Peter Xu To: Tiberiu Georgescu Cc: David Hildenbrand , "linux-kernel@vger.kernel.org" , "linux-mm@kvack.org" , Alistair Popple , Ivan Teterevkov , Mike Rapoport , Hugh Dickins , Matthew Wilcox , Andrea Arcangeli , "Kirill A . Shutemov" , Andrew Morton , Mike Kravetz , "Carl Waldspurger [C]" , Florian Schmidt , Jonathan Davies Subject: Re: [PATCH RFC 0/4] mm: Enable PM_SWAP for shmem with PTE_MARKER Message-ID: References: <16a765e7-c2a3-982a-e585-c04067766e3f@redhat.com> <7F645772-1212-4F0D-88AF-2569D5BBC2CD@nutanix.com> <6ab58270-c487-2a56-b522-ea5100edb13c@redhat.com> <0A4C4E37-88C9-4490-9D8B-6990D805F447@nutanix.com> <5766d353-6ff8-fdfa-f8f9-764e8de9b5aa@redhat.com> MIME-Version: 1.0 In-Reply-To: X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Authentication-Results: imf29.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=TrIdeWt3; dmarc=pass (policy=none) header.from=redhat.com; spf=none (imf29.hostedemail.com: domain of peterx@redhat.com has no SPF policy when checking 216.205.24.124) smtp.mailfrom=peterx@redhat.com X-Stat-Signature: f9usaopjd7uw61ra1pqn4xz3zwfxpan5 X-Rspamd-Queue-Id: B142D9000264 X-Rspamd-Server: rspam05 X-HE-Tag: 1629486726-397255 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Hello, Tiberiu, On Fri, Aug 20, 2021 at 04:49:58PM +0000, Tiberiu Georgescu wrote: > Firstly, I am worried lseek with the SEEK_HOLE flag would page in pages from > swap, so using it would be a direct factor on its own output. If people are working > on Live Migration, this would not be ideal. I am not 100% sure this is how lseek > works, so please feel free to contradict me, but I think it would swap in some > of the pages that it seeks through, if not all, to figure out when to stop. Unless it > leverages the page cache somehow, or an internal bitmap. It shouldn't. Man page is clear on that: SEEK_DATA Adjust the file offset to the next location in the file greater than or equal to offset containing data. If offset points to data, then the file offset is set to offset. Again, I think your requirement is different from CRIU, so I think mincore() is the right thing for you. > > Secondly, mincore() could return some "false positives" for this particular use > case. That is because it returns flag=1 for pages which are still in the swap > cache, so the output becomes ambiguous. I don't think so; mincore() should return flag=0 if it's either in swap cache or even got dropped from it. I think its name/doc also shows that in the fact that "as long as it's not in RAM, the flag is cleared". That's why I think that should indeed be what you're looking for, if swp entry can be ignored. More below on that. Note that my series is as you mentioned missing the changes to support mincore() (otherwise I'll know the existance of it!). It'll be trivial to add that, but let's see whether mincore() will satisfy your need. [...] > It is possible for the swap device to be network attached and shared, so multiple > hosts would need to understand its content. Then it is no longer internal to one > kernel only. > > By being swap-aware, we can skip swapped-out pages during migration (to prevent IO and potential thrashing), and transfer those pages in another way that > is zero-copy. That sounds reasonable, but I'm not aware of any user-API that exposes swap entries to userspace, or is there one? I.e., how do you know which swap device is which? How do you guarantee the kernel swp entry information won't change along with time? Thanks, -- Peter Xu