From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-10.8 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id E02FFC433FE for ; Fri, 11 Dec 2020 21:29:16 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 38EA423F2A for ; Fri, 11 Dec 2020 21:29:15 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 38EA423F2A Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=redhat.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 2CEA96B0036; Fri, 11 Dec 2020 16:29:15 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 27F0F6B005C; Fri, 11 Dec 2020 16:29:15 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 16F066B005D; Fri, 11 Dec 2020 16:29:15 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id 00D106B0036 for ; Fri, 11 Dec 2020 16:29:14 -0500 (EST) Received: from smtpin02.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay05.hostedemail.com (Postfix) with ESMTP id B34C9181AF5D7 for ; Fri, 11 Dec 2020 21:29:14 +0000 (UTC) X-FDA: 77582292228.02.wheel58_49068d127404 Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin02.hostedemail.com (Postfix) with ESMTP id 9644210097AA0 for ; Fri, 11 Dec 2020 21:29:14 +0000 (UTC) X-HE-Tag: wheel58_49068d127404 X-Filterd-Recvd-Size: 8321 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [216.205.24.124]) by imf39.hostedemail.com (Postfix) with ESMTP for ; Fri, 11 Dec 2020 21:29:13 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1607722152; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=57Ml8mWSJLrs1m4+iKT/G+AxyZjjLEG1x3bDMGqDW/c=; b=G62Qo4slhviQxK+/3e85G0z7AiSqfqA681a6KMzCZ0QBQ9Y2sAPnqLAccGAOdcBJklFUa2 KiKIkb66f6uSjlcL82QXYYMSJVGr6IKeBCNHltyRvFOdiHs9yXCTO389QEPMMqU8D3cEch wxucbyaN2unnsN5WlAh0puLhKdEEddc= Received: from mail-wr1-f72.google.com (mail-wr1-f72.google.com [209.85.221.72]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-595--SJUQqYqNaSAKv5X1GMV8A-1; Fri, 11 Dec 2020 16:29:11 -0500 X-MC-Unique: -SJUQqYqNaSAKv5X1GMV8A-1 Received: by mail-wr1-f72.google.com with SMTP id d2so3820141wrr.5 for ; Fri, 11 Dec 2020 13:29:11 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:content-transfer-encoding:from:mime-version :subject:date:message-id:references:cc:in-reply-to:to; bh=9Rl5EtIOs03CVOerY2+wVTiaDIkcXc+bBp/8wnzmHJM=; b=TP9GpVGV1S71w0bOCkfx25s/wpv4dvMXU7+FZNp6+9k9beQlJf9KLNkOHMRH9DYD3U QiXdVD/wn9OovIY4auYap8I3rrHJk3XNeUkQb80AoSCb1lZHE+RPe76NWj+qEHti0SHh mudHW9g+uY+XVoCufViR5qzt4pQj3IKkhPVwxGtz7cLkX2t75yklW341GGWHHUDwU/qy sWaWXp+KtdD44iv/2ePVeEq8RVhAteKZFLin3LSMIppmROwzmjXCmisBBbhyTG2R351Z Ctl6r2q102XeyOjKp7uEk9VQpN0C2Fm9bQfR6UhFZEkibfbEs3Oe/egAdjMgXDQrdhxY gnFw== X-Gm-Message-State: AOAM533L9aXpGee+mZZIU0tpCtMf/SwxtfxIzltrEZXRJkWHI7FyXkti JH5QU0N37+VjfCOiBPNHRJlHcQBYzQjix3lJadv3e4umtcGwcRomC/Cg/pxl7+C2T1wm2zXEcuu UpbQRkZnL01U= X-Received: by 2002:adf:9cca:: with SMTP id h10mr8782027wre.77.1607722150013; Fri, 11 Dec 2020 13:29:10 -0800 (PST) X-Google-Smtp-Source: ABdhPJxML2UPeLq19wEtUIxrniwOw/YoaYTaWXyKEajEcsSnhW1AWwjmA+tfkw1FaXOISFtnfmZ+hQ== X-Received: by 2002:adf:9cca:: with SMTP id h10mr8781992wre.77.1607722149677; Fri, 11 Dec 2020 13:29:09 -0800 (PST) Received: from [192.168.3.114] (p4ff23c7c.dip0.t-ipconnect.de. [79.242.60.124]) by smtp.gmail.com with ESMTPSA id n189sm15862735wmf.20.2020.12.11.13.29.08 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Fri, 11 Dec 2020 13:29:09 -0800 (PST) From: David Hildenbrand Mime-Version: 1.0 (1.0) Subject: Re: [PATCH v3 5/6] mm/gup: migrate pinned pages out of movable zone Date: Fri, 11 Dec 2020 22:29:08 +0100 Message-Id: <10F682D5-0654-4C42-9989-F999D4434295@redhat.com> References: Cc: Jason Gunthorpe , LKML , linux-mm , Andrew Morton , Vlastimil Babka , Michal Hocko , David Hildenbrand , Oscar Salvador , Dan Williams , Sasha Levin , Tyler Hicks , Joonsoo Kim , mike.kravetz@oracle.com, Steven Rostedt , Ingo Molnar , Peter Zijlstra , Mel Gorman , Matthew Wilcox , David Rientjes , John Hubbard , Linux Doc Mailing List In-Reply-To: To: Pavel Tatashin X-Mailer: iPhone Mail (18B92) Authentication-Results: relay.mimecast.com; auth=pass smtp.auth=CUSA124A263 smtp.mailfrom=david@redhat.com X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: > Am 11.12.2020 um 22:09 schrieb Pavel Tatashin = : >=20 > =EF=BB=BFOn Fri, Dec 11, 2020 at 3:46 PM Jason Gunthorpe w= rote: >>=20 >>> On Fri, Dec 11, 2020 at 03:40:57PM -0500, Pavel Tatashin wrote: >>> On Fri, Dec 11, 2020 at 3:23 PM Jason Gunthorpe wrote: >>>>=20 >>>> On Fri, Dec 11, 2020 at 03:21:39PM -0500, Pavel Tatashin wrote: >>>>> @@ -1593,7 +1592,7 @@ static long check_and_migrate_cma_pages(struct = mm_struct *mm, >>>>> } >>>>>=20 >>>>> if (!isolate_lru_page(head)) { >>>>> - list_add_tail(&head->lru, &cma_= page_list); >>>>> + list_add_tail(&head->lru, &mova= ble_page_list); >>>>> mod_node_page_state(page_pgdat(h= ead), >>>>> NR_ISOLATED_= ANON + >>>>> page_is_file= _lru(head), >>>>> @@ -1605,7 +1604,7 @@ static long check_and_migrate_cma_pages(struct = mm_struct *mm, >>>>> i +=3D step; >>>>> } >>>>>=20 >>>>> - if (!list_empty(&cma_page_list)) { >>>>> + if (!list_empty(&movable_page_list)) { >>>>=20 >>>> You didn't answer my earlier question, is it OK that ZONE_MOVABLE >>>> pages leak out here if ioslate_lru_page() fails but the >>>> moval_page_list is empty? >>>>=20 >>>> I think the answer is no, right? >>> In my opinion it is OK. We are doing our best to not pin movable >>> pages, but if isolate_lru_page() fails because pages are currently >>> locked by someone else, we will end up long-term pinning them. >>> See comment in this patch: >>> + * 1. Pinned pages: (long-term) pinning of movable pages is avo= ided >>> + * when pages are pinned and faulted, but it is still possib= le that >>> + * address space already has pages in ZONE_MOVABLE at the ti= me when >>> + * pages are pinned (i.e. user has touches that memory befor= e >>> + * pinning). In such case we try to migrate them to a differ= ent zone, >>> + * but if migration fails the pages can still end-up pinned = in >>> + * ZONE_MOVABLE. In such case, memory offlining might retry = a long >>> + * time and will only succeed once user application unpins p= ages. >>=20 >> It is not "retry a long time" it is "might never complete" because >> userspace will hold the DMA pin indefinitely. >>=20 >> Confused what the point of all this is then ?? >>=20 >> I thought to goal here is to make memory unplug reliable, if you leave >> a hole like this then any hostile userspace can block it forever. >=20 > You are right, I used a wording from the previous comment, and it > should be made clear that pin may be forever. Without these patches it > is guaranteed that hot-remove will fail if there are pinned pages as > ZONE_MOVABLE is actually the first to be searched. Now, it will fail > only due to exceptions listed in ZONE_MOVABLE comment: >=20 > 1. pin + migration/isolation failure Not sure what that really means. We have short-term pinnings (although we m= ight have a better term for =E2=80=9Epinning=E2=80=9C here) for example, wh= en a process dies (IIRC). There is a period where pages cannot get migrated= and offlining code has to retry (which might take a while). This still app= lies after your change - are you referring to that? > 2. memblock allocation due to limited amount of space for kernelcore > 3. memory holes > 4. hwpoison > 5. Unmovable PG_offline pages (? need to study why this is a scenario). Virtio-mem is the primary user in this context. > Do you think we should unconditionally unpin pages, and return error > when isolation/migration fails? I=E2=80=98m not sure what you mean here. Who=E2=80=99s supposed to unpin wh= ich pages? >=20 > Pasha >=20 >>=20 >> Jason >=20