From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-it0-f71.google.com (mail-it0-f71.google.com [209.85.214.71]) by kanga.kvack.org (Postfix) with ESMTP id C35554405B1 for ; Wed, 15 Feb 2017 11:57:58 -0500 (EST) Received: by mail-it0-f71.google.com with SMTP id e137so74814077itc.0 for ; Wed, 15 Feb 2017 08:57:58 -0800 (PST) Received: from mail-it0-x22b.google.com (mail-it0-x22b.google.com. [2607:f8b0:4001:c0b::22b]) by mx.google.com with ESMTPS id w19si4576633ioi.6.2017.02.15.08.57.57 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 15 Feb 2017 08:57:57 -0800 (PST) Received: by mail-it0-x22b.google.com with SMTP id 203so69023602ith.0 for ; Wed, 15 Feb 2017 08:57:57 -0800 (PST) MIME-Version: 1.0 In-Reply-To: References: <20170213195858.5215-1-edumazet@google.com> <20170213195858.5215-9-edumazet@google.com> <20170214131206.44b644f6@redhat.com> <1487087488.8227.53.camel@edumazet-glaptop3.roam.corp.google.com> From: Eric Dumazet Date: Wed, 15 Feb 2017 08:57:56 -0800 Message-ID: Subject: Re: [PATCH v3 net-next 08/14] mlx4: use order-0 pages for RX Content-Type: text/plain; charset=UTF-8 Sender: owner-linux-mm@kvack.org List-ID: To: Tariq Toukan Cc: Tom Herbert , Eric Dumazet , Jesper Dangaard Brouer , Alexander Duyck , "David S . Miller" , netdev , Martin KaFai Lau , Saeed Mahameed , Willem de Bruijn , Brenden Blanco , Alexei Starovoitov , linux-mm On Wed, Feb 15, 2017 at 8:42 AM, Tariq Toukan wrote: > > > Isn't it the same principle in page_frag_alloc() ? > It is called form __netdev_alloc_skb()/__napi_alloc_skb(). > > Why is it ok to have order-3 pages (PAGE_FRAG_CACHE_MAX_ORDER) there? This is not ok. This is a very well known problem, we already mentioned that here in the past, but at least core networking stack uses order-0 pages on PowerPC. mlx4 driver suffers from this problem 100% more than other drivers ;) One problem at a time Tariq. Right now, only mlx4 has this big problem compared to other NIC. Then, if we _still_ hit major issues, we might also need to force napi_get_frags() to allocate skb->head using kmalloc() instead of a page frag. That is a very simple fix. Remember that we have skb->truesize that is an approximation, it will never be completely accurate, but we need to make it better. mlx4 driver pretends to have a frag truesize of 1536 bytes, but this is obviously wrong when host is under memory pressure (2 frags per page -> truesize should be 2048) > By using netdev/napi_alloc_skb, you'll get that the SKB's linear data is a > frag of a huge page, > and it is not going to be freed before the other non-linear frags. > Cannot this cause the same threats (memory pinning and so...)? > > Currently, mlx4 doesn't use this generic API, while most other drivers do. > > Similar claims are true for TX: > https://github.com/torvalds/linux/commit/5640f7685831e088fe6c2e1f863a6805962f8e81 We do not have such problem on TX. GFP_KERNEL allocations do not have the same issues. Tasks are usually not malicious in our DC, and most serious applications use memcg or such memory control. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org