* Re: How to read-protect a vm_area? [not found] <199802192321.XAA06580@dax.dcs.ed.ac.uk> @ 1998-02-20 5:41 ` Benjamin C.R. LaHaise 1998-02-23 23:17 ` PATCH: Swap shared pages (was: How to read-protect a vm_area?) Stephen C. Tweedie 0 siblings, 1 reply; 12+ messages in thread From: Benjamin C.R. LaHaise @ 1998-02-20 5:41 UTC (permalink / raw) To: Stephen C. Tweedie Cc: Rik van Riel, Linus Torvalds, Itai Nahshon, Alan Cox, paubert, linux-kernel, Ingo Molnar, linux-mm On Thu, 19 Feb 1998, Stephen C. Tweedie wrote: ... > Please do let me know if you want to start hacking around with this > code --- we probably want to coordinate with some of the other VM > things happening at the moment (in particular, things like Ingo's swap > prediction and the dirty page caching suggestions). ... Just to let people know, as a successor to my pte-list/swapping patch from the 2.1.48/66 days (which made running X on my nfs-root'd [34]86 possible/reliable), I'm currently mostly done a patch that does Mach-style page replacement (active/inactive/free) as an alternative to kswapd. I'm also hoping to work on adding a per-cpu free page cache to get_free_page later this month when my PPros arrive (grumble). As Rik mentioned, please feel free to make use of linux-mm@kvack.org for discussion purposes. (echo subscribe | mail majordomo@kvack.org) It's been quite, but then it's Febuary. About the dirty page caching suggestions: Eric W. Biederman <ebiederm+eric@npwt.net> wrote patches to support that against 2.1.78, but last time the issue was brought up, things became messy as NFS needs dentries now, yet we'll only ever have inodes in struct page. Now that the dentry list is back in the inode, perhaps a patch to revert read/writepage to non-dentry arguments could be accepted? (NFS could get its dentry from the i_dentry list.) Linus: how far off are you hoping for 2.2? It seems like there are icebreakers out on the first code freeze... Or maybe that just how things work ;-) -ben ^ permalink raw reply [flat|nested] 12+ messages in thread
* PATCH: Swap shared pages (was: How to read-protect a vm_area?) 1998-02-20 5:41 ` How to read-protect a vm_area? Benjamin C.R. LaHaise @ 1998-02-23 23:17 ` Stephen C. Tweedie 1998-02-23 23:27 ` Linus Torvalds ` (4 more replies) 0 siblings, 5 replies; 12+ messages in thread From: Stephen C. Tweedie @ 1998-02-23 23:17 UTC (permalink / raw) To: Benjamin C.R. LaHaise Cc: Stephen C. Tweedie, Rik van Riel, Linus Torvalds, Itai Nahshon, Alan Cox, paubert, linux-kernel, Ingo Molnar, linux-mm Hi, On Fri, 20 Feb 1998 00:41:19 -0500 (EST), "Benjamin C.R. LaHaise" <blah@kvack.org> said: > As Rik mentioned, please feel free to make use of linux-mm@kvack.org for > discussion purposes. (echo subscribe | mail majordomo@kvack.org) It's > been quite, but then it's Febuary. OK. I'm CC:ing there. The patch below, against 2.1.88, adds a bunch of new functionality to the swapper. The main changes are: * All swapping goes through the swap cache (aka. page cache) now. * There is no longer a swap lock map. Because we need to atomically test and create a new swap-cache page in order to do swap IO, it is sufficient just to lock the struct page itself. Having only one layer of locking to deal with removes a number of races concerning swapping shared pages. * We can swap shared pages, and still keep them shared when they are swapped back in!!! Currently, only private shared pages (as in pages shared after a fork()) benefit from this, but the basic mechanism will be appropriate for MAP_ANONYMOUS | MAP_SHARED pages too (implementation to follow). Pages will remain shared after a swapoff. * The page cache is now quite happy dealing with swap-cache pages too. In particular, write-ahead and read-ahead of swap through the page cache will work fine (and in fact, write-ahead does get done already under certain circumstances with this patch --- that's essentially how the swapping of shared pages gets done). Support code to perform asynchronous readahead of swap is included, but is not actually used anywhere yet. I've tested with a number of forked processes running with a shared working set larger than physical memory, and with SysV shared memory. I haven't found any problems with it so far. Enjoy. Cheers, Stephen. ---------------------------------------------------------------- begin 644 swapdiff-2.1.88.gz M'XL(".8`\C0"`W-W87!D:69F+3(N,2XX.`"U/6MSVT:2GZE?,=(F-FF"%$F] M+'GMK!++CF[C1TEV4GM[*19$@A0B$N`"H&CMQ?_]^C%/8$!2WKK4KB4!,X.9 MGIY^=\\XGDQ$)UN*_6*^V+_+5^'B8'\6)\LO^Y-\_V8YF419=R0J3W8ZG<[Z M/HUW:2+>1#=B<"`&_;-^[^SH1/1/3Y_OM-MML;[YX`QZ'/2X^=_^)CK]@T$_ MZ!^*-O]R(/[VMQVQ_TQ<+1-1W$;B-DWO<O@M+,1M>!^)(A4WD1BG2216MU$B M0K$(IY&XW/\`[W,Q2N>+651$XZYXMK\C\B(LXI&($YA5).[3>"S"21%EPV4R M2T=W0^K;S(ML.2IXH&?TH[4C_G>GTX@GHEE$>3$,D_%P-(O";'@3%\V/;X?C M:)0-::A`/,$>G5>363C-6ZV=]C=U@^^)1B,LTGD\PF;-)PFTS!^2$4TR;[U8 M.Q_<*K6HNO$[C<;^,_P7%OE;]#1#,,;)%$"(O0.1IV)T&XWN&-K%;9PS2.)< M=L)FG5$(C0",R1BZTQ8ET2C*\S![$#BE9+GH"NZPCS]H9A6@-WERZ6221P6L MK?V7>#*.)N+UQ8^?WPZO?SO_"'!L++(X*>Y$<^]U=+.<_OKN3+R)DSC'[U]^ M$(!6-+_O8>XNL,3WX_])]@(<HM$<W8:9>-:BML-P/,Y@LO3Y5B"HA01Z%H7C M"M1I:E$RCB>P/5_A__5;,,FBJ`8E8&>'_%ZO'08>UY_0.!G-EN-(_C6?=V]% MW8N:\UIMJ,]A_QB.W]G@^=G!8?G8KNEE3F_?/KT'P0F>7?R!)_<OL(EXU``> MRT61CL,B`DPX<%X80,&K0^>5.1_PZFBG8[VJ8'A#'.^T@5)\3I9Y-(9]%,=T MYJT^K]^=X_,3Y^'U++S!I\^=IS0\X3:\.N7%'1[3XN#'L;LXV+_/<G6\F0W& M"8D):N6`!_S:P@1[C#<`B'-<BF\0!YW6#O,:@%8[C$-QRL-T[&&N`0*?";[6 M8,(9S$=EUD_MW;F<E#LK>+ZI*VZ3MR^^V-@9)OH3;J8/)&:K/<-L?2CQ3<VQ MY%=;'4QNZN&HI^N/IJ=?Y7">`MJV3X/GA+L`'Z:PP^'?+Z[>7_PR',+#]E^6 M29GL8DO^EOAKF,_WF3QV;U_1(!+(U[]]''Z^OGC=Z-.G@'7#21D<R(/2N!M' M]\."&`9@X'T\BE[`4\EGQU%2`+-@?C*<Q#-ZN4SR>)H`96=ZS2_GX0+YGO<= M8B&]M_H"NQ"S="4WN_+J-I[>UKV#%>?(I)+H"[S$11T=!/V^:!\=2MI&<HFB MC/O$3N(41"@D.M`IRA*6,+(5HQC1>A@Z$/I#LS29!G(5`7X6V;J"*3(@JV>2 M!2!"M<3_``"<(:\NSE\'\+H5-+%!T+<.\BJ+BVB;07Z[NOQT41[%7H6?;XNF MLQ9!6XE,L@X`PR2EH[8&$-#=!]MP!E_U@)>&CA,EM15A?C>4OP-,Y6_W\V$( M\#0O:$>/G^.1.#Z29\(9]39=690!/C!)F_@&9R<;(IJ``#$L4JNE*SV6EFAU M-G,?+Q>S>(2\H]Q60=$52`U>\-1(/FF60,FXU*[B$D^2]ZEF*'H9B'YU([`I M'E!W&YQS`P*3:I7SN>GW3H+^`!AG[SCH/]='9T>@W)EF=R)=%@+D*)`=008- M\?_)@TCQ3['(4A0F083+`3=0.M5R**QQFJ3X;*<#(V$?`E`6@9H1):,(A7_L M,<G2.0FF]#K-".J"%MP50/.VZVPZ!3@&/8>)@T8!TB#,H2WID)1$]33A$\U/ MZN/47PS':90/X0M+1*`<OLZ+O0]G\9@&,M,H4CWU`)2?570/.E0+85=19Q#Z M<<ZG#*$5C5U<U(H,2*U(X.CS+X4C[K*<2J^TAL'/XB0=1TJ;09YZ%>51=@\? MH7'Q32.+BB7@0_^%;&;U%"]?BB<(GP40$#U6@[[4Z;RPQBU)020MZV9"?:/) MTW_%6/IU'9\&O$5L!-Z@]5O[40U?MIMX1.6C,C_VMO<+R<>#H-^#\X`_!\P@ MX;]Z34)CP]BG2<!_-[!Y=WA8._@7'%C`O;A0NC`C'?[Z`V)-N]3&:'&(US1\ MJ8>>GHT)K$!67_@VFALV%F$"V-K<N[Z%4WMG5$UKAGM(M/2P'M$-O\L-&N,( M-?LAGD2;M$EM2C:R41+_^RI!ED7S])[U+Q[A-LQOA_]:1DL]0DU#6I.OY>-4 M.L"7.0R=/=A(J9[4XZ1J4181^V>]O@<EO<TK&#DX/@$6V,8?@)@*(;^:C1^E M*X5KBP*V6,"_PU4&U+F(1D43_I(ZN^@A:KU/B0L!24NB:`R,8147M](VL+*W M?(0(8V-8S9X^@76@\/=/%B&RWUO5Z<SOQG%6/.C)*"T=_LNC8@B/F^.<?@:R M?3H;4VNU@ZI9GHUT,_5.4DF0A*N3Z2IZ23)%_QAA>=Q_CM(BP=(5D>"K0R;G M``SZS9*%+9O34`JS<'I40UCI<#B%61I$>_OFHQ3@::8,#B+UPR*\05&ZO77W M-N[=:Q1+TGF$%K5HEN,>+1XLZ\\$B,0R%ZM;('9B!7QQ%BT*25B(%`!H@9$U MK3FTQ"YO$3XG@.-AG*;`VV"3AJN%A@(.L(LM%QEPEZ20K45]:T;$&!5O:DI; M,`!!XPCVX*"O"2RV?7?^<?C^JJDVH"5>O13S\`L"&N1>_96;T/E*D=]U7H$L M,)S,BG8;F"*"6>X/0%0B@VB+\O"*?U:I&'9H$5E<0\.H$2&`0#N=0(O;ZQ1! MCH<*!0/<F1_HQ;[\E,/+%9J6YZ7P%7>ES_04^Y9$!S-/-D5B$X5(\E%5%*@' MACJP[?;]/.R\`GE\/N^\RG(I(QZ#5D6B^+&4Q1N--)TW`?9\`)F,OR"C6WL] M5-O;074"^MVM%'OI*,"\`B'M@=1"D0.#R$&9T,#O%O)9XQ:S&^^HZQF"H^=X MG]8S!KO5EO**MXM?9"';`?PKK7K:(&"L#Z-B1C:!\CL0-:,L6RZ*FK<QO6B7 M7^#<4):29@;'`C&>AZ7!\&G^`/KZ'%Z@QH)4:C2+FZW]O(!_\83@.@X&SP-8 M5_M@<!H<,*)MU"$%(&&M'BEP?YGN#@M%N"6N>#1CTLND4AZ.4+.1G@6W[7;$ MNMH'GU9$?N8C[0;\SR);?H60%+ZU9%PJCB3\5<2=MI$&Z9R1S*5;D6!;/HZV M`9X/<.6\=R1;D(2G;?ZD#DR+?(?UQ_/7PX_G;R]X2+8NP63,6M<0F4ZC9'Z1 M^^>X#XR*]"W0ZI@O=I"LR0U"<<8+&9Q8B7@2/Y(<*OQ#<RB&D(UFXLD3CZFB MS"*8/01RVMI').ZB:,%B&TV02$98`.;A,4.Q;@X"`R!VTXAV+98(RLRS[=F& MMMI29\)__BDV<"3?EL_O]*-`*&!1`Y15/<C59JE':^@@YL"YH@\&Y-0"?KL* MU:$5/#G0Z2]9Q8?>0L"*<S1'H/HM\G#"6CLH<&*<KI*2]8`&G<;HNKR-N#O; M)4"Q1\#'631[8+4?12ZV?W2EX@;_WX:WU7,NS:TT%]L$,8191PC?\1'"@/+K M-JR-S*/51QN8&C9Y#$=SVGO9V3')&I*9`=46*$@`24RS*:BH.2#VX+3;'W1/ MCP+!#<[1,':;I4D*LB]IMZ2^CE&Y.>AQVZZX+J(%.J$_K4!"BR/N>D4*Y%AD MX0AM-()L;'J,KN@?=@^[,+7CKO@Q6R:I^#F,8<?(#B3.Q\R<1#H1C)6"/9H% M3&8YO35V+6G/&O2Z`QSMN7<V^RXS-6Z]5ZP'G@0#@`PRR>,2HT<.>Q,7Z2+W ML-[%E)",F75'6J4D(UJ%<<':,JR'[,;\QTOQ_O,OOY`X+6V!5T!P<[1!$)+F MRCA`1C1J@$.][)^AV2LKR,5/!Q2>,K]G7W^<)EUNVE--0WOWH%^7@'L^FZ&Q M#HX:?8<LEB!"PQ06BS1G$5O.("MB'#=OX0G'8`,V]YE19P\T1;)$7LX7:58@ MS0`MYAZ/-.`B[!\AP"@%M12?G!'-F<193KYUP*85.N")A$1HET%8T7C*FJBA M$:#V-;H5LS`OD%P5\8R@H8(=@%Q\P@ZK\$&$-$39#2#Y4CR?`V8`)0>*,TEG MZ"D9BQOH)'S,KV5F`P0MG$?,)T"Y!T#R0M0,J$GYHZVN>`.`62;TP8!&2](" MF$>&'4(`DL=C@49JP!@<'%K<`+9'9'"%GQE/`#`]91,PS`.!G-/G1_`G-)DO MJ6L!\(+C<P9_SXH8)EE>HOY.[E\FX\RV>VM-X!G.X)DTL\*C!S8YC^#[R%!@ M_V+XKLTA&*S,6^U]A]5/$%8XBC(YV[X59EBT8$0>CMN0-N@?'V#8?"E-Z!3/ M@=9V&#YD7(87,V4BGPBV&XE,*G::T=-<I"A$]IPTFX<S\>L[ZDDCT:<)?/+K ML"]Y"JB]O,$#AO9NW48/%R/2`N6[Q_FER>R!AD)\X)5`_PB^0W8DW'@S#]', M0K*>$P9A:QR"WA,T:"$T6IR,(T0>Z#8#Y*=9S<.[B.R@49C'.`@_$J,(=CE. M#*E/X:LAT0!V&-!NJH-(MA&@$:/E+,Q@@V.-)123M&*O!DR6H)_R3D2,!P0F MA[0#S7I03]C,<A.I-2VB<9<)>7O'[UV$!S6JAW26WBPG4@T!&BG]`6[[XF&! M3@Z*Q;%L4U))FJ1&^UE43%=2@JT8`M"G^((\S)[`GFI<S_>Y697<Y^][SV=? M`A7BHQP`WX];`;16X3W-;(4&:'2)ML0/8@\/^9XX$WM$5O8XPD?"@P!1Z_V@ MX8BWP##XDX9)4OK5,C"VE<)DF<!!V+;_W*U:Q(6QAG\"Y#>\)H0CE71D:VT2 M%PW<$X`K^MD__>/CA1;"V.1(2NW)(!@HXXDMXLKY=5XI%SK.;]?Z^Y^\U;]+ M@9]WH[GW,P@&P$DTUB[S"!#+"/\TWR9M2POA;YR_]6-\GZ\?`OL*_F^KK91M M+8VNNO)=7#LY3,03':B@E`@U43-%FA,NE^*(V#W",0LP0Y^B2$%T<!B`N<'F MK0*FVO`_9&ZS:!K.=JG!?JT]D&V!!B&<J3@>&FR]2[.@#XMW2*R(F"@B@VX< MH)]*<Y`Q?2@>2".\S5&Z/"NF,L;[A/H#^IX8+P*%*C+"0BF&(*P.R5+"OI`G MQ;^&XYAT:0SSFX'..$R3YA,C\[64FV\70?`+/-?>0_;JJ,U`"F"X'JED,VK- M.]`N[8"$JX4O;.XH/>1O.)XU*S++]JO1)M^A'-M=Y"N@_ZQ5HSD<AOA*=G&W M2;ID]1:>VIZ"LCN5A.$T<<P`:-T5;-XUNTE<`;:48EP=+UF8Y^DH)E:D103N M+3G];\I`C#L-2#2)D['D4P:FA-^2"X\B#@\Z#/H8WM8;!/U3)B/(Z')TS<<X M&92>))8A9H=27B26F<0+X'U2\",:Q]W1JT9N=)@I;R$SN:YCMJ[Q\W5LYR+C MHC&JD%-1'1E+')C'^1REO3T"..^5X.-E#2-'<1#2[VFLQ4F$@900>&LJR-G6 MQFK_$NP/U"^B-*`HT7,F39:1?I?XNK30JY.\+A[U1;EE;50RG6R[Y:8P8Q[: M/A&5*&8A'8VSV3!C[P*<V<!=72`)$0L19)`^.D(#;KM_?!0<GKHN2VLX'5&& M8U)8"O!D'#00_P;A,@=!2`[:)D/7YV06PQ$T<PE$:2"Q2I.GA3H\;,+1-ASV MC2ECC4-I&/DKX"$9XRX"0J2(!>T>$39-9A0*VJ+>&>"?T5M1:W9XE*%#%#ON M"^N0V(BR@"?LH([TVP*".Q\"2#A#-LV1WYFBUQVS1I<9?(,L.%$!WR;:VQ(% M'R4&KI?_;!\RGCK@,F0G9-8.O$_F&]BFA36I!YU'!]%A,#-I"-=142`-79)T MJ!SHM.U(Z,E0$N.W0'=FL?$/5`&!+N.JB"*;Z"3\3@C4!#G%]4/^JU(TI,)' M2I%_5%*``;'7AO1MTCXX"<3K]'"UCLYZK0-Y?IU(+/,B\.VKETX\6DG`]>_& M&3J`JV*?Q[-B>55\*D\IG$_(M6,OGQS0M@EK+;EPHH3*E%@%,9<S/6PN@R?* M-7V<*2V=CZ[#TW9K9%YT(3W16_-/!/;O"'G-7G%7/KQY<WWQJ;0OL@7LS`*= M%E^VW)-5%&=CH-BYV@U//-)CEVGLIG7+)'[ZC=2Q;B7?0"<;3GQ529DT[ZNR MC6,<P*-I5-\),NN^9_0>/JHF$_EPZP63QC;*O=*(>^T:3*6TN&)2=!^'XD9R M5J`_ALV2'2JA`1R".EDF(Y)T<0CT$H#\*0UZLYFP3Q'G<:W";)S+"'09@CZ0 M9GXE+3;WS&?YJ-_0CL!9%Z,Y'#D:_R825_N_,4Y\E0$?-68-/M.=VI/I/<J. MPF=/R,%1[KJW1F\BG*\;AJ7T\E"B<6,)6M*@3"+1$X)$(-!G.KR^_.^+0/2V MB:XD_$!5*++=._;3>@^/W<KCY#GV.'F\7;Q^'A`2!Z>BC;+B@8P[WB)FFR'B M9<E$J]8P)I<JECC3[^3LD#(/2CO#G\Y_^OEB>/G^S8<=Z6"38=@TP2*<H4XI MI!1B5"47`SH:L<_';!@W09]%:FM=>QI/RQ81BO[_\/<6"DW.WUH)P^]^BBC: M&>01G_E"S<*L`RGP8A:.<%)HU8KFB^*!0<E3J887;QS'Q*>.G9'6:=T;R&>M M;HC#CCE^"?&E%"$:E`8R]J]2/SM@M-Q%A0+48(7*ES1HD2_)`TW&AGKQV9*> MJ\A.YC:A9.;`-J\J`5J:U7PIDO4&4V.%TQA;B0A?CT:NQKT5(O&\Q9YE#-S3 MJ@&^8H7`6I#:Y\"S/%O=1D:H(D'4W^O%CJUQ=NUTMY@6&Q;J4?[;!89OPO?V M8]&]78/N[;78+NWM%DZ1](&2!YD$^CT*4NL?VD',4N#Q6;M)WO.\`'CT!R<R M/&2K,U9*XW'\%7S.4'XQ^JEMM:Z;FU8^TV5QMF-%6O!B![U@<`2K15^]%'`< M)-2S.9,JE8B^C$BO`YE;2;L4[@KCP^\8],H6[S,4[VN&DC9QW$8EA_N;`ID0 MWS]?E'N0XBU7/QS>+.,9:+9#7II&^5ZK/#G:9N7OXGC\<OR+/]/%"AVS68R2 MF"@T0[L+C86=3>7_CK*4;9EDQMSS)<3X?#M;?V"5H6CA^T(]EZ\9E-*12JR^ M75V\1TEZ%`R8<#4>2;*JEM`ZV#UZ?F40/G*"7ZW8:@Z9P%B&&S2YD^.&(KO( M1&X9XRE>84Y&%#N5K#X0VR;.*&'U_=MIR6UIHJ/02MNY8>!U`%PSOH8306^K MW<3`].VH8\V)=<60_T3R<-(]UB3V".-R69?58S7#<_@3JL):4M$OB0J?##@N M>J!R/6JE>YNSC:.9S=F4N+2NYH4%SIH`P+)4YX@8'A'O/Y3QK/P:__Y:4#2Q M@Z)9\=0B'^<0\Q[E(/6/@N<R:US[Q2XG%?<F<)6,0E`200,O%Z4XRZ[8Y&(B MT[<GQK35JDTQ6-.''2Z;<M*^<CZ'B05&R+..C5849?+])4WOELKU*V7&.*FL MT/CZJ`K."*8S6<Y$>`-L4\?B:)>;$.=B`MLX5O2321UO@TK2BHL<LT]INV74 MS`@V&*:`D2=L`=YINR%^,FEI1I.VE^TW:_]OV99*DY+&5.F*5I2,Y_N2G)@, ML'HAE-D=]>#(<UM@II?TKI;[Z(""/_\4)8<@#ZIR#^U$/W[#V84.V7U#,W?# M.;3_<%?["MU)?M7+L"0`^7&[M898W5R&0\?$;-[`%[XRH@F):2RNZ<@OFB,: M1V\?<HQ7DYZ!0,:"E3AQO@A'D4RA'I.1CP/,(H&A`-)U+)VEVN9)FA#&4$\X M3O-E+Y#G6T9]P?GFL`7L$-["/U2-:$RNMU6:X6F8F-!7-!XZ9BX;41^9.>_D M:5"`5`V^2O?@,RNYKE?-RE!OB<*J)EOS3<]L;:U7ASVY^H0*6N+?R.WE!"R! M8AJ%H%.T&V8A,+7J\35'2Q^>H<G"<#/"*@=#O[%U5S.&DB)T]+TD<HPJ,OT/ M<.07LLUB&#W[OF2<XWB,B%6DN/AGO@A[LC<C,1NE0+J20GEEZ7#9>6PPN?(F M;<B[<<>@3B5Z@VOZL"RL0$KV"%NX4K$DN^.]T!F2O%LOR#T-Y'X>3V]EZ3,@ MPK,9(!J-_=6$O.Q633YJ<#NUHVVGI'B^7C8W^)3<%L_K?,1RAQ,ERGM!D_-9 MP]47;>=6M:R)**?=.!-E#&^]^(\.%`?BCDE@\LA+I,DNO`?-7TI,(U?+8[0P M*;]$AC?;U66U#<^S]39U;O,(BWJI@S]/O(_B+A7W\:4!SCF7T)?G=S/#$D!6 M<AX\&.;QOSG[N]SZWHSTJ,3`;1(4R"_01U'S:*#*$S6,WS*/0=:U2O^0@"Y3 M7?%=Q5+#9B:D,C`]MB8;5Y#E*Q2ZM^,MM+M2&([]_8Q+7DA*`/_HI&0XD3+R M$W/->VHA"-FFLQ:K^I$R!/V5WUC%CX1ZUVY;@43URW56^YC%.FMUIP?SU1'' M:J5GJIG/1*<"@0Y(!3OJ&:L?$<%.9U$W?1D2P"%(O,&/LO29+*C_9R/?R4%P MB(5S3@Y5MJKTLGYR0O$IW.0F`E6(N1[PRO%]F(PBG<&F5"U4%$"2)@I.`ZE4 M`)9L]#ADLP-<(G;LUG(@V4KFSZS"!\Y6P1&`^&%J!1H:"E67$R!":7--G4L! M7_CIPV^LD;#<B'Y;BO17$:LXB^@+=!NW6,5AN>\.P4N1+;?PRYQ*]F":$.HQ M'-]RRQJB"G"5D,^E;K0L8G0I(T1`1X-!59(_2I14A(A&61%$4DK%F8#0P0O` MD!].#T')#J"7R3#\3K4&#YD\N9#$FJ3ADH-1<H]`ZU9VC=+_;$2=G3R.LYK@ MG&HJL4P,X)ZRJ@3T=Q.$0;=1=1<Z)4W&Q%&HQDY)!W8LND;;*9&3ZE2`#L+7 M58IJH9)HN7!'M<>4#[?RX=EN1AS)K?G0<30PM,I-'>%,]M*$;K?IC`]MX\11 M]*>MEF=4[*J]P<X(E"9<<A1[!JBS*TQUZ.+EA%%?R]%6=59I*"$X4WB#5)<` MX:O_W:8S#AK+2-24:$1,&\_<*%W$F#)C-$3O*([<JU%`;1P5!-%J@[$K2B'7 MF":DZ;(D5"*%DSG'>+Z[W:XL(J,+O2"FEXO"6"YG9P_=D&('[/4YV[*BD"?/ MPJISX@YMC^R9Z#>FYOHJ:CBVVEJ8;!S^Q8XO[;?C0,!7%\?))N\:IZ1GIAQ: M64M#Y^/U%`\:*,*VCG`^9IAR,I0BHR4:B;)K^9D4?BI#K">R+%@?4VW)0?^@ MIFI0A/8>37MAOD1[<<J;B2^TO@G'LK&;]J(!Q*%0\(N3CH-=J:X!=66'Z5P& MPB%".2>BYSD1ICP3`X>QT*ZR(H72]DM-:9Z(C^]>#]^=7_^=1;S!X#3HH]Z! M97T.3#D?`(EX16TQ5HE$NXB,A>H1#@]L7,>P	*NRY%7OG9CL)-2AFFDZ*U M3Q;4K5`7AP9\\[#&0*Z6#E#0P5<,[2*2`JI,!6RJIG]%E+"/8V_C:9IN.@;3 MK4[38X9YS&GZEG-S*,_-8:^NVI942-SS,WW4^9EN/#]3=7ZFI?,SK9Z?Z:/. MSQP1&L_<AO/C')^WKR^O[`-T=!H<]`!*H"L-CBL'B%J7CY!^6'.(@*0JTAF8 M7<7.'?.G`3YN))\BL>$8?<O`*$0(SW%Z:8Y+VQ`*A\`P1?OV,Z8.%\QYJU,! M/^SC99VK1XZ`BNSC9/G2D:/*""]%B3[Q]JN'^M1(P'`G!DL%):9JYVB"PAJP M8_XPA*\>!1XY$!L3Y'+D%-LV6I>.!!_".%.:_Q:[*U53M3_SN=F:^;Q&I_+M M\3>-4]UI:U.]&$,((\_^*=VBT1X\-[7PX&MO4UU!1,[I*:L'`)=H5&#-2.D^ ME3FT\SFZQ>!?+.^)5;M`AEM'.,D2!;-`;0J$O3D6H:)9"7ZH,`QM;(JF,GX3 M$7U)V"YI'H+&05,CR9NCHS!F2R2K[^*+^K/*W)B`LX-^#S,EVX/34[RQA$!; M'QN<Q]Z4"=W%J3GF5C&W:X+UUM3X\O+/%[+<;^P<Y+YU?'5EK8X4<C:Y0)2) MDSLR69?0ZER\__#NXAVU^4JE$F5Z=.,9WM4QMHNZH!J'.>66NP8>=;DUZ8UB M^T[MAL16QCP$=E^HV%8$FXB!;J$M$;1_`7]I4V?5UAG_#FRQAZ[^\F/0YWI? MGO=4SX8:'A7XB_>?KO[1Y"RF6`:('!STL91.^Z!_A"GZ;)[\JFV4*JK<E/"E M`[5-W3$AM>6WD:[SJ[POB@_D2G..OL1YH1RW7FW="JO2A;CC'*UR72$^X-^K M&$M<3>EK=*V+=QS.7E!;H[5VP+\4-DHE1FY1AJ[=*%5YLQUL&LG:C<T5TQB> M:(9N/L$S-@-8T)^,R`"T802?IDSVYD(=`XNM29*-V4KS$G%1-9#-V:0_R5K/ M.3_\Y*L\#@U^>,:1#>Z':8$;/VE6Q$-YUZ3,76I^?,N/":0!C7PV1I,JI8:' MB570119@S[',>#A3W19IGF--*MRN/$WR,[="+)().HA4!"A0O5;J#>(!UBIR MRYICO1_XRH2SR"T1$4,`I!^12CJ207B68DIC%F+&.KO^EU2;=HFN7=EM7V^= M[]SJW:IIT!^<J$K52KJ'*:$CE7;EK!1OSBV5PU`F@T,[IQ(&-2I_RR!*Q1'! MR,(G^WVZHO.6@?0@EPR\,@NKQ>$!1HLT"[.'^C.YBI[>1\JV3V7'C&5N7275 M+<K-M>L+0=9M0WO#+K2WW83V%GO0KMN"MM\51/395)70/*ZR2%&6&XGBGU#> MV<%13^6=269-^;KPST(SY$"FO&(E*?4PRM#6W;GX>'&%3!2=^J)!R8!W499$ MLV9+ETG.,02N:55'IEAIG,/A(=YHT3X<'"J_H)M.S\L7C7NRZUF>*2XA;16* MT47:K-4W.9NUG/N(C4K/='<:E/.-Y*=YG?@K>;[(*=NCLGR'S]7=40H@J0T2 M/SPM@1UX$U;YSN(TBXN'2E0.L]'P/HQG<H4]%HPV`)EF.*#;K8YZ_3)<:3+. M:CW@]@-V+<P<]Z@:Q';<FJ?&:<P/:<+'Q\'@$*^L.<"?1OR093%HB=?#R^NK MB[=-.6,\-9U78PZ)PVH)<PS,=JMC8U/_W)LE0+=$58[\?'UQI2/==\MX1.3! MWI]'#*GB7U3ODK'F<X)!`%18AY1$5=SKC&X@,<$Z9YV6S@<V^"=ECBH<O.5B M2Q)AKV6+<)ZC@Q.'SX_FB^8>.I\[UQ_/?[K8*Z<8M[5YL(/:1T]&\GC3\3F` M9MU$#*`4['Q3T*V\'_?"EP0+<D!1@*"I.N"!Z^7[7\]_J8<KS`AUP+5PH,G@ M$F3CFAG+9J+QAW5NMCUBEEK!NL3S9WIPHT^4XR%B;_&D=J65V0E;*]EUYB=# M3<ISCCE$I#+K6!8'PWH^!T`#3D[-;2,>2'M6N"@I2ULL;OW2U$0KK!@10A8[ M\;P'C4O;0ZV7/?.R#DU@R_4FM4I\R,ISE6^X_-Y+\0?^[4@&:)C_0\+SX(@O M%^CW\:<TIC:1$W4GPY2*J^D_.J^R"!D25WRK/E;=2F0W>(+/6S(MBZ8Q.-N. M"==Q]?%B::"#[*JUF5'5<+5'\"^;YQ.CETDL&B5<Z5&NRB9+0NI,#FN6<A?1 MD"WRT^_G^2A,["@Z]:0^ADZUV#*"KM+<&S]W@N+#B55WF+XZYCK"9^*DV^^> M'@N1CPJW=#`W&XZ*&7QQ'A<Y!_9P8>`EW2$U5U=8485-#M7A0?"D9]J)SAK5 M3<A%-37H\R$2CS,QZ!YV3T\"<17?B7O0T*[B:-:E@JCBURC+J;;I=Y<P5[7@ MX%[TN[!86./)?N]P'TA,[Q`6?=8[$N/P'KYU\64AON-JQFN&X&K.^[W!/D.N M?PBP1DC(_IN+%S\'B1OH'/SH'\DRB#*8OY1-QXBGWS:;TPDB=7XGGHCA$,6) MU^_..=/#OK.225C+REPAPH-7IIC2I&3_F)*"19^%W9C*[#PNXLGYD"]P,QZX M;&`6_;&\C[`H+J9E<2LN,TY!\A\P&IYW>J8JA#;AWY<O>]P?[32S,=#;5E?% M4W-4=,VT/I6U?)$OHE$LD]UR'50]`D4_'D?9&?_)_WYTPL34-+&@2*%B3ISX M?BM&GP>8RF)&^7(RB4<QU6(5>W1)1$1&JCVJ\X&1VS*T7_;@[HBEU(`"6F*= M%^0,E]^F&<ESE!G$1B]5("^7L3"LUWHS:D94B9AR$&0IQTDIS!P'>IH[E\4M MZ?(UM)B=B1@&,K:41,*3[WC3?:@V7XR51A[29%R.<,$RU#P$QNOP```>#HM9 M%/CU11HGA2P4&=,%>N9*M*Z]9_8VX-:]__")MZ](EQP=A,1$03Q#HQ3,#C,_ M`[UN7<16%\2%3\J@12'K7U(")3)^CM\Q@8,<7X=PLO!#HC(7=>8!.)]J5PBG M7":%Q)@H'CR7FXP2+1J!CS?V?@#`3&7@&8E?2(PD7#(G?Z1R(9DIP\QQ,@@+ M-8"ZM\\.7-0#N35.20RF6,0_]'X*6(=ER\$/JIP;;4>ERY?LS"\2%%1_S^6$ M=*.AV?OW*<8;4`0H(#30%ZR;PY!7L:(T"(\=TW)NJ0JS-[,"^;D@S^N'*TFJ MYF&&*688O432OYPLQCGP$$@:WE_\>G$E?OSPZ6>\#O&R$)?7,L%6QHG*3TO4 M2+"<'>ZX',+<>XA_\C.@`:4"]\1!9-Z/[VXH%KR9//*=YV*(F#KD4I;\136% M.:-YSL&>`+F`*V5R5XQPNO[Y_.KB]9_XZ_G[#^__\>[#YVM$)<3V&*_@Q(5/ MN":L[,9ER`F;J`SVS7*J/1=2<%YWYT5MCIDJ%JWV@%>R5:)9G?%OW>?T=^P+ MX#=]S4I0*1_)VML\?->T$=$R=CACA'0V%EC`$L,8D1K1&4>E/XHDQZ`+/2BW M9D37;H#L)+N63S_V54'(BAD8:@HC\DG!KE%"[M1P5,3W,:"N.'__VLN>$-=D M[7/N:J:MDCG9T'V3JEOSEO,;1,()B\8:]K*C%/LD%Z'"0:J+#+I5E>-7S-W& MJ>R)*\+H26'YS]SJ)U31=AI5`"Z<FWXZK]@C[>R[%1!I!6&Z)87J[D73V?$_ MA>BHT*S/8B"[N[O2EH"#*?]PNJ"[;JP_63\!7J4TZT6,?F;X%Q03NI\&?F4= MNA2>J+(O2N/K`9W+O43']5&+MIF$76^&+P>2NOP=DA]D'TV80R"N+]_^^/E: MWKS;4%8ZXY':;2J'(YK`K+L+6JX75AJW]Y]Q9(PI6QE*LLILG/!(UF)[0_<W M((;-9H'JQNZ^?&D==^MR@KB0%PZHJL6JFRJ54&(>&*:TI-+:JH:QN@-'8[&J M41T6UL>ZLL6^]AML(%K:+6O04Z66X&L-5J=M":+ROE#;3=MN."`6?-&E8>_` MT2:%FB81V$8=5N$[0KT$NZL<H\U7TOG)Y)">R-AQ>ZBZ6^@ZE<^3D\2;X$?< M5#ND8/,X(8(UR-O(JCK->[Y*U8:2.G%6EZF+-0*E7TMUL*]89OF^L'L`S<2D M8OC8?[$,19U(QM<?8-\SWQ#,4@^:630FFJG)*AC:-<FJ#,IZ:=I2&*>:L]N4 MUC</QY%=AM6X3E%$UBLA"J`O02$E'7-*>/H5O:AAI%]]R3-);4SW%%>@:[@L M>"GA4&5HYW0[I,Z-]:^Z^YB#M/;N-G5&%+PK52/XZ)GR)#W9@\]&-4M5]7%] M=YP2`.^IE`%R;^3GJB(\77*#I1)N@).K?"<9$,@[1JJ.*HV`/D*V_P-<^>8< M$@\D3.K*-O)*K*+-E.'[]P"E@M"([;S9LG!_5ZB;;9VT5C(Y!MX44OBS8HKX M[?SR4TN5R>;;V/"'[[H!O'D,I,YH%"XERI/X0Y#ABTYD3U-AWESK"C]'RRQC M"06A9,EV72&_N?\HUOV1KZP98=J7%!(YV>OR@RS:Y_,;2T(C)15]NN7D7!6) MZ0[6+);RC+\4"),@!I(^\`@C0WY,^7S4F,HWY/#U/K=4`5^?W'FHA48MG;`< MC@EN>(N2JQZR0H`CJP]*M1$&HGPW<Z<NBHR(2W3QS-(DZ,F.H7OQ%)ITPN(I MI>K!F,`2YD`+Z([M(<;=-5L(2J6"I!D&^,VX]&RB@)6.@%JC;2Z<N96!9#>^ M/P!X9,Y<[B:Z11>/IR8)GZ1-A,,N_X);KBX2EO"U((.X2`%,E'U$M9?H6/'- M&VPJ$!=X%=,2+U<*N%2(DY5EJCI)]Y3\=)$MI>4!KW(BF4;>P"/U*11&G#VD M*BK*.J$"+,A(H<2(ZM=MXZ$PTN]Z:JFNYNTT&MO<0RMSF[0G[CR.(A#K-3,? MJP-H725",U:%ATMY3]5[$5>@L`YA+X9*@D1D9L0GLP=7"N,C!/J]L5HI[GV# M!Y-OV[`X^:4J$%+!I)+M"8D!)7\N,2'!B!Y(`KO*4EX:0HZ0IG4$(&,:W?;N MY-I-%'7*PA8RW'8BW$8);F-N6IT\MQ5;?Q13QX`"V_)\+HF+NFKDALK!``\> M1^,?A'@G*RPS<<+;Q;)T+FV<Z3Q:2:7[L[73A>3M100(5]#%>3,L`\`WR"N3 MM?P(YRDRPJ.<JHN?U:(!==_?<+.TI[;:S\"I$#,-'QW2F1ARRC'.X(<]7SD- MX4>?S=B#!Y_31RQM<O.]T\+5B<@+T=SZEG+%T;==KE@3+*6"DKF``<8H44B- MNEJ:@N6EOPL.RG+1Q"<<[*Y.*4<S*_K+$E06W<?(OU[N?7<E?SUCS]1W>X%X M!MSW&1;=V*;/D=-%[#C4`;V9>9$!^VVJ[H%X>O:TA8;I'3ONKAF9IC#6T^^> ,HIK^?\EED1Z'DP`` ` end ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: PATCH: Swap shared pages (was: How to read-protect a vm_area?) 1998-02-23 23:17 ` PATCH: Swap shared pages (was: How to read-protect a vm_area?) Stephen C. Tweedie @ 1998-02-23 23:27 ` Linus Torvalds 1998-02-24 0:08 ` Benjamin C.R. LaHaise ` (3 subsequent siblings) 4 siblings, 0 replies; 12+ messages in thread From: Linus Torvalds @ 1998-02-23 23:27 UTC (permalink / raw) To: Stephen C. Tweedie Cc: Benjamin C.R. LaHaise, Rik van Riel, Itai Nahshon, Alan Cox, paubert, linux-kernel, Ingo Molnar, linux-mm On Mon, 23 Feb 1998, Stephen C. Tweedie wrote: > > The patch below, against 2.1.88, adds a bunch of new functionality to > the swapper. The main changes are: Ok, this looks clean, I've applied it to my current sources and pending no surprises it will be in 89. [ I've also changed the way we consider us to need more memory in kswapd, but that was entirely orthogonal and did not impact these patches. ] Knock wood, Linus ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: PATCH: Swap shared pages (was: How to read-protect a vm_area?) 1998-02-23 23:17 ` PATCH: Swap shared pages (was: How to read-protect a vm_area?) Stephen C. Tweedie 1998-02-23 23:27 ` Linus Torvalds @ 1998-02-24 0:08 ` Benjamin C.R. LaHaise 1998-02-24 9:45 ` Stephen C. Tweedie 1998-02-24 9:42 ` Rik van Riel ` (2 subsequent siblings) 4 siblings, 1 reply; 12+ messages in thread From: Benjamin C.R. LaHaise @ 1998-02-24 0:08 UTC (permalink / raw) To: Stephen C. Tweedie; +Cc: linux-mm Hello, On Mon, 23 Feb 1998, Stephen C. Tweedie wrote: ... > The patch below, against 2.1.88, adds a bunch of new functionality to > the swapper. The main changes are: > > * All swapping goes through the swap cache (aka. page cache) now. ... I noticed you're using just one inode for the swapper/page cache... What I've been working on is a slightly different approach: Create inodes for each anonymous mapping. The actual implementation uses one inode per mm_struct, with the virtual address within the process providing the offset. This has the advantage of giving us an easy way to find all ptes that use an anonymous page. Anonymous mappings end up looking more like shared mappings, which gives us some interesting possibilities - it becomes almost trivial to implement a MAP_SHARED on another process' address space. What do you think of this approach? My main goal is to reimplement the page-oriented swapping my pte-list patch performed, which makes the running time try_to_free_page drastically shorter, even predictable... (at most 1 pass over mem_map to find a page using the old style aging, or just one list operation using the inactive list approach) -ben ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: PATCH: Swap shared pages (was: How to read-protect a vm_area?) 1998-02-24 0:08 ` Benjamin C.R. LaHaise @ 1998-02-24 9:45 ` Stephen C. Tweedie 0 siblings, 0 replies; 12+ messages in thread From: Stephen C. Tweedie @ 1998-02-24 9:45 UTC (permalink / raw) To: Benjamin C.R. LaHaise; +Cc: Stephen C. Tweedie, linux-mm Hi Ben, On Mon, 23 Feb 1998 19:08:59 -0500 (U\x01), "Benjamin C.R. LaHaise" <blah@kvack.org> said: > Hello, > On Mon, 23 Feb 1998, Stephen C. Tweedie wrote: > ... >> The patch below, against 2.1.88, adds a bunch of new functionality to >> the swapper. The main changes are: >> >> * All swapping goes through the swap cache (aka. page cache) now. > ... > I noticed you're using just one inode for the swapper/page cache... What > I've been working on is a slightly different approach: Create inodes for > each anonymous mapping. It's not a different approach to the same problem --- it's a different problem entirely! The swapper_inode is *only* used as a root for the page cache. Its job is to identify pages by their swap entry, rather than by their vma. Its purpose is really more to do with the management of swap pages on disk than in memory. > The actual implementation uses one inode per mm_struct, with the > virtual address within the process providing the offset. This has the > advantage of giving us an easy way to find all ptes that use an > anonymous page. Anonymous mappings end up looking more like shared > mappings, which gives us some interesting possibilities - it becomes > almost trivial to implement a MAP_SHARED on another process' address > space. What do you think of this approach? I'm not sure --- one inode per mm might have problems if we ever change the virtual address of a physical page (and mremap() does exactly that). However, that's not an insurmountable problem, and the remap-vma code will probably get it right. In fact, the more I think of it the more I am convinced that this is a good way to go. I am actually planning a different but very similar approach for the final MAP_SHARED | MAP_ANONYMOUS code, which is to have one inode per new vma for anonymous shared regions. The primary reason for that is for lookup, so that when we initialise a demand-zero page, we can rapidly locate any other processes sharing this vma and update their pte's too. > My main goal is to reimplement the page-oriented swapping my pte-list > patch performed, which makes the running time try_to_free_page > drastically shorter, even predictable... (at most 1 pass over mem_map > to find a page using the old style aging, or just one list operation > using the inactive list approach) Yep. I was thinking along similar lines a while back. Doing this will also make it easier to unify the handling of shrink_mmap() and try_to_free_page(), which is something we desparately need (we've already unified the page and buffer shrinking, and I think we can unify shm swapout too with the new swap cache code). The changes you are proposing overlap a lot of my current patches, but that's not a problem --- the two sets of changes doing fundamentally orthogonal things; there's just an overlap in the middle. The code I'm working on right now is targetted at getting MAP_SHARED | MAP_ANONYMOUS in place, and I reckon it's now pretty close. However, the new swap cache mechanism is a lot more generic than that, and its real flexibility lies in the way its underlying mechanism works --- the ability to do swap read-ahead and to proactively write-ahead swap pages will allow us to do some major performance enhancements. Your changes to the vmscan code are really concerned with policy --- rapidly locating what to swap, where and when --- than the mechanics of getting pages to and from disk, synchronously or asynchronously. In other words, I'm keen to integrate the two diffs, since I see a lot more complimentary than overlapping progress here. Cheers, Stephen. ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: PATCH: Swap shared pages (was: How to read-protect a vm_area?) 1998-02-23 23:17 ` PATCH: Swap shared pages (was: How to read-protect a vm_area?) Stephen C. Tweedie 1998-02-23 23:27 ` Linus Torvalds 1998-02-24 0:08 ` Benjamin C.R. LaHaise @ 1998-02-24 9:42 ` Rik van Riel 1998-02-24 23:38 ` Stephen C. Tweedie 1998-02-24 11:16 ` Thomas Sailer [not found] ` <Pine.LNX.3.96.980224152231.7112A-100000@renass3.u-strasbg.fr> 4 siblings, 1 reply; 12+ messages in thread From: Rik van Riel @ 1998-02-24 9:42 UTC (permalink / raw) To: Stephen C. Tweedie Cc: Benjamin C.R. LaHaise, Linus Torvalds, Itai Nahshon, Alan Cox, paubert, Ingo Molnar, linux-mm [linux-kernel trimmed from f-ups] On Mon, 23 Feb 1998, Stephen C. Tweedie wrote: > The patch below, against 2.1.88, adds a bunch of new functionality to > the swapper. The main changes are: > > * All swapping goes through the swap cache (aka. page cache) now. Does this mean that _after_ the pages are properly aged as user-pages, they'll be aged again as page-cache pages? (when proper aging is added to the page cache, by eg. my patch) I think it might be far better to: - put user-pages in the swap cache after they haven't been used for two aging rounds - free swap-cache pages and page-cache pages after they haven't been used for eight aging rounds (so the real aging and waiting takes place here) - use right-shift aging here {age << 1; if(touched) age |= 0x80} - adapt the get_free_pages so it can allocate clean page-cache and swap-cache pages when: - a bigorder area can't be found - there are no free pages left (and kswapd hasn't found new ones) - keep the ratio user-page:swap-cache-page at about 2:1 so that swap-cache pages get a proper chance for aging, instead of being discarded immediately (hmm, why not put untouched user-pages in the swap cache immediately?) For more improvements, we could use Ben's pte_list <name?> patch so we could force-free bigorder areas and run somewhat more efficiently. Rik. +-----------------------------+------------------------------+ | For Linux mm-patches, go to | "I'm busy managing memory.." | | my homepage (via LinuxHQ). | H.H.vanRiel@fys.ruu.nl | | ...submissions welcome... | http://www.fys.ruu.nl/~riel/ | +-----------------------------+------------------------------+ ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: PATCH: Swap shared pages (was: How to read-protect a vm_area?) 1998-02-24 9:42 ` Rik van Riel @ 1998-02-24 23:38 ` Stephen C. Tweedie 1998-02-25 10:41 ` Rik van Riel 0 siblings, 1 reply; 12+ messages in thread From: Stephen C. Tweedie @ 1998-02-24 23:38 UTC (permalink / raw) To: Rik van Riel Cc: Stephen C. Tweedie, Benjamin C.R. LaHaise, Linus Torvalds, Itai Nahshon, Alan Cox, paubert, Ingo Molnar, linux-mm Hi, On Tue, 24 Feb 1998 10:42:48 +0100 (MET), Rik van Riel <H.H.vanRiel@fys.ruu.nl> said: > [linux-kernel trimmed from f-ups] > On Mon, 23 Feb 1998, Stephen C. Tweedie wrote: >> The patch below, against 2.1.88, adds a bunch of new functionality to >> the swapper. The main changes are: >> >> * All swapping goes through the swap cache (aka. page cache) now. > Does this mean that _after_ the pages are properly aged > as user-pages, they'll be aged again as page-cache pages? > (when proper aging is added to the page cache, by eg. my patch) No --- the swap cache is using the same data structures as the page cache, but mainly to get lookup of swap entries still in physical memory. The swapout code does not leave swapped pages around in memory unnecessarily (although it does leave the door open to performing readahead of swap, which _would_ look very much like the current page cache readahead and would be reclaimed by shrink_mmap()). The page cache swapout creates a page cache association for a page when swapping begins, and clears the link when the swapping is finished. The swap cache does not linger. > I think it might be far better to: > - put user-pages in the swap cache after they haven't been used > for two aging rounds > - free swap-cache pages and page-cache pages after they haven't > been used for eight aging rounds (so the real aging and waiting > takes place here) > - use right-shift aging here {age << 1; if(touched) age |= 0x80} > - adapt the get_free_pages so it can allocate clean page-cache and > swap-cache pages when: > - a bigorder area can't be found > - there are no free pages left (and kswapd hasn't found new ones) That is already scheduled as part of phase 4 of this work. The patch I have just posted is phase 2, modifying the swapper for shared pages. Phase three is to implement MAP_SHARED | MAP_ANONYMOUS, and part four is to do much what you describe, proactively soft-swapping data out into the swap cache up to a predefined limit, and allowing get_free_page to reclaim these pages atomically even from within an interrupt. I have already begun the work of spin-irq-locking the relevant page cache structures. > For more improvements, we could use Ben's pte_list <name?> > patch so we could force-free bigorder areas and run somewhat > more efficiently. Ben has already been talking about some similar ideas, and I think that yes, we do want to upgrade the swapout policy layer to work on a physical page basis, using the pte_list walking to do its work. The swap cache mechanism will still be needed to perform readahead and writeahead, but the only reason I have had to integrate it so tightly into the policy code right now is that without pte-walking there is no other way to properly keep pages shared over swapping. Ideally, we really want to have just one function walking over physical pages for reclamation, and that function should be able to deal with swap/filemap pages, page/swap cache, buffer cache and SysV shm pages as they come up. I think that is achievable without too much more work now, and it ought to give us much better performance from the swapper. Cheers, Stephen. ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: PATCH: Swap shared pages (was: How to read-protect a vm_area?) 1998-02-24 23:38 ` Stephen C. Tweedie @ 1998-02-25 10:41 ` Rik van Riel 1998-02-25 19:00 ` Stephen C. Tweedie 0 siblings, 1 reply; 12+ messages in thread From: Rik van Riel @ 1998-02-25 10:41 UTC (permalink / raw) To: Stephen C. Tweedie Cc: Benjamin C.R. LaHaise, Linus Torvalds, Itai Nahshon, Alan Cox, paubert, Ingo Molnar, linux-mm On Tue, 24 Feb 1998, Stephen C. Tweedie wrote: > That is already scheduled as part of phase 4 of this work. The patch I > have just posted is phase 2, modifying the swapper for shared pages. > Phase three is to implement MAP_SHARED | MAP_ANONYMOUS, and part four is > to do much what you describe, proactively soft-swapping data out Hmm, is there anything I can do to help with this, or will that just confuse things ? :-( If not, I'll be working on buffer/cache memory limits so one file/process can't clog up all of memory (a'la badblocks -w), of course with DU like tunability... Rik. +-----------------------------+------------------------------+ | For Linux mm-patches, go to | "I'm busy managing memory.." | | my homepage (via LinuxHQ). | H.H.vanRiel@fys.ruu.nl | | ...submissions welcome... | http://www.fys.ruu.nl/~riel/ | +-----------------------------+------------------------------+ ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: PATCH: Swap shared pages (was: How to read-protect a vm_area?) 1998-02-25 10:41 ` Rik van Riel @ 1998-02-25 19:00 ` Stephen C. Tweedie 1998-02-25 22:05 ` Rik van Riel 0 siblings, 1 reply; 12+ messages in thread From: Stephen C. Tweedie @ 1998-02-25 19:00 UTC (permalink / raw) To: Rik van Riel Cc: Stephen C. Tweedie, Benjamin C.R. LaHaise, Linus Torvalds, Itai Nahshon, Alan Cox, paubert, Ingo Molnar, linux-mm Hi, On Wed, 25 Feb 1998 11:41:20 +0100 (MET), Rik van Riel <H.H.vanRiel@fys.ruu.nl> said: > On Tue, 24 Feb 1998, Stephen C. Tweedie wrote: >> That is already scheduled as part of phase 4 of this work. The patch I >> have just posted is phase 2, modifying the swapper for shared pages. >> Phase three is to implement MAP_SHARED | MAP_ANONYMOUS, and part four is >> to do much what you describe, proactively soft-swapping data out > Hmm, is there anything I can do to help with this Probably not right now. I'm probably going to swap round bits 3 and 4 of this work and defer the shared mapping, because Ben's work with using inodes to label mm structs for anonymous maps is probably going to be a lot better than my own plan of using a label inode per new anonymous vma. Ben, any thoughts about integrating this stuff or sharing patches? I'd like to see what you've done with this before storming off on my own. > If not, I'll be working on buffer/cache memory limits > so one file/process can't clog up all of memory (a'la > badblocks -w), of course with DU like tunability... Interestingly enough, the thing I wanted to do next with the VM was similar --- implementing proper control of process RSS. There's no reason why we can't give big processes a smaller RSS if we start getting memory contention, and that will let the swapper efficiently deal with overly large processes without massively impacting the performance of smaller processes. Similarly, we ought to be able to give small processes a guaranteed RSS to allow them to proceed (even if at a reduced pace) during a swap storm. Doing something similar with the buffer cache will help some things a _lot_. Another thing to think about is to do similar tuning on the device request lists, as I suspect that a lot of our performance / fairness problems under high load come largely from request starvation. One thing I was thinking about was the possibility of forcing processes to wait for a certain number of requests to complete if they fill up the request queue, effectively giving them request "credits" similar to the scheduler's credits. This would be a simple but probably quite effective way of making sure that processes doing small amounts of single-block IO don't get overly starved by processes performing large writes (such as bdflush). Feel free to comment; I won't be working on this any time in the immediate future... --Stephen. ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: PATCH: Swap shared pages (was: How to read-protect a vm_area?) 1998-02-25 19:00 ` Stephen C. Tweedie @ 1998-02-25 22:05 ` Rik van Riel 0 siblings, 0 replies; 12+ messages in thread From: Rik van Riel @ 1998-02-25 22:05 UTC (permalink / raw) To: Stephen C. Tweedie Cc: Benjamin C.R. LaHaise, Linus Torvalds, Itai Nahshon, Alan Cox, paubert, Ingo Molnar, linux-mm On Wed, 25 Feb 1998, Stephen C. Tweedie wrote: > Feel free to comment; I won't be working on this any time in the > immediate future... OK, then I'll focus on memory balancing, starting with the following simple rules: - buffer memory isn't allowed to grow larger than twice the size of the pagecache when nr_free_pages < free_pages_high - if a cached inode uses more than half of the pagecache, and the pagecache is larger than 1/4th of memory, and nr_free_pages < 2 * free_pages_high (pfew!), then we won't allocate new pagecache memory to satisfy _that_ inode's demand, but steal memory from the pagecache or buffer instead. - do some form of RSS balancing (later on, after we get the stats right again). - document the files in /proc/sys/vm and /proc/sys/kernel (I've started, but really should finish the files tonight :-) Rik. +-----------------------------+------------------------------+ | For Linux mm-patches, go to | "I'm busy managing memory.." | | my homepage (via LinuxHQ). | H.H.vanRiel@fys.ruu.nl | | ...submissions welcome... | http://www.fys.ruu.nl/~riel/ | +-----------------------------+------------------------------+ ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: PATCH: Swap shared pages (was: How to read-protect a vm_area?) 1998-02-23 23:17 ` PATCH: Swap shared pages (was: How to read-protect a vm_area?) Stephen C. Tweedie ` (2 preceding siblings ...) 1998-02-24 9:42 ` Rik van Riel @ 1998-02-24 11:16 ` Thomas Sailer [not found] ` <Pine.LNX.3.96.980224152231.7112A-100000@renass3.u-strasbg.fr> 4 siblings, 0 replies; 12+ messages in thread From: Thomas Sailer @ 1998-02-24 11:16 UTC (permalink / raw) To: Stephen C. Tweedie; +Cc: Benjamin C.R. LaHaise, Rik van Riel, linux-mm Stephen For the sound driver we need some way to postpone driver shutdown until the last mmap to driver memory is unmapped (or alternatively to force unmapping on driver close). Could you or anyone else of the linux-mm community provide the necessary hook in the linux mm layer? This is one of the nasty problems with the current sound driver that should IMHO be fixed before 2.2... Thanks Tom ^ permalink raw reply [flat|nested] 12+ messages in thread
[parent not found: <Pine.LNX.3.96.980224152231.7112A-100000@renass3.u-strasbg.fr>]
* Re: PATCH: Swap shared pages (was: How to read-protect a vm_area?) [not found] ` <Pine.LNX.3.96.980224152231.7112A-100000@renass3.u-strasbg.fr> @ 1998-02-24 23:38 ` Stephen C. Tweedie 0 siblings, 0 replies; 12+ messages in thread From: Stephen C. Tweedie @ 1998-02-24 23:38 UTC (permalink / raw) To: Stephane Casset; +Cc: Stephen C. Tweedie, linux-kernel, linux-mm On Tue, 24 Feb 1998 15:31:37 +0000 (GMT), Stephane Casset <sept@renass3.u-strasbg.fr> said: >> The patch below, against 2.1.88, adds a bunch of new functionality to >> the swapper. The main changes are: > I tried it but got the following message : > ipc/ipc.o: In function `shm_swap_in': > ipc/ipc.o(.text+0x37e4): undefined reference to `read_swap_page' > ipc/ipc.o: In function `shm_swap': > ipc/ipc.o(.text+0x3b57): undefined reference to `write_swap_page' > make: *** [vmlinux] Error 1 The diff below includes a patch against ipc/shm.c was missing from my first post, and another fix for spurious warnings about shared dirty pages. Cheers, Stephen. ---------------------------------------------------------------- Index: ipc/shm.c =================================================================== RCS file: /home/rcs/CVS/kswap3/linux/ipc/shm.c,v retrieving revision 1.1 retrieving revision 1.2 diff -u -r1.1 -r1.2 --- shm.c 1998/02/24 08:50:30 1.1 +++ shm.c 1998/02/24 08:51:37 1.2 @@ -689,7 +689,7 @@ goto done; } if (!pte_none(pte)) { - read_swap_page(pte_val(pte), (char *) page); + rw_swap_page_nocache(READ, pte_val(pte), (char *)page); pte = __pte(shp->shm_pages[idx]); if (pte_present(pte)) { free_page (page); /* doesn't sleep */ @@ -820,7 +820,7 @@ if (atomic_read(&mem_map[MAP_NR(pte_page(page))].count) != 1) goto check_table; shp->shm_pages[idx] = swap_nr; - write_swap_page (swap_nr, (char *) pte_page(page)); + rw_swap_page_nocache (WRITE, swap_nr, (char *) pte_page(page)); free_page(pte_page(page)); swap_successes++; shm_swp++; Index: mm/page_io.c =================================================================== RCS file: /home/rcs/CVS/kswap3/linux/mm/page_io.c,v retrieving revision 1.4 diff -u -r1.4 page_io.c --- page_io.c 1998/02/23 22:14:27 1.4 +++ page_io.c 1998/02/24 09:28:08 @@ -201,7 +201,9 @@ } page->inode = &swapper_inode; page->offset = entry; + atomic_inc(&page->count); /* Protect from shrink_mmap() */ rw_swap_page(rw, entry, buffer, 1); + atomic_dec(&page->count); page->inode = 0; clear_bit(PG_swap_cache, &page->flags); } Index: mm/vmscan.c =================================================================== RCS file: /home/rcs/CVS/kswap3/linux/mm/vmscan.c,v retrieving revision 1.5 diff -u -r1.5 vmscan.c --- vmscan.c 1998/02/23 22:14:28 1.5 +++ vmscan.c 1998/02/24 09:22:47 @@ -108,18 +108,16 @@ * * -- Stephen Tweedie 1998 */ - if (pte_write(pte)) { - /* - * We _will_ allow dirty cached mappings later on, once - * MAP_SHARED|MAP_ANONYMOUS is working, but for now - * catch this as a bug. - */ - if (is_page_shared(page_map)) { - printk ("VM: Found a shared writable dirty page!\n"); + if (PageSwapCache(page_map)) { + if (pte_write(pte)) { + printk ("VM: Found a writable swap-cached page!\n"); return 0; } - if (PageSwapCache(page_map)) { - printk ("VM: Found a writable swap-cached page!\n"); + /* We _will_ allow dirty cached mappings later + * on, once MAP_SHARED|MAP_ANONYMOUS is working, + * but for now catch this as a bug. */ + if (is_page_shared(page_map)) { + printk ("VM: Found a shared writable dirty page!\n"); return 0; } } ---------------------------------------------------------------- ^ permalink raw reply [flat|nested] 12+ messages in thread
end of thread, other threads:[~1998-02-25 22:05 UTC | newest]
Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
[not found] <199802192321.XAA06580@dax.dcs.ed.ac.uk>
1998-02-20 5:41 ` How to read-protect a vm_area? Benjamin C.R. LaHaise
1998-02-23 23:17 ` PATCH: Swap shared pages (was: How to read-protect a vm_area?) Stephen C. Tweedie
1998-02-23 23:27 ` Linus Torvalds
1998-02-24 0:08 ` Benjamin C.R. LaHaise
1998-02-24 9:45 ` Stephen C. Tweedie
1998-02-24 9:42 ` Rik van Riel
1998-02-24 23:38 ` Stephen C. Tweedie
1998-02-25 10:41 ` Rik van Riel
1998-02-25 19:00 ` Stephen C. Tweedie
1998-02-25 22:05 ` Rik van Riel
1998-02-24 11:16 ` Thomas Sailer
[not found] ` <Pine.LNX.3.96.980224152231.7112A-100000@renass3.u-strasbg.fr>
1998-02-24 23:38 ` Stephen C. Tweedie
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox