On 1/26/2016 9:43 PM, Dan Williams wrote:
On Mon, Jan 25, 2016 at 12:35 PM, Julian Margetson runaway@candw.ms wrote:
On 1/25/2016 3:20 PM, Dan Williams wrote:
[..]
Hmm, this commit could only cause a behavior change if it modifies the value of the pfn as seen by insert_pfn(). Can you try the attached debug patch to see if that assumption is being violated?
Had to manually delete the lines in the second part of the patch.
Sorry about that I had based direct on that failing commit rather than 4.5-rc1. A reflowed version in the attached.
[ 42.557813] Oops: Machine check, sig: 7 [#1] [ 42.562350] PREEMPT Canyonlands [ 42.565692] Modules linked in: [ 42.568933] CPU: 0 PID: 495 Comm: Xorg Tainted: G W 4.5.0-rc1-Sam460ex #1 [ 42.577291] task: ee3adcc0 ti: ee260000 task.ti: ee260000 [ 42.582984] NIP: 1ff72480 LR: 1ff72404 CTR: 1ff724d0 [ 42.588220] REGS: ee261f10 TRAP: 0214 Tainted: G W (4.5.0-rc1-Sam460ex) [ 42.596663] MSR: 0002d000 <CE,EE,PR,ME> CR: 24004242 XER: 00000000 [ 42.603512] GPR00: 1f436134 bfc4dac0 b79cb6f0 b718dffc b69a4008 00000780 00000004 00000000 GPR08: 00000000 b718dffc 00000000 bfc4da70 1ff72404 2080dff4 00000000 00000780 GPR16: 00000000 00000020 00000000 00000000 00001e00 20aaa620 00000438 b69a4008 GPR24: 00000780 bfc4db18 20a94760 b718e000 b718e000 b69a4008 2007aff4 00001e00 [ 42.635363] NIP [1ff72480] 0x1ff72480 [ 42.639225] LR [1ff72404] 0x1ff72404 [ 42.642991] Call Trace: [ 42.798393] ---[ end trace 8fcfa5f0e9942055 ]---
I'm not familiar with powerpc crash dumps, so there's not much information I can glean from this. Any folks on the cc can translate a powerpc "Machine check"?
I'm down to looking a differences between the passing and failing case. Can you print out the value the pte entry and the in insert_pfn, like the following:
diff --git a/mm/memory.c b/mm/memory.c index 30991f83d0bf..c44e387130b2 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -1521,6 +1521,8 @@ static int insert_pfn(struct vm_area_struct *vma, unsigned long addr, entry = pte_mkdevmap(pfn_t_pte(pfn, prot)); else entry = pte_mkspecial(pfn_t_pte(pfn, prot));
pr_info("%s: entry: %#llx pfn: %#lx\n", __func__,
(unsigned long long) entry, pfn_t_to_pfn(pfn)); set_pte_at(mm, addr, pte, entry); update_mmu_cache(vma, addr, pte); /* XXX: why not for insert_page? */
...of course for the passing case you'll need to drop the call to pfn_t_to_pfn() and just print the pfn directly.
Thank you for the help tracking this down, it's much appreciated.
Happy to help out. Just need some guidance sometimes as I am relatively new at this.
----------------------------------------------------------------------------------------------------------------------------- 15.802615] systemd[1]: Started Journal Service. [ 44.263074] Oops: Machine check, sig: 7 [#1] [ 44.267603] PREEMPT Canyonlands [ 44.270938] Modules linked in: [ 44.274182] CPU: 0 PID: 586 Comm: Xorg Tainted: G W 4.5.0-rc1-Sam460ex #2 [ 44.282538] task: ecd505c0 ti: efff2000 task.ti: ecd76000 [ 44.288239] NIP: c0000cec LR: 1fb81404 CTR: 1fb814d0 [ 44.293483] REGS: efff3f10 TRAP: 0214 Tainted: G W (4.5.0-rc1-Sam460ex) [ 44.301926] MSR: 00021000 <CE,ME> CR: 84004242 XER: 00000000 [ 44.308185] GPR00: 1f045134 bfd0ce80 b7e7b6f0 b763dffc b6e54008 00000780 00000004 00000000 GPR08: 00000000 b763dffc b6e54010 ecf50000 ecf50000 00000009 00000000 00000780 GPR16: 00000000 00000020 00000000 00000000 00001e00 2079b638 00000438 b6e54008 GPR24: 00000780 bfd0ced8 20785770 b763e000 b763e000 b6e54008 1fc89ff4 00001e00 [ 44.340039] NIP [c0000cec] DataTLBError44x+0x6c/0x90 [ 44.345279] LR [1fb81404] 0x1fb81404 [ 44.349053] Call Trace: [ 44.351631] Instruction dump: [ 44.354776] 7d7342a6 816b0040 7d92eaa6 7db00aa6 51ac063e 7d92eba6 7d9e0aa6 39a00009 [ 44.363081] 518d57bc 554c6cfa 7d6c582e 556c0029 <4182003c> 514cbd38 816c0000 818c0004 [ 44.524699] ---[ end trace 439fa29153308785 ]--- [ 44.529322] [ 47.216536] insert_pfn: entry: 0x80ed246b pfn: 0x80ed2 [ 47.221777] insert_pfn: entry: 0x80ed346b pfn: 0x80ed3 [ 47.228485] insert_pfn: entry: 0x80ed446b pfn: 0x80ed4 [ 47.237798] insert_pfn: entry: 0x80ed546b pfn: 0x80ed5 [ 47.249809] insert_pfn: entry: 0x80ed646b pfn: 0x80ed6 [ 47.257588] insert_pfn: entry: 0x80ed746b pfn: 0x80ed7 [ 47.265879] insert_pfn: entry: 0x80ed846b pfn: 0x80ed8 [ 47.275825] insert_pfn: entry: 0x80ed946b pfn: 0x80ed9 [ 47.281437] insert_pfn: entry: 0x80eda46b pfn: 0x80eda [ 47.288113] insert_pfn: entry: 0x80edb46b pfn: 0x80edb [ 47.293660] insert_pfn: entry: 0x80edc46b pfn: 0x80edc [ 47.299834] insert_pfn: entry: 0x80edd46b pfn: 0x80edd [ 47.305223] insert_pfn: entry: 0x80ede46b pfn: 0x80ede [ 47.314891] insert_pfn: entry: 0x80edf46b pfn: 0x80edf [ 47.329777] insert_pfn: entry: 0x80ee046b pfn: 0x80ee0 [ 47.339769] insert_pfn: entry: 0x80ee146b pfn: 0x80ee1 [ 47.349777] Machine check in kernel mode. [ 47.353814] Data Write PLB Error [ 47.357049] Vector: 214 at [efff3f10] [ 47.360799] pc: c0000cec: DataTLBError44x+0x6c/0x90 [ 47.366085] lr: 2008f404 [ 47.369002] sp: bfe76110 [ 47.371885] msr: 21000 [ 47.374506] current = 0xeced85c0 [ 47.377910] pid = 668, comm = Xorg [ 47.381835] Linux version 4.5.0-rc1-Sam460ex (root@julian-VirtualBox) (gcc version 4.8.2 (Ubuntu 4.8.2-16ubuntu3) ) #2 PREEMPT Wed Jan 27 06:07:01 AST 2016 [ 47.395758] enter ? for help [ 47.398638] mon> <no input ...> [ 49.401927] Oops: Machine check, sig: 7 [#2] [ 49.406450] PREEMPT Canyonlands [ 49.409783] Modules linked in: [ 49.413026] CPU: 0 PID: 668 Comm: Xorg Tainted: G D W 4.5.0-rc1-Sam460ex #2 [ 49.421383] task: eced85c0 ti: efff2000 task.ti: ecf8c000 [ 49.427075] NIP: c0000cec LR: 2008f404 CTR: 2008f4d0 [ 49.432311] REGS: efff3f10 TRAP: 0214 Tainted: G D W (4.5.0-rc1-Sam460ex) [ 49.440755] MSR: 00021000 <CE,ME> CR: 88004262 XER: 00000000 [ 49.447013] GPR00: 1f553134 bfe76110 b7d6d6f0 b752fffc b6d46008 00000780 00000004 00000000 GPR08: 00000000 b752fffc b6d46010 ecef9000 ecef9000 00000009 00000000 00000780 GPR16: 00000000 00000020 00000000 00000000 00001e00 20eb5650 00000438 b6d46008 GPR24: 00000780 bfe76168 20e9f728 b7530000 b7530000 b6d46008 20197ff4 00001e00 [ 49.478867] NIP [c0000cec] DataTLBError44x+0x6c/0x90 [ 49.484108] LR [2008f404] 0x2008f404 [ 49.487881] Call Trace: [ 49.490460] Instruction dump: [ 49.493603] 7d7342a6 816b0040 7d92eaa6 7db00aa6 51ac063e 7d92eba6 7d9e0aa6 39a00009 [ 49.501909] 518d57bc 554c6cfa 7d6c582e 556c0029 <4182003c> 514cbd38 816c0000 818c0004 [ 49.510404] ---[ end trace 439fa29153308786 ]--- [ 49.515026]