https://bugs.freedesktop.org/show_bug.cgi?id=98915
Bug ID: 98915 Summary: NULL pointer dereference on boot - amdgpu_debugfs_add_files Product: DRI Version: DRI git Hardware: Other OS: All Status: NEW Severity: major Priority: medium Component: DRM/AMDgpu Assignee: dri-devel@lists.freedesktop.org Reporter: rafael.ristovski@gmail.com
Created attachment 128289 --> https://bugs.freedesktop.org/attachment.cgi?id=128289&action=edit Kernel log
When booting linux-next version 20161129+ (iirc 20161128 worked fine) the kernel spits out a null pointer deref. which can be traced to amdgpu_debugfs_add_files.
Kernel log attached.
https://bugs.freedesktop.org/show_bug.cgi?id=98915
Rafael Ristovski rafael.ristovski@gmail.com changed:
What |Removed |Added ---------------------------------------------------------------------------- OS|All |Linux (All) Hardware|Other |x86-64 (AMD64)
https://bugs.freedesktop.org/show_bug.cgi?id=98915
--- Comment #1 from Rafael Ristovski rafael.ristovski@gmail.com --- HW Details:
AMD Radeon HD 8850M (Mobile chipset)
03:00.0 VGA compatible controller [0300]: Advanced Micro Devices, Inc. [AMD/ATI] Venus PRO [Radeon HD 8850M / R9 M265X] [1002:6823] (prog-if 00 [VGA controller]) Subsystem: Dell Venus PRO [Radeon HD 8850M / R9 M265X] [1028:05eb] Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+ Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx- Latency: 0, Cache Line Size: 64 bytes Interrupt: pin A routed to IRQ 48 Region 0: Memory at a0000000 (64-bit, prefetchable) [size=256M] Region 2: Memory at c0500000 (64-bit, non-prefetchable) [size=256K] Region 4: I/O ports at 3000 [size=256] Expansion ROM at c0540000 [disabled] [size=128K] Capabilities: [48] Vendor Specific Information: Len=08 <?> Capabilities: [50] Power Management version 3 Flags: PMEClk- DSI- D1+ D2+ AuxCurrent=0mA PME(D0-,D1+,D2+,D3hot+,D3cold-) Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=0 PME- Capabilities: [58] Express (v2) Legacy Endpoint, MSI 00 DevCap: MaxPayload 256 bytes, PhantFunc 0, Latency L0s <4us, L1 unlimited ExtTag+ AttnBtn- AttnInd- PwrInd- RBE+ FLReset- DevCtl: Report errors: Correctable- Non-Fatal- Fatal- Unsupported- RlxdOrd+ ExtTag- PhantFunc- AuxPwr- NoSnoop+ MaxPayload 128 bytes, MaxReadReq 512 bytes DevSta: CorrErr+ UncorrErr- FatalErr- UnsuppReq+ AuxPwr- TransPend- LnkCap: Port #0, Speed 5GT/s, Width x16, ASPM L0s L1, Exit Latency L0s <64ns, L1 <1us ClockPM- Surprise- LLActRep- BwNot- ASPMOptComp+ LnkCtl: ASPM L0s L1 Enabled; RCB 64 bytes Disabled- CommClk+ ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt- LnkSta: Speed 5GT/s, Width x4, TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt- DevCap2: Completion Timeout: Not Supported, TimeoutDis-, LTR-, OBFF Not Supported DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis-, LTR-, OBFF Disabled LnkCtl2: Target Link Speed: 5GT/s, EnterCompliance- SpeedDis- Transmit Margin: Normal Operating Range, EnterModifiedCompliance- ComplianceSOS- Compliance De-emphasis: -6dB LnkSta2: Current De-emphasis Level: -6dB, EqualizationComplete-, EqualizationPhase1- EqualizationPhase2-, EqualizationPhase3-, LinkEqualizationRequest- Capabilities: [a0] MSI: Enable+ Count=1/1 Maskable- 64bit+ Address: 00000000fee0f00c Data: 4172 Capabilities: [100 v1] Vendor Specific Information: ID=0001 Rev=1 Len=010 <?> Capabilities: [150 v2] Advanced Error Reporting UESta: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol- UEMsk: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol- UESvrt: DLP+ SDES+ TLP- FCP+ CmpltTO- CmpltAbrt- UnxCmplt- RxOF+ MalfTLP+ ECRC- UnsupReq- ACSViol- CESta: RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr- CEMsk: RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr+ AERCap: First Error Pointer: 00, GenCap+ CGenEn- ChkCap+ ChkEn- Capabilities: [270 v1] #19 Kernel driver in use: amdgpu Kernel modules: radeon, amdgpu
https://bugs.freedesktop.org/show_bug.cgi?id=98915
Rafael Ristovski rafael.ristovski@gmail.com changed:
What |Removed |Added ---------------------------------------------------------------------------- Severity|major |critical
https://bugs.freedesktop.org/show_bug.cgi?id=98915
Rafael Ristovski rafael.ristovski@gmail.com changed:
What |Removed |Added ---------------------------------------------------------------------------- Priority|medium |high
https://bugs.freedesktop.org/show_bug.cgi?id=98915
--- Comment #2 from Alex Deucher alexdeucher@gmail.com --- Can you bisect?
https://bugs.freedesktop.org/show_bug.cgi?id=98915
--- Comment #3 from Nicolai Stange nicstange@gmail.com --- (In reply to Alex Deucher from comment #2)
Can you bisect?
No need: most likely, the offending commit is 8a357d10043c ("drm: Nerf DRM_CONTROL nodes").
C.f. the discussion at http://lkml.kernel.org/r/20161203144700.2307-1-nicstange@gmail.com
A patch for amdgpu is in the works as well.
https://bugs.freedesktop.org/show_bug.cgi?id=98915
Rafael Ristovski rafael.ristovski@gmail.com changed:
What |Removed |Added ---------------------------------------------------------------------------- Resolution|--- |FIXED Status|NEW |RESOLVED
--- Comment #4 from Rafael Ristovski rafael.ristovski@gmail.com --- Fixed as of https://git.kernel.org/cgit/linux/kernel/git/next/linux-next.git/commit/?id=...
drm/amdgpu: don't add files at control minor debugfs directory
dri-devel@lists.freedesktop.org