Hello there,
i *think* i found a regression (card/system freeze in AGP mode) that must have been in the drm code for quite some time (since the switch to kms drivers) and possibly also the potential solution (re-apply an old patch from pre-kms-days). Affected seem to be older cards (actually, very old cards :-) before R600. I mailed this to the ati driver mailing list, but was told that this is a kernel/drm subject now, so i forward the mail interchange to this list. Details below, one has to start reading from the end upwards to get the chronological order, of course.
Could somebody give me a hint on how to re-apply the old patch or whether the info i found is valid ? The next step i would take is to insert some diagnostic messages in radeon_vram_location (see below) and build a new kernel.
Cheers
Jochen
-------- Original-Nachricht -------- Betreff: Fwd: Fwd: Fwd: Re: regression on RV280 card freeze, patch not applicable any more Datum: Fri, 25 Oct 2013 15:04:33 +0200 Von: Jochen Rollwagen joro-2013@t-online.de An: xorg-driver-ati@lists.x.org
more info (and possible solution):
void radeon_vram_location in radeon_device.c says
* Note: GTT start, end, size should be initialized before calling this * function on AGP platform. * * Note: We don't explicitly enforce VRAM start to be aligned on VRAM size, * this shouldn't be a problem as we are using the PCI aperture as a reference. * Otherwise this would be needed for rv280, all r3xx, and all r4xx, but * not IGP. *
so does this mean i just have to re-apply the old patch i found ? struct radeon_mc in radeon.h contains aper_base as a member which could be set/aligned to VRAM size using the code snippet below.
Cheers
Jochen
-------- Original-Nachricht -------- Betreff: Fwd: Fwd: Re: regression on RV280 card freeze, patch not applicable any more Datum: Fri, 25 Oct 2013 11:31:32 +0200 Von: Jochen Rollwagen joro-2013@t-online.de An: xorg-driver-ati@lists.x.org
I've done some more researching and found the following:
- There's another follow-on-patch ("Extend the alignment workaround to post-rv280 chips as well") to the one indicated below (http://cgit.freedesktop.org/~agd5f/xf86-video-ati/commit/?id=b2145aea36bb035...) that applies to not only RV280 but "rv280, all r3xx, and all r4xx, but not IGP".
- the piece of code affected seems to be (IMHO) in drivers/gpu/drm/radeon/: The (Radeon ?) Register RADEON_CONFIG_APER_0_BASE is defined in radeon_reg.h but never used in the driver:
radeon_reg.h:#define RADEON_CONFIG_APER_0_BASE 0x0100
in r100.c there's
static u32 r100_get_accessible_vram(struct radeon_device *rdev) { u32 aper_size; u8 byte;
aper_size = RREG32(RADEON_CONFIG_APER_SIZE);
/* Set HDP_APER_CNTL only on cards that are known not to be broken, * that is has the 2nd generation multifunction PCI interface */ if (rdev->family == CHIP_RV280 || rdev->family >= CHIP_RV350) { WREG32_P(RADEON_HOST_PATH_CNTL, RADEON_HDP_APER_CNTL, ~RADEON_HDP_APER_CNTL); DRM_INFO("Generation 2 PCI interface, using max accessible memory\n"); return aper_size * 2; }
That's the code executed on my machine according to dmesg. Missing (from the original patch, not applicable any more because of driver reorganization) seems to be
CARD32 aper0_base = INREG(RADEON_CONFIG_APER_0_BASE); aper0_base &= ~(mem_size - 1); info->mc_fb_location = (aper0_base >> 16);
The patch that seems to have removed/overridden this code is:
http://www.mail-archive.com/dri-devel@lists.sourceforge.net/msg41307.html
According to that patch, it was "booted on PCI r100, PCIE rv370, IGP rs400". So IMHO this could be a classical regression for an AGP RV280 card (like mine) and might explain why PCI mode works. this is Additionally corroborated by this post (http://comments.gmane.org/gmane.comp.freedesktop.xorg/5429):/ // //* The above doesn't necessarily work. For example, I've seen machines * with 128Mb configured as 2x64Mb apertures. I'm now _//_always_//_ setting * RADEON_HOST_PATH_CNTL. OUTREGP (RADEON_HOST_PATH_CNTL, RADEON_HDP_APER_CNTL, ~RADEON_HDP_APER_CNTL); (which was previously done only on some chip families).
*_I __*/*_/think/_**_/_ this is not correct on all cards as the apertures may not be configured correctly (and X doesn't set them up neither, if those correspond to the RADEON_CONFIG_APER registers)/_**_/"/_*
Could a Radeon guru confirm this or am i totally lost?
Cheers
Jochen -------- Original-Nachricht -------- Betreff: Fwd: Re: regression on RV280 card freeze, patch not applicable any more Datum: Fri, 18 Oct 2013 15:32:18 +0200 Von: Jochen Rollwagen joro-2013@t-online.de An: xorg-driver-ati@lists.x.org
sorry about that.
Anyway, i checked drivers/gpu/drm/radeon and drivers/char/agp/uninorth-agp.c and can't seem to find the patch indicated below. Might it have gone missing :-) ?
Am 08.10.2013 18:41, schrieb Michel Dänzer:
[ Please always follow up to the mailing list ]
On Die, 2013-10-08 at 14:53 +0200, Jochen Rollwagen wrote:
Am 08.10.2013 10:03, schrieb Michel Dänzer:
On Sam, 2013-10-05 at 15:13 +0200, Jochen Rollwagen wrote:
I’m running a RV280 based Radeon 9200 card (I know, an ancient card) in a Mac Mini G4 (powerpc-architecture) with Ubuntu Precise and the latest 3.4.64-kernel/ati driver and get lockups when trying to run the card in AGP mode (KMS enabled). The lockups happen when resetting the card (that’s what I can infer from the oops-screen).
It's the other way around: The kernel radeon driver resets the card to try and get it running again after a lockup.
PCI mode works. After researching I found a old bug that was fixed back in 2006 (https://bugs.freedesktop.org/show_bug.cgi?id=6011) that looks like the freeze I experience (since PCI mode – which allocates 64 MB of memory - works and AGP mode which by default allocates 256 MB doesn’t). The card has 64 mb memory.
So the first question is, could this be the problem that causes the lockups ?
Not really. The GART and VRAM memory apertures aren't directly related, and the fix for the bug above should still be incorporated in the current radeon KMS code.
Does radeon.agpmode=1 or radeon.agpmode=4 work?
Thank you for your reply. First, none of the agpmodes work, they just take more or less time to lockup the card (1 - slowest, 4 fastest). Secondly, if you write that the fix "should be incorporated in the current code", i'm somewhat lost because it definitely isn't there.
It's in the kernel now.
Well........no. I checked the 3.4.64 kernel sources after my last Mail and the code isn't in the drivers/gpu/drm/radeon sources. But of course i might have overlooked something.
On Fri, Nov 8, 2013 at 2:35 AM, Jochen Rollwagen joro-2013@t-online.de wrote:
Hello there,
i *think* i found a regression (card/system freeze in AGP mode) that must have been in the drm code for quite some time (since the switch to kms drivers) and possibly also the potential solution (re-apply an old patch from pre-kms-days). Affected seem to be older cards (actually, very old cards :-) before R600. I mailed this to the ati driver mailing list, but was told that this is a kernel/drm subject now, so i forward the mail interchange to this list. Details below, one has to start reading from the end upwards to get the chronological order, of course.
Could somebody give me a hint on how to re-apply the old patch or whether the info i found is valid ? The next step i would take is to insert some diagnostic messages in radeon_vram_location (see below) and build a new kernel.
I'm not entirely sure what you are proposing to change, but none of the stuff you mentioned below has to do with AGP. AGP is unstable because AGP sucks. KMS unfortunately exacerbates that since it makes much greater use of AGP memory than UMS ever did. To really fix AGP with KMS, we'd probably need to manage two pools of memory, one non-cache-coherent pool for AGP and one cache coherent pool that used the on-chip gart, then fix up the kernel and userspace accel drivers to use the appropriate pools.
Alex
Cheers
Jochen
-------- Original-Nachricht -------- Betreff: Fwd: Fwd: Fwd: Re: regression on RV280 card freeze, patch not applicable any more Datum: Fri, 25 Oct 2013 15:04:33 +0200 Von: Jochen Rollwagen joro-2013@t-online.de An: xorg-driver-ati@lists.x.org
more info (and possible solution):
void radeon_vram_location in radeon_device.c says
- Note: GTT start, end, size should be initialized before calling this
- function on AGP platform.
- Note: We don't explicitly enforce VRAM start to be aligned on VRAM size,
- this shouldn't be a problem as we are using the PCI aperture as a reference.
- Otherwise this would be needed for rv280, all r3xx, and all r4xx, but
- not IGP.
so does this mean i just have to re-apply the old patch i found ? struct radeon_mc in radeon.h contains aper_base as a member which could be set/aligned to VRAM size using the code snippet below.
Cheers
Jochen
-------- Original-Nachricht -------- Betreff: Fwd: Fwd: Re: regression on RV280 card freeze, patch not applicable any more Datum: Fri, 25 Oct 2013 11:31:32 +0200 Von: Jochen Rollwagen joro-2013@t-online.de An: xorg-driver-ati@lists.x.org
I've done some more researching and found the following:
There's another follow-on-patch ("Extend the alignment workaround to post-rv280 chips as well") to the one indicated below (http://cgit.freedesktop.org/~agd5f/xf86-video-ati/commit/?id=b2145aea36bb035...) that applies to not only RV280 but "rv280, all r3xx, and all r4xx, but not IGP".
the piece of code affected seems to be (IMHO) in drivers/gpu/drm/radeon/: The (Radeon ?) Register RADEON_CONFIG_APER_0_BASE is defined in radeon_reg.h but never used in the driver:
radeon_reg.h:#define RADEON_CONFIG_APER_0_BASE 0x0100
in r100.c there's
static u32 r100_get_accessible_vram(struct radeon_device *rdev) { u32 aper_size; u8 byte;
aper_size = RREG32(RADEON_CONFIG_APER_SIZE); /* Set HDP_APER_CNTL only on cards that are known not to be broken, * that is has the 2nd generation multifunction PCI interface */ if (rdev->family == CHIP_RV280 || rdev->family >= CHIP_RV350) { WREG32_P(RADEON_HOST_PATH_CNTL, RADEON_HDP_APER_CNTL, ~RADEON_HDP_APER_CNTL); DRM_INFO("Generation 2 PCI interface, using max accessible memory\n"); return aper_size * 2; }
That's the code executed on my machine according to dmesg. Missing (from the original patch, not applicable any more because of driver reorganization) seems to be
CARD32 aper0_base = INREG(RADEON_CONFIG_APER_0_BASE); aper0_base &= ~(mem_size - 1); info->mc_fb_location = (aper0_base >> 16);
The patch that seems to have removed/overridden this code is:
http://www.mail-archive.com/dri-devel@lists.sourceforge.net/msg41307.html
According to that patch, it was "booted on PCI r100, PCIE rv370, IGP rs400". So IMHO this could be a classical regression for an AGP RV280 card (like mine) and might explain why PCI mode works. this is Additionally corroborated by this post (http://comments.gmane.org/gmane.comp.freedesktop.xorg/5429):
- The above doesn't necessarily work. For example, I've seen machines * with 128Mb configured as 2x64Mb apertures. I'm now _always_ setting * RADEON_HOST_PATH_CNTL. OUTREGP (RADEON_HOST_PATH_CNTL, RADEON_HDP_APER_CNTL, ~RADEON_HDP_APER_CNTL); (which was previously done only on some chip families).
I _think_ this is not correct on all cards as the apertures may not be configured correctly (and X doesn't set them up neither, if those correspond to the RADEON_CONFIG_APER registers)"
Could a Radeon guru confirm this or am i totally lost?
Cheers
Jochen -------- Original-Nachricht -------- Betreff: Fwd: Re: regression on RV280 card freeze, patch not applicable any more Datum: Fri, 18 Oct 2013 15:32:18 +0200 Von: Jochen Rollwagen joro-2013@t-online.de An: xorg-driver-ati@lists.x.org
sorry about that.
Anyway, i checked drivers/gpu/drm/radeon and drivers/char/agp/uninorth-agp.c and can't seem to find the patch indicated below. Might it have gone missing :-) ?
Am 08.10.2013 18:41, schrieb Michel Dänzer:
[ Please always follow up to the mailing list ]
On Die, 2013-10-08 at 14:53 +0200, Jochen Rollwagen wrote:
Am 08.10.2013 10:03, schrieb Michel Dänzer:
On Sam, 2013-10-05 at 15:13 +0200, Jochen Rollwagen wrote:
I’m running a RV280 based Radeon 9200 card (I know, an ancient card) in a Mac Mini G4 (powerpc-architecture) with Ubuntu Precise and the latest 3.4.64-kernel/ati driver and get lockups when trying to run the card in AGP mode (KMS enabled). The lockups happen when resetting the card (that’s what I can infer from the oops-screen).
It's the other way around: The kernel radeon driver resets the card to try and get it running again after a lockup.
PCI mode works. After researching I found a old bug that was fixed back in 2006 (https://bugs.freedesktop.org/show_bug.cgi?id=6011) that looks like the freeze I experience (since PCI mode – which allocates 64 MB of memory - works and AGP mode which by default allocates 256 MB doesn’t). The card has 64 mb memory.
So the first question is, could this be the problem that causes the lockups ?
Not really. The GART and VRAM memory apertures aren't directly related, and the fix for the bug above should still be incorporated in the current radeon KMS code.
Does radeon.agpmode=1 or radeon.agpmode=4 work?
Thank you for your reply. First, none of the agpmodes work, they just take more or less time to lockup the card (1 - slowest, 4 fastest). Secondly, if you write that the fix "should be incorporated in the current code", i'm somewhat lost because it definitely isn't there.
It's in the kernel now.
Well........no. I checked the 3.4.64 kernel sources after my last Mail and the code isn't in the drivers/gpu/drm/radeon sources. But of course i might have overlooked something.
dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel
I started investigating the problem because AGP mode used to work with the UMS drivers (although I now understand they didn’t really use AGP memory) and in the second patch I mentioned below Benjamin Herrenschmidt stated that “there's a chip errata (for the pre-R600 chips). On those chips, the aperture must be aligned to the aperture size (that is FB_START in MC_FB_LOCATION must be aligned to the aperture size).” Since that workaround/patch definitely isn’t in the current DRM code any more (probably got lost in the transition to KMS) my idea was to re-apply the three-line workaround/patch (which should be quite trivial given the comment in radeon_vram_location in the drm kernel code). But since you indicated that AGP has nothing to do with that and PCI mode (agpmode = -1) works I guess there’s no point to do that. Additionally, I found out that the uninorth chipset used in my (PowerMac) machine has some very “special” features, basically it doesn’t do address mapping via the GART at all but directly uses a 256 MB AGP aperture (other sizes seem not to be supported) which has to be allocated coherently non-cacheable (OpenBSD seems to have a working AGP/drm radeon powermac driver, the info comes from their source code). Although it would be rather trivial to port the OpenBSD code I’ll stick with PCI mode for now.
Thanks for the info
Jochen
Am 11.11.2013 21:35, schrieb Alex Deucher:
On Fri, Nov 8, 2013 at 2:35 AM, Jochen Rollwagen joro-2013@t-online.de wrote:
Hello there,
i *think* i found a regression (card/system freeze in AGP mode) that must have been in the drm code for quite some time (since the switch to kms drivers) and possibly also the potential solution (re-apply an old patch from pre-kms-days). Affected seem to be older cards (actually, very old cards :-) before R600. I mailed this to the ati driver mailing list, but was told that this is a kernel/drm subject now, so i forward the mail interchange to this list. Details below, one has to start reading from the end upwards to get the chronological order, of course.
Could somebody give me a hint on how to re-apply the old patch or whether the info i found is valid ? The next step i would take is to insert some diagnostic messages in radeon_vram_location (see below) and build a new kernel.
I'm not entirely sure what you are proposing to change, but none of the stuff you mentioned below has to do with AGP. AGP is unstable because AGP sucks. KMS unfortunately exacerbates that since it makes much greater use of AGP memory than UMS ever did. To really fix AGP with KMS, we'd probably need to manage two pools of memory, one non-cache-coherent pool for AGP and one cache coherent pool that used the on-chip gart, then fix up the kernel and userspace accel drivers to use the appropriate pools.
Alex
Cheers
Jochen
-------- Original-Nachricht -------- Betreff: Fwd: Fwd: Fwd: Re: regression on RV280 card freeze, patch not applicable any more Datum: Fri, 25 Oct 2013 15:04:33 +0200 Von: Jochen Rollwagen joro-2013@t-online.de An: xorg-driver-ati@lists.x.org
more info (and possible solution):
void radeon_vram_location in radeon_device.c says
- Note: GTT start, end, size should be initialized before calling this
- function on AGP platform.
- Note: We don't explicitly enforce VRAM start to be aligned on VRAM size,
- this shouldn't be a problem as we are using the PCI aperture as a reference.
- Otherwise this would be needed for rv280, all r3xx, and all r4xx, but
- not IGP.
so does this mean i just have to re-apply the old patch i found ? struct radeon_mc in radeon.h contains aper_base as a member which could be set/aligned to VRAM size using the code snippet below.
Cheers
Jochen
-------- Original-Nachricht -------- Betreff: Fwd: Fwd: Re: regression on RV280 card freeze, patch not applicable any more Datum: Fri, 25 Oct 2013 11:31:32 +0200 Von: Jochen Rollwagen joro-2013@t-online.de An: xorg-driver-ati@lists.x.org
I've done some more researching and found the following:
There's another follow-on-patch ("Extend the alignment workaround to post-rv280 chips as well") to the one indicated below (http://cgit.freedesktop.org/~agd5f/xf86-video-ati/commit/?id=b2145aea36bb035...) that applies to not only RV280 but "rv280, all r3xx, and all r4xx, but not IGP".
the piece of code affected seems to be (IMHO) in drivers/gpu/drm/radeon/: The (Radeon ?) Register RADEON_CONFIG_APER_0_BASE is defined in radeon_reg.h but never used in the driver:
radeon_reg.h:#define RADEON_CONFIG_APER_0_BASE 0x0100
in r100.c there's
static u32 r100_get_accessible_vram(struct radeon_device *rdev) { u32 aper_size; u8 byte;
aper_size = RREG32(RADEON_CONFIG_APER_SIZE); /* Set HDP_APER_CNTL only on cards that are known not to be broken, * that is has the 2nd generation multifunction PCI interface */ if (rdev->family == CHIP_RV280 || rdev->family >= CHIP_RV350) { WREG32_P(RADEON_HOST_PATH_CNTL, RADEON_HDP_APER_CNTL, ~RADEON_HDP_APER_CNTL); DRM_INFO("Generation 2 PCI interface, using max accessible memory\n"); return aper_size * 2; }
That's the code executed on my machine according to dmesg. Missing (from the original patch, not applicable any more because of driver reorganization) seems to be
CARD32 aper0_base = INREG(RADEON_CONFIG_APER_0_BASE); aper0_base &= ~(mem_size - 1); info->mc_fb_location = (aper0_base >> 16);
The patch that seems to have removed/overridden this code is:
http://www.mail-archive.com/dri-devel@lists.sourceforge.net/msg41307.html
According to that patch, it was "booted on PCI r100, PCIE rv370, IGP rs400". So IMHO this could be a classical regression for an AGP RV280 card (like mine) and might explain why PCI mode works. this is Additionally corroborated by this post (http://comments.gmane.org/gmane.comp.freedesktop.xorg/5429):
- The above doesn't necessarily work. For example, I've seen machines * with 128Mb configured as 2x64Mb apertures. I'm now _always_ setting * RADEON_HOST_PATH_CNTL. OUTREGP (RADEON_HOST_PATH_CNTL, RADEON_HDP_APER_CNTL, ~RADEON_HDP_APER_CNTL); (which was previously done only on some chip families).
I _think_ this is not correct on all cards as the apertures may not be configured correctly (and X doesn't set them up neither, if those correspond to the RADEON_CONFIG_APER registers)"
Could a Radeon guru confirm this or am i totally lost?
Cheers
Jochen -------- Original-Nachricht -------- Betreff: Fwd: Re: regression on RV280 card freeze, patch not applicable any more Datum: Fri, 18 Oct 2013 15:32:18 +0200 Von: Jochen Rollwagen joro-2013@t-online.de An: xorg-driver-ati@lists.x.org
sorry about that.
Anyway, i checked drivers/gpu/drm/radeon and drivers/char/agp/uninorth-agp.c and can't seem to find the patch indicated below. Might it have gone missing :-) ?
Am 08.10.2013 18:41, schrieb Michel Dänzer:
[ Please always follow up to the mailing list ]
On Die, 2013-10-08 at 14:53 +0200, Jochen Rollwagen wrote:
Am 08.10.2013 10:03, schrieb Michel Dänzer:
On Sam, 2013-10-05 at 15:13 +0200, Jochen Rollwagen wrote:
I’m running a RV280 based Radeon 9200 card (I know, an ancient card) in a Mac Mini G4 (powerpc-architecture) with Ubuntu Precise and the latest 3.4.64-kernel/ati driver and get lockups when trying to run the card in AGP mode (KMS enabled). The lockups happen when resetting the card (that’s what I can infer from the oops-screen).
It's the other way around: The kernel radeon driver resets the card to try and get it running again after a lockup.
PCI mode works. After researching I found a old bug that was fixed back in 2006 (https://bugs.freedesktop.org/show_bug.cgi?id=6011) that looks like the freeze I experience (since PCI mode – which allocates 64 MB of memory - works and AGP mode which by default allocates 256 MB doesn’t). The card has 64 mb memory.
So the first question is, could this be the problem that causes the lockups ?
Not really. The GART and VRAM memory apertures aren't directly related, and the fix for the bug above should still be incorporated in the current radeon KMS code.
Does radeon.agpmode=1 or radeon.agpmode=4 work?
Thank you for your reply. First, none of the agpmodes work, they just take more or less time to lockup the card (1 - slowest, 4 fastest). Secondly, if you write that the fix "should be incorporated in the current code", i'm somewhat lost because it definitely isn't there.
It's in the kernel now.
Well........no. I checked the 3.4.64 kernel sources after my last Mail and the code isn't in the drivers/gpu/drm/radeon sources. But of course i might have overlooked something.
dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel
On Mit, 2013-11-13 at 19:07 +0100, Jochen Rollwagen wrote:
I started investigating the problem because AGP mode used to work with the UMS drivers (although I now understand they didn’t really use AGP memory)
They used AGP memory but didn't dynamically bind memory to the AGP aperture or unbind it from there.
and in the second patch I mentioned below Benjamin Herrenschmidt stated that “there's a chip errata (for the pre-R600 chips). On those chips, the aperture must be aligned to the aperture size (that is FB_START in MC_FB_LOCATION must be aligned to the aperture size).” Since that workaround/patch definitely isn’t in the current DRM code any more (probably got lost in the transition to KMS) my idea was to re-apply the three-line workaround/patch (which should be quite trivial given the comment in radeon_vram_location in the drm kernel code).
So, have you verified that the aperture is not aligned to its size for you? If it's not, does aligning it help stability with AGP?
I think there are two issues here: the first is the missing alignment workaround, since i'll be upgrading to 3.4.69 anyway i'll insert some diagnostic messages in radeon_device.c and see what happens. But i'm pretty certain now that this isn't the cause for the lockups. They are probably (quite certainly) caused by the dynamic binding/unbinding of AGP memory which the Uninorth chipset used in 32-bit powermacs obviously doesn't support. All it supports seems to be statically allocating a 256 MB contigouous non-cacheable AGP aperture and using that (since the chipset doesn't do any address mapping via the GART as indicated in the OpenBSD code). So to get AGP mode working again on those machines one would have to disable the dynamic memory stuff. Question: Would that require changes in the driver only or also in the DRM ?
Cheers
Jochen
Am 14.11.2013 02:37, schrieb Michel Dänzer:
On Mit, 2013-11-13 at 19:07 +0100, Jochen Rollwagen wrote:
I started investigating the problem because AGP mode used to work with the UMS drivers (although I now understand they didn’t really use AGP memory)
They used AGP memory but didn't dynamically bind memory to the AGP aperture or unbind it from there.
and in the second patch I mentioned below Benjamin Herrenschmidt stated that “there's a chip errata (for the pre-R600 chips). On those chips, the aperture must be aligned to the aperture size (that is FB_START in MC_FB_LOCATION must be aligned to the aperture size).” Since that workaround/patch definitely isn’t in the current DRM code any more (probably got lost in the transition to KMS) my idea was to re-apply the three-line workaround/patch (which should be quite trivial given the comment in radeon_vram_location in the drm kernel code).
So, have you verified that the aperture is not aligned to its size for you? If it's not, does aligning it help stability with AGP?
On Fre, 2013-11-15 at 08:49 +0100, Jochen Rollwagen wrote:
I think there are two issues here: the first is the missing alignment workaround, since i'll be upgrading to 3.4.69 anyway i'll insert some diagnostic messages in radeon_device.c and see what happens.
Yes, please do that before speculating more about the problem.
But i'm pretty certain now that this isn't the cause for the lockups. They are probably (quite certainly) caused by the dynamic binding/unbinding of AGP memory which the Uninorth chipset used in 32-bit powermacs obviously doesn't support.
"doesn't support" is too strong; it's working fine on this PowerBook5,8. But the older the revision of UniNorth, the more quirks.
All it supports seems to be statically allocating a 256 MB contigouous non-cacheable AGP aperture and using that (since the chipset doesn't do any address mapping via the GART as indicated in the OpenBSD code).
It does address mapping for the GPU, that's the whole point of the GART. What UniNorth doesn't do in contrast to most AGP bridges is provide a linear aperture to the CPU as well. But that shouldn't be an issue per se.
So to get AGP mode working again on those machines one would have to disable the dynamic memory stuff. Question: Would that require changes in the driver only or also in the DRM ?
It's not really possible with radeon KMS.
Am 15.11.2013 09:27, schrieb Michel Dänzer:
On Fre, 2013-11-15 at 08:49 +0100, Jochen Rollwagen wrote:
I think there are two issues here: the first is the missing alignment workaround, since i'll be upgrading to 3.4.69 anyway i'll insert some diagnostic messages in radeon_device.c and see what happens.
Yes, please do that before speculating more about the problem.
But i'm pretty certain now that this isn't the cause for the lockups. They are probably (quite certainly) caused by the dynamic binding/unbinding of AGP memory which the Uninorth chipset used in 32-bit powermacs obviously doesn't support.
"doesn't support" is too strong; it's working fine on this PowerBook5,8. But the older the revision of UniNorth, the more quirks.
All it supports seems to be statically allocating a 256 MB contigouous non-cacheable AGP aperture and using that (since the chipset doesn't do any address mapping via the GART as indicated in the OpenBSD code).
It does address mapping for the GPU, that's the whole point of the GART. What UniNorth doesn't do in contrast to most AGP bridges is provide a linear aperture to the CPU as well. But that shouldn't be an issue per se.
So to get AGP mode working again on those machines one would have to disable the dynamic memory stuff. Question: Would that require changes in the driver only or also in the DRM ?
It's not really possible with radeon KMS.
Here are the dmesg output for PCI and AGP mode with kernel 3.4.69:
PCI mode: [ 0.852172] Linux agpgart interface v0.103 [ 0.852198] agpgart-uninorth 0000:00:0b.0: Apple UniNorth 2 chipset [ 0.853260] agpgart-uninorth 0000:00:0b.0: configuring for size idx: 64 [ 0.853339] agpgart-uninorth 0000:00:0b.0: AGP aperture is 256M @ 0x0 ... [ 2.542722] [drm] Initialized drm 1.1.0 20060810 ... [ 2.747123] [drm] radeon kernel modesetting enabled. [ 2.747234] radeon 0000:00:10.0: enabling device (0006 -> 0007) [ 2.748887] [drm] initializing kernel modesetting (RV280 0x1002:0x5962 0x1002:0x5962). [ 2.748898] [drm] Forcing AGP to PCI mode [ 2.748913] [drm] register mmio base: 0x90000000 [ 2.748916] [drm] register mmio size: 65536 [ 2.748966] radeon 0000:00:10.0: Invalid ROM contents [ 2.748991] radeon 0000:00:10.0: Invalid ROM contents [ 2.749002] [drm:radeon_get_bios] *ERROR* Unable to locate a BIOS ROM [ 2.749022] [drm] Using device-tree clock info [ 2.749028] [drm] Generation 2 PCI interface, using max accessible memory [ 2.749035] radeon 0000:00:10.0: RADEON_CONFIG_APER_0_BASE: 0x9800000098000000 (my message) [ 2.749041] radeon 0000:00:10.0: VRAM: 128M 0x0000000098000000 - 0x000000009FFFFFFF (64M used) [ 2.749046] radeon 0000:00:10.0: GTT in radeon_vram_location: 512M 0x0000000000000000 - 0x0000000000000000 (my message) [ 2.749053] radeon 0000:00:10.0: GTT: 512M 0x0000000078000000 - 0x0000000097FFFFFF [ 2.749066] [drm] Detected VRAM RAM=128M, BAR=128M [ 2.749070] [drm] RAM width 64bits DDR [ 2.752299] [TTM] Zone kernel: Available graphics memory: 381972 kiB [ 2.752305] [TTM] Zone highmem: Available graphics memory: 513044 kiB [ 2.752309] [TTM] Initializing pool allocator [ 2.752318] [TTM] Initializing DMA pool allocator [ 2.752391] [drm] radeon: 64M of VRAM memory ready [ 2.752396] [drm] radeon: 512M of GTT memory ready. [ 2.752438] [drm] GART: num cpu pages 131072, num gpu pages 131072 [ 2.762558] [drm] radeon: ib pool ready. [ 2.855566] [drm] PCIE GART of 512M enabled (table at 0x0000000002880000). ... [ 2.874349] radeon 0000:00:10.0: WB disabled [ 2.874367] [drm] fence driver on ring 0 use gpu addr 0x78000000 and cpu addr 0xc256e000 [ 2.875151] [drm] Supports vblank timestamp caching Rev 1 (10.10.2010). [ 2.875158] [drm] Driver supports precise vblank timestamp query. [ 2.875194] [drm] radeon: irq initialized. [ 2.876030] [drm] Loading R200 Microcode [ 2.895607] [drm] radeon: ring at 0x0000000078001000 [ 2.895634] [drm] ring test succeeded in 0 usecs
AGP mode 1: [ 0.852235] Linux agpgart interface v0.103 [ 0.852262] agpgart-uninorth 0000:00:0b.0: Apple UniNorth 2 chipset [ 0.853324] agpgart-uninorth 0000:00:0b.0: configuring for size idx: 64 [ 0.853404] agpgart-uninorth 0000:00:0b.0: AGP aperture is 256M @ 0x0 ... [ 2.548750] [drm] Initialized drm 1.1.0 20060810 ... [ 2.751298] [drm] radeon kernel modesetting enabled. [ 2.751414] radeon 0000:00:10.0: enabling device (0006 -> 0007) [ 2.760316] [drm] initializing kernel modesetting (RV280 0x1002:0x5962 0x1002:0x5962). [ 2.760667] [drm] register mmio base: 0x90000000 [ 2.760671] [drm] register mmio size: 65536 [ 2.761003] radeon 0000:00:10.0: Invalid ROM contents [ 2.761331] radeon 0000:00:10.0: Invalid ROM contents [ 2.761347] [drm:radeon_get_bios] *ERROR* Unable to locate a BIOS ROM [ 2.761368] [drm] Using device-tree clock info [ 2.761398] [drm] AGP mode requested: 1 [ 2.761502] agpgart-uninorth 0000:00:0b.0: putting AGP V2 device into 1x mode [ 2.761510] radeon 0000:00:10.0: putting AGP V2 device into 1x mode [ 2.763545] radeon 0000:00:10.0: GTT: 256M 0x00000000 - 0x0FFFFFFF [ 2.763576] [drm] Generation 2 PCI interface, using max accessible memory [ 2.763583] radeon 0000:00:10.0: RADEON_CONFIG_APER_0_BASE: 0x9800000098000000 (my message) [ 2.763590] radeon 0000:00:10.0: VRAM: 128M 0x0000000098000000 - 0x000000009FFFFFFF (64M used) [ 2.763595] radeon 0000:00:10.0: GTT in radeon_vram_location: 256M 0x0000000000000000 - 0x000000000FFFFFFF (my message) [ 2.763697] [drm] Detected VRAM RAM=128M, BAR=128M [ 2.763702] [drm] RAM width 64bits DDR [ 2.766293] [TTM] Zone kernel: Available graphics memory: 381972 kiB [ 2.766299] [TTM] Zone highmem: Available graphics memory: 513044 kiB [ 2.766303] [TTM] Initializing pool allocator [ 2.766313] [TTM] Initializing DMA pool allocator [ 2.766405] [drm] radeon: 64M of VRAM memory ready [ 2.766499] [drm] radeon: 256M of GTT memory ready. [ 2.766655] [drm] radeon: ib pool ready. ... [ 2.876427] [drm] fence driver on ring 0 use gpu addr 0x00000000 and cpu addr 0xf137c000 [ 2.876435] [drm] Supports vblank timestamp caching Rev 1 (10.10.2010). [ 2.876439] [drm] Driver supports precise vblank timestamp query. [ 2.876472] [drm] radeon: irq initialized. [ 2.877451] [drm] Loading R200 Microcode [ 2.884498] [drm] radeon: ring at 0x0000000000001000 [ 2.884572] [drm] ring test succeeded in 1 usecs [ 2.885156] [drm] ib test succeeded in 0 usecs
i'm not certain whether gpu addr 0 is okay for the fence driver or whether the gtt location is okay (according to the comment in radeon_gtt_location it should be placed before or after VRAM) In PCI mode it ends at 0x0000000097FFFFFF and VRAM starts directly after at 0x0000000098000000. In AGP mode GTT and VRAM are completely unrelated.
Other than that it looks like the alignment thing isn't the problem.
Cheers
Jochen
Am 15.11.2013 09:27, schrieb Michel Dänzer:
On Fre, 2013-11-15 at 08:49 +0100, Jochen Rollwagen wrote:
I think there are two issues here: the first is the missing alignment workaround, since i'll be upgrading to 3.4.69 anyway i'll insert some diagnostic messages in radeon_device.c and see what happens.
Yes, please do that before speculating more about the problem.
But i'm pretty certain now that this isn't the cause for the lockups. They are probably (quite certainly) caused by the dynamic binding/unbinding of AGP memory which the Uninorth chipset used in 32-bit powermacs obviously doesn't support.
"doesn't support" is too strong; it's working fine on this PowerBook5,8. But the older the revision of UniNorth, the more quirks.
All it supports seems to be statically allocating a 256 MB contigouous non-cacheable AGP aperture and using that (since the chipset doesn't do any address mapping via the GART as indicated in the OpenBSD code).
It does address mapping for the GPU, that's the whole point of the GART. What UniNorth doesn't do in contrast to most AGP bridges is provide a linear aperture to the CPU as well. But that shouldn't be an issue per se.
So to get AGP mode working again on those machines one would have to disable the dynamic memory stuff. Question: Would that require changes in the driver only or also in the DRM ?
It's not really possible with radeon KMS.
the relevant syslog part is:
/var/log/syslog:Nov 22 11:32:08 mac-mini kernel: [ 3.363099] [drm] Initialized radeon 2.16.0 20080528 for 0000:00:10.0 on minor 0 /var/log/syslog:Nov 22 11:41:03 mac-mini kernel: [ 554.476580] radeon 0000:00:10.0: GPU lockup CP stall for more than 10000msec /var/log/syslog:Nov 22 11:41:03 mac-mini kernel: [ 554.477629] radeon 0000:00:10.0: GPU reset succeed /var/log/syslog:Nov 22 11:41:03 mac-mini kernel: [ 554.655218] kernel BUG at drivers/gpu/drm/radeon/radeon_object.c:410! /var/log/syslog:Nov 22 11:41:03 mac-mini kernel: [ 554.660331] Modules linked in: dm_crypt arc4 btusb parport_pc ppdev b43 bnep joydev bluetooth lp mac_hid parport rtc_generic mac80211 snd_aoa_codec_onyx snd_aoa_codec_tas snd_aoa_codec_toonie cfg80211 snd_aoa_fabric_layout snd_aoa snd_aoa_i2sbus snd_aoa_soundbus bcma snd_powermac snd_pcm snd_seq_midi snd_rawmidi snd_seq_midi_event snd_seq snd_timer snd_seq_device snd soundcore snd_page_alloc usbhid hid radeon firewire_ohci sungem firewire_core crc_itu_t sungem_phy ttm drm_kms_helper ssb drm /var/log/syslog:Nov 22 11:41:03 mac-mini kernel: [ 554.664059] NIP [f15b5158] radeon_bo_get_surface_reg+0x30/0x144 [radeon] /var/log/syslog:Nov 22 11:41:03 mac-mini kernel: [ 554.664313] LR [f159b7b8] radeon_surface_init+0x3c/0xac [radeon] /var/log/syslog:Nov 22 11:41:03 mac-mini kernel: [ 554.664657] [eba63c10] [f15d10bc] r100_pll_rreg+0x58/0x70 [radeon] (unreliable) /var/log/syslog:Nov 22 11:41:03 mac-mini kernel: [ 554.664931] [eba63c30] [f159b7b8] radeon_surface_init+0x3c/0xac [radeon] /var/log/syslog:Nov 22 11:41:03 mac-mini kernel: [ 554.665184] [eba63c50] [f15d2edc] r100_resume+0x68/0x104 [radeon] /var/log/syslog:Nov 22 11:41:03 mac-mini kernel: [ 554.665414] [eba63c70] [f159d54c] radeon_gpu_reset+0x120/0x164 [radeon] /var/log/syslog:Nov 22 11:41:03 mac-mini kernel: [ 554.665661] [eba63c90] [f15b2544] radeon_fence_wait+0x3d8/0x404 [radeon] /var/log/syslog:Nov 22 11:41:03 mac-mini kernel: [ 554.665914] [eba63d00] [f15c7618] radeon_ib_get+0x250/0x2d0 [radeon] /var/log/syslog:Nov 22 11:41:03 mac-mini kernel: [ 554.666155] [eba63d50] [f15c9c80] radeon_cs_ioctl+0x3c4/0x6e0 [radeon]
the code where the kernel bug seems to hit is
int radeon_bo_get_surface_reg(struct radeon_bo *bo) { struct radeon_device *rdev = bo->rdev; struct radeon_surface_reg *reg; struct radeon_bo *old_object; int steal; int i;
BUG_ON(!atomic_read(&bo->tbo.reserved));
dri-devel@lists.freedesktop.org