After all the comments in the last patch series, I did a refactoring of what I was proposing and came up with this. Basically, now: 1. max_bus_speed is used to set the device to gen2 speeds 2. on power there's no longer a conflict between the pseries call and other architectures, because the overwrite is done via a ppc_md hook 3. radeon is using bus->max_bus_speed instead of drm_pcie_get_speed_cap_mask for gen2 capability detection
The first patch consists of some architecture changes, such as adding a hook on powerpc for pci_root_bridge_prepare, so that pseries will initialize it to a function, while all other architectures get a NULL pointer. So that whenever whenever pci_create_root_bus is called, we'll get max_bus_speed properly setup from OpenFirmware.
The second patch consists of simple radeon changes not to call drm_get_pcie_speed_cap_mask anymore. I assume that on x86 machines, the max_bus_speed property will be properly set already.
Lucas Kannebley Tavares (2): ppc64: perform proper max_bus_speed detection radeon: use max_bus_speed to activate gen2 speeds
arch/powerpc/include/asm/machdep.h | 2 + arch/powerpc/kernel/pci-common.c | 8 +++++ arch/powerpc/platforms/pseries/pci.c | 51 ++++++++++++++++++++++++++++++++ arch/powerpc/platforms/pseries/setup.c | 4 ++ drivers/gpu/drm/radeon/evergreen.c | 9 +---- drivers/gpu/drm/radeon/r600.c | 8 +---- drivers/gpu/drm/radeon/rv770.c | 8 +---- 7 files changed, 69 insertions(+), 21 deletions(-)
On pseries machines the detection for max_bus_speed should be done through an OpenFirmware property. This patch adds a function to perform this detection and a hook to perform dynamic adding of the function only for pseries.
Signed-off-by: Lucas Kannebley Tavares lucaskt@linux.vnet.ibm.com --- arch/powerpc/include/asm/machdep.h | 2 + arch/powerpc/kernel/pci-common.c | 8 +++++ arch/powerpc/platforms/pseries/pci.c | 51 ++++++++++++++++++++++++++++++++ arch/powerpc/platforms/pseries/setup.c | 4 ++ 4 files changed, 65 insertions(+), 0 deletions(-)
diff --git a/arch/powerpc/include/asm/machdep.h b/arch/powerpc/include/asm/machdep.h index 3d6b410..8f558bf 100644 --- a/arch/powerpc/include/asm/machdep.h +++ b/arch/powerpc/include/asm/machdep.h @@ -107,6 +107,8 @@ struct machdep_calls { void (*pcibios_fixup)(void); int (*pci_probe_mode)(struct pci_bus *); void (*pci_irq_fixup)(struct pci_dev *dev); + int (*pcibios_root_bridge_prepare)(struct pci_host_bridge + *bridge);
/* To setup PHBs when using automatic OF platform driver for PCI */ int (*pci_setup_phb)(struct pci_controller *host); diff --git a/arch/powerpc/kernel/pci-common.c b/arch/powerpc/kernel/pci-common.c index fa12ae4..80986cf 100644 --- a/arch/powerpc/kernel/pci-common.c +++ b/arch/powerpc/kernel/pci-common.c @@ -844,6 +844,14 @@ int pci_proc_domain(struct pci_bus *bus) return 1; }
+int pcibios_root_bridge_prepare(struct pci_host_bridge *bridge) +{ + if (ppc_md.pcibios_root_bridge_prepare) + return ppc_md.pcibios_root_bridge_prepare(bridge); + + return 0; +} + /* This header fixup will do the resource fixup for all devices as they are * probed, but not for bridge ranges */ diff --git a/arch/powerpc/platforms/pseries/pci.c b/arch/powerpc/platforms/pseries/pci.c index 0b580f4..7f9c956 100644 --- a/arch/powerpc/platforms/pseries/pci.c +++ b/arch/powerpc/platforms/pseries/pci.c @@ -108,3 +108,54 @@ static void fixup_winbond_82c105(struct pci_dev* dev) } DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_WINBOND, PCI_DEVICE_ID_WINBOND_82C105, fixup_winbond_82c105); + +int pseries_root_bridge_prepare(struct pci_host_bridge *bridge) +{ + struct device_node *dn, *pdn; + struct pci_bus *bus; + const uint32_t *pcie_link_speed_stats; + + bus = bridge->bus; + + dn = pcibios_get_phb_of_node(bus); + if (!dn) + return 0; + + for (pdn = dn; pdn != NULL; pdn = pdn->parent) { + pcie_link_speed_stats = (const uint32_t *) of_get_property(dn, + "ibm,pcie-link-speed-stats", NULL); + if (pcie_link_speed_stats) + break; + } + + if (!pcie_link_speed_stats) { + pr_err("no ibm,pcie-link-speed-stats property\n"); + return 0; + } + + switch (pcie_link_speed_stats[0]) { + case 0x01: + bus->max_bus_speed = PCIE_SPEED_2_5GT; + break; + case 0x02: + bus->max_bus_speed = PCIE_SPEED_5_0GT; + break; + default: + bus->max_bus_speed = PCI_SPEED_UNKNOWN; + break; + } + + switch (pcie_link_speed_stats[1]) { + case 0x01: + bus->cur_bus_speed = PCIE_SPEED_2_5GT; + break; + case 0x02: + bus->cur_bus_speed = PCIE_SPEED_5_0GT; + break; + default: + bus->cur_bus_speed = PCI_SPEED_UNKNOWN; + break; + } + + return 0; +} diff --git a/arch/powerpc/platforms/pseries/setup.c b/arch/powerpc/platforms/pseries/setup.c index 8bcc9ca..15796b5 100644 --- a/arch/powerpc/platforms/pseries/setup.c +++ b/arch/powerpc/platforms/pseries/setup.c @@ -430,6 +430,8 @@ static void pSeries_machine_kexec(struct kimage *image) } #endif
+int pseries_root_bridge_prepare(struct pci_host_bridge *bridge); + static void __init pSeries_setup_arch(void) { panic_timeout = 10; @@ -466,6 +468,8 @@ static void __init pSeries_setup_arch(void) else ppc_md.enable_pmcs = power4_enable_pmcs;
+ ppc_md.pcibios_root_bridge_prepare = pseries_root_bridge_prepare; + if (firmware_has_feature(FW_FEATURE_SET_MODE)) { long rc; if ((rc = pSeries_enable_reloc_on_exc()) != H_SUCCESS) {
On Thu, Apr 11, 2013 at 10:13:13AM -0300, Lucas Kannebley Tavares wrote:
On pseries machines the detection for max_bus_speed should be done through an OpenFirmware property. This patch adds a function to perform this detection and a hook to perform dynamic adding of the function only for pseries.
The crucial detail you didn't mention is that pcibios_root_bridge_prepare() already exists as a weak function in the PCI code and is called from pci_create_root_bus().
diff --git a/arch/powerpc/platforms/pseries/setup.c b/arch/powerpc/platforms/pseries/setup.c index 8bcc9ca..15796b5 100644 --- a/arch/powerpc/platforms/pseries/setup.c +++ b/arch/powerpc/platforms/pseries/setup.c @@ -430,6 +430,8 @@ static void pSeries_machine_kexec(struct kimage *image) } #endif
+int pseries_root_bridge_prepare(struct pci_host_bridge *bridge);
Don't do that, put it in a header where it belongs.
cheers
On 04/15/2013 02:00 AM, Michael Ellerman wrote:
On Thu, Apr 11, 2013 at 10:13:13AM -0300, Lucas Kannebley Tavares wrote:
On pseries machines the detection for max_bus_speed should be done through an OpenFirmware property. This patch adds a function to perform this detection and a hook to perform dynamic adding of the function only for pseries.
The crucial detail you didn't mention is that pcibios_root_bridge_prepare() already exists as a weak function in the PCI code and is called from pci_create_root_bus().
Adding this comment.
diff --git a/arch/powerpc/platforms/pseries/setup.c b/arch/powerpc/platforms/pseries/setup.c index 8bcc9ca..15796b5 100644 --- a/arch/powerpc/platforms/pseries/setup.c +++ b/arch/powerpc/platforms/pseries/setup.c @@ -430,6 +430,8 @@ static void pSeries_machine_kexec(struct kimage *image) } #endif
+int pseries_root_bridge_prepare(struct pci_host_bridge *bridge);
Don't do that, put it in a header where it belongs.
And done as well.
cheers
On Thu, Apr 11, 2013 at 10:13:13AM -0300, Lucas Kannebley Tavares wrote:
On pseries machines the detection for max_bus_speed should be done through an OpenFirmware property. This patch adds a function to perform this detection and a hook to perform dynamic adding of the function only for pseries.
This fails to build for me on ppc64_defconfig, with:
arch/powerpc/include/asm/machdep.h:111:5: error: 'struct pci_host_bridge' declared inside parameter list [-Werror]
Presumably you tested it using some other defconfig?
cheers
On 04/15/2013 08:42 AM, Michael Ellerman wrote:
On Thu, Apr 11, 2013 at 10:13:13AM -0300, Lucas Kannebley Tavares wrote:
On pseries machines the detection for max_bus_speed should be done through an OpenFirmware property. This patch adds a function to perform this detection and a hook to perform dynamic adding of the function only for pseries.
This fails to build for me on ppc64_defconfig, with:
arch/powerpc/include/asm/machdep.h:111:5: error: 'struct pci_host_bridge' declared inside parameter list [-Werror]
Presumably you tested it using some other defconfig?
cheers
Yes, I tested with another config, I did get warnings, though, so I should've fixed that earlier. Adding a forward declaration to prevent this from even throwing out warnings.
radeon currently uses a drm function to get the speed capabilities for the bus. However, this is a non-standard method of performing this detection and this patch changes it to use the max_bus_speed attribute. --- drivers/gpu/drm/radeon/evergreen.c | 9 ++------- drivers/gpu/drm/radeon/r600.c | 8 +------- drivers/gpu/drm/radeon/rv770.c | 8 +------- 3 files changed, 4 insertions(+), 21 deletions(-)
diff --git a/drivers/gpu/drm/radeon/evergreen.c b/drivers/gpu/drm/radeon/evergreen.c index 305a657..3291f62 100644 --- a/drivers/gpu/drm/radeon/evergreen.c +++ b/drivers/gpu/drm/radeon/evergreen.c @@ -3855,8 +3855,7 @@ void evergreen_fini(struct radeon_device *rdev)
void evergreen_pcie_gen2_enable(struct radeon_device *rdev) { - u32 link_width_cntl, speed_cntl, mask; - int ret; + u32 link_width_cntl, speed_cntl;
if (radeon_pcie_gen2 == 0) return; @@ -3871,11 +3870,7 @@ void evergreen_pcie_gen2_enable(struct radeon_device *rdev) if (ASIC_IS_X2(rdev)) return;
- ret = drm_pcie_get_speed_cap_mask(rdev->ddev, &mask); - if (ret != 0) - return; - - if (!(mask & DRM_PCIE_SPEED_50)) + if (rdev->pdev->bus->max_bus_speed < PCIE_SPEED_5_0GT) return;
speed_cntl = RREG32_PCIE_P(PCIE_LC_SPEED_CNTL); diff --git a/drivers/gpu/drm/radeon/r600.c b/drivers/gpu/drm/radeon/r600.c index 0740db3..64b90c0 100644 --- a/drivers/gpu/drm/radeon/r600.c +++ b/drivers/gpu/drm/radeon/r600.c @@ -4351,8 +4351,6 @@ static void r600_pcie_gen2_enable(struct radeon_device *rdev) { u32 link_width_cntl, lanes, speed_cntl, training_cntl, tmp; u16 link_cntl2; - u32 mask; - int ret;
if (radeon_pcie_gen2 == 0) return; @@ -4371,11 +4369,7 @@ static void r600_pcie_gen2_enable(struct radeon_device *rdev) if (rdev->family <= CHIP_R600) return;
- ret = drm_pcie_get_speed_cap_mask(rdev->ddev, &mask); - if (ret != 0) - return; - - if (!(mask & DRM_PCIE_SPEED_50)) + if (rdev->pdev->bus->max_bus_speed < PCIE_SPEED_5_0GT) return;
speed_cntl = RREG32_PCIE_P(PCIE_LC_SPEED_CNTL); diff --git a/drivers/gpu/drm/radeon/rv770.c b/drivers/gpu/drm/radeon/rv770.c index d63fe1d..c683c36 100644 --- a/drivers/gpu/drm/radeon/rv770.c +++ b/drivers/gpu/drm/radeon/rv770.c @@ -1238,8 +1238,6 @@ static void rv770_pcie_gen2_enable(struct radeon_device *rdev) { u32 link_width_cntl, lanes, speed_cntl, tmp; u16 link_cntl2; - u32 mask; - int ret;
if (radeon_pcie_gen2 == 0) return; @@ -1254,11 +1252,7 @@ static void rv770_pcie_gen2_enable(struct radeon_device *rdev) if (ASIC_IS_X2(rdev)) return;
- ret = drm_pcie_get_speed_cap_mask(rdev->ddev, &mask); - if (ret != 0) - return; - - if (!(mask & DRM_PCIE_SPEED_50)) + if (rdev->pdev->bus->max_bus_speed < PCIE_SPEED_5_0GT) return;
DRM_INFO("enabling PCIE gen 2 link speeds, disable with radeon.pcie_gen2=0\n");
On Thu, Apr 11, 2013 at 7:13 AM, Lucas Kannebley Tavares lucaskt@linux.vnet.ibm.com wrote:
radeon currently uses a drm function to get the speed capabilities for the bus. However, this is a non-standard method of performing this detection and this patch changes it to use the max_bus_speed attribute.
drivers/gpu/drm/radeon/evergreen.c | 9 ++------- drivers/gpu/drm/radeon/r600.c | 8 +------- drivers/gpu/drm/radeon/rv770.c | 8 +------- 3 files changed, 4 insertions(+), 21 deletions(-)
diff --git a/drivers/gpu/drm/radeon/evergreen.c b/drivers/gpu/drm/radeon/evergreen.c index 305a657..3291f62 100644 --- a/drivers/gpu/drm/radeon/evergreen.c +++ b/drivers/gpu/drm/radeon/evergreen.c @@ -3855,8 +3855,7 @@ void evergreen_fini(struct radeon_device *rdev)
void evergreen_pcie_gen2_enable(struct radeon_device *rdev) {
u32 link_width_cntl, speed_cntl, mask;
int ret;
u32 link_width_cntl, speed_cntl; if (radeon_pcie_gen2 == 0) return;
@@ -3871,11 +3870,7 @@ void evergreen_pcie_gen2_enable(struct radeon_device *rdev) if (ASIC_IS_X2(rdev)) return;
ret = drm_pcie_get_speed_cap_mask(rdev->ddev, &mask);
if (ret != 0)
return;
if (!(mask & DRM_PCIE_SPEED_50))
if (rdev->pdev->bus->max_bus_speed < PCIE_SPEED_5_0GT)
For devices on a root bus, we previously dereferenced a NULL pointer in drm_pcie_get_speed_cap_mask() because pdev->bus->self is NULL on a root bus. (I think this is the original problem you tripped over, Lucas.)
These patches fix that problem. On pseries, where the device *is* on a root bus, your patches set max_bus_speed so this will work as expected. On most other systems, max_bus_speed for root buses will be PCI_SPEED_UNKNOWN (set in pci_alloc_bus() and never updated because most arches don't have code like the pseries code you're adding).
PCI_SPEED_UNKNOWN = 0xff, so if we see another machine with a GPU on the root bus, we'll attempt to enable Gen2 on the device even though we have no idea what the bus will support.
That's why I originally suggested skipping the Gen2 stuff if "max_bus_speed == PCI_SPEED_UNKNOWN". I was just being conservative, thinking that it's better to have a functional but slow GPU rather than the unknown (to me) effects of enabling Gen2 on a link that might not support it. But I'm fine with this being either way.
It would be nice if we could get rid of drm_pcie_get_speed_cap_mask() altogether. It is exported, but I have no idea of anybody else uses it. Maybe it could at least be marked __deprecated now?
I don't know who should take these patches. They don't touch drivers/pci, but I'd be happy to push them, given the appropriate ACKs from DRM and powerpc folks.
Bjorn
return; speed_cntl = RREG32_PCIE_P(PCIE_LC_SPEED_CNTL);
diff --git a/drivers/gpu/drm/radeon/r600.c b/drivers/gpu/drm/radeon/r600.c index 0740db3..64b90c0 100644 --- a/drivers/gpu/drm/radeon/r600.c +++ b/drivers/gpu/drm/radeon/r600.c @@ -4351,8 +4351,6 @@ static void r600_pcie_gen2_enable(struct radeon_device *rdev) { u32 link_width_cntl, lanes, speed_cntl, training_cntl, tmp; u16 link_cntl2;
u32 mask;
int ret; if (radeon_pcie_gen2 == 0) return;
@@ -4371,11 +4369,7 @@ static void r600_pcie_gen2_enable(struct radeon_device *rdev) if (rdev->family <= CHIP_R600) return;
ret = drm_pcie_get_speed_cap_mask(rdev->ddev, &mask);
if (ret != 0)
return;
if (!(mask & DRM_PCIE_SPEED_50))
if (rdev->pdev->bus->max_bus_speed < PCIE_SPEED_5_0GT) return; speed_cntl = RREG32_PCIE_P(PCIE_LC_SPEED_CNTL);
diff --git a/drivers/gpu/drm/radeon/rv770.c b/drivers/gpu/drm/radeon/rv770.c index d63fe1d..c683c36 100644 --- a/drivers/gpu/drm/radeon/rv770.c +++ b/drivers/gpu/drm/radeon/rv770.c @@ -1238,8 +1238,6 @@ static void rv770_pcie_gen2_enable(struct radeon_device *rdev) { u32 link_width_cntl, lanes, speed_cntl, tmp; u16 link_cntl2;
u32 mask;
int ret; if (radeon_pcie_gen2 == 0) return;
@@ -1254,11 +1252,7 @@ static void rv770_pcie_gen2_enable(struct radeon_device *rdev) if (ASIC_IS_X2(rdev)) return;
ret = drm_pcie_get_speed_cap_mask(rdev->ddev, &mask);
if (ret != 0)
return;
if (!(mask & DRM_PCIE_SPEED_50))
if (rdev->pdev->bus->max_bus_speed < PCIE_SPEED_5_0GT) return; DRM_INFO("enabling PCIE gen 2 link speeds, disable with radeon.pcie_gen2=0\n");
-- 1.7.4.4
diff --git a/drivers/gpu/drm/radeon/evergreen.c b/drivers/gpu/drm/radeon/evergreen.c index 305a657..3291f62 100644 --- a/drivers/gpu/drm/radeon/evergreen.c +++ b/drivers/gpu/drm/radeon/evergreen.c @@ -3855,8 +3855,7 @@ void evergreen_fini(struct radeon_device *rdev)
void evergreen_pcie_gen2_enable(struct radeon_device *rdev) {
u32 link_width_cntl, speed_cntl, mask;
int ret;
u32 link_width_cntl, speed_cntl; if (radeon_pcie_gen2 == 0) return;
@@ -3871,11 +3870,7 @@ void evergreen_pcie_gen2_enable(struct radeon_device *rdev) if (ASIC_IS_X2(rdev)) return;
ret = drm_pcie_get_speed_cap_mask(rdev->ddev, &mask);
if (ret != 0)
return;
if (!(mask & DRM_PCIE_SPEED_50))
if (rdev->pdev->bus->max_bus_speed < PCIE_SPEED_5_0GT)
For devices on a root bus, we previously dereferenced a NULL pointer in drm_pcie_get_speed_cap_mask() because pdev->bus->self is NULL on a root bus. (I think this is the original problem you tripped over, Lucas.)
These patches fix that problem. On pseries, where the device *is* on a root bus, your patches set max_bus_speed so this will work as expected. On most other systems, max_bus_speed for root buses will be PCI_SPEED_UNKNOWN (set in pci_alloc_bus() and never updated because most arches don't have code like the pseries code you're adding).
PCI_SPEED_UNKNOWN = 0xff, so if we see another machine with a GPU on the root bus, we'll attempt to enable Gen2 on the device even though we have no idea what the bus will support.
That's why I originally suggested skipping the Gen2 stuff if "max_bus_speed == PCI_SPEED_UNKNOWN". I was just being conservative, thinking that it's better to have a functional but slow GPU rather than the unknown (to me) effects of enabling Gen2 on a link that might not support it. But I'm fine with this being either way.
It would be nice if we could get rid of drm_pcie_get_speed_cap_mask() altogether. It is exported, but I have no idea of anybody else uses it. Maybe it could at least be marked __deprecated now?
I don't know who should take these patches. They don't touch drivers/pci, but I'd be happy to push them, given the appropriate ACKs from DRM and powerpc folks.
Acked-by: Dave Airlie airlied@redhat.com
I'm happy to see these go via pci tree to avoid interdependent trees.
Dave.
On 04/12/2013 01:38 PM, Bjorn Helgaas wrote:
On Thu, Apr 11, 2013 at 7:13 AM, Lucas Kannebley Tavares lucaskt@linux.vnet.ibm.com wrote:
radeon currently uses a drm function to get the speed capabilities for the bus. However, this is a non-standard method of performing this detection and this patch changes it to use the max_bus_speed attribute.
drivers/gpu/drm/radeon/evergreen.c | 9 ++------- drivers/gpu/drm/radeon/r600.c | 8 +------- drivers/gpu/drm/radeon/rv770.c | 8 +------- 3 files changed, 4 insertions(+), 21 deletions(-)
diff --git a/drivers/gpu/drm/radeon/evergreen.c b/drivers/gpu/drm/radeon/evergreen.c index 305a657..3291f62 100644 --- a/drivers/gpu/drm/radeon/evergreen.c +++ b/drivers/gpu/drm/radeon/evergreen.c @@ -3855,8 +3855,7 @@ void evergreen_fini(struct radeon_device *rdev)
void evergreen_pcie_gen2_enable(struct radeon_device *rdev) {
u32 link_width_cntl, speed_cntl, mask;
int ret;
u32 link_width_cntl, speed_cntl; if (radeon_pcie_gen2 == 0) return;
@@ -3871,11 +3870,7 @@ void evergreen_pcie_gen2_enable(struct radeon_device *rdev) if (ASIC_IS_X2(rdev)) return;
ret = drm_pcie_get_speed_cap_mask(rdev->ddev,&mask);
if (ret != 0)
return;
if (!(mask& DRM_PCIE_SPEED_50))
if (rdev->pdev->bus->max_bus_speed< PCIE_SPEED_5_0GT)
For devices on a root bus, we previously dereferenced a NULL pointer in drm_pcie_get_speed_cap_mask() because pdev->bus->self is NULL on a root bus. (I think this is the original problem you tripped over, Lucas.)
These patches fix that problem. On pseries, where the device *is* on a root bus, your patches set max_bus_speed so this will work as expected. On most other systems, max_bus_speed for root buses will be PCI_SPEED_UNKNOWN (set in pci_alloc_bus() and never updated because most arches don't have code like the pseries code you're adding).
PCI_SPEED_UNKNOWN = 0xff, so if we see another machine with a GPU on the root bus, we'll attempt to enable Gen2 on the device even though we have no idea what the bus will support.
That's why I originally suggested skipping the Gen2 stuff if "max_bus_speed == PCI_SPEED_UNKNOWN". I was just being conservative, thinking that it's better to have a functional but slow GPU rather than the unknown (to me) effects of enabling Gen2 on a link that might not support it. But I'm fine with this being either way.
You're right here, of course. I'll test for it being different from 5_0GT and 8_0GT instead. Though at some point I suppose someone will want to tackle Gen3 speeds.
It would be nice if we could get rid of drm_pcie_get_speed_cap_mask() altogether. It is exported, but I have no idea of anybody else uses it. Maybe it could at least be marked __deprecated now?
I'll mark it.
I don't know who should take these patches. They don't touch drivers/pci, but I'd be happy to push them, given the appropriate ACKs from DRM and powerpc folks.
Bjorn
return; speed_cntl = RREG32_PCIE_P(PCIE_LC_SPEED_CNTL);
diff --git a/drivers/gpu/drm/radeon/r600.c b/drivers/gpu/drm/radeon/r600.c index 0740db3..64b90c0 100644 --- a/drivers/gpu/drm/radeon/r600.c +++ b/drivers/gpu/drm/radeon/r600.c @@ -4351,8 +4351,6 @@ static void r600_pcie_gen2_enable(struct radeon_device *rdev) { u32 link_width_cntl, lanes, speed_cntl, training_cntl, tmp; u16 link_cntl2;
u32 mask;
int ret; if (radeon_pcie_gen2 == 0) return;
@@ -4371,11 +4369,7 @@ static void r600_pcie_gen2_enable(struct radeon_device *rdev) if (rdev->family<= CHIP_R600) return;
ret = drm_pcie_get_speed_cap_mask(rdev->ddev,&mask);
if (ret != 0)
return;
if (!(mask& DRM_PCIE_SPEED_50))
if (rdev->pdev->bus->max_bus_speed< PCIE_SPEED_5_0GT) return; speed_cntl = RREG32_PCIE_P(PCIE_LC_SPEED_CNTL);
diff --git a/drivers/gpu/drm/radeon/rv770.c b/drivers/gpu/drm/radeon/rv770.c index d63fe1d..c683c36 100644 --- a/drivers/gpu/drm/radeon/rv770.c +++ b/drivers/gpu/drm/radeon/rv770.c @@ -1238,8 +1238,6 @@ static void rv770_pcie_gen2_enable(struct radeon_device *rdev) { u32 link_width_cntl, lanes, speed_cntl, tmp; u16 link_cntl2;
u32 mask;
int ret; if (radeon_pcie_gen2 == 0) return;
@@ -1254,11 +1252,7 @@ static void rv770_pcie_gen2_enable(struct radeon_device *rdev) if (ASIC_IS_X2(rdev)) return;
ret = drm_pcie_get_speed_cap_mask(rdev->ddev,&mask);
if (ret != 0)
return;
if (!(mask& DRM_PCIE_SPEED_50))
if (rdev->pdev->bus->max_bus_speed< PCIE_SPEED_5_0GT) return; DRM_INFO("enabling PCIE gen 2 link speeds, disable with radeon.pcie_gen2=0\n");
-- 1.7.4.4
On Wed, Apr 17, 2013 at 8:38 AM, Lucas Kannebley Tavares lucaskt@linux.vnet.ibm.com wrote:
On 04/12/2013 01:38 PM, Bjorn Helgaas wrote:
On Thu, Apr 11, 2013 at 7:13 AM, Lucas Kannebley Tavares lucaskt@linux.vnet.ibm.com wrote:
radeon currently uses a drm function to get the speed capabilities for the bus. However, this is a non-standard method of performing this detection and this patch changes it to use the max_bus_speed attribute.
drivers/gpu/drm/radeon/evergreen.c | 9 ++------- drivers/gpu/drm/radeon/r600.c | 8 +------- drivers/gpu/drm/radeon/rv770.c | 8 +------- 3 files changed, 4 insertions(+), 21 deletions(-)
diff --git a/drivers/gpu/drm/radeon/evergreen.c b/drivers/gpu/drm/radeon/evergreen.c index 305a657..3291f62 100644 --- a/drivers/gpu/drm/radeon/evergreen.c +++ b/drivers/gpu/drm/radeon/evergreen.c @@ -3855,8 +3855,7 @@ void evergreen_fini(struct radeon_device *rdev)
void evergreen_pcie_gen2_enable(struct radeon_device *rdev) {
u32 link_width_cntl, speed_cntl, mask;
int ret;
u32 link_width_cntl, speed_cntl; if (radeon_pcie_gen2 == 0) return;
@@ -3871,11 +3870,7 @@ void evergreen_pcie_gen2_enable(struct radeon_device *rdev) if (ASIC_IS_X2(rdev)) return;
ret = drm_pcie_get_speed_cap_mask(rdev->ddev,&mask);
if (ret != 0)
return;
if (!(mask& DRM_PCIE_SPEED_50))
if (rdev->pdev->bus->max_bus_speed< PCIE_SPEED_5_0GT)
For devices on a root bus, we previously dereferenced a NULL pointer in drm_pcie_get_speed_cap_mask() because pdev->bus->self is NULL on a root bus. (I think this is the original problem you tripped over, Lucas.)
These patches fix that problem. On pseries, where the device *is* on a root bus, your patches set max_bus_speed so this will work as expected. On most other systems, max_bus_speed for root buses will be PCI_SPEED_UNKNOWN (set in pci_alloc_bus() and never updated because most arches don't have code like the pseries code you're adding).
PCI_SPEED_UNKNOWN = 0xff, so if we see another machine with a GPU on the root bus, we'll attempt to enable Gen2 on the device even though we have no idea what the bus will support.
That's why I originally suggested skipping the Gen2 stuff if "max_bus_speed == PCI_SPEED_UNKNOWN". I was just being conservative, thinking that it's better to have a functional but slow GPU rather than the unknown (to me) effects of enabling Gen2 on a link that might not support it. But I'm fine with this being either way.
You're right here, of course. I'll test for it being different from 5_0GT and 8_0GT instead. Though at some point I suppose someone will want to tackle Gen3 speeds.
drm_pcie_get_speed_cap_mask() actually checked the pci bridge to see what speeds it supported. If the new code doesn't actually check the bridge caps then the new code does not seem like a valid replacement unless I'm missing something.
Alex
It would be nice if we could get rid of drm_pcie_get_speed_cap_mask() altogether. It is exported, but I have no idea of anybody else uses it. Maybe it could at least be marked __deprecated now?
I'll mark it.
I don't know who should take these patches. They don't touch drivers/pci, but I'd be happy to push them, given the appropriate ACKs from DRM and powerpc folks.
Bjorn
return; speed_cntl = RREG32_PCIE_P(PCIE_LC_SPEED_CNTL);
diff --git a/drivers/gpu/drm/radeon/r600.c b/drivers/gpu/drm/radeon/r600.c index 0740db3..64b90c0 100644 --- a/drivers/gpu/drm/radeon/r600.c +++ b/drivers/gpu/drm/radeon/r600.c @@ -4351,8 +4351,6 @@ static void r600_pcie_gen2_enable(struct radeon_device *rdev) { u32 link_width_cntl, lanes, speed_cntl, training_cntl, tmp; u16 link_cntl2;
u32 mask;
int ret; if (radeon_pcie_gen2 == 0) return;
@@ -4371,11 +4369,7 @@ static void r600_pcie_gen2_enable(struct radeon_device *rdev) if (rdev->family<= CHIP_R600) return;
ret = drm_pcie_get_speed_cap_mask(rdev->ddev,&mask);
if (ret != 0)
return;
if (!(mask& DRM_PCIE_SPEED_50))
if (rdev->pdev->bus->max_bus_speed< PCIE_SPEED_5_0GT) return; speed_cntl = RREG32_PCIE_P(PCIE_LC_SPEED_CNTL);
diff --git a/drivers/gpu/drm/radeon/rv770.c b/drivers/gpu/drm/radeon/rv770.c index d63fe1d..c683c36 100644 --- a/drivers/gpu/drm/radeon/rv770.c +++ b/drivers/gpu/drm/radeon/rv770.c @@ -1238,8 +1238,6 @@ static void rv770_pcie_gen2_enable(struct radeon_device *rdev) { u32 link_width_cntl, lanes, speed_cntl, tmp; u16 link_cntl2;
u32 mask;
int ret; if (radeon_pcie_gen2 == 0) return;
@@ -1254,11 +1252,7 @@ static void rv770_pcie_gen2_enable(struct radeon_device *rdev) if (ASIC_IS_X2(rdev)) return;
ret = drm_pcie_get_speed_cap_mask(rdev->ddev,&mask);
if (ret != 0)
return;
if (!(mask& DRM_PCIE_SPEED_50))
if (rdev->pdev->bus->max_bus_speed< PCIE_SPEED_5_0GT) return; DRM_INFO("enabling PCIE gen 2 link speeds, disable with
radeon.pcie_gen2=0\n");
1.7.4.4
-- Lucas Kannebley Tavares Software Engineer IBM Linux Technology Center
dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel
On Wed, Apr 17, 2013 at 2:04 PM, Alex Deucher alexdeucher@gmail.com wrote:
On Wed, Apr 17, 2013 at 8:38 AM, Lucas Kannebley Tavares lucaskt@linux.vnet.ibm.com wrote:
On 04/12/2013 01:38 PM, Bjorn Helgaas wrote:
On Thu, Apr 11, 2013 at 7:13 AM, Lucas Kannebley Tavares lucaskt@linux.vnet.ibm.com wrote:
radeon currently uses a drm function to get the speed capabilities for the bus. However, this is a non-standard method of performing this detection and this patch changes it to use the max_bus_speed attribute.
drivers/gpu/drm/radeon/evergreen.c | 9 ++------- drivers/gpu/drm/radeon/r600.c | 8 +------- drivers/gpu/drm/radeon/rv770.c | 8 +------- 3 files changed, 4 insertions(+), 21 deletions(-)
diff --git a/drivers/gpu/drm/radeon/evergreen.c b/drivers/gpu/drm/radeon/evergreen.c index 305a657..3291f62 100644 --- a/drivers/gpu/drm/radeon/evergreen.c +++ b/drivers/gpu/drm/radeon/evergreen.c @@ -3855,8 +3855,7 @@ void evergreen_fini(struct radeon_device *rdev)
void evergreen_pcie_gen2_enable(struct radeon_device *rdev) {
u32 link_width_cntl, speed_cntl, mask;
int ret;
u32 link_width_cntl, speed_cntl; if (radeon_pcie_gen2 == 0) return;
@@ -3871,11 +3870,7 @@ void evergreen_pcie_gen2_enable(struct radeon_device *rdev) if (ASIC_IS_X2(rdev)) return;
ret = drm_pcie_get_speed_cap_mask(rdev->ddev,&mask);
if (ret != 0)
return;
if (!(mask& DRM_PCIE_SPEED_50))
if (rdev->pdev->bus->max_bus_speed< PCIE_SPEED_5_0GT)
For devices on a root bus, we previously dereferenced a NULL pointer in drm_pcie_get_speed_cap_mask() because pdev->bus->self is NULL on a root bus. (I think this is the original problem you tripped over, Lucas.)
These patches fix that problem. On pseries, where the device *is* on a root bus, your patches set max_bus_speed so this will work as expected. On most other systems, max_bus_speed for root buses will be PCI_SPEED_UNKNOWN (set in pci_alloc_bus() and never updated because most arches don't have code like the pseries code you're adding).
PCI_SPEED_UNKNOWN = 0xff, so if we see another machine with a GPU on the root bus, we'll attempt to enable Gen2 on the device even though we have no idea what the bus will support.
That's why I originally suggested skipping the Gen2 stuff if "max_bus_speed == PCI_SPEED_UNKNOWN". I was just being conservative, thinking that it's better to have a functional but slow GPU rather than the unknown (to me) effects of enabling Gen2 on a link that might not support it. But I'm fine with this being either way.
You're right here, of course. I'll test for it being different from 5_0GT and 8_0GT instead. Though at some point I suppose someone will want to tackle Gen3 speeds.
drm_pcie_get_speed_cap_mask() actually checked the pci bridge to see what speeds it supported. If the new code doesn't actually check the bridge caps then the new code does not seem like a valid replacement unless I'm missing something.
The max_bus_speed in struct pci_bus is set in pci_set_bus_speed() based on the PCIe, PCI-X, or AGP capabilities. This happens when we enumerate the bridge device. Or, in this case, Lucas added powerpc-specific code to set max_bus_speed for the root bus based on platform-specific host bridge knowledge.
So it still does check the PCI bridge capabilities, just not as directly as it did before.
Bjorn
On Wed, Apr 17, 2013 at 4:11 PM, Bjorn Helgaas bhelgaas@google.com wrote:
On Wed, Apr 17, 2013 at 2:04 PM, Alex Deucher alexdeucher@gmail.com wrote:
On Wed, Apr 17, 2013 at 8:38 AM, Lucas Kannebley Tavares lucaskt@linux.vnet.ibm.com wrote:
On 04/12/2013 01:38 PM, Bjorn Helgaas wrote:
On Thu, Apr 11, 2013 at 7:13 AM, Lucas Kannebley Tavares lucaskt@linux.vnet.ibm.com wrote:
radeon currently uses a drm function to get the speed capabilities for the bus. However, this is a non-standard method of performing this detection and this patch changes it to use the max_bus_speed attribute.
drivers/gpu/drm/radeon/evergreen.c | 9 ++------- drivers/gpu/drm/radeon/r600.c | 8 +------- drivers/gpu/drm/radeon/rv770.c | 8 +------- 3 files changed, 4 insertions(+), 21 deletions(-)
diff --git a/drivers/gpu/drm/radeon/evergreen.c b/drivers/gpu/drm/radeon/evergreen.c index 305a657..3291f62 100644 --- a/drivers/gpu/drm/radeon/evergreen.c +++ b/drivers/gpu/drm/radeon/evergreen.c @@ -3855,8 +3855,7 @@ void evergreen_fini(struct radeon_device *rdev)
void evergreen_pcie_gen2_enable(struct radeon_device *rdev) {
u32 link_width_cntl, speed_cntl, mask;
int ret;
u32 link_width_cntl, speed_cntl; if (radeon_pcie_gen2 == 0) return;
@@ -3871,11 +3870,7 @@ void evergreen_pcie_gen2_enable(struct radeon_device *rdev) if (ASIC_IS_X2(rdev)) return;
ret = drm_pcie_get_speed_cap_mask(rdev->ddev,&mask);
if (ret != 0)
return;
if (!(mask& DRM_PCIE_SPEED_50))
if (rdev->pdev->bus->max_bus_speed< PCIE_SPEED_5_0GT)
For devices on a root bus, we previously dereferenced a NULL pointer in drm_pcie_get_speed_cap_mask() because pdev->bus->self is NULL on a root bus. (I think this is the original problem you tripped over, Lucas.)
These patches fix that problem. On pseries, where the device *is* on a root bus, your patches set max_bus_speed so this will work as expected. On most other systems, max_bus_speed for root buses will be PCI_SPEED_UNKNOWN (set in pci_alloc_bus() and never updated because most arches don't have code like the pseries code you're adding).
PCI_SPEED_UNKNOWN = 0xff, so if we see another machine with a GPU on the root bus, we'll attempt to enable Gen2 on the device even though we have no idea what the bus will support.
That's why I originally suggested skipping the Gen2 stuff if "max_bus_speed == PCI_SPEED_UNKNOWN". I was just being conservative, thinking that it's better to have a functional but slow GPU rather than the unknown (to me) effects of enabling Gen2 on a link that might not support it. But I'm fine with this being either way.
You're right here, of course. I'll test for it being different from 5_0GT and 8_0GT instead. Though at some point I suppose someone will want to tackle Gen3 speeds.
drm_pcie_get_speed_cap_mask() actually checked the pci bridge to see what speeds it supported. If the new code doesn't actually check the bridge caps then the new code does not seem like a valid replacement unless I'm missing something.
The max_bus_speed in struct pci_bus is set in pci_set_bus_speed() based on the PCIe, PCI-X, or AGP capabilities. This happens when we enumerate the bridge device. Or, in this case, Lucas added powerpc-specific code to set max_bus_speed for the root bus based on platform-specific host bridge knowledge.
So it still does check the PCI bridge capabilities, just not as directly as it did before.
Ah, ok. Thanks. The previous comments about PCI_SPEED_UNKNOWN being set in pci_alloc_bus() and never updated confused me.
Alex
On Wed, Apr 17, 2013 at 2:17 PM, Alex Deucher alexdeucher@gmail.com wrote:
On Wed, Apr 17, 2013 at 4:11 PM, Bjorn Helgaas bhelgaas@google.com wrote:
On Wed, Apr 17, 2013 at 2:04 PM, Alex Deucher alexdeucher@gmail.com wrote:
On Wed, Apr 17, 2013 at 8:38 AM, Lucas Kannebley Tavares lucaskt@linux.vnet.ibm.com wrote:
On 04/12/2013 01:38 PM, Bjorn Helgaas wrote:
On Thu, Apr 11, 2013 at 7:13 AM, Lucas Kannebley Tavares lucaskt@linux.vnet.ibm.com wrote:
radeon currently uses a drm function to get the speed capabilities for the bus. However, this is a non-standard method of performing this detection and this patch changes it to use the max_bus_speed attribute.
drivers/gpu/drm/radeon/evergreen.c | 9 ++------- drivers/gpu/drm/radeon/r600.c | 8 +------- drivers/gpu/drm/radeon/rv770.c | 8 +------- 3 files changed, 4 insertions(+), 21 deletions(-)
diff --git a/drivers/gpu/drm/radeon/evergreen.c b/drivers/gpu/drm/radeon/evergreen.c index 305a657..3291f62 100644 --- a/drivers/gpu/drm/radeon/evergreen.c +++ b/drivers/gpu/drm/radeon/evergreen.c @@ -3855,8 +3855,7 @@ void evergreen_fini(struct radeon_device *rdev)
void evergreen_pcie_gen2_enable(struct radeon_device *rdev) {
u32 link_width_cntl, speed_cntl, mask;
int ret;
u32 link_width_cntl, speed_cntl; if (radeon_pcie_gen2 == 0) return;
@@ -3871,11 +3870,7 @@ void evergreen_pcie_gen2_enable(struct radeon_device *rdev) if (ASIC_IS_X2(rdev)) return;
ret = drm_pcie_get_speed_cap_mask(rdev->ddev,&mask);
if (ret != 0)
return;
if (!(mask& DRM_PCIE_SPEED_50))
if (rdev->pdev->bus->max_bus_speed< PCIE_SPEED_5_0GT)
For devices on a root bus, we previously dereferenced a NULL pointer in drm_pcie_get_speed_cap_mask() because pdev->bus->self is NULL on a root bus. (I think this is the original problem you tripped over, Lucas.)
These patches fix that problem. On pseries, where the device *is* on a root bus, your patches set max_bus_speed so this will work as expected. On most other systems, max_bus_speed for root buses will be PCI_SPEED_UNKNOWN (set in pci_alloc_bus() and never updated because most arches don't have code like the pseries code you're adding).
PCI_SPEED_UNKNOWN = 0xff, so if we see another machine with a GPU on the root bus, we'll attempt to enable Gen2 on the device even though we have no idea what the bus will support.
That's why I originally suggested skipping the Gen2 stuff if "max_bus_speed == PCI_SPEED_UNKNOWN". I was just being conservative, thinking that it's better to have a functional but slow GPU rather than the unknown (to me) effects of enabling Gen2 on a link that might not support it. But I'm fine with this being either way.
You're right here, of course. I'll test for it being different from 5_0GT and 8_0GT instead. Though at some point I suppose someone will want to tackle Gen3 speeds.
drm_pcie_get_speed_cap_mask() actually checked the pci bridge to see what speeds it supported. If the new code doesn't actually check the bridge caps then the new code does not seem like a valid replacement unless I'm missing something.
The max_bus_speed in struct pci_bus is set in pci_set_bus_speed() based on the PCIe, PCI-X, or AGP capabilities. This happens when we enumerate the bridge device. Or, in this case, Lucas added powerpc-specific code to set max_bus_speed for the root bus based on platform-specific host bridge knowledge.
So it still does check the PCI bridge capabilities, just not as directly as it did before.
Ah, ok. Thanks. The previous comments about PCI_SPEED_UNKNOWN being set in pci_alloc_bus() and never updated confused me.
Yeah, that's just for root buses where we don't have the host bridge knowledge to figure it out. The root bus case was broken in drm_pcie_get_speed_cap_mask() anyway, because there is no upstream P2P bridge to look at. (That's why Lucas tripped over the null pointer dereference in the first place.)
On Thu, Apr 11, 2013 at 9:13 AM, Lucas Kannebley Tavares < lucaskt@linux.vnet.ibm.com> wrote:
After all the comments in the last patch series, I did a refactoring of what I was proposing and came up with this. Basically, now:
- max_bus_speed is used to set the device to gen2 speeds
- on power there's no longer a conflict between the pseries call and
other architectures, because the overwrite is done via a ppc_md hook 3. radeon is using bus->max_bus_speed instead of drm_pcie_get_speed_cap_mask for gen2 capability detection
The first patch consists of some architecture changes, such as adding a hook on powerpc for pci_root_bridge_prepare, so that pseries will initialize it to a function, while all other architectures get a NULL pointer. So that whenever whenever pci_create_root_bus is called, we'll get max_bus_speed properly setup from OpenFirmware.
The second patch consists of simple radeon changes not to call drm_get_pcie_speed_cap_mask anymore. I assume that on x86 machines, the max_bus_speed property will be properly set already.
The radeon changes are :
Reviewed-by: Jerome Glisse jglisse@redhat.com
Lucas Kannebley Tavares (2): ppc64: perform proper max_bus_speed detection radeon: use max_bus_speed to activate gen2 speeds
arch/powerpc/include/asm/machdep.h | 2 + arch/powerpc/kernel/pci-common.c | 8 +++++ arch/powerpc/platforms/pseries/pci.c | 51 ++++++++++++++++++++++++++++++++ arch/powerpc/platforms/pseries/setup.c | 4 ++ drivers/gpu/drm/radeon/evergreen.c | 9 +---- drivers/gpu/drm/radeon/r600.c | 8 +---- drivers/gpu/drm/radeon/rv770.c | 8 +---- 7 files changed, 69 insertions(+), 21 deletions(-)
-- 1.7.4.4
dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel
dri-devel@lists.freedesktop.org