Discussion:
Bug#619827: linux-source-2.6.38: [linux-dvb] cx88-blackbird broken (since 2.6.37)
(too old to reply)
Huber Andreas
2011-03-27 15:10:01 UTC
Permalink
Package: linux-source-2.6.38
Version: 2.6.38-1
Severity: important
Tags: upstream


[Symptom]
Processes that try to open a cx88-blackbird driven MPEG device will hang up.

[Cause]
Nestet mutex_locks (which are not allowed) result in a deadlock.

[Details]
There has been resent work on removing BKL (BigKernelLock) calls from kernel code. (see http://kernelnewbies.org/BigKernelLock) This was not properly done for the cx88-blackbird driver:

Source-File: drivers/media/video/cx88/cx88-blackbird.c
Function: int mpeg_open(struct file *file)
Problem: the calls to drv->request_acquire(drv); and drv->request_release(drv); will hang because they try to lock a mutex that has already been locked by a previouse call to mutex_lock(&dev->core->lock) ...

1050 static int mpeg_open(struct file *file)
1051 {
[...]
1060 mutex_lock(&dev->core->lock); // MUTEX LOCKED !!!!!!!!!!!!!!!!
1061
1062 /* Make sure we can acquire the hardware */
1063 drv = cx8802_get_driver(dev, CX88_MPEG_BLACKBIRD);
1064 if (drv) {
1065 err = drv->request_acquire(drv); // HANGS !!!!!!!!!!!!!!!!!!!
1066 if(err != 0) {
1067 dprintk(1,"%s: Unable to acquire hardware, %d\n", __func__, err);
1068 mutex_unlock(&dev->core->lock);;
1069 return err;
1070 }
1071 }
[...]

Here's the relevant kernel log extract (Linux version 2.6.38-1-amd64 (Debian 2.6.38-1)) ...

Mar 24 21:25:10 xen kernel: [ 241.472067] INFO: task v4l_id:1000 blocked for more than 120 seconds.
Mar 24 21:25:10 xen kernel: [ 241.478845] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Mar 24 21:25:10 xen kernel: [ 241.482412] v4l_id D ffff88006bcb6540 0 1000 1 0x00000000
Mar 24 21:25:10 xen kernel: [ 241.486031] ffff88006bcb6540 0000000000000086 ffff880000000001 ffff88006981c380
Mar 24 21:25:10 xen kernel: [ 241.489694] 0000000000013700 ffff88006be5bfd8 ffff88006be5bfd8 0000000000013700
Mar 24 21:25:10 xen kernel: [ 241.493301] ffff88006bcb6540 ffff88006be5a010 ffff88006bcb6540 000000016be5a000
Mar 24 21:25:10 xen kernel: [ 241.496766] Call Trace:
Mar 24 21:25:10 xen kernel: [ 241.500145] [<ffffffff81321c4a>] ? __mutex_lock_common+0x127/0x193
Mar 24 21:25:10 xen kernel: [ 241.503630] [<ffffffff81321d82>] ? mutex_lock+0x1a/0x33
Mar 24 21:25:10 xen kernel: [ 241.507145] [<ffffffffa09dd155>] ? cx8802_request_acquire+0x66/0xc6 [cx8802]
Mar 24 21:25:10 xen kernel: [ 241.510699] [<ffffffffa0aab7f2>] ? mpeg_open+0x7a/0x1fc [cx88_blackbird]
Mar 24 21:25:10 xen kernel: [ 241.514279] [<ffffffff8123bfb6>] ? kobj_lookup+0x139/0x173
Mar 24 21:25:10 xen kernel: [ 241.517856] [<ffffffffa062d5fd>] ? v4l2_open+0xb3/0xdf [videodev]


regards
Andi Huber

-- System Information:
Debian Release: wheezy/sid
APT prefers unstable
APT policy: (500, 'unstable'), (500, 'testing'), (500, 'stable')
Architecture: amd64 (x86_64)

Kernel: Linux 2.6.36-trunk-amd64 (SMP w/2 CPU cores)
Locale: LANG=en_US.ISO-8859-15, LC_CTYPE=en_US.ISO-8859-15 (charmap=ISO-8859-15)
Shell: /bin/sh linked to /bin/dash

Versions of packages linux-source-2.6.38 depends on:
ii binutils 2.20.1-16 The GNU assembler, linker and bina
ii bzip2 1.0.5-6 high-quality block-sorting file co

Versions of packages linux-source-2.6.38 recommends:
ii gcc 4:4.4.5-1 The GNU C compiler
ii libc6-dev [libc-dev] 2.11.2-10 Embedded GNU C Library: Developmen
ii make 3.81-8 An utility for Directing compilati

Versions of packages linux-source-2.6.38 suggests:
pn kernel-package <none> (no description available)
ii libncurses5-dev [ncurses- 5.8+20110307-1 developer's libraries for ncurses
pn libqt3-mt-dev <none> (no description available)

-- no debconf information
--
To UNSUBSCRIBE, email to debian-bugs-dist-***@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact ***@lists.debian.org
Ben Hutchings
2011-03-29 03:40:01 UTC
Permalink
Post by Huber Andreas
Package: linux-source-2.6.38
Version: 2.6.38-1
Severity: important
Tags: upstream
[Symptom]
Processes that try to open a cx88-blackbird driven MPEG device will hang up.
[Cause]
Nestet mutex_locks (which are not allowed) result in a deadlock.
Could you test whether this patch fixes the problem? Instructions for
rebuilding the kernel package are at
<http://kernel-handbook.alioth.debian.org/ch-common-tasks.html#s-common-official>.

Ben.
--
Ben Hutchings
Once a job is fouled up, anything done to improve it makes it worse.
Andreas Huber
2011-03-30 18:50:02 UTC
Permalink
Post by Ben Hutchings
Post by Huber Andreas
Package: linux-source-2.6.38
Version: 2.6.38-1
Severity: important
Tags: upstream
[Symptom]
Processes that try to open a cx88-blackbird driven MPEG device will hang up.
[Cause]
Nestet mutex_locks (which are not allowed) result in a deadlock.
Could you test whether this patch fixes the problem? Instructions for
rebuilding the kernel package are at
<http://kernel-handbook.alioth.debian.org/ch-common-tasks.html#s-common-official>.
Ben.
Hi Ben, this patch fixes the deadlock during opening of the MPEG device, thanks.
But I did some testing and ran into another deadlock while unloading the
(patched) driver ...

rmmod cx88_blackbird

...
cx88/2: unregistering cx8802 driver, type: blackbird access: shared
cx88[0]/2: subsystem: 0070:9601, board: Hauppauge WinTV-HVR1300
DVB-T/Hybrid MPEG Encoder [card=56]
cx88[1]/2: subsystem: 0070:9601, board: Hauppauge WinTV-HVR1300
DVB-T/Hybrid MPEG Encoder [card=56]
INFO: task rmmod:11233 blocked for more than 120 seconds.
...
rmmod D ffff88005e9086c0 0 11233 5297 0x00000000
...
Call Trace:
[<ffffffff8131fae5>] ? __mutex_lock_common.clone.5+0x12a/0x195
[<ffffffff810eb3d9>] ? kfree+0xc1/0xda
[<ffffffff8131f9a2>] ? mutex_lock+0x1a/0x33
[<ffffffffa0b3a809>] ? cx8802_blackbird_remove+0x27/0x3d [cx88_blackbird]
[<ffffffffa08671f2>] ? cx8802_unregister_driver+0xf1/0x1bd [cx8802]
[<ffffffff810730a9>] ? sys_delete_module+0x1df/0x251
[<ffffffff81009912>] ? system_call_fastpath+0x16/0x1b
...


And there seems to be a new problem:

I have 2 identical WinTV-HVR1300 Cards ...

[ 6.876614] cx88[0]/0: registered device video0 [v4l2]
[ 6.889815] cx88[1]/0: registered device video1 [v4l2]
[ 10.161998] cx88[0]/2: registered device video2 [mpeg]
[ 13.286062] cx88[1]/2: registered device video3 [mpeg]

Here's what I experienced so far:

1) booting kernel 2.6.36-trunk-amd64 from debian
both cards are able to stream their mpeg encoded tv streams through
the mpeg devices (in my case /dev/video2 and /dev/video3)
2) after reboot into kernel 2.6.38
/dev/video2 still works fine; tuning to different channels works!
/dev/video3 is inaccessible (after doing exactly the same initialization
as before) ...

dd if=/dev/video3 of=/tmp/test.mpg
dd: reading `/dev/video3': Input/output error
0+0 records in
0+0 records out
0 bytes (0 B) copied, 2.74267 s, 0.0 kB/s

Do you think this behavior could be BKL conversion related?
--
To UNSUBSCRIBE, email to debian-kernel-***@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact ***@lists.debian.org
Archive: http://lists.debian.org/loom.20110330T202348-***@post.gmane.org
Andreas Huber
2011-04-01 06:40:02 UTC
Permalink
Continuing Ben's work, I fixed all remaining issues.

Patch was uploaded to
https://bugzilla.kernel.org/show_bug.cgi?id=31962

Everything works fine for me now.

regards
Andi
--
To UNSUBSCRIBE, email to debian-kernel-***@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact ***@lists.debian.org
Archive: http://lists.debian.org/loom.20110401T082954-***@post.gmane.org
Andreas Huber
2011-03-30 22:20:03 UTC
Permalink
Ok, thanks!
All mail regarding Debian bugs should be cc'd to the bug address (in
I am not going to spend more time trying to fix this, as I don't know
the media/DVB system well and do not have the hardware in question. I
will forward your information to the upstream developers.
Ben.
--
To UNSUBSCRIBE, email to debian-bugs-dist-***@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact ***@lists.debian.org
Andreas Huber
2011-04-01 06:40:01 UTC
Permalink
Continuing Ben's work, I fixed all remaining issues.

Patch was uploaded to https://bugzilla.kernel.org/show_bug.cgi?id=31962

Everything works fine for me now.

regards
Andi
--
To UNSUBSCRIBE, email to debian-bugs-dist-***@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact ***@lists.debian.org
Ben Hutchings
2011-04-02 15:30:02 UTC
Permalink
Jonathan has provided a new patch set. (RFC in progress.)
And while testing it today an unrelated issue occurred which needs to
be resolved ...
[...]

I'm not going to spend more time looking at this in detail. Please let
us know when a complete fix has been accepted upstream.

Ben.
--
Ben Hutchings
Once a job is fouled up, anything done to improve it makes it worse.
Andreas Huber
2011-04-02 18:20:02 UTC
Permalink
On 02.04.2011 17:18, Ben Hutchings wrote:
[...]
I'm not going to spend more time looking at this in detail. Please let
us know when a complete fix has been accepted upstream.
Ben.
Just for clarification, what I think is going wrong in cx88-mpeg.c,
it's very simple, see the comments ...

static int cx8802_request_release(struct cx8802_driver *drv)
{
struct cx88_core *core = drv->core;

if (drv->advise_release&& --core->active_ref == 0) // REF COUNT MAY BECOME NEGATIVE !!!!!!
{
drv->advise_release(drv);
core->active_type_id = CX88_BOARD_NONE;
mpeg_dbg(1,"%s() Post release GPIO=%x\n", __func__, cx_read(MO_GP0_IO));
}

if(core->active_ref<0) core->active_ref=0; // THIS IS A POSSIBLE FIX !!!!!

return 0;
}

please review this simple fix!
regards
Andi
--
To UNSUBSCRIBE, email to debian-bugs-dist-***@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact ***@lists.debian.org
Geert Stappers
2011-04-02 19:00:01 UTC
Permalink
Post by Andreas Huber
please review this simple fix!
Please let us know when a complete fix has been accepted upstream.


Groeten
Geert Stappers
--
Post by Andreas Huber
And is there a policy on top-posting vs. bottom-posting?
Yes.
Jonathan Nieder
2011-05-25 02:10:02 UTC
Permalink
tags 619827 = upstream fixed-upstream
quit

Hi,
Post by Ben Hutchings
Post by Huber Andreas
Processes that try to open a cx88-blackbird driven MPEG device will hang up.
[...]
Post by Ben Hutchings
Could you test whether this patch fixes the problem?
Thanks again. This and related problems should be fixed by

- 8a317a87 ([media] cx88: protect per-device driver list with device lock)
- 1fe70e96 ([media] cx88: fix locking of sub-driver operations)
- 1d6213ab ([media] cx88: hold device lock during sub-driver initialization)
- 344d6c6b ([media] cx88: protect cx8802_devlist with a mutex)
- 579b2b45 ([media] cx88: gracefully reject attempts to use unregistered
cx88-blackbird driver)
- f4bd4be8 ([media] cx88: don't use atomic_t for core->mpeg_users)

which are as of yesterday part of Linus's "master" branch fwiw.
--
To UNSUBSCRIBE, email to debian-bugs-dist-***@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact ***@lists.debian.org
Ben Hutchings
2011-06-06 01:10:01 UTC
Permalink
On Tue, 2011-05-24 at 21:03 -0500, Jonathan Nieder wrote:
[...]
Post by Jonathan Nieder
Thanks again. This and related problems should be fixed by
- 8a317a87 ([media] cx88: protect per-device driver list with device lock)
- 1fe70e96 ([media] cx88: fix locking of sub-driver operations)
- 1d6213ab ([media] cx88: hold device lock during sub-driver initialization)
- 344d6c6b ([media] cx88: protect cx8802_devlist with a mutex)
- 579b2b45 ([media] cx88: gracefully reject attempts to use unregistered
cx88-blackbird driver)
- f4bd4be8 ([media] cx88: don't use atomic_t for core->mpeg_users)
which are as of yesterday part of Linus's "master" branch fwiw.
Now also in 2.6.39.1 and queued for 2.6.39-2, thanks.

Ben.
--
Ben Hutchings
Once a job is fouled up, anything done to improve it makes it worse.
Debian Bug Tracking System
2011-06-08 10:10:01 UTC
Permalink
Your message dated Wed, 08 Jun 2011 10:05:58 +0000
with message-id <E1QUFeA-0006Bs-***@franck.debian.org>
and subject line Bug#619827: fixed in linux-2.6 2.6.39-2
has caused the Debian Bug report #619827,
regarding cx88-blackbird driver hangs when used
to be marked as done.

This means that you claim that the problem has been dealt with.
If this is not the case it is now your responsibility to reopen the
Bug report if necessary, and/or fix the problem forthwith.

(NB: If you are a system administrator and have no idea what this
message is talking about, this may indicate a serious mail system
misconfiguration somewhere. Please contact ***@bugs.debian.org
immediately.)
--
619827: http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=619827
Debian Bug Tracking System
Contact ***@bugs.debian.org with problems
Loading...