Commit Graph

15351 Commits

Author SHA1 Message Date
Frank Schaefer
cbe7f8a030 [media] em28xx: remove obsolete field 'frame' from struct em28xx_buffer
Signed-off-by: Frank Schäfer <fschaefer.oss@googlemail.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
2012-12-22 20:51:19 -02:00
Frank Schaefer
948a49aa69 [media] em28xx: use common function for video and vbi buffer completion
Signed-off-by: Frank Schäfer <fschaefer.oss@googlemail.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
2012-12-22 20:50:59 -02:00
Frank Schaefer
24a6d8497f [media] em28xx: refactor get_next_buf() and use it for vbi data, too
get_next_buf() and vbi_get_next_buf() do exactly the same just with a
different dma queue and buffer. Saving the new buffer pointer back to the
device struct in em28xx_urb_data_copy() instead of doing this from inside
these functions makes it possible to get rid of one of them.
Also refactor the function parameters and return type:
- pass a pointer to struct em28xx as parameter (instead of obtaining the
  pointer from the dma queue pointer with the container_of macro) like we do
  it in all other functions
- instead of using a pointer-pointer, return the pointer to the new buffer
  as return value of the function

Signed-off-by: Frank Schäfer <fschaefer.oss@googlemail.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
2012-12-22 20:48:57 -02:00
Frank Schaefer
960da93ba5 [media] em28xx: use common urb data copying function for vbi and non-vbi data streams
em28xx_urb_data_copy_vbi() is actually an extended version of
em28xx_urb_data_copy(). With the preceding fixes and improvements, it works
fine with both, vbi and non-vbi data streams without performance impacts.
So rename em28xx_urb_data_copy_vbi() to em28xx_urb_data_copy(), delete the
the old implementation of em28xx_urb_data_copy() and change the code to use
this function for both data stream types.
Tested with "SilverCrest 1.3 MPix webcam" (progressive, non-vbi) and
"Hauppauge HVR-900 (65008/A1C0)" (interlaced, vbi enabled and disabled).

Signed-off-by: Frank Schäfer <fschaefer.oss@googlemail.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
2012-12-22 19:17:37 -02:00
Frank Schaefer
79ff8697e9 [media] em28xx: em28xx_urb_data_copy_vbi(): calculate vbi_size only if needed
Signed-off-by: Frank Schäfer <fschaefer.oss@googlemail.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
2012-12-22 19:17:18 -02:00
Frank Schaefer
0455eebfbd [media] em28xx: fix capture type setting in em28xx_urb_data_copy_vbi()
Set capture type to 1 (video start) when the video frame start header is
detected. This bug didn't cause any trouble, because this type of header is
never received in vbi mode.
Fix it, because we want to use this function with disabled vbi in the future.
Also start with capture type -1 to avoid processing of corrupted/incomplete
frame data which is usually received at streaming start (especially when
USB bulk transfers are used).

Signed-off-by: Frank Schäfer <fschaefer.oss@googlemail.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
2012-12-22 19:16:57 -02:00
Frank Schaefer
3610f58bb1 [media] em28xx: make sure the packet size is >= 4 before checking for headers in em28xx_urb_data_copy_vbi()
Signed-off-by: Frank Schäfer <fschaefer.oss@googlemail.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
2012-12-22 19:16:10 -02:00
Frank Schaefer
b77e0c088f [media] em28xx: fix video data start position calculation in em28xx_urb_data_copy_vbi()
The header check/removal code at the end of function em28xx_urb_data_copy_vbi()
is obsolete, because this is already done earlier in this function.
In fact it is incomplete (doesn't check for vbi header) and causes trouble
when the first data bytes are the same as header bytes (which is fortunately
very unlikely).

Signed-off-by: Frank Schäfer <fschaefer.oss@googlemail.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
2012-12-22 19:02:32 -02:00
Frank Schaefer
454fe92f01 [media] em28xx: add module parameter for selection of the preferred USB transfer type
By default, isoc transfers are used if possible.
With the new module parameter, bulk can be selected as the
preferred USB transfer type.

Signed-off-by: Frank Schäfer <fschaefer.oss@googlemail.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
2012-12-22 18:20:03 -02:00
Frank Schaefer
c647a91a25 [media] em28xx: improve USB endpoint logic, also use bulk transfers
The current enpoint logic ignores all bulk endpoints and uses
a fixed mapping between endpint addresses and the supported
data stream types (analog/audio/DVB):
  Ep 0x82, isoc	=> analog
  Ep 0x83, isoc	=> audio
  Ep 0x84, isoc	=> DVB
Now that the code can also do bulk transfers, the endpoint
logic has to be extended to also consider bulk endpoints.
The new logic preserves backwards compatibility and reflects
the endpoint configurations we have seen so far:
  Ep 0x82, isoc		=> analog
  Ep 0x82, bulk		=> analog
  Ep 0x83, isoc*	=> audio
  Ep 0x84, isoc		=> digital
  Ep 0x84, bulk		=> analog or digital**
 (*: audio should always be isoc)
 (**: analog, if ep 0x82 is isoc, otherwise digital)

[mchehab@redhat.com: Fix a CodingStyle issue: don't break strings
 into separate lines]

Signed-off-by: Frank Schäfer <fschaefer.oss@googlemail.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
2012-12-22 18:17:52 -02:00
Frank Schaefer
c8e9d95b41 [media] em28xx: set USB alternate settings for analog video bulk transfers properly
Extend function em28xx_set_alternate:
- use alternate setting 0 for bulk transfers as default
- respect module parameter 'alt'=0 for bulk transfers
- set max_packet_size to 512 bytes for bulk transfers
[mchehab@redhat.com: Fix a CodingStyle issue: don't break strings
 into separate lines]
Signed-off-by: Frank Schäfer <fschaefer.oss@googlemail.com>

Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
2012-12-22 18:13:11 -02:00
Frank Schaefer
7312f2c9fa [media] em28xx: add fields for analog and DVB USB transfer type selection to struct em28xx
Signed-off-by: Frank Schäfer <fschaefer.oss@googlemail.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
2012-12-22 18:11:42 -02:00
Frank Schaefer
0cf544a6cc [media] em28xx: rename some USB parameter fields in struct em28xx to clarify their role
Also improve the comments.

Signed-off-by: Frank Schäfer <fschaefer.oss@googlemail.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
2012-12-22 18:11:26 -02:00
Frank Schaefer
a950e4a75e [media] em28xx: rename function em28xx_dvb_isoc_copy and extend for USB bulk transfers
The URB data processing for DVB bulk transfers is very similar to
what is done with isoc transfers, so create a common function that
works with both transfer types based on the existing isoc function.
Tested with device Hauppauge HVR-930c.

Signed-off-by: Frank Schäfer <fschaefer.oss@googlemail.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
2012-12-22 18:04:50 -02:00
Frank Schaefer
4601cc3977 [media] em28xx: rename function em28xx_isoc_copy_vbi and extend for USB bulk transfers
The URB data processing for bulk transfers is very similar to what
is done with isoc transfers, so create a common function that works
with both transfer types based on the existing isoc function.

[mchehab@redhat.com: Fix a CodingStyle issue: don't break strings
 into separate lines]

Signed-off-by: Frank Schäfer <fschaefer.oss@googlemail.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
2012-12-22 18:04:28 -02:00
Frank Schaefer
0fa4a40290 [media] em28xx: rename function em28xx_isoc_copy and extend for USB bulk transfers
The URB data processing for bulk transfers is very similar to what
is done with isoc transfers, so create a common function that works
with both transfer types based on the existing isoc function.

[mchehab@redhat.com: Fix a CodingStyle issue: don't break strings
 into separate lines]

Signed-off-by: Frank Schäfer <fschaefer.oss@googlemail.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
2012-12-22 18:02:48 -02:00
Frank Schaefer
1653cb0cb2 [media] em28xx: remove double checks for urb->status == -ENOENT in urb_data_copy functions
This check is already done in the URB handler
em28xx_irq_callback before calling these functions.

Signed-off-by: Frank Schäfer <fschaefer.oss@googlemail.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
2012-12-22 18:00:59 -02:00
Frank Schaefer
337fe8dad5 [media] em28xx: clear USB halt/stall condition in em28xx_init_usb_xfer when using bulk transfers
[mchehab@redhat.com: Fix a CodingStyle issue: don't break strings
 into separate lines]

Signed-off-by: Frank Schäfer <fschaefer.oss@googlemail.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
2012-12-22 18:00:30 -02:00
Frank Schaefer
057ca0da06 [media] em28xx: create a common function for isoc and bulk USB transfer initialization
- rename em28xx_init_isoc to em28xx_init_usb_xfer
- add parameter for isoc/bulk transfer selection which is passed to em28xx_alloc_urbs
- rename local variable isoc_buf to usb_bufs

Signed-off-by: Frank Schäfer <fschaefer.oss@googlemail.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
2012-12-22 17:58:14 -02:00
Frank Schaefer
6ddd89d0c9 [media] em28xx: create a common function for isoc and bulk URB allocation and setup
Rename the existing function for isoc transfers em28xx_init_isoc
to em28xx_init_usb_xfer and extend it.
URB allocation and setup is now done depending on the USB
transfer type, which is selected with a new function parameter.

Signed-off-by: Frank Schäfer <fschaefer.oss@googlemail.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
2012-12-22 17:58:00 -02:00
Frank Schaefer
afb177e065 [media] em28xx: rename function em28xx_uninit_isoc to em28xx_uninit_usb_xfer
This function will be used to uninitialize USB bulk transfers, too.
Also rename the local variable isoc_bufs to usb_bufs.

Signed-off-by: Frank Schäfer <fschaefer.oss@googlemail.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
2012-12-22 17:57:41 -02:00
Frank Schaefer
836e93bf6a [media] em28xx: update description of em28xx_irq_callback
em28xx_irq_callback can be used for isoc and bulk transfers.

Signed-off-by: Frank Schäfer <fschaefer.oss@googlemail.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
2012-12-22 17:57:27 -02:00
Frank Schaefer
89f84b9c20 [media] em28xx: remove obsolete #define EM28XX_URB_TIMEOUT
It isn't used anymore and uses constants which no longer exist.

Signed-off-by: Frank Schäfer <fschaefer.oss@googlemail.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
2012-12-22 17:57:00 -02:00
Frank Schaefer
74209dc06a [media] em28xx: rename struct em28xx_usb_isoc_ctl to em28xx_usb_ctl
Also rename the corresponding field isoc_ctl in struct em28xx
to usb_ctl.
We will use this struct for USB bulk transfers, too.

Signed-off-by: Frank Schäfer <fschaefer.oss@googlemail.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
2012-12-22 17:56:49 -02:00
Frank Schaefer
f0fa9936f5 [media] em28xx: rename struct em28xx_usb_isoc_bufs to em28xx_usb_bufs
It will be used for USB bulk transfers, too.

Signed-off-by: Frank Schäfer <fschaefer.oss@googlemail.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
2012-12-22 17:56:34 -02:00
Frank Schaefer
515688a898 [media] em28xx: rename isoc packet number constants and parameters
Rename EM28XX_NUM_PACKETS to EM28XX_NUM_ISOC_PACKETS and
EM28XX_DVB_MAX_PACKETS to EM28XX_DVB_NUM_ISOC_PACKETS to
clarify that these values are used only for isoc usb transfers.
Also use the term num_packets instead of max_packets, as this
is how these values are used and called in struct urb.

Signed-off-by: Frank Schäfer <fschaefer.oss@googlemail.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
2012-12-22 17:56:10 -02:00
Frank Schaefer
8c3015676f [media] em28xx: clarify meaning of field 'progressive' in struct em28xx
Signed-off-by: Frank Schäfer <fschaefer.oss@googlemail.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
2012-12-22 17:55:58 -02:00
Frank Schaefer
c02ec71b01 [media] em28xx: fix wrong data offset for non-interlaced mode in em28xx_copy_video
em28xx_copy_video uses a wrong offset for the target buffer
when copying the data from an USB isoc packet. This happens
only for the second and all following lines in the packet.
The reason why this bug doesn't cause image corruption with
my test device (SilverCrest Webcam 1.3 MPix) is, that this
device never sends any packets that cross the end of a line.
I don't know if all devices behave like this, so this patch
should be considered for stable.
With the upcoming patches to add support for USB bulk transfers,
em28xx_copy_video will be called once per URB, which will
always trigger this bug.

Signed-off-by: Frank Schäfer <fschaefer.oss@googlemail.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
2012-12-22 17:55:34 -02:00
Frank Schaefer
2f5741aa6a [media] em28xx: input: fix oops on device removal
When em28xx_ir_init() fails due to an configuration error, it frees the memory
of struct em28xx_IR *ir, but doesn't set the corresponding pointer in the
device struct to NULL.
On device removal, em28xx_ir_fini() gets called, which then calls
rc_unregister_device() with a pointer to freed memory.
Fixes bug 26572 (http://bugzilla.kernel.org/show_bug.cgi?id=26572)

Signed-off-by: Frank Schäfer <fschaefer.oss@googlemail.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
2012-12-22 16:14:37 -02:00
Mauro Carvalho Chehab
0dae883923 [media] em28xx: add support for RC6 mode 0 on devices that support it
Newer em28xx chipsets (em2874 and upper) are capable of supporting
RC6 codes, on both mode 0 (command mode, 16 bits payload size, similar
to RC5, also called "Philips mode") and mode 6a (OEM command mode,
with offers a few alternatives with regards to the payload size).
I don't have any mode 6a control ATM to test it, so, I opted to add
support only to mode 0.
After this patch, adding support to mode 6a should not be hard.
Tested with a Philips television remote controller.

Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
2012-12-21 21:16:19 -02:00
Mauro Carvalho Chehab
105e3687ad [media] em28xx: add support for NEC proto variants on em2874 and upper
By disabling the NEC parity check, it is possible to handle all 3 NEC
protocol variants (32, 24 or 16 bits).
Change the driver in order to handle all of them.
Unfortunately, em2860/em2863 provide only 16 bits for the IR scancode,
even when NEC parity is disabled. So, this change should affect only
em2874 and newer devices, with provides up to 32 bits for the scancode.
Tested with one NEC-16, one NEC-24 and one RC5 IR.

Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
2012-12-21 21:15:52 -02:00
Mauro Carvalho Chehab
37285bf2a5 em28xx: add two missing tuners at the Kconfig file
Those two tuners may also be needed.

Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
2012-12-21 21:14:41 -02:00
Jonathan McDowell
fc09931e10 [media] Autoselect more relevant frontends for EM28XX DVB stick
I noticed that the EM28XX DVB driver doesn't auto select all of the
appropriate DVB tuner modules required. In particular I needed
DVB_LGDT3305 for my a340, but it looks like DVB_MT352 + DVB_S5H1409 were
missing as well.

Signed-Off-by: Jonathan McDowell <noodles@earth.li>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
2012-12-21 21:14:01 -02:00
Julian Scheel
afca99a2bf [media] tm6000-dvb: Fix module unload
dvb_unregister_frontend has to be called before detach. Otherwise the
unregister call will segfault. This made tm6000-dvb module unload unusable.

Signed-off-by: Julian Scheel <julian@jusst.de>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
2012-12-21 18:45:31 -02:00
Matthijs Kooijman
9fa35204dd [media] rc: Call rc_register_device before irq setup
This should fix a potential race condition, when the irq handler
triggers while rc_register_device is still setting up the rdev->raw
device.
This crash has not been observed in practice, but there should be a very
small window where it could occur. Since ir_raw_event_store_with_filter
checks if rdev->raw is not NULL before using it, this bug is not
triggered if the request_irq triggers a pending irq directly (since
rdev->raw will still be NULL then).
This commit was tested on nuvoton-cir only.

Cc: Jarod Wilson <jarod@redhat.com>
Cc: Maxim Levitsky <maximlevitsky@gmail.com>
Cc: David Härdeman <david@hardeman.nu>
Signed-off-by: Matthijs Kooijman <matthijs@stdin.nl>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
2012-12-21 18:26:11 -02:00
Matthijs Kooijman
d62b681847 [media] rc: Set rdev before irq setup
This fixes a problem in fintek-cir and nuvoton-cir where the
irq handler would trigger during module load before the rdev member was
set, causing a NULL pointer crash.
It seems this crash is very reproducible (just bombard the receiver with
IR signals during module load), probably because when request_irq is
called, any pending intterupt is handled immediately, before
request_irq returns and rdev can be set.
This same crash was supposed to be fixed by commit
9ef449c6b3 ("[media] rc: Postpone ISR
registration"), but the crash was still observed on the nuvoton-cir
driver.
This commit was tested on nuvoton-cir only.

Cc: Jarod Wilson <jarod@redhat.com>
Signed-off-by: Matthijs Kooijman <matthijs@stdin.nl>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
2012-12-21 18:26:10 -02:00
Matthijs Kooijman
70ef69915b [media] rc: Make probe cleanup goto labels more verbose
Before, labels were simply numbered. Now, the labels are named after the
cleanup action they'll perform (first), based on how the winbond-cir
driver does it. This makes the code a bit more clear and makes changes
in the ordering of labels easier to review.
This change is applied only to the rc drivers that do significant
cleanup in their probe functions: ati-remote, ene-ir, fintek-cir,
gpio-ir-recv, ite-cir, nuvoton-cir.
This commit should not change any code, it just renames goto labels.

[mchehab@redhat.com: removed changes at gpio-ir-recv.c, due to
 merge conflicts]

Signed-off-by: Matthijs Kooijman <matthijs@stdin.nl>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
2012-12-21 18:26:08 -02:00
Kirill Smelkov
d40fbf8d52 [media] vivi: Optimize precalculate_line()
precalculate_line() is not very high on profile, but it calls expensive
gen_twopix(), so let's polish it too:
    call gen_twopix() only once for every color bar and then distribute
    the result.
before:
    # cmdline : /home/kirr/local/perf/bin/perf record -g -a sleep 20
    #
    # Samples: 46K of event 'cycles'
    # Event count (approx.): 15574200568
    #
    # Overhead          Command         Shared Object
    # ........  ...............  ....................
    #
        27.99%             rawv  libc-2.13.so          [.] __memcpy_ssse3
        23.29%           vivi-*  [kernel.kallsyms]     [k] memcpy
        10.30%             Xorg  [unknown]             [.] 0xa75c98f8
         5.34%           vivi-*  [vivi]                [k] gen_text.constprop.6
         4.61%             rawv  [vivi]                [k] gen_twopix
         2.64%             rawv  [vivi]                [k] precalculate_line
         1.37%          swapper  [kernel.kallsyms]     [k] read_hpet
after:
    # cmdline : /home/kirr/local/perf/bin/perf record -g -a sleep 20
    #
    # Samples: 45K of event 'cycles'
    # Event count (approx.): 15561769214
    #
    # Overhead          Command         Shared Object
    # ........  ...............  ....................
    #
        30.73%             rawv  libc-2.13.so          [.] __memcpy_ssse3
        26.78%           vivi-*  [kernel.kallsyms]     [k] memcpy
        10.68%             Xorg  [unknown]             [.] 0xa73015e9
         5.55%           vivi-*  [vivi]                [k] gen_text.constprop.6
         1.36%          swapper  [kernel.kallsyms]     [k] read_hpet
         0.96%             Xorg  [kernel.kallsyms]     [k] read_hpet
         ...
         0.16%             rawv  [vivi]                [k] precalculate_line
         ...
         0.14%             rawv  [vivi]                [k] gen_twopix
(i.e. gen_twopix and precalculate_line overheads are almost gone)

Signed-off-by: Kirill Smelkov <kirr@mns.spb.ru>
Acked-by: Hans Verkuil <hans.verkuil@cisco.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
2012-12-21 18:25:57 -02:00
Kirill Smelkov
13908f330f [media] vivi: Move computations out of vivi_fillbuf linecopy loop
The "dev->mvcount % wmax" thing was showing high in profiles (we do it
for each line which ~ 500 per frame)
           ?     000010c0 <vivi_fillbuff>:
                 ...
      0,39 ? 70:???mov    0x3ff4(%edi),%esi
      0,22 ? 76:?  mov    0x2a0(%edi),%eax
      0,30 ?    ?  mov    -0x84(%ebp),%ebx
      0,35 ?    ?  mov    %eax,%edx
      0,04 ?    ?  mov    -0x7c(%ebp),%ecx
      0,35 ?    ?  sar    $0x1f,%edx
      0,44 ?    ?  idivl  -0x7c(%ebp)
     21,68 ?    ?  imul   %esi,%ecx
      0,70 ?    ?  imul   %esi,%ebx
      0,52 ?    ?  add    -0x88(%ebp),%ebx
      1,65 ?    ?  mov    %ebx,%eax
      0,22 ?    ?  imul   %edx,%esi
      0,04 ?    ?  lea    0x3f4(%edi,%esi,1),%edx
      2,18 ?    ?? call   vivi_fillbuff+0xa6
      0,74 ?    ?  addl   $0x1,-0x80(%ebp)
     62,69 ?    ?  mov    -0x7c(%ebp),%edx
      1,18 ?    ?  mov    -0x80(%ebp),%ecx
      0,35 ?    ?  add    %edx,-0x84(%ebp)
      0,61 ?    ?  cmp    %ecx,-0x8c(%ebp)
      0,22 ?    ???jne    70
so since all variables stay the same for all iterations let's move
computations out of the loop: the abovementioned division and
"width*pixelsize" too
before:
    # cmdline : /home/kirr/local/perf/bin/perf record -g -a sleep 20
    #
    # Samples: 49K of event 'cycles'
    # Event count (approx.): 16475832370
    #
    # Overhead          Command           Shared Object
    # ........  ...............  ......................
    #
        29.07%             rawv  libc-2.13.so            [.] __memcpy_ssse3
        20.57%           vivi-*  [kernel.kallsyms]       [k] memcpy
        10.20%             Xorg  [unknown]               [.] 0xa7301494
         5.16%           vivi-*  [vivi]                  [k] gen_text.constprop.6
         4.43%             rawv  [vivi]                  [k] gen_twopix
         4.36%           vivi-*  [vivi]                  [k] vivi_fillbuff
         2.42%             rawv  [vivi]                  [k] precalculate_line
         1.33%          swapper  [kernel.kallsyms]       [k] read_hpet
after:
    # cmdline : /home/kirr/local/perf/bin/perf record -g -a sleep 20
    #
    # Samples: 46K of event 'cycles'
    # Event count (approx.): 15574200568
    #
    # Overhead          Command         Shared Object
    # ........  ...............  ....................
    #
        27.99%             rawv  libc-2.13.so          [.] __memcpy_ssse3
        23.29%           vivi-*  [kernel.kallsyms]     [k] memcpy
        10.30%             Xorg  [unknown]             [.] 0xa75c98f8
         5.34%           vivi-*  [vivi]                [k] gen_text.constprop.6
         4.61%             rawv  [vivi]                [k] gen_twopix
         2.64%             rawv  [vivi]                [k] precalculate_line
         1.37%          swapper  [kernel.kallsyms]     [k] read_hpet
         0.79%             Xorg  [kernel.kallsyms]     [k] read_hpet
         0.64%             Xorg  [kernel.kallsyms]     [k] unix_poll
         0.45%             Xorg  [kernel.kallsyms]     [k] fget_light
         0.43%             rawv  libxcb.so.1.1.0       [.] 0x0000aae9
         0.40%            runsv  [kernel.kallsyms]     [k] ext2_try_to_allocate
         0.36%             Xorg  [kernel.kallsyms]     [k] _raw_spin_lock_irqsave
         0.31%           vivi-*  [vivi]                [k] vivi_fillbuff
(i.e. vivi_fillbuff own overhead is almost gone)

Signed-off-by: Kirill Smelkov <kirr@mns.spb.ru>
Acked-by: Hans Verkuil <hans.verkuil@cisco.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
2012-12-21 18:25:43 -02:00
Kirill Smelkov
10ce844158 [media] vivi: vivi_dev->line[] was not aligned
Though dev->line[] is u8 array we work with it as with u16, u24 or u32
pixels, and also pass it to memcpy() and it's better to align it to at
least 4.
Before the patch, on x86 offsetof(vivi_dev, line) was 1003 and after
patch it is 1004.
There is slight performance increase, but I think is is slight, only
because we start copying not from line[0]:
    ---- 8< ---- drivers/media/platform/vivi.c
    static void vivi_fillbuff(struct vivi_dev *dev, struct vivi_buffer *buf)
    {
            ...
            for (h = 0; h < hmax; h++)
                    memcpy(vbuf + h * wmax * dev->pixelsize,
                           dev->line + (dev->mv_count % wmax) * dev->pixelsize,
                           wmax * dev->pixelsize);
before:
    # cmdline : /home/kirr/local/perf/bin/perf record -g -a sleep 20
    #
    # Samples: 49K of event 'cycles'
    # Event count (approx.): 16799780016
    #
    # Overhead          Command         Shared Object
    # ........  ...............  ....................
    #
        27.51%             rawv  libc-2.13.so          [.] __memcpy_ssse3
        23.77%           vivi-*  [kernel.kallsyms]     [k] memcpy
         9.96%             Xorg  [unknown]             [.] 0xa76f5e12
         4.94%           vivi-*  [vivi]                [k] gen_text.constprop.6
         4.44%             rawv  [vivi]                [k] gen_twopix
         3.17%           vivi-*  [vivi]                [k] vivi_fillbuff
         2.45%             rawv  [vivi]                [k] precalculate_line
         1.20%          swapper  [kernel.kallsyms]     [k] read_hpet
    23.77%           vivi-*  [kernel.kallsyms]     [k] memcpy
                     |
                     --- memcpy
                        |
                        |--99.28%-- vivi_fillbuff
                        |          vivi_thread
                        |          kthread
                        |          ret_from_kernel_thread
                         --0.72%-- [...]
after:
    # cmdline : /home/kirr/local/perf/bin/perf record -g -a sleep 20
    #
    # Samples: 49K of event 'cycles'
    # Event count (approx.): 16475832370
    #
    # Overhead          Command           Shared Object
    # ........  ...............  ......................
    #
        29.07%             rawv  libc-2.13.so            [.] __memcpy_ssse3
        20.57%           vivi-*  [kernel.kallsyms]       [k] memcpy
        10.20%             Xorg  [unknown]               [.] 0xa7301494
         5.16%           vivi-*  [vivi]                  [k] gen_text.constprop.6
         4.43%             rawv  [vivi]                  [k] gen_twopix
         4.36%           vivi-*  [vivi]                  [k] vivi_fillbuff
         2.42%             rawv  [vivi]                  [k] precalculate_line
         1.33%          swapper  [kernel.kallsyms]       [k] read_hpet

Signed-off-by: Kirill Smelkov <kirr@mns.spb.ru>
Acked-by: Hans Verkuil <hans.verkuil@cisco.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
2012-12-21 18:25:34 -02:00
Kirill Smelkov
e3a8b4d22b [media] vivi: Optimize gen_text()
I've noticed that vivi takes a lot of CPU to produce its frames.
For example for 8 devices and 8 simple programs running, where each
captures YUY2 640x480 and displays it to X via SDL, profile timing is as
follows:
    # cmdline : /home/kirr/local/perf/bin/perf record -g -a sleep 20
    # Samples: 82K of event 'cycles'
    # Event count (approx.): 31551930117
    #
    # Overhead          Command         Shared Object                                                           Symbol
    # ........  ...............  ....................
    #
        49.48%           vivi-*  [vivi]                [k] gen_twopix
        10.79%           vivi-*  [kernel.kallsyms]     [k] memcpy
        10.02%             rawv  libc-2.13.so          [.] __memcpy_ssse3
         8.35%           vivi-*  [vivi]                [k] gen_text.constprop.6
         5.06%             Xorg  [unknown]             [.] 0xa73015f8
         2.32%             rawv  [vivi]                [k] gen_twopix
         1.22%             rawv  [vivi]                [k] precalculate_line
         1.20%           vivi-*  [vivi]                [k] vivi_fillbuff
    (rawv is display program, vivi-* is a combination of vivi-000 through vivi-007)
so a lot of time is spent in gen_twopix() which as the follwing
call-graph profile shows ...
    49.48%           vivi-*  [vivi]                [k] gen_twopix
                     |
                     --- gen_twopix
                        |
                        |--96.30%-- gen_text.constprop.6
                        |          vivi_fillbuff
                        |          vivi_thread
                        |          kthread
                        |          ret_from_kernel_thread
                        |
                         --3.70%-- vivi_fillbuff
                                   vivi_thread
                                   kthread
                                   ret_from_kernel_thread
... is called mostly from gen_text().
If we'll look at gen_text(), in the inner loop, we'll see
    if (chr & (1 << (7 - i)))
            gen_twopix(dev, pos + j * dev->pixelsize, WHITE, (x+y) & 1);
    else
            gen_twopix(dev, pos + j * dev->pixelsize, TEXT_BLACK, (x+y) & 1);
which calls gen_twopix() for every character pixel, and that is very
expensive, because gen_twopix() branches several times.
Now, let's note, that we operate on only two colors - WHITE and
TEXT_BLACK, and that pixel for that colors could be precomputed and
gen_twopix() moved out of the inner loop. Also note, that for black
and white colors even/odd does not make a difference for all supported
pixel formats, so we could stop doing that `odd` gen_twopix() parameter
game.
So the first thing we are doing here is
    1) moving gen_twopix() calls out of gen_text() into vivi_fillbuff(),
       to pregenerate black and white colors, just before printing
       starts.
what we have next is that gen_text's font rendering loop, even with
gen_twopix() calls moved out, was inefficient and branchy, so let's
    2) rewrite gen_text() loop so it uses less variables + unroll char
       horizontal-rendering loop + instantiate 3 code paths for pixelsizes 2,3
       and 4 so that in all inner loops we don't have to branch or make
       indirections (*).
Done all above reworks, for gen_text() we get nice, non-branchy
streamlined code (showing loop for pixelsize=2):
           ?       cmp    $0x2,%eax
           ?     ? jne    26
           ?       mov    -0x18(%ebp),%eax
           ?       mov    -0x20(%ebp),%edi
           ?       imul   -0x20(%ebp),%eax
           ?       movzwl 0x3ffc(%ebx),%esi
      0,08 ?       movzwl 0x4000(%ebx),%ecx
      0,04 ?       add    %edi,%edi
           ?       mov    0x0,%ebx
      0,51 ?       mov    %edi,-0x1c(%ebp)
           ?       mov    %ebx,-0x14(%ebp)
           ?       movl   $0x0,-0x10(%ebp)
           ?       lea    0x20(%edx,%eax,2),%eax
           ?       mov    %eax,-0x18(%ebp)
           ?       xchg   %ax,%ax
      0,04 ? a0:   mov    0x8(%ebp),%ebx
           ?       mov    -0x18(%ebp),%eax
      0,04 ?       movzbl (%ebx),%edx
      0,16 ?       test   %dl,%dl
      0,04 ?     ? je     128
      0,08 ?       lea    0x0(%esi),%esi
      1,61 ? b0:???shl    $0x4,%edx
      1,02 ?    ?  mov    -0x14(%ebp),%edi
      2,04 ?    ?  add    -0x10(%ebp),%edx
      2,24 ?    ?  lea    0x1(%ebx),%ebx
      0,27 ?    ?  movzbl (%edi,%edx,1),%edx
      9,92 ?    ?  mov    %esi,%edi
      0,39 ?    ?  test   %dl,%dl
      2,04 ?    ?  cmovns %ecx,%edi
      4,63 ?    ?  test   $0x40,%dl
      0,55 ?    ?  mov    %di,(%eax)
      3,76 ?    ?  mov    %esi,%edi
      0,71 ?    ?  cmove  %ecx,%edi
      3,41 ?    ?  test   $0x20,%dl
      0,75 ?    ?  mov    %di,0x2(%eax)
      2,43 ?    ?  mov    %esi,%edi
      0,59 ?    ?  cmove  %ecx,%edi
      4,59 ?    ?  test   $0x10,%dl
      0,67 ?    ?  mov    %di,0x4(%eax)
      2,55 ?    ?  mov    %esi,%edi
      0,78 ?    ?  cmove  %ecx,%edi
      4,31 ?    ?  test   $0x8,%dl
      0,67 ?    ?  mov    %di,0x6(%eax)
      5,76 ?    ?  mov    %esi,%edi
      1,80 ?    ?  cmove  %ecx,%edi
      4,20 ?    ?  test   $0x4,%dl
      0,86 ?    ?  mov    %di,0x8(%eax)
      2,98 ?    ?  mov    %esi,%edi
      1,37 ?    ?  cmove  %ecx,%edi
      4,67 ?    ?  test   $0x2,%dl
      0,20 ?    ?  mov    %di,0xa(%eax)
      2,78 ?    ?  mov    %esi,%edi
      0,75 ?    ?  cmove  %ecx,%edi
      3,92 ?    ?  and    $0x1,%edx
      0,75 ?    ?  mov    %esi,%edx
      2,59 ?    ?  mov    %di,0xc(%eax)
      0,59 ?    ?  cmove  %ecx,%edx
      3,10 ?    ?  mov    %dx,0xe(%eax)
      2,39 ?    ?  add    $0x10,%eax
      0,51 ?    ?  movzbl (%ebx),%edx
      2,86 ?    ?  test   %dl,%dl
      2,31 ?    ???jne    b0
      0,04 ?128:   addl   $0x1,-0x10(%ebp)
      4,00 ?       mov    -0x1c(%ebp),%eax
      0,04 ?       add    %eax,-0x18(%ebp)
      0,08 ?       cmpl   $0x10,-0x10(%ebp)
           ?     ? jne    a0
which almost goes away from the profile:
    # cmdline : /home/kirr/local/perf/bin/perf record -g -a sleep 20
    # Samples: 49K of event 'cycles'
    # Event count (approx.): 16799780016
    #
    # Overhead          Command         Shared Object                                                           Symbol
    # ........  ...............  ....................
    #
        27.51%             rawv  libc-2.13.so          [.] __memcpy_ssse3
        23.77%           vivi-*  [kernel.kallsyms]     [k] memcpy
         9.96%             Xorg  [unknown]             [.] 0xa76f5e12
         4.94%           vivi-*  [vivi]                [k] gen_text.constprop.6
         4.44%             rawv  [vivi]                [k] gen_twopix
         3.17%           vivi-*  [vivi]                [k] vivi_fillbuff
         2.45%             rawv  [vivi]                [k] precalculate_line
         1.20%          swapper  [kernel.kallsyms]     [k] read_hpet
i.e. gen_twopix() overhead dropped from 49% to 4% and gen_text() loops
from ~8% to ~4%, and overal cycles count dropped from 31551930117 to
16799780016 which is ~1.9x whole workload speedup.
(*) for RGB24 rendering I've introduced x24, which could be thought as
    synthetic u24 for simplifying the code. That's done because for
    memcpy used for conditional assignment, gcc generates suboptimal code
    with more indirections.
    Fortunately, in C struct assignment is builtin and that's all we
    need from pixeltype for font rendering.

Signed-off-by: Kirill Smelkov <kirr@mns.spb.ru>
Acked-by: Hans Verkuil <hans.verkuil@cisco.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
2012-12-21 18:25:21 -02:00
Paul Bolle
ed2e330114 [media] tda18212: tda18218: use 'val' if initialized
Commits e666a44fa3 ("[media] tda18212:
silence compiler warning") and e0e52d4e9f
("[media] tda18218: silence compiler warning") silenced warnings
equivalent to these:
    drivers/media/tuners/tda18212.c: In function ‘tda18212_attach’:
    drivers/media/tuners/tda18212.c:299:2: warning: ‘val’ may be used uninitialized in this function [-Wmaybe-uninitialized]
    drivers/media/tuners/tda18218.c: In function ‘tda18218_attach’:
    drivers/media/tuners/tda18218.c:305:2: warning: ‘val’ may be used uninitialized in this function [-Wmaybe-uninitialized]
But in both cases 'val' will still be used uninitialized if the calls
of tda18212_rd_reg() or tda18218_rd_reg() fail. Fix this by only
printing the "chip id" if the calls of those functions were successful.
This allows to drop the uninitialized_var() stopgap measure.
Also stop printing the return values of tda18212_rd_reg() or
tda18218_rd_reg(), as these are not interesting.

Signed-off-by: Paul Bolle <pebolle@tiscali.nl>
Acked-by: Antti Palosaari <crope@iki.fi>
Reviewed-by: Antti Palosaari <crope@iki.fi>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
2012-12-21 18:25:13 -02:00
Paul Bolle
cb31c74875 [media] budget-av: only use t_state if initialized
Building budget-av.o triggers this GCC warning:
    In file included from drivers/media/pci/ttpci/budget-av.c:44:0:
    drivers/media/dvb-frontends/tda8261_cfg.h: In function ‘tda8261_get_bandwidth’:
    drivers/media/dvb-frontends/tda8261_cfg.h:68:21: warning: ‘t_state.bandwidth’ may be used uninitialized in this function [-Wuninitialized]
Move the printk() that uses t_state.bandwith to the location where it
should be initialized to fix this.

Signed-off-by: Paul Bolle <pebolle@tiscali.nl>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
2012-12-21 18:25:11 -02:00
Javier Martin
c01b7429e2 [media] media: ov7670: Allow 32x maximum gain for yuv422
4x gain ceiling is not enough to capture a decent image in conditions
of total darkness and only a LED light source. Allow a maximum gain
of 32x instead.
This doesn't have any drawback since the image quality in 'normal'
light conditions is the same.

Signed-off-by: Javier Martin <javier.martin@vista-silicon.com>
Acked-by: Jonathan Corbet <corbet@lwn.net>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
2012-12-21 18:25:09 -02:00
Javier Martin
2027240968 [media] media: m2m-deinterlace: Do not set debugging flag to true
Default value should be 'debugging disabled'.

Signed-off-by: Javier Martin <javier.martin@vista-silicon.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
2012-12-21 18:25:07 -02:00
Peter Senna Tschudin
37e310edf3 [media] drivers/media/pci/saa7134/saa7134-dvb.c: Test if videobuf_dvb_get_frontend return NULL
Based on commit: e66131cee5
Not testing videobuf_dvb_get_frontend output may cause OOPS if it return
NULL. This patch fixes this issue.
The semantic patch that found this issue is(http://coccinelle.lip6.fr/):
// <smpl>
@@
identifier i,a,b;
statement S, S2;
@@
i = videobuf_dvb_get_frontend(...);
... when != if (!i) S
* if (i->a.b)
S2
// </smpl>

Signed-off-by: Peter Senna Tschudin <peter.senna@gmail.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
2012-12-21 17:43:22 -02:00
Javier Martin
832fbb5aec [media] media: coda: Fix H.264 header alignment - v2
Length of H.264 headers is variable and thus it might not be
aligned for the coda to append the encoded frame. This causes
the first frame to overwrite part of the H.264 PPS.
In order to solve that, a filler NAL must be added between
the headers and the first frame to preserve alignment.

[mchehab@redhat.com: applied only v2 diff here, as v1 ended by mistakenly
 being applied]
Signed-off-by: Javier Martin <javier.martin@vista-silicon.com>

Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
2012-12-21 16:56:31 -02:00
Javier Martin
3f3f5c7f63 [media] media: coda: Fix H.264 header alignment
Length of H.264 headers is variable and thus it might not be
aligned for the coda to append the encoded frame. This causes
the first frame to overwrite part of the H.264 PPS.
In order to solve that, a filler NAL must be added between
the headers and the first frame to preserve alignment.

[mchehab@redhat.com: Fix a few CodingStyle issues]
Signed-off-by: Javier Martin <javier.martin@vista-silicon.com>

Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
2012-12-21 16:39:03 -02:00
Jesper Juhl
bbe2a1d32f [media] rc: Fix double free in gpio_ir_recv_remove()
Since rc_unregister_device() frees its argument there's no need to
subsequently call rc_free_device() on the same variable - in fact it's
a double free bug.
Easily fixed by just removing the rc_free_device() call.

Signed-off-by: Jesper Juhl <jj@chaosbits.net>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
2012-12-21 16:28:58 -02:00
Jesper Juhl
e5d85b9ac3 [media] rc: Fix double free in gpio_ir_recv_probe()
At the 'err_request_irq' label, rc_unregister_device(rcdev) frees its
argument. So when we fall through to the 'err_gpio_request' label
further down and call rc_free_device(rcdev) then that's a double free.
Fix that by moving 'rcdev = NULL' from after the call to
rc_free_device() to after rc_unregister_device(). That fixes the
problem since rc_free_device() just does nothing if passed NULL and
there's no further use of 'rcdev' after the call to rc_free_device()
so it's not needed there.

Signed-off-by: Jesper Juhl <jj@chaosbits.net>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
2012-12-21 16:27:01 -02:00