Guest - Testers are needed for the reworked CDateTime core component. See... https://forum.kodi.tv/showthread.php?tid=378981 (September 29) x
  • 1
  • 6
  • 7
  • 8(current)
  • 9
  • 10
  • 342
Intel VAAPI howto with Leia v18 nightly based on Ubuntu 18.04 server
(2015-07-18, 13:03)puithove Wrote:
(2015-07-17, 23:16)wizziwig Wrote: I looked at your sample. It uses repeat-first-field flags on the picture headers in order to extend 24fps content to 29.97fps video as specified by the sequence header. Because the repetition pattern is not regular, you end up with stutter because frames are presented at a variable frame rate. Intel is not going to fix this for you because it's not a bug. Their Windows driver will do the same thing. I don't think the field repeat flags are even visible to the VPP/Deinterlacer. This needs to be handled in the player (Kodi in this case) by taking decoded frames and buffering some of the fields to repeat on the next frame. You're basically converting a partially soft-telecined video into a fully hard-telecined one (wiki). The final frames with repeated fields applied can then be sent to VPP.

I covered this topic before in an older thread.

Now that you mention it, I do remember seeing your post on this before. Would you mind outlining your findings on the bug report (and therefore also indicating that I'm not the only one having this issue)?

In my mind, it seems like it would still be something Intel would need to fix. I'm not going to pretend I really know a whole lot in this area, but wouldn't the field repeat be applied after the frames are decoded but before being sent to the deinterlacer? This would all be in the VAAPI chain it seems.

@fritsch - any thoughts here? Anything that could be done on the Kodi side?

I wish I could help out but I'm not a Linux/Kodi expert. I spend most of my time coding in Windows (DirectX and Cuda). The basic logic of what needs to happen is inside the "softpulldown" filter. Link

The simpler solution would be to copy the decoded frames from video memory back into system memory to run that filter on the CPU. I think that's what happens with yadif? I would not recommend that solution as it would be very slow. Ideally the field copying of that filter needs to happen on the GPU so frames stay in video memory. Can OpenGL access the decoded NV12 frames in order to copy some of the fields to a new frame?

If the VAAPI architecture does not allow any intermediate steps between decoding and VPP deinterlacing, then you're stuck and can't fix the bug. If someone who knows this code can explain more about how VAAPI works, maybe I can offer some suggestions.
Reply
This looks to be quite a mixed sample - unless ffprobe is misleading me or I misuse it there are a lot of places where the telecine/pulldown pattern reverts to normal interlaced.

Thinking as I type I don't know how you would ever (currently on a computer) get this perfectly smooth.

Of course telecine is not by design smooth - it was invented for 60Hz (/1.001) displays to play 24Hz using repeated fields. If the sample were pure soft pulldown the decoder/player could easily output 24HZ - but it isn't so that's not an option it needs sometimes to output 60Hz.

I live in pal land, so I don't really know how disconcerting 3:2 looked on an analogue TV - or a modern TV.

Given that modern TVs don't really work at 50/60Hz anymore it's quite possible that they handle this OK (or better) with frame interpolation/smoothing. Modern TVs often boast of internal processing at really high Hz rates.

Edit: forgot to put that I am not trying to say there is no issue for you that couldn't be handled better by player and or deinterlacer.
Reply
Hello fritsch (or anyone else who may know) -

Are these enhancements present in OpenELEC 6.0 beta 3?

Thanks!
Reply
Nope.

Won't be in final 6.0 either.
Reply
Try the openelec build in second post, works great! And based on rc2.
Reply
Thanks Hufvudet. I'll try that.
Reply
Using the OE build on the new 5CPYH NUC. Image quality is best I've seen so far (I've used IVB before and CuboxTV for some weeks, which was disappointing).
1080i live TV is outstanding with this hard- and software. Thanks to all the devs!
Reply
I installed OE6 Beta 3 from clean on a NUC5CPYH (Braswell N3050). Then I used the update .tar in post #2 in this thread to upgrade to the development version. HEVC encoded videos take the CPU immediately close to 100% - just as they did on OE6 Beta 3, where HW decoding support did not exist. Then my fps starts to fall below the video fps.

Debug log: http://sprunge.us/BYQD
Vainfo: http://sprunge.us/YYBg
Dmesg: http://sprunge.us/gYNX

I do seem to get a lot of "[drm:valleyview_update_wm] *ERROR* timed out waiting for Punit DDR DVFS request" errors, whatever they mean. Seem to be DRM related, so maybe not directly related to this problem.

I checked the configuration that it should match what has been documented in post #1, with the exception that " Adjust Refreshrate to match video: On" does not exist under Acceleration. I did set "Adjust display refresh rate" under Playback to "On start/stop" though.

The said test videos run well in Windows 8.1 on the same hardware. Any pointers here?

EDIT: Now that I read more carefully, nowhere is actually said that HEVC HW decoding would be supported yet in this build! Am I right?
Reply
This is somewhat off-topic but it's related to this new development: I'm using HSW hardware (a Celeron-based Chromebox) to run OpenELEC/Kodi. I encoded a concert with resolution [email protected] with Hi10p profile using x264 (avg bitrate around 5Mbps). That means I cannot use hardware decoding for the file.

The pop up menu shows that the CPU usage is around 60-70% for both cores. Still, I get a lot of skips per second unless I choose Bilinear as the scaling method.

Is it that the GPU cannot keep up or is the high CPU usage related? In either case, will the new development help me later on when it gets to the official code or should I forget the Hi10p profile for such material? I tested one similar concert which was encoded with an 8-bit x264 and it didn't have problems being scaled with Lanczos3 optimized to 1080p.
Reply
(2015-07-21, 19:46)trsqr Wrote: EDIT: Now that I read more carefully, nowhere is actually said that HEVC HW decoding would be supported yet in this build! Am I right?

not yet, no

(2015-07-21, 20:59)Boulder Wrote: This is somewhat off-topic but it's related to this new development: I'm using HSW hardware (a Celeron-based Chromebox) to run OpenELEC/Kodi. I encoded a concert with resolution [email protected] with Hi10p profile using x264 (avg bitrate around 5Mbps). That means I cannot use hardware decoding for the file.

The pop up menu shows that the CPU usage is around 60-70% for both cores. Still, I get a lot of skips per second unless I choose Bilinear as the scaling method.

Is it that the GPU cannot keep up or is the high CPU usage related? In either case, will the new development help me later on when it gets to the official code or should I forget the Hi10p profile for such material? I tested one similar concert which was encoded with an 8-bit x264 and it didn't have problems being scaled with Lanczos3 optimized to 1080p.

Hi10P is s/w decoded, so asking the CPU to that as well as lanczos3 upscaling is simply too much. Either using bilinear scaling or re-encode to 8-bit so hw/GPU decoding can be used
Reply
(2015-07-22, 00:10)Matt Devo Wrote: Hi10P is s/w decoded, so asking the CPU to that as well as lanczos3 upscaling is simply too much. Either using bilinear scaling or re-encode to 8-bit so hw/GPU decoding can be used

Upscaling is done by GPU, not CPU. In theory lanczos3 (fast) should work with sw decoding too but there are issue with mesa and EGL I haven't tracked down yet. The timing is strange and OpenGL calls block where they are not supposed to block.
Reply
(2015-07-22, 07:41)FernetMenta Wrote: Upscaling is done by GPU, not CPU. In theory lanczos3 (fast) should work with sw decoding too but there are issue with mesa and EGL I haven't tracked down yet. The timing is strange and OpenGL calls block where they are not supposed to block.

ah, my mistake - was under the impression that software decoding implied CPU scaling as well.
Reply
(2015-07-22, 07:41)FernetMenta Wrote:
(2015-07-22, 00:10)Matt Devo Wrote: Hi10P is s/w decoded, so asking the CPU to that as well as lanczos3 upscaling is simply too much. Either using bilinear scaling or re-encode to 8-bit so hw/GPU decoding can be used

Upscaling is done by GPU, not CPU. In theory lanczos3 (fast) should work with sw decoding too but there are issue with mesa and EGL I haven't tracked down yet. The timing is strange and OpenGL calls block where they are not supposed to block.
My other 10-bit encodes work just fine (ranging from 1280x720 to full HD @ 23.976 or 25 fps) so it seems that the high framerate is what causes the "issue" in the end. So is bilinear the only one that doesn't have any timing problems? It's the first one that works, all the higher ones shows the same amount of dropped frames.
Reply
(2015-07-22, 10:21)Boulder Wrote: My other 10-bit encodes work just fine (ranging from 1280x720 to full HD @ 23.976 or 25 fps) so it seems that the high framerate is what causes the "issue" in the end. So is bilinear the only one that doesn't have any timing problems? It's the first one that works, all the higher ones shows the same amount of dropped frames.

I didn't notice the framerate the first time, but yeah you're asking the CPU to decode 150% more data per second than with 24p, which isn't trivial. I've never understood the reasoning behind 10-bit encodes vs higher bitrate 8-bit ones. The difference visually is minimal at best, and the latter is decoded in hardware which makes it easier to deal with.
Reply
It's mostly for bitrate savings in my case, with x264 the gain can be up to 10% or so. However, I'll probably switch back to 8-bit encodes for compatibility reasons.
Reply
  • 1
  • 6
  • 7
  • 8(current)
  • 9
  • 10
  • 342

Logout Mark Read Team Forum Stats Members Help
Intel VAAPI howto with Leia v18 nightly based on Ubuntu 18.04 server18