HDMI audio stops working, vg heap issue?

Hi all

I have a number of Pi 3’s running Ubuntu MATE 16.04.1 they’re used as status monitors and just have firefox running in full screen, displaying jenkins, zabbix etc…

Everything works as expected. But after a certain amount of time (I’m still trying to find a pattern here, but more than a day), the audio alerts stop working.

If I reboot the Pi, it all works again.

So things I’ve found.
When it works, the tvservice cmd returns this :-

$ tvservice -a PCM supported: Max channels: 2, Max samplerate: 48kHz, Max samplesize 24 bits.

When it stops working, all tvservice cmds hang on /dev/vchiq

open("/dev/vchiq", O_RDWR|O_LARGEFILE)  = 3
ioctl(3, _IOC(_IOC_READ|_IOC_WRITE, 0xc4, 0x08, 0x10), 0x7eee7360) = -1 EAGAIN (Resource temporarily unavailable)

The other odd thing, is when it stops working, omxplayer continues to work and will play sounds.

Also, and this is where I give up. If I clear the cache and have a look

vcgencmd cache_flush && vcdbg reloc

Relocatable heap version 4 found at 0x3b000000
total space allocated is 44M, with 44M relocatable, 0 legacy and 0 offline
0 legacy blocks of size 2359296

free list at 0x3d401920
36M free memory in 2 free block(s)
largest free block is 36M bytes

0x3b000000: free 36M
[   7] 0x3d401480: used  576 (refcount 1 lock count 0, size      512, align    4, data 0x3d4014a0, d0rual) 'ILCS VC buffer pool'
0x3d4016c0: free 640
[  67] 0x3d401940: used 1.1K (refcount 1 lock count 0, size     1024, align    4, data 0x3d401960, d0rual) 'resample coeff table data'
[  63] 0x3d401d80: used   96 (refcount 1 lock count 0, size       12, align    4, data 0x3d401da0, d0rual) 'resample coeff table'
[   4] 0x3d401de0: used  416 (refcount 1 lock count 0, size      340, align    4, data 0x3d401e00, d0rual) 'resample context'
[   3] 0x3d401f80: used 8.0M (refcount 1 lock count 8, size  8355840, align 4096, data 0x3d402000, d1rual) 'ARM FB'
[   2] 0x3dbfafa0: used  16K (refcount 1 lock count 0, size    16384, align   32, data 0x3dbfafc0, d0ruAl) 'audioplus_tmp_buf'
[   1] 0x3dbfefe0: used 4.0K (refcount 1 lock count 0, size        0, align 4096, data 0x3dbff000, d1rual) 'camera fast alloc arena'
small allocs not requested

It all looks good, but then I try and play a test sound

aplay /usr/share/sounds/alsa/Front_Center.wav

The cache looks broken?

vcdbg reloc

Relocatable heap version 4 found at 0x3b000000
total space allocated is 44M, with 44M relocatable, 0 legacy and 0 offline
0 legacy blocks of size 2359296

free list at 0x3d401780
invalid space 0x0 in free list at 0x3d3fc720
20K free memory in 2 free block(s)
largest free block is 19K bytes

0x3b000000: free 36M
0x3d3e30e0: corrupt entry (space 0x0)
resynced at 0x3d3ec160 (skipped 36992 bytes)
[  20] 0x3d3ec160: used 1.2K (refcount 1 lock count 0, size     1168, align   32, data 0x3d3ec180, d0rual) 'audio_server'
[  43] 0x3d3ec640: used  23K (refcount 1 lock count 0, size    23796, align   32, data 0x3d3ec660, d0rual) 'audio_server'
0x3d3f2380: free 27K
0x3d3f8e40: corrupt entry (space 0x2f202f2)
resynced at 0x3d3fc740 (skipped 14592 bytes)
[  30] 0x3d3fc740: used  19K (refcount 1 lock count 0, size    19708, align   32, data 0x3d3fc760, d0rual) 'audio_server'
[   7] 0x3d401480: used  576 (refcount 1 lock count 0, size      512, align    4, data 0x3d4014a0, d0rual) 'ILCS VC buffer pool'
0x3d4016c0: free 640
[  67] 0x3d401940: used 1.1K (refcount 2 lock count 0, size     1024, align    4, data 0x3d401960, d0rual) 'resample coeff table data'
[  63] 0x3d401d80: used   96 (refcount 1 lock count 0, size       12, align    4, data 0x3d401da0, d0rual) 'resample coeff table'
[   4] 0x3d401de0: used  416 (refcount 1 lock count 0, size      340, align    4, data 0x3d401e00, d0rual) 'resample context'
[   3] 0x3d401f80: used 8.0M (refcount 1 lock count 8, size  8355840, align 4096, data 0x3d402000, d1rual) 'ARM FB'
[   2] 0x3dbfafa0: used  16K (refcount 1 lock count 0, size    16384, align   32, data 0x3dbfafc0, d0ruAl) 'audioplus_tmp_buf'
[   1] 0x3dbfefe0: used 4.0K (refcount 1 lock count 0, size        0, align 4096, data 0x3dbff000, d1rual) 'camera fast alloc arena'
heap corruption detected
small allocs not requested

On a working box, it doesn’t do that, but I dont know if its red herring?

One other thing, on a working Pi, when you re-plug the HDMI cable, you get this :-

1261711.810: hdmi: HDMI:>>>>>>>>>>>>>Rx sensed, reading EDID<<<<<<<<<<<<<
1261723.882: hdmi: HDMI:EDID version 1.3, 1 extensions, screen size 71x40 cm
1261723.953: hdmi: HDMI:EDID features - videodef 0x80 !standby !suspend active off; colour encoding:RGB444|YCbCr422; sRGB is not default colourspace; preferred format is native; does not support GTF
1261724.073: hdmi: HDMI:EDID failed to find a matching detail format for 2560x1440p hfp:48 hs:32 hbp:80 vfp:3 vs:5 vbp:33 pixel clock:241 MHz
1261724.099: hdmi: HDMI:EDID calculated refresh rate is 60 Hz
1261724.129: hdmi: HDMI:EDID guessing the format to be 2560x1440p @60 Hz
1261724.211: hdmi: HDMI:EDID found unknown detail timing format: 2560x1440p hfp:48 hs:32 hbp:80 vfp:3 vs:5 vbp:33 pixel clock:241 MHz
1261724.262: hdmi: HDMI:EDID found DMT format: code 4, 640x480p @ 60 Hz in established timing I/II
1261724.308: hdmi: HDMI:EDID found DMT format: code 5, 640x480p @ 72 Hz in established timing I/II
1261724.352: hdmi: HDMI:EDID found DMT format: code 6, 640x480p @ 75 Hz in established timing I/II
1261724.397: hdmi: HDMI:EDID found DMT format: code 8, 800x600p @ 56 Hz in established timing I/II
1261724.442: hdmi: HDMI:EDID found DMT format: code 9, 800x600p @ 60 Hz in established timing I/II
1261724.488: hdmi: HDMI:EDID found DMT format: code 10, 800x600p @ 72 Hz in established timing I/II
1261724.533: hdmi: HDMI:EDID found DMT format: code 11, 800x600p @ 75 Hz in established timing I/II
1261724.578: hdmi: HDMI:EDID found DMT format: code 16, 1024x768p @ 60 Hz in established timing I/II
1261724.622: hdmi: HDMI:EDID found DMT format: code 17, 1024x768p @ 70 Hz in established timing I/II
1261724.669: hdmi: HDMI:EDID found DMT format: code 18, 1024x768p @ 75 Hz in established timing I/II
1261724.717: hdmi: HDMI:EDID found DMT format: code 36, 1280x1024p @ 75 Hz in established timing I/II
1261724.810: hdmi: HDMI:EDID standard timings block x 8: 0xD1C0 B300 9500 8180 8140 81C0 0101 0101 
1261724.872: hdmi: HDMI:EDID found DMT format: code 82, 1920x1080p @ 60 Hz (16:9) in standard timing 0
1261724.932: hdmi: HDMI:EDID found DMT format: code 58, 1680x1050p @ 60 Hz (16:10) in standard timing 1
1261724.989: hdmi: HDMI:EDID found DMT format: code 47, 1440x900p @ 60 Hz (16:10) in standard timing 2
1261725.044: hdmi: HDMI:EDID found DMT format: code 35, 1280x1024p @ 60 Hz (5:4) in standard timing 3
1261725.102: hdmi: HDMI:EDID found DMT format: code 32, 1280x960p @ 60 Hz (4:3) in standard timing 4
1261725.161: hdmi: HDMI:EDID found DMT format: code 85, 1280x720p @ 60 Hz (16:9) in standard timing 5
1261738.177: hdmi: HDMI:EDID parsing v3 CEA extension 0
1261738.221: hdmi: HDMI:EDID monitor support - underscan IT formats:yes, basic audio:yes, yuv444:yes, yuv422:yes, #native DTD:1
1261738.275: hdmi: HDMI:EDID found preferred CEA detail timing format: 1920x1080p @ 60 Hz (16)
1261738.322: hdmi: HDMI:EDID found CEA detail timing format: 1280x720p @ 60 Hz (4)
1261738.371: hdmi: HDMI:EDID found CEA detail timing format: 720x480p @ 60 Hz (2)
1261738.420: hdmi: HDMI:EDID found CEA detail timing format: 720x576p @ 50 Hz (17)
1261738.513: hdmi: HDMI:EDID failed to find a matching detail format for 1280x1440p hfp:96 hs:136 hbp:232 vfp:3 vs:10 vbp:40 pixel clock:156 MHz
1261738.541: hdmi: HDMI:EDID calculated refresh rate is 60 Hz
1261738.573: hdmi: HDMI:EDID guessing the format to be 1280x1440p @60 Hz
1261738.658: hdmi: HDMI:EDID found unknown detail timing format: 1280x1440p hfp:96 hs:136 hbp:232 vfp:3 vs:10 vbp:40 pixel clock:156 MHz
1261738.702: hdmi: HDMI:EDID found CEA format: code 16, 1920x1080p @ 60Hz 
1261738.739: hdmi: HDMI:EDID found CEA format: code 31, 1920x1080p @ 50Hz 
1261738.779: hdmi: HDMI:EDID found CEA format: code 5, 1920x1080i @ 60Hz 
1261738.816: hdmi: HDMI:EDID found CEA format: code 20, 1920x1080i @ 50Hz 
1261738.854: hdmi: HDMI:EDID found CEA format: code 4, 1280x720p @ 60Hz 
1261738.892: hdmi: HDMI:EDID found CEA format: code 19, 1280x720p @ 50Hz 
1261738.929: hdmi: HDMI:EDID found CEA format: code 3, 720x480p @ 60Hz 
1261738.965: hdmi: HDMI:EDID found CEA format: code 18, 720x576p @ 50Hz 
1261739.003: hdmi: HDMI:EDID found CEA format: code 2, 720x480p @ 60Hz 
1261739.045: hdmi: HDMI:EDID found CEA format: code 17, 720x576p @ 50Hz 
1261739.082: hdmi: HDMI:EDID found CEA format: code 1, 640x480p @ 60Hz 
1261739.140: hdmi: HDMI:EDID found audio format 2 channels PCM, sample rate: 32|44|48 kHz, sample size: 16|20|24 bits
1261739.164: hdmi: HDMI:EDID found HDMI VSDB length 7
1261739.193: hdmi: HDMI:EDID HDMI VSDB has physical address 1.0.0.0
1261739.216: hdmi: HDMI:EDID HDMI VSDB supports AI:no, dual link DVI:no
1261739.253: hdmi: HDMI:EDID HDMI VSDB deep colour support - 48-bit:no 36-bit:yes 30-bit:yes DC_yuv444:yes
1261739.276: hdmi: HDMI:EDID HDMI VSDB max TMDS clock 300 MHz
1261739.299: hdmi: HDMI:EDID HDMI VSDB has no latency information
1261739.345: hdmi: HDMI:EDID filtering formats with pixel clock > 162 MHz or h. blanking > 1023
1261739.572: hdmi: HDMI:EDID preferred mode remained as CEA (16) 1920x1080p @ 60 Hz with pixel clock 148 MHz
1261739.600: hdmi: HDMI: hotplug attached with HDMI support
1261739.650: TV service:host side not connected, dropping notification 0x00000002, 0x00000001, 0x00000010
1262260.407: clock: Set PLLB_VCO to 2400000000
1262260.484: clock: clock_set_clock:  dst: CLK_ARM, source: PLLB_ARM, div: 1.000000
1262260.570: clock: clock_set_clock:  dst: CLK_EMMC, source: PLLC_PER, div: 4.800000
1262260.616: clock: Set PLLC_VCO to 2400000000
1262260.689: clock: clock_set_clock:  dst: CLK_VPU, source: PLLC_CORE0, div: 3.000000
--------- SNIP ------------
--- continuous clock messages 
---- SNIP ------------

On the Pi with the non-working audio only this happens :-

3736656.709: hdmi: HDMI:>>>>>>>>>>>>>Rx sensed, reading EDID<<<<<<<<<<<<<
3736666.359: clock: Set PLLB_VCO to 2400000000
3736666.435: clock: clock_set_clock:  dst: CLK_ARM, source: PLLB_ARM, div: 1.000000
3736666.522: clock: clock_set_clock:  dst: CLK_EMMC, source: PLLC_PER, div: 4.800000
2821951.730: clock: clock_set_clock:  dst: CLK_PWM, source: XOSC, div: 0.000000
2821951.767: clock: clock_set_clock:  dst: CLK_PWM, source: PLLD_PER, div: 5.000000

Once again, a reboot fixes all of these issues.

Anything else I can try? Have I done something daft?

Many thanks for any help

Lee