Cache disk shown with question mark after a while

Emo

Cadet
Joined
Sep 26, 2013
Messages
5
Hi,

I have a test setup of Truenas Scale 23.10 working as a proxmox VM (4 Cpu threads/8gb ram). I have one pool with single 8gb WD red disk to which I plan to add a second one (I have a warning on the dashboard that the pool is a stripe with only 1 disk). I use the pool as smb share for photos/movies and as a iscsi dataset which is shared to proxmox with the idea to use for vm disk for some test linux/windows VMs.

I had a 256gb SSD laying around and wanted to add it as a cache disk in order to see if it will speed up operation of the vm disks stored on the iscsi dataset. The disk was added successfully and operates ok for 2-3 days. After that it appears with a question mark. Not sure which logs to check and whether this is some limitation/ issue with the ssd or the cache requires specific disk parameters. Also I can successfully remove the cache SSD and readd it to the pool but after a while the same thing happens.

Hope someone can explain why I am seeing the error. I suspect I am not using the system in the recommended way but still it is strange that the issue appears after a few days of operation.

Thanks.
Merry Christmas to all!
 

Attachments

  • truenas1.png
    truenas1.png
    290.4 KB · Views: 116
  • truenas2.png
    truenas2.png
    26.4 KB · Views: 121
  • truenas3.png
    truenas3.png
    387 KB · Views: 117
Last edited:

Arwen

MVP
Joined
May 17, 2014
Messages
3,611
There are several issues:
  • Did you pass through the disk controller to the VM?
    Are both disks on this passed through disk controller?
    If not, ZFS won't be stable... though that does not explain the lack of L2ARC / Cache disk info
  • With only 8GBytes of RAM, it is HIGHLY NOT recommended to use a L2ARC / Cache disk. This is because the entries in the L2ARC / Cache disk take up RAM for it's directory / index entries. Thus, reducing the limited RAM for the ARC, which is the primary Cache device.
If you can give us the output of zpool status in code tags from Unix SHELL, that would be helpful. If possible, both when the "?" is happening and immediately after removing and re-adding it. Just so we can compare the
 

Emo

Cadet
Joined
Sep 26, 2013
Messages
5
Hi Arwen,

Yes I have the sata controller working with passthrough. Only the boot pool is a proxmox virtual disk.

Here is the output when the SSD cache seems to work:
Code:
 pool: boot-pool
 state: ONLINE
  scan: scrub repaired 0B in 00:00:06 with 0 errors on Tue Dec 19 03:45:08 2023
config:

        NAME        STATE     READ WRITE CKSUM
        boot-pool   ONLINE       0     0     0
          sde3      ONLINE       0     0     0

errors: No known data errors

  pool: nas_pool
 state: ONLINE
  scan: scrub repaired 0B in 04:01:56 with 0 errors on Sun Nov  5 04:02:05 2023
config:

        NAME                                    STATE     READ WRITE CKSUM
        nas_pool                                ONLINE       0     0     0
          3e11c8dc-2653-4f36-8b2e-f401fbfea2ea  ONLINE       0     0     0
        cache
          sdc1                                  ONLINE       0     0     0

errors: No known data errors


Code:
 top -o RES
top - 17:59:11 up 12 min,  1 user,  load average: 1.15, 1.28, 0.97
Tasks: 387 total,   1 running, 386 sleeping,   0 stopped,   0 zombie
%Cpu(s):  0.0 us,  0.3 sy,  0.0 ni, 97.3 id,  2.3 wa,  0.0 hi,  0.1 si,  0.0 st
MiB Mem :   7940.9 total,   5647.3 free,   2362.2 used,    191.5 buff/cache
MiB Swap:   2045.0 total,   2045.0 free,      0.0 used.   5578.7 avail Mem

    PID USER      PR  NI    VIRT    RES    SHR S  %CPU  %MEM     TIME+ COMMAND
    944 root      20   0 2657704 304420  31356 S   0.0   3.7   0:10.72 asyncio_loop
   7468 root      20   0  676468 213568  33652 S   0.0   2.6   0:02.51 middlewared (wo
   7937 root      20   0  676340 211976  33704 S   0.0   2.6   0:02.43 middlewared (wo
   7922 root      20   0  676456 211740  33476 S   0.0   2.6   0:02.35 middlewared (wo
   7478 root      20   0  676452 209848  33604 S   0.0   2.6   0:02.35 middlewared (wo
   9160 root      20   0  601760 208316  33084 S   0.0   2.6   0:02.40 middlewared (wo
   3186 root      20   0  364616  64488  10052 S   0.0   0.8   0:00.96 cli
    995 root      20   0   73328  50296  12140 S   0.0   0.6   0:00.27 python3
   2854 netdata   39  19  326868  39316   7172 S   0.0   0.5   0:01.12 netdata
   4038 netdata   39  19   56572  31584  10188 S   0.0   0.4   0:00.34 python.d.plugin
...


I will have to wait a few days to check again the status when the question mark appears.

Thanks.
 
Top