Context:
Linux hostname 6.8.0-45-generic #45~22.04.1-Ubuntu SMP PREEMPT_DYNAMIC Wed Sep 11 15:25:05
Description: Ubuntu 22.04.5 LTS (from '*lsb_release -a*' )
Observations and Comments:
I decided to check to see if using the "--bwlimit=95232" option while doing backups with "rsync" was actually giving me any benefit.
Note that I had no other processes running except the two MATE terminals I had open, one running the backup script, the other I was using for "probing" and "analyzing".
My script was doing rsync involving a massive 120 GB data differential transfer.
root:~# ps -ef
root 354188 1 0 18:33 pts/0 00:00:00 /bin/sh /site/DB005_F7/Z_backup.DB001_F7.DateSize.batch
root 354192 354188 3 18:33 pts/0 00:03:51 rsync --bwlimit=95232 --one-file-system --recursive --outbuf=Line --
root 354193 354192 0 18:33 pts/0 00:00:08 rsync --bwlimit=95232 --one-file-system --recursive --outbuf=Line --
root 354194 354193 19 18:33 pts/0 00:23:30 rsync --bwlimit=95232 --one-file-system --recursive --outbuf=Line --
I expected that with the "--bwlimit" and with 4GB RAM, I would only be seeing a minimal swap usage. Instead, I got this (about 750MB swap):
root:~# swapon
NAME TYPE SIZE USED PRIO
/dev/sda11 partition 996.2M 240.4M 99
/dev/sda10 partition 996.2M 239.4M 99
/dev/sdb2 partition 2G 241.2M 99
root:~# sync
root:~# swapon
NAME TYPE SIZE USED PRIO
/dev/sda11 partition 996.2M 240.1M 99
/dev/sda10 partition 996.2M 239.1M 99
/dev/sdb2 partition 2G 240.9M 99
root:~#
When I checked with top, to see if anything else would pop up, I didn't see anything that was out of place:
root@OasisMega1:~# top -d 10
top - 20:36:24 up 9:17, 1 user, load average: 3.75, 4.53, 4.39
Tasks: 282 total, 1 running, 281 sleeping, 0 stopped, 0 zombie
%Cpu0 : 1.2 us, 7.7 sy, 0.0 ni, 61.9 id, 28.8 wa, 0.0 hi, 0.3 si, 0.0 st
%Cpu1 : 1.0 us, 10.5 sy, 0.0 ni, 65.1 id, 23.4 wa, 0.0 hi, 0.0 si, 0.0 st
%Cpu2 : 1.6 us, 3.9 sy, 0.0 ni, 61.9 id, 32.6 wa, 0.0 hi, 0.0 si, 0.0 st
%Cpu3 : 1.6 us, 8.0 sy, 0.0 ni, 23.2 id, 67.1 wa, 0.0 hi, 0.0 si, 0.0 st
MiB Mem : 3663.5 total, 603.1 free, 609.8 used, 2450.6 buff/cache
MiB Swap: 4040.4 total, 3321.1 free, 719.3 used. 1721.5 avail Mem
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
354194 root 18 -2 74.1m 2.3m 1.0m D 17.1 0.1 23:59.42 rsync --bwlimit=95232 --outbuf=Line
47 root 20 0 0.0m 0.0m 0.0m S 6.3 0.0 2:49.60 [kcompactd0]
354192 root 18 -2 18.2m 3.0m 2.5m S 6.1 0.1 3:58.06 rsync --bwlimit=95232 --outbuf=Line
7195 ericthe+ 20 0 338.2m 10.8m 8.0m S 2.0 0.3 11:09.33 /usr/lib/mate-applets/mate-multiload-applet
6296 root 20 0 573.1m 23.3m 14.1m S 1.7 0.6 11:02.42 /usr/lib/xorg/Xorg -core :0 -seat seat0 -auth /var/run/lightdm/root/+
8537 root 20 0 0.0m 0.0m 0.0m D 1.5 0.0 2:29.62 [usb-storage]
My backup process lasted about 2 hours, given I was working with an external 4 TB USB hard drive on a USB2 port:
Using previously determined bandwidth limit for rsync buffer setting ...
Will apply parameter to limit flooding of I/O, memory and swap ==>> --bwlimit=95232
Thu 26 Sep 2024 06:33:21 PM EDT |rsync| Start DB001_F7 ...
Background 'rsync' working ...
Expected Log files:
/site/DB005_F7/Z_backup.DB001_F7.DateSize.out
/site/DB005_F7/Z_backup.DB001_F7.DateSize.err
Use 'OS_Admin__partitionMirror_Monitor.sh' to monitor rsync process.
Imported LIBRARY: INCLUDES__TerminalEscape_SGR.bh ...
Thu 26 Sep 2024 06:33:32 PM EDT
PID 354194 is RSYNC child process ...
PID 354193 is RSYNC child process ...
PID 354192 is RSYNC MASTER process ...
RSYNC backup process under way ...
root 354192 354188 10 18:33 pts/0 00:00:01 rsync
--bwlimit=95232
--one-file-system
--recursive
--outbuf=Line
--links
--perms
--times
--group
--owner
--devices
--specials
--verbose
--out-format=%t|%i|%M|%b|%f|
--delete-during
--whole-file
--human-readable
--protect-args
--ignore-errors
--msgs2stderr ./ /site/DB005_F7/DB001_F7/
Scanning at 10 second intervals ...
.............................. 5 min
.............................. 10 min
.............................. 15 min
.............................. 125 min
.............................
RSYNC process (# 354192) has completed.
Thu 26 Sep 2024 08:45:37 PM EDT
Question #1:
Can anyone explain why rsync used so much memory as to overflow into the swap, if I had specified that buffering size with "--bwlimit" ???
Question #2:
I expected the manual "sync" to flush RAM and swap of anything that was duplicated on disk. That total RAM and swap was much larger than the expected usage, given that buffer specification, so why no reduction in usage of RAM/SWAP ?
Question #3:
Is there a command I could issue (periodically) to force the flushing of the retained "dirty" RAM ?
I saw somewhere a suggestion that the following would flush that, but it had not apparent effect when I did enter it:
sync; echo 1 > /proc/sys/vm/drop_caches
Also for
sync; echo 3 > /proc/sys/vm/drop_caches
My current kernel parameters:
vm.admin_reserve_kbytes = 8192
vm.compact_unevictable_allowed = 1
vm.compaction_proactiveness = 20
vm.dirty_background_bytes = 0
vm.dirty_background_ratio = 10
vm.dirty_bytes = 0
vm.dirty_expire_centisecs = 3000
vm.dirty_ratio = 40
vm.dirty_writeback_centisecs = 500
vm.dirtytime_expire_seconds = 10000
vm.extfrag_threshold = 500
vm.hugetlb_optimize_vmemmap = 0
vm.hugetlb_shm_group = 0
vm.laptop_mode = 0
vm.legacy_va_layout = 0
vm.lowmem_reserve_ratio = 256 256 32 0 0
vm.max_map_count = 65530
vm.memfd_noexec = 0
vm.memory_failure_early_kill = 0
vm.memory_failure_recovery = 1
vm.min_free_kbytes = 67584
vm.min_slab_ratio = 5
vm.min_unmapped_ratio = 1
vm.mmap_min_addr = 65536
vm.mmap_rnd_bits = 32
vm.mmap_rnd_compat_bits = 16
vm.nr_hugepages = 0
vm.nr_hugepages_mempolicy = 0
vm.nr_overcommit_hugepages = 0
vm.numa_stat = 1
vm.numa_zonelist_order = Node
vm.oom_dump_tasks = 1
vm.oom_kill_allocating_task = 0
vm.overcommit_kbytes = 0
vm.overcommit_memory = 0
vm.overcommit_ratio = 0
vm.page-cluster = 4
vm.page_lock_unfairness = 5
vm.panic_on_oom = 0
vm.percpu_pagelist_high_fraction = 0
vm.stat_interval = 1
vm.swappiness = 20
vm.unprivileged_userfaultfd = 0
vm.user_reserve_kbytes = 90551
vm.vfs_cache_pressure = 50
vm.watermark_boost_factor = 15000
vm.watermark_scale_factor = 1000
vm.zone_reclaim_mode = 0