linux-snapdragon/include/trace/events
Josef Bacik 5fe46590d0 btrfs: use delalloc_bytes to determine flush amount for shrink_delalloc
commit 03fe78cc2942c55cc13be5ca42578750f17204a1 upstream.

We have been hitting some early ENOSPC issues in production with more
recent kernels, and I tracked it down to us simply not flushing delalloc
as aggressively as we should be.  With tracing I was seeing us failing
all tickets with all of the block rsvs at or around 0, with very little
pinned space, but still around 120MiB of outstanding bytes_may_used.
Upon further investigation I saw that we were flushing around 14 pages
per shrink call for delalloc, despite having around 2GiB of delalloc
outstanding.

Consider the example of a 8 way machine, all CPUs trying to create a
file in parallel, which at the time of this commit requires 5 items to
do.  Assuming a 16k leaf size, we have 10MiB of total metadata reclaim
size waiting on reservations.  Now assume we have 128MiB of delalloc
outstanding.  With our current math we would set items to 20, and then
set to_reclaim to 20 * 256k, or 5MiB.

Assuming that we went through this loop all 3 times, for both
FLUSH_DELALLOC and FLUSH_DELALLOC_WAIT, and then did the full loop
twice, we'd only flush 60MiB of the 128MiB delalloc space.  This could
leave a fair bit of delalloc reservations still hanging around by the
time we go to ENOSPC out all the remaining tickets.

Fix this two ways.  First, change the calculations to be a fraction of
the total delalloc bytes on the system.  Prior to this change we were
calculating based on dirty inodes so our math made more sense, now it's
just completely unrelated to what we're actually doing.

Second add a FLUSH_DELALLOC_FULL state, that we hold off until we've
gone through the flush states at least once.  This will empty the system
of all delalloc so we're sure to be truly out of space when we start
failing tickets.

I'm tagging stable 5.10 and forward, because this is where we started
using the page stuff heavily again.  This affects earlier kernel
versions as well, but would be a pain to backport to them as the
flushing mechanisms aren't the same.

CC: stable@vger.kernel.org # 5.10+
Reviewed-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-09-18 13:43:18 +02:00
..
9p.h
afs.h afs: Fix tracepoint string placement with built-in AFS 2021-07-21 15:08:35 +01:00
alarmtimer.h
asoc.h ASoC: soc-core: tidyup jack.h 2020-11-30 12:54:01 +00:00
avc.h selinux: add basic filtering for audit trace events 2020-08-21 17:07:29 -04:00
bcache.h block: remove superfluous param in blk_fill_rwbs() 2021-02-22 06:37:41 -07:00
block.h blktrace: fix blk_rq_merge documentation 2021-02-22 06:37:41 -07:00
bpf_test_run.h
bridge.h
btrfs.h btrfs: use delalloc_bytes to determine flush amount for shrink_delalloc 2021-09-18 13:43:18 +02:00
cachefiles.h
cgroup.h
clk.h clk: Trace clk_set_rate() "range" functions 2020-12-17 01:54:31 -08:00
cma.h mm, tracing: unify PFN format strings 2021-06-29 10:53:52 -07:00
compaction.h
context_tracking.h
cpuhp.h
devfreq.h PM / devfreq: Add tracepoint for frequency changes 2020-10-26 10:52:37 +09:00
devlink.h devlink: Add a tracepoint for trap reports 2020-09-30 18:01:26 -07:00
dma_fence.h treewide: Add missing semicolons to __assign_str uses 2021-06-30 09:19:14 -04:00
erofs.h
error_report.h tracing: add error_report_end trace point 2021-02-26 09:41:02 -08:00
ext4.h ext4: delete some unused tracepoint definitions 2021-04-02 11:38:06 -04:00
f2fs.h f2fs: move ioctl interface definitions to separated file 2020-11-02 08:33:02 -08:00
fib6.h
fib.h
filelock.h locks: Remove extra "0x" in tracepoint format specifier 2020-09-01 18:09:34 -04:00
filemap.h mm, tracing: unify PFN format strings 2021-06-29 10:53:52 -07:00
fs_dax.h
fscache.h
fsi_master_aspeed.h
fsi_master_ast_cf.h
fsi_master_gpio.h
fsi.h
gpio.h
gpu_mem.h
host1x.h
huge_memory.h
hwmon.h
i2c.h
ib_mad.h
ib_umad.h
initcall.h
intel_iommu.h iommu/vt-d: Add prq_report trace event 2021-06-10 09:06:13 +02:00
intel_ish.h
intel-sst.h
io_uring.h io_uring: io_uring_complete() trace should take an integer 2021-09-15 10:02:33 +02:00
iocost.h blk-iocost: Add iocg idle state tracepoint 2020-12-17 07:55:44 -07:00
iommu.h
ipi.h
irq_matrix.h
irq.h
iscsi.h
jbd2.h jbd2,ext4: add a shrinker to release checkpointed buffers 2021-06-24 10:54:49 -04:00
kmem.h mm, tracing: unify PFN format strings 2021-06-29 10:53:52 -07:00
kvm.h KVM: x86/mmu: Drop trace_kvm_age_page() tracepoint 2021-04-17 08:30:56 -04:00
kyber.h block: add queue_to_disk() to get gendisk from request_queue 2021-04-12 06:51:57 -06:00
libata.h
lock.h
mce.h
mdio.h
migrate.h mm/gup: migrate pinned pages out of movable zone 2021-05-05 11:27:26 -07:00
mlxsw.h
mmap_lock.h mm: mmap_lock: add tracepoints around lock acquisition 2020-12-15 12:13:41 -08:00
mmap.h
mmc.h
mmflags.h mmflags.h: add missing __GFP_ZEROTAGS and __GFP_SKIP_KASAN_POISON names 2021-08-20 11:31:42 -07:00
module.h
mptcp.h mptcp: dump csum fields in mptcp_dump_mpext 2021-06-18 11:40:11 -07:00
napi.h
nbd.h
neigh.h
net_probe_common.h
net.h net: use %px to print skb address in trace_netif_receive_skb 2021-07-15 10:28:48 -07:00
netfs.h netfs: Add a tracepoint to log failures that would be otherwise unseen 2021-04-23 10:14:32 +01:00
netlink.h netlink: add tracepoint at NL_SET_ERR_MSG 2021-02-04 18:05:59 -08:00
nilfs2.h
nmi.h
objagg.h
oom.h
osnoise.h tracing: Fix spelling in osnoise tracer "interferences" -> "interference" 2021-06-28 14:12:27 -04:00
page_isolation.h
page_pool.h mm, tracing: unify PFN format strings 2021-06-29 10:53:52 -07:00
page_ref.h
pagemap.h mm, tracing: unify PFN format strings 2021-06-29 10:53:52 -07:00
percpu.h
power_cpu_migrate.h
power.h
preemptirq.h
printk.h
pwc.h
pwm.h
qdisc.h net_sched: introduce tracepoint trace_qdisc_enqueue() 2021-07-15 10:32:38 -07:00
qla.h
qrtr.h
random.h random: remove dead code left over from blocking pool 2021-04-02 18:28:12 +11:00
rcu.h rcu/nocb: Unify timers 2021-05-12 12:10:23 -07:00
rdma_core.h
rdma.h RDMA/core: Move the rdma_show_ib_cm_event() macro 2020-08-24 16:01:47 -03:00
regulator.h
rpcgss.h treewide: Add missing semicolons to __assign_str uses 2021-06-30 09:19:14 -04:00
rpcrdma.h xprtrdma: Move fr_mr field to struct rpcrdma_mr 2021-04-26 09:27:23 -04:00
rpm.h
rseq.h
rtc.h
rxrpc.h rxrpc: Fix a missing NULL-pointer check in a trace 2020-09-14 16:18:59 +01:00
sched.h sched/tracing: Remove the redundant 'success' in the sched tracepoint 2021-06-10 11:16:20 -04:00
scmi.h
scsi.h scsi: core: Kill message byte 2021-05-31 22:48:24 -04:00
sctp.h
signal.h
siox.h
skb.h
smbus.h
sock.h net: sock: add trace for socket errors 2021-06-29 11:28:21 -07:00
spi.h spi: Enable tracing of the SPI setup CS selection 2021-05-26 21:22:13 +01:00
spmi.h
sunrpc.h SUNRPC: Fix a NULL pointer deref in trace_svc_stats_latency() 2021-09-15 10:02:24 +02:00
sunvnet.h
swiotlb.h
syscalls.h
target.h scsi: target: core: Add CONTROL field for trace events 2020-10-02 18:36:19 -04:00
task.h
tcp.h tcp: add tracepoint for checksum errors 2021-05-14 15:26:03 -07:00
tegra_apb_dma.h
thermal_power_allocator.h
thermal.h thermal: devfreq_cooling: change tracing function and arguments 2020-12-11 14:10:44 +01:00
thp.h
timer.h tracing: Fix various typos in comments 2021-03-23 14:08:18 -04:00
tlb.h
udp.h
ufs.h scsi: ufs: core: Enable power management for wlun 2021-05-10 22:28:20 -04:00
v4l2.h
vb2.h
vmscan.h include/trace/events/vmscan.h: remove mm_vmscan_inactive_list_is_low 2021-06-30 20:47:28 -07:00
vsock_virtio_transport_common.h virtio/vsock: update trace event for SEQPACKET 2021-06-11 13:32:47 -07:00
wbt.h
workqueue.h workqueue/tracing: Copy workqueue name to buffer in trace event 2021-03-18 12:57:37 -04:00
writeback.h trace: replace WB_REASON_FOREIGN_FLUSH with a string 2021-06-10 11:16:20 -04:00
xdp.h xdp: Extend xdp_redirect_map with broadcast support 2021-05-26 09:46:16 +02:00
xen.h x86/mm/tlb: Flush remote and local TLBs concurrently 2021-03-06 12:59:10 +01:00