mirror of
https://github.com/edk2-porting/linux-next.git
synced 2024-12-22 20:23:57 +08:00
241994ed86
Introduce the basic control files to account, partition, and limit memory using cgroups in default hierarchy mode. This interface versioning allows us to address fundamental design issues in the existing memory cgroup interface, further explained below. The old interface will be maintained indefinitely, but a clearer model and improved workload performance should encourage existing users to switch over to the new one eventually. The control files are thus: - memory.current shows the current consumption of the cgroup and its descendants, in bytes. - memory.low configures the lower end of the cgroup's expected memory consumption range. The kernel considers memory below that boundary to be a reserve - the minimum that the workload needs in order to make forward progress - and generally avoids reclaiming it, unless there is an imminent risk of entering an OOM situation. - memory.high configures the upper end of the cgroup's expected memory consumption range. A cgroup whose consumption grows beyond this threshold is forced into direct reclaim, to work off the excess and to throttle new allocations heavily, but is generally allowed to continue and the OOM killer is not invoked. - memory.max configures the hard maximum amount of memory that the cgroup is allowed to consume before the OOM killer is invoked. - memory.events shows event counters that indicate how often the cgroup was reclaimed while below memory.low, how often it was forced to reclaim excess beyond memory.high, how often it hit memory.max, and how often it entered OOM due to memory.max. This allows users to identify configuration problems when observing a degradation in workload performance. An overcommitted system will have an increased rate of low boundary breaches, whereas increased rates of high limit breaches, maximum hits, or even OOM situations will indicate internally overcommitted cgroups. For existing users of memory cgroups, the following deviations from the current interface are worth pointing out and explaining: - The original lower boundary, the soft limit, is defined as a limit that is per default unset. As a result, the set of cgroups that global reclaim prefers is opt-in, rather than opt-out. The costs for optimizing these mostly negative lookups are so high that the implementation, despite its enormous size, does not even provide the basic desirable behavior. First off, the soft limit has no hierarchical meaning. All configured groups are organized in a global rbtree and treated like equal peers, regardless where they are located in the hierarchy. This makes subtree delegation impossible. Second, the soft limit reclaim pass is so aggressive that it not just introduces high allocation latencies into the system, but also impacts system performance due to overreclaim, to the point where the feature becomes self-defeating. The memory.low boundary on the other hand is a top-down allocated reserve. A cgroup enjoys reclaim protection when it and all its ancestors are below their low boundaries, which makes delegation of subtrees possible. Secondly, new cgroups have no reserve per default and in the common case most cgroups are eligible for the preferred reclaim pass. This allows the new low boundary to be efficiently implemented with just a minor addition to the generic reclaim code, without the need for out-of-band data structures and reclaim passes. Because the generic reclaim code considers all cgroups except for the ones running low in the preferred first reclaim pass, overreclaim of individual groups is eliminated as well, resulting in much better overall workload performance. - The original high boundary, the hard limit, is defined as a strict limit that can not budge, even if the OOM killer has to be called. But this generally goes against the goal of making the most out of the available memory. The memory consumption of workloads varies during runtime, and that requires users to overcommit. But doing that with a strict upper limit requires either a fairly accurate prediction of the working set size or adding slack to the limit. Since working set size estimation is hard and error prone, and getting it wrong results in OOM kills, most users tend to err on the side of a looser limit and end up wasting precious resources. The memory.high boundary on the other hand can be set much more conservatively. When hit, it throttles allocations by forcing them into direct reclaim to work off the excess, but it never invokes the OOM killer. As a result, a high boundary that is chosen too aggressively will not terminate the processes, but instead it will lead to gradual performance degradation. The user can monitor this and make corrections until the minimal memory footprint that still gives acceptable performance is found. In extreme cases, with many concurrent allocations and a complete breakdown of reclaim progress within the group, the high boundary can be exceeded. But even then it's mostly better to satisfy the allocation from the slack available in other groups or the rest of the system than killing the group. Otherwise, memory.max is there to limit this type of spillover and ultimately contain buggy or even malicious applications. - The original control file names are unwieldy and inconsistent in many different ways. For example, the upper boundary hit count is exported in the memory.failcnt file, but an OOM event count has to be manually counted by listening to memory.oom_control events, and lower boundary / soft limit events have to be counted by first setting a threshold for that value and then counting those events. Also, usage and limit files encode their units in the filename. That makes the filenames very long, even though this is not information that a user needs to be reminded of every time they type out those names. To address these naming issues, as well as to signal clearly that the new interface carries a new configuration model, the naming conventions in it necessarily differ from the old interface. - The original limit files indicate the state of an unset limit with a very high number, and a configured limit can be unset by echoing -1 into those files. But that very high number is implementation and architecture dependent and not very descriptive. And while -1 can be understood as an underflow into the highest possible value, -2 or -10M etc. do not work, so it's not inconsistent. memory.low, memory.high, and memory.max will use the string "infinity" to indicate and set the highest possible value. [akpm@linux-foundation.org: use seq_puts() for basic strings] Signed-off-by: Johannes Weiner <hannes@cmpxchg.org> Acked-by: Michal Hocko <mhocko@suse.cz> Cc: Vladimir Davydov <vdavydov@parallels.com> Cc: Greg Thelen <gthelen@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
577 lines
15 KiB
C
577 lines
15 KiB
C
/* memcontrol.h - Memory Controller
|
|
*
|
|
* Copyright IBM Corporation, 2007
|
|
* Author Balbir Singh <balbir@linux.vnet.ibm.com>
|
|
*
|
|
* Copyright 2007 OpenVZ SWsoft Inc
|
|
* Author: Pavel Emelianov <xemul@openvz.org>
|
|
*
|
|
* This program is free software; you can redistribute it and/or modify
|
|
* it under the terms of the GNU General Public License as published by
|
|
* the Free Software Foundation; either version 2 of the License, or
|
|
* (at your option) any later version.
|
|
*
|
|
* This program is distributed in the hope that it will be useful,
|
|
* but WITHOUT ANY WARRANTY; without even the implied warranty of
|
|
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
|
|
* GNU General Public License for more details.
|
|
*/
|
|
|
|
#ifndef _LINUX_MEMCONTROL_H
|
|
#define _LINUX_MEMCONTROL_H
|
|
#include <linux/cgroup.h>
|
|
#include <linux/vm_event_item.h>
|
|
#include <linux/hardirq.h>
|
|
#include <linux/jump_label.h>
|
|
|
|
struct mem_cgroup;
|
|
struct page;
|
|
struct mm_struct;
|
|
struct kmem_cache;
|
|
|
|
/*
|
|
* The corresponding mem_cgroup_stat_names is defined in mm/memcontrol.c,
|
|
* These two lists should keep in accord with each other.
|
|
*/
|
|
enum mem_cgroup_stat_index {
|
|
/*
|
|
* For MEM_CONTAINER_TYPE_ALL, usage = pagecache + rss.
|
|
*/
|
|
MEM_CGROUP_STAT_CACHE, /* # of pages charged as cache */
|
|
MEM_CGROUP_STAT_RSS, /* # of pages charged as anon rss */
|
|
MEM_CGROUP_STAT_RSS_HUGE, /* # of pages charged as anon huge */
|
|
MEM_CGROUP_STAT_FILE_MAPPED, /* # of pages charged as file rss */
|
|
MEM_CGROUP_STAT_WRITEBACK, /* # of pages under writeback */
|
|
MEM_CGROUP_STAT_SWAP, /* # of pages, swapped out */
|
|
MEM_CGROUP_STAT_NSTATS,
|
|
};
|
|
|
|
struct mem_cgroup_reclaim_cookie {
|
|
struct zone *zone;
|
|
int priority;
|
|
unsigned int generation;
|
|
};
|
|
|
|
enum mem_cgroup_events_index {
|
|
MEM_CGROUP_EVENTS_PGPGIN, /* # of pages paged in */
|
|
MEM_CGROUP_EVENTS_PGPGOUT, /* # of pages paged out */
|
|
MEM_CGROUP_EVENTS_PGFAULT, /* # of page-faults */
|
|
MEM_CGROUP_EVENTS_PGMAJFAULT, /* # of major page-faults */
|
|
MEM_CGROUP_EVENTS_NSTATS,
|
|
/* default hierarchy events */
|
|
MEMCG_LOW = MEM_CGROUP_EVENTS_NSTATS,
|
|
MEMCG_HIGH,
|
|
MEMCG_MAX,
|
|
MEMCG_OOM,
|
|
MEMCG_NR_EVENTS,
|
|
};
|
|
|
|
#ifdef CONFIG_MEMCG
|
|
void mem_cgroup_events(struct mem_cgroup *memcg,
|
|
enum mem_cgroup_events_index idx,
|
|
unsigned int nr);
|
|
|
|
bool mem_cgroup_low(struct mem_cgroup *root, struct mem_cgroup *memcg);
|
|
|
|
int mem_cgroup_try_charge(struct page *page, struct mm_struct *mm,
|
|
gfp_t gfp_mask, struct mem_cgroup **memcgp);
|
|
void mem_cgroup_commit_charge(struct page *page, struct mem_cgroup *memcg,
|
|
bool lrucare);
|
|
void mem_cgroup_cancel_charge(struct page *page, struct mem_cgroup *memcg);
|
|
void mem_cgroup_uncharge(struct page *page);
|
|
void mem_cgroup_uncharge_list(struct list_head *page_list);
|
|
|
|
void mem_cgroup_migrate(struct page *oldpage, struct page *newpage,
|
|
bool lrucare);
|
|
|
|
struct lruvec *mem_cgroup_zone_lruvec(struct zone *, struct mem_cgroup *);
|
|
struct lruvec *mem_cgroup_page_lruvec(struct page *, struct zone *);
|
|
|
|
bool mem_cgroup_is_descendant(struct mem_cgroup *memcg,
|
|
struct mem_cgroup *root);
|
|
bool task_in_mem_cgroup(struct task_struct *task, struct mem_cgroup *memcg);
|
|
|
|
extern struct mem_cgroup *try_get_mem_cgroup_from_page(struct page *page);
|
|
extern struct mem_cgroup *mem_cgroup_from_task(struct task_struct *p);
|
|
|
|
extern struct mem_cgroup *parent_mem_cgroup(struct mem_cgroup *memcg);
|
|
extern struct mem_cgroup *mem_cgroup_from_css(struct cgroup_subsys_state *css);
|
|
|
|
static inline bool mm_match_cgroup(struct mm_struct *mm,
|
|
struct mem_cgroup *memcg)
|
|
{
|
|
struct mem_cgroup *task_memcg;
|
|
bool match = false;
|
|
|
|
rcu_read_lock();
|
|
task_memcg = mem_cgroup_from_task(rcu_dereference(mm->owner));
|
|
if (task_memcg)
|
|
match = mem_cgroup_is_descendant(task_memcg, memcg);
|
|
rcu_read_unlock();
|
|
return match;
|
|
}
|
|
|
|
extern struct cgroup_subsys_state *mem_cgroup_css(struct mem_cgroup *memcg);
|
|
|
|
struct mem_cgroup *mem_cgroup_iter(struct mem_cgroup *,
|
|
struct mem_cgroup *,
|
|
struct mem_cgroup_reclaim_cookie *);
|
|
void mem_cgroup_iter_break(struct mem_cgroup *, struct mem_cgroup *);
|
|
|
|
/*
|
|
* For memory reclaim.
|
|
*/
|
|
int mem_cgroup_inactive_anon_is_low(struct lruvec *lruvec);
|
|
bool mem_cgroup_lruvec_online(struct lruvec *lruvec);
|
|
int mem_cgroup_select_victim_node(struct mem_cgroup *memcg);
|
|
unsigned long mem_cgroup_get_lru_size(struct lruvec *lruvec, enum lru_list);
|
|
void mem_cgroup_update_lru_size(struct lruvec *, enum lru_list, int);
|
|
extern void mem_cgroup_print_oom_info(struct mem_cgroup *memcg,
|
|
struct task_struct *p);
|
|
|
|
static inline void mem_cgroup_oom_enable(void)
|
|
{
|
|
WARN_ON(current->memcg_oom.may_oom);
|
|
current->memcg_oom.may_oom = 1;
|
|
}
|
|
|
|
static inline void mem_cgroup_oom_disable(void)
|
|
{
|
|
WARN_ON(!current->memcg_oom.may_oom);
|
|
current->memcg_oom.may_oom = 0;
|
|
}
|
|
|
|
static inline bool task_in_memcg_oom(struct task_struct *p)
|
|
{
|
|
return p->memcg_oom.memcg;
|
|
}
|
|
|
|
bool mem_cgroup_oom_synchronize(bool wait);
|
|
|
|
#ifdef CONFIG_MEMCG_SWAP
|
|
extern int do_swap_account;
|
|
#endif
|
|
|
|
static inline bool mem_cgroup_disabled(void)
|
|
{
|
|
if (memory_cgrp_subsys.disabled)
|
|
return true;
|
|
return false;
|
|
}
|
|
|
|
struct mem_cgroup *mem_cgroup_begin_page_stat(struct page *page);
|
|
void mem_cgroup_update_page_stat(struct mem_cgroup *memcg,
|
|
enum mem_cgroup_stat_index idx, int val);
|
|
void mem_cgroup_end_page_stat(struct mem_cgroup *memcg);
|
|
|
|
static inline void mem_cgroup_inc_page_stat(struct mem_cgroup *memcg,
|
|
enum mem_cgroup_stat_index idx)
|
|
{
|
|
mem_cgroup_update_page_stat(memcg, idx, 1);
|
|
}
|
|
|
|
static inline void mem_cgroup_dec_page_stat(struct mem_cgroup *memcg,
|
|
enum mem_cgroup_stat_index idx)
|
|
{
|
|
mem_cgroup_update_page_stat(memcg, idx, -1);
|
|
}
|
|
|
|
unsigned long mem_cgroup_soft_limit_reclaim(struct zone *zone, int order,
|
|
gfp_t gfp_mask,
|
|
unsigned long *total_scanned);
|
|
|
|
void __mem_cgroup_count_vm_event(struct mm_struct *mm, enum vm_event_item idx);
|
|
static inline void mem_cgroup_count_vm_event(struct mm_struct *mm,
|
|
enum vm_event_item idx)
|
|
{
|
|
if (mem_cgroup_disabled())
|
|
return;
|
|
__mem_cgroup_count_vm_event(mm, idx);
|
|
}
|
|
#ifdef CONFIG_TRANSPARENT_HUGEPAGE
|
|
void mem_cgroup_split_huge_fixup(struct page *head);
|
|
#endif
|
|
|
|
#else /* CONFIG_MEMCG */
|
|
struct mem_cgroup;
|
|
|
|
static inline void mem_cgroup_events(struct mem_cgroup *memcg,
|
|
enum mem_cgroup_events_index idx,
|
|
unsigned int nr)
|
|
{
|
|
}
|
|
|
|
static inline bool mem_cgroup_low(struct mem_cgroup *root,
|
|
struct mem_cgroup *memcg)
|
|
{
|
|
return false;
|
|
}
|
|
|
|
static inline int mem_cgroup_try_charge(struct page *page, struct mm_struct *mm,
|
|
gfp_t gfp_mask,
|
|
struct mem_cgroup **memcgp)
|
|
{
|
|
*memcgp = NULL;
|
|
return 0;
|
|
}
|
|
|
|
static inline void mem_cgroup_commit_charge(struct page *page,
|
|
struct mem_cgroup *memcg,
|
|
bool lrucare)
|
|
{
|
|
}
|
|
|
|
static inline void mem_cgroup_cancel_charge(struct page *page,
|
|
struct mem_cgroup *memcg)
|
|
{
|
|
}
|
|
|
|
static inline void mem_cgroup_uncharge(struct page *page)
|
|
{
|
|
}
|
|
|
|
static inline void mem_cgroup_uncharge_list(struct list_head *page_list)
|
|
{
|
|
}
|
|
|
|
static inline void mem_cgroup_migrate(struct page *oldpage,
|
|
struct page *newpage,
|
|
bool lrucare)
|
|
{
|
|
}
|
|
|
|
static inline struct lruvec *mem_cgroup_zone_lruvec(struct zone *zone,
|
|
struct mem_cgroup *memcg)
|
|
{
|
|
return &zone->lruvec;
|
|
}
|
|
|
|
static inline struct lruvec *mem_cgroup_page_lruvec(struct page *page,
|
|
struct zone *zone)
|
|
{
|
|
return &zone->lruvec;
|
|
}
|
|
|
|
static inline struct mem_cgroup *try_get_mem_cgroup_from_page(struct page *page)
|
|
{
|
|
return NULL;
|
|
}
|
|
|
|
static inline bool mm_match_cgroup(struct mm_struct *mm,
|
|
struct mem_cgroup *memcg)
|
|
{
|
|
return true;
|
|
}
|
|
|
|
static inline bool task_in_mem_cgroup(struct task_struct *task,
|
|
const struct mem_cgroup *memcg)
|
|
{
|
|
return true;
|
|
}
|
|
|
|
static inline struct cgroup_subsys_state
|
|
*mem_cgroup_css(struct mem_cgroup *memcg)
|
|
{
|
|
return NULL;
|
|
}
|
|
|
|
static inline struct mem_cgroup *
|
|
mem_cgroup_iter(struct mem_cgroup *root,
|
|
struct mem_cgroup *prev,
|
|
struct mem_cgroup_reclaim_cookie *reclaim)
|
|
{
|
|
return NULL;
|
|
}
|
|
|
|
static inline void mem_cgroup_iter_break(struct mem_cgroup *root,
|
|
struct mem_cgroup *prev)
|
|
{
|
|
}
|
|
|
|
static inline bool mem_cgroup_disabled(void)
|
|
{
|
|
return true;
|
|
}
|
|
|
|
static inline int
|
|
mem_cgroup_inactive_anon_is_low(struct lruvec *lruvec)
|
|
{
|
|
return 1;
|
|
}
|
|
|
|
static inline bool mem_cgroup_lruvec_online(struct lruvec *lruvec)
|
|
{
|
|
return true;
|
|
}
|
|
|
|
static inline unsigned long
|
|
mem_cgroup_get_lru_size(struct lruvec *lruvec, enum lru_list lru)
|
|
{
|
|
return 0;
|
|
}
|
|
|
|
static inline void
|
|
mem_cgroup_update_lru_size(struct lruvec *lruvec, enum lru_list lru,
|
|
int increment)
|
|
{
|
|
}
|
|
|
|
static inline void
|
|
mem_cgroup_print_oom_info(struct mem_cgroup *memcg, struct task_struct *p)
|
|
{
|
|
}
|
|
|
|
static inline struct mem_cgroup *mem_cgroup_begin_page_stat(struct page *page)
|
|
{
|
|
return NULL;
|
|
}
|
|
|
|
static inline void mem_cgroup_end_page_stat(struct mem_cgroup *memcg)
|
|
{
|
|
}
|
|
|
|
static inline void mem_cgroup_oom_enable(void)
|
|
{
|
|
}
|
|
|
|
static inline void mem_cgroup_oom_disable(void)
|
|
{
|
|
}
|
|
|
|
static inline bool task_in_memcg_oom(struct task_struct *p)
|
|
{
|
|
return false;
|
|
}
|
|
|
|
static inline bool mem_cgroup_oom_synchronize(bool wait)
|
|
{
|
|
return false;
|
|
}
|
|
|
|
static inline void mem_cgroup_inc_page_stat(struct mem_cgroup *memcg,
|
|
enum mem_cgroup_stat_index idx)
|
|
{
|
|
}
|
|
|
|
static inline void mem_cgroup_dec_page_stat(struct mem_cgroup *memcg,
|
|
enum mem_cgroup_stat_index idx)
|
|
{
|
|
}
|
|
|
|
static inline
|
|
unsigned long mem_cgroup_soft_limit_reclaim(struct zone *zone, int order,
|
|
gfp_t gfp_mask,
|
|
unsigned long *total_scanned)
|
|
{
|
|
return 0;
|
|
}
|
|
|
|
static inline void mem_cgroup_split_huge_fixup(struct page *head)
|
|
{
|
|
}
|
|
|
|
static inline
|
|
void mem_cgroup_count_vm_event(struct mm_struct *mm, enum vm_event_item idx)
|
|
{
|
|
}
|
|
#endif /* CONFIG_MEMCG */
|
|
|
|
enum {
|
|
UNDER_LIMIT,
|
|
SOFT_LIMIT,
|
|
OVER_LIMIT,
|
|
};
|
|
|
|
struct sock;
|
|
#if defined(CONFIG_INET) && defined(CONFIG_MEMCG_KMEM)
|
|
void sock_update_memcg(struct sock *sk);
|
|
void sock_release_memcg(struct sock *sk);
|
|
#else
|
|
static inline void sock_update_memcg(struct sock *sk)
|
|
{
|
|
}
|
|
static inline void sock_release_memcg(struct sock *sk)
|
|
{
|
|
}
|
|
#endif /* CONFIG_INET && CONFIG_MEMCG_KMEM */
|
|
|
|
#ifdef CONFIG_MEMCG_KMEM
|
|
extern struct static_key memcg_kmem_enabled_key;
|
|
|
|
extern int memcg_limited_groups_array_size;
|
|
|
|
/*
|
|
* Helper macro to loop through all memcg-specific caches. Callers must still
|
|
* check if the cache is valid (it is either valid or NULL).
|
|
* the slab_mutex must be held when looping through those caches
|
|
*/
|
|
#define for_each_memcg_cache_index(_idx) \
|
|
for ((_idx) = 0; (_idx) < memcg_limited_groups_array_size; (_idx)++)
|
|
|
|
static inline bool memcg_kmem_enabled(void)
|
|
{
|
|
return static_key_false(&memcg_kmem_enabled_key);
|
|
}
|
|
|
|
/*
|
|
* In general, we'll do everything in our power to not incur in any overhead
|
|
* for non-memcg users for the kmem functions. Not even a function call, if we
|
|
* can avoid it.
|
|
*
|
|
* Therefore, we'll inline all those functions so that in the best case, we'll
|
|
* see that kmemcg is off for everybody and proceed quickly. If it is on,
|
|
* we'll still do most of the flag checking inline. We check a lot of
|
|
* conditions, but because they are pretty simple, they are expected to be
|
|
* fast.
|
|
*/
|
|
bool __memcg_kmem_newpage_charge(gfp_t gfp, struct mem_cgroup **memcg,
|
|
int order);
|
|
void __memcg_kmem_commit_charge(struct page *page,
|
|
struct mem_cgroup *memcg, int order);
|
|
void __memcg_kmem_uncharge_pages(struct page *page, int order);
|
|
|
|
int memcg_cache_id(struct mem_cgroup *memcg);
|
|
|
|
void memcg_update_array_size(int num_groups);
|
|
|
|
struct kmem_cache *__memcg_kmem_get_cache(struct kmem_cache *cachep);
|
|
void __memcg_kmem_put_cache(struct kmem_cache *cachep);
|
|
|
|
int memcg_charge_kmem(struct mem_cgroup *memcg, gfp_t gfp,
|
|
unsigned long nr_pages);
|
|
void memcg_uncharge_kmem(struct mem_cgroup *memcg, unsigned long nr_pages);
|
|
|
|
/**
|
|
* memcg_kmem_newpage_charge: verify if a new kmem allocation is allowed.
|
|
* @gfp: the gfp allocation flags.
|
|
* @memcg: a pointer to the memcg this was charged against.
|
|
* @order: allocation order.
|
|
*
|
|
* returns true if the memcg where the current task belongs can hold this
|
|
* allocation.
|
|
*
|
|
* We return true automatically if this allocation is not to be accounted to
|
|
* any memcg.
|
|
*/
|
|
static inline bool
|
|
memcg_kmem_newpage_charge(gfp_t gfp, struct mem_cgroup **memcg, int order)
|
|
{
|
|
if (!memcg_kmem_enabled())
|
|
return true;
|
|
|
|
/*
|
|
* __GFP_NOFAIL allocations will move on even if charging is not
|
|
* possible. Therefore we don't even try, and have this allocation
|
|
* unaccounted. We could in theory charge it forcibly, but we hope
|
|
* those allocations are rare, and won't be worth the trouble.
|
|
*/
|
|
if (gfp & __GFP_NOFAIL)
|
|
return true;
|
|
if (in_interrupt() || (!current->mm) || (current->flags & PF_KTHREAD))
|
|
return true;
|
|
|
|
/* If the test is dying, just let it go. */
|
|
if (unlikely(fatal_signal_pending(current)))
|
|
return true;
|
|
|
|
return __memcg_kmem_newpage_charge(gfp, memcg, order);
|
|
}
|
|
|
|
/**
|
|
* memcg_kmem_uncharge_pages: uncharge pages from memcg
|
|
* @page: pointer to struct page being freed
|
|
* @order: allocation order.
|
|
*/
|
|
static inline void
|
|
memcg_kmem_uncharge_pages(struct page *page, int order)
|
|
{
|
|
if (memcg_kmem_enabled())
|
|
__memcg_kmem_uncharge_pages(page, order);
|
|
}
|
|
|
|
/**
|
|
* memcg_kmem_commit_charge: embeds correct memcg in a page
|
|
* @page: pointer to struct page recently allocated
|
|
* @memcg: the memcg structure we charged against
|
|
* @order: allocation order.
|
|
*
|
|
* Needs to be called after memcg_kmem_newpage_charge, regardless of success or
|
|
* failure of the allocation. if @page is NULL, this function will revert the
|
|
* charges. Otherwise, it will commit @page to @memcg.
|
|
*/
|
|
static inline void
|
|
memcg_kmem_commit_charge(struct page *page, struct mem_cgroup *memcg, int order)
|
|
{
|
|
if (memcg_kmem_enabled() && memcg)
|
|
__memcg_kmem_commit_charge(page, memcg, order);
|
|
}
|
|
|
|
/**
|
|
* memcg_kmem_get_cache: selects the correct per-memcg cache for allocation
|
|
* @cachep: the original global kmem cache
|
|
* @gfp: allocation flags.
|
|
*
|
|
* All memory allocated from a per-memcg cache is charged to the owner memcg.
|
|
*/
|
|
static __always_inline struct kmem_cache *
|
|
memcg_kmem_get_cache(struct kmem_cache *cachep, gfp_t gfp)
|
|
{
|
|
if (!memcg_kmem_enabled())
|
|
return cachep;
|
|
if (gfp & __GFP_NOFAIL)
|
|
return cachep;
|
|
if (in_interrupt() || (!current->mm) || (current->flags & PF_KTHREAD))
|
|
return cachep;
|
|
if (unlikely(fatal_signal_pending(current)))
|
|
return cachep;
|
|
|
|
return __memcg_kmem_get_cache(cachep);
|
|
}
|
|
|
|
static __always_inline void memcg_kmem_put_cache(struct kmem_cache *cachep)
|
|
{
|
|
if (memcg_kmem_enabled())
|
|
__memcg_kmem_put_cache(cachep);
|
|
}
|
|
#else
|
|
#define for_each_memcg_cache_index(_idx) \
|
|
for (; NULL; )
|
|
|
|
static inline bool memcg_kmem_enabled(void)
|
|
{
|
|
return false;
|
|
}
|
|
|
|
static inline bool
|
|
memcg_kmem_newpage_charge(gfp_t gfp, struct mem_cgroup **memcg, int order)
|
|
{
|
|
return true;
|
|
}
|
|
|
|
static inline void memcg_kmem_uncharge_pages(struct page *page, int order)
|
|
{
|
|
}
|
|
|
|
static inline void
|
|
memcg_kmem_commit_charge(struct page *page, struct mem_cgroup *memcg, int order)
|
|
{
|
|
}
|
|
|
|
static inline int memcg_cache_id(struct mem_cgroup *memcg)
|
|
{
|
|
return -1;
|
|
}
|
|
|
|
static inline struct kmem_cache *
|
|
memcg_kmem_get_cache(struct kmem_cache *cachep, gfp_t gfp)
|
|
{
|
|
return cachep;
|
|
}
|
|
|
|
static inline void memcg_kmem_put_cache(struct kmem_cache *cachep)
|
|
{
|
|
}
|
|
#endif /* CONFIG_MEMCG_KMEM */
|
|
#endif /* _LINUX_MEMCONTROL_H */
|
|
|