Should I use Hugepages when running Oracle or should I not?

http://muctable.org/?p=6

The first question should be, why should I use Hugepages at all?
 
The short answer: Performance. With HugePages you can create a memory area that is divided in Hugepage chunks (2M, 4M or 8M per huge page, it depends on the architecture) and this area cannot be paged/swapped out. The default is a 4k page on most Linux systems.
 
With HugePages you can address a larger address space with a smaller TLB (Table Lookaside Buffer), and these results in less TBL Misses and faster access.
Hugepages are not swappable (always physically allocated in RAM, never paged/swapped to disk)
By the way … starting with Oracle 11.2.0.2.0 there is a parameter called use_huge_pages.
If this parameter is set to ONLY, and there are not enough huge pages configured,
then the instance doesn’t start up. This is an additional guarantee that database instance is using hugepages.
 
Decreased page table overhead: Each page table entry can be as large as 64 bytes and if we are trying to handle 50GB of RAM,  the pagetable will be approximately 800MB in size which practically will not fit in 880MB size lowmem.
When 95% of memory is accessed via 256MB hugepages, this can work with a page table of approximately 40MB in total.
 
Calculation: SGA/Pagesize*TBL-Entry=Size of TBL
Default Pages= 1024*1024*1024*50/4096*64=838860800 (~800MB)
Huge Pages =   1024*1024*1024*50/2097152*64=1638400 (~1,5MB)
 
So, generally speaking, the bigger the SGA, the more useful are HugePages.
 
Before we start, we got to clear out some important things
 
This document applies to Redhat Linux 4.x, and 5.x x86 32/64bit only.
Redhat Linux 6.x is out of scope, because it uses a new feature called “Transparent Page Sharing (THP)”.
Starting with Redhat 6.0, the kernel attempts to allocate hugepages whenever possible and any linux process will receive 2M pages if the mmap region is 2M naturally aligned.
Solaris is also out of scope, because Solaris uses Intimate Share Memory Segments with Hugepages per default since Solaris 2.6 or earlier.
 
 
 
 
 
 
 
 
Are there good reasons when I should not use HugePages? Yes, e.g.
 
·         When I work with AMM (Automatic Memory Management) aka MEMORY_TARGET & MEMORY_MAX_TARGET
·         When I completly not know how big my future SGA will be
·         When I need, for any reason a NUMA optimized SGA “_enable_NUMA_optimization”=true
·         When I need Indirect Data Buffers and memory allocation in pseudo-files under ramfs/tmpfs (USE_INDIRECT_DATA_BUFFERS=TRUE)
·         When I work with Oracle VM 2.1 or 2.1.1 and want to use Hugepages in the Guest OS. Starting with Oracle VM 2.1.2 and later the HugePages feature is supported
·         When I work with the XEN Kernel, the XEN Kernel (2.6.18-128.el5xen e.g.) does not support HugePages
 
How do I implement/setup HugePages?
 
·         Make sure that the SGA of all instances including the ASM instance fit into the HugePage area
·         Make sure that the HugePages are configured on every system that runs a Oracle RAC instance
·         Configure the Kernel with CONFIG_HUGETLB_PAGE (default with Redhat Enterprise Linux)
·         Configure the amount of needed HugePages (vm.nr_hugepages = XX => sysctl.conf)
·         Configure the OS group that can take advantage of using HugePages e.g. the dba group (vm.hugetlb_shm_group = => /etc/sysctl.conf)
·         Configure the Security Limits /etc/security/limits.conf e.g. (@dba soft memlock 60397977 @dba hard memlock 60397977) or unlimited
·         Disable AMM (Automatic Memory Management) MEMORY_TARGET & MEMORY_MAX_TARGET
·         Disable NUMA optimization “_enable_NUMA_optimization”=false
·         Disable indirect data buffers USE_INDIRECT_DATA_BUFFERS=FALSE (default), memory allocation in pseudo-files under ramfs/tmpfs
·         Configure the parameter use_large_pages=true in case you are using Oracle 11.2.0.2.0
·         Do not configure the OS Kernel setting vm.swappiness anymore.
·         Optionally install the kernel-doc rpm Package
 
 
Some important notes regarding HugePages
 
If the setting of nr_hugepages is not effective, you will need to reboot the server to make HugePages allocation during system startup.
The HugePages are allocated in a lazy fashion, so the “Hugepages_Free” count drops as the pages get touched and are backed by physical memory. The idea is that it’s more efficient in the sense that you don’t use memory you don’t touch.
If you had set the instance initialization parameter PRE_PAGE_SGA=TRUE (for suitable settings see Document  30793.1), all of the pages would be allocated from HugePages up front. This approach has the advantage that all SGA pages are allocated(touched) right after connection creation (each connection!). Disadvantage is that if we have a big SGA, touching the pages can take a long time.
The userspace application that employs HugePages should be aware of permission implications.
Permissions HugePages segments in memory can strictly impose certain requirements. e.g. Per Bug 6620371 on Linux x86-64 port of Oracle RDBMS until 11g was setting the shared memory flags to hugetlb, read and write by default.
But that shall depend on the configuration environment and with Patch 6620371 on 10.2 and with 11g, the read and write permissions are set based on the internal context.
 
Monitoring
 
Display the default pagesize
 
$ getconf PAGESIZE
4096
Display the Hugepage size
 
$ cat /proc/meminfo | grep Hugepagesize
Hugepagesize:     2048 kB
The /proc/meminfo explained
 
The output of “cat /proc/meminfo” will have lines like:
HugePages_Total: xxx
HugePages_Free:  yyy
HugePages_Rsvd:  www
Hugepagesize:    zzz kB
where:
HugePages_Total: Is the size of the pool of hugepages.
HugePages_Free:  Is the number of hugepages in the pool that are not yet allocated.
HugePages_Rsvd:  Is short for “reserved,” and is the number of hugepages
for which a commitment to allocate from the pool has been made, but no
allocation has yet been made. It’s vaguely analogous to overcommit.
 
 
 
 
 
 
Check if NUMA is enabled on the system
 
$ numactl --show
physcpubind: 0 1
No NUMA support available on this system.
 
Documentation
Redhat Notes
 
Redhat Article ID: 2593   What are Huge Pages and what are the advantages of using them?
Redhat Article ID: 50740  What is an approprite memlock value in limits.conf when using hugepages for an Oracle DB on RHEL
Redhat Article ID: 46335  How do I enable large page support on Linux?
Redhat Article ID: 6651   What is required for an application or program to be able to take advantage of hugepages?
Redhat Article ID: 49562  How to use and monitor transparent hugepages in Red Hat Enterprise Linux 6?
Oracle Metalink Notes
 
HugePages on Linux: What It Is… and What It Is Not… [ID 361323.1]
HugePages on Oracle Linux 64-bit [ID 361468.1]
HugePages and Oracle Database 11g Automatic Memory Management (AMM) on Linux [ID 749851.1]
Hugepages Are Not Used by Database Buffer Cache [ID 829850.1]
Oracle Not Utilizing Hugepages [ID 803238.1]
Setup HugePages in an Guest Does Not Work with Oracle VM 2.1 or 2.1.1 [ID 728063.1]
HugePages Not Released On Oracle RDBMS Instance Shutdown with RHEL / EL 5 Update 1 (5.1) [ID 550443.1]
/proc/meminfo Does Not Provide HugePages Information on Oracle Enterprise Linux (OEL5) [ID 860350.1]
Shell Script to Calculate Values Recommended Linux HugePages / HugeTLB Configuration [ID 401749.1]
Kernel Docs
 
/usr/share/doc/kernel-doc-2.6.18/Documentation  (in case, the kernel-doc package is installed)
Web Links
 
Conclusion
 
So what is the conclusion of this blog entry? You could say, if you are not using AMM, and the bigger the SGA of your Oracle database is, the more it makes sense to work with HugePages.
 
Cheers, William (kernel@0×01.net)
This entry was posted in LinuxOracle 11gR2. Bookmark the permalink.

 

 

Advertisements
This entry was posted in Uncategorized. Bookmark the permalink.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s