Tmpfs vs Ramfs vs (Transparent) Huge Pages

 

Preamble

What are the possible configuration for Automatic Shared Memory Management (ASMM) and Automatic Memory Management (AMM) with ramfs, tmpfs and (Transparent) Huge Pages. Some configurations are simply not supported while for some you have to choose between flexibility and performance…

Testing has been done using Oracle Linux Server release 6.3 (Kernel 2.6.39-300.26.1.el6uek.x86_64) and Oracle 11.2.0.3.

Tmpfs

Previously known as shmfs, tmpfs is created by default when installing most Linux distribution:

[root@server1 ~]# grep shm /etc/fstab
tmpfs                           /dev/shm                                tmpfs   defaults        0 0

By default 50% of server physical memory will be *used*, in fact not really used because pages can be swap out. This is pseudo File System to configure if you want to use Automatic Memory Management (AMM) feature i.e. using memory_target and memory_max_target initialization parameters.

This implementation is quite common but not at all recommended, even by Oracle, when having a server with plenty of memory. In such scenario managing memory page table can be quite time consuming:

[root@server1 ~]# grep -i pagetables /proc/meminfo
PageTables:        12500 kB

Ramfs

The main differences, well sum up by The Geek Stuff (see references), are:

Experimentation Tmpfs Ramfs
Fill maximum space and continue writing Will display error Will continue writing
Fixed Size Yes No
Uses Swap Yes No
Volatile Storage Yes Yes

So on the paper it looks great:

  • Does not use swap !
  • Can fill beyond limit but not an issue if you are careful, in any case we are talking of Oracle memory allocation so you control it !

To configure it use the following:

[root@server1 ~]# grep shm /etc/fstab
ramfs                   /dev/shm                ramfs   defaults        0 0
[root@server1 ~]# df -P /dev/shm
Filesystem         1024-blocks      Used Available Capacity Mounted on
ramfs                   510352       112    510240       1% /dev/shm

The database is starting well and use /dev/shm pseudo-File System correctly:

[root@server1 orcl]# df -P /dev/shm
Filesystem         1024-blocks      Used Available Capacity Mounted on
ramfs                   510352    192540    317812      38% /dev/shm

Unfortunately as it is explained in MOS note 749851.1 it is not supported:

Please also note that ramfs (instead of tmpfs mount over /dev/shm) is not supported for AMM at all. With AMM the Oracle database needs to grow and reduce the size of SGA dynamically. This is not possible with ramfs where it possible and supported with tmpfs (which is the default for the OS installation).

Huge Pages

As we have seen in this blog post Huge Pages is an answer to swapping. But unfortunately as clearly mentioned in documentation available at (install kernel-doc package):

[root@server1 ~]# ll /usr/share/doc/kernel-doc-2.6.39/Documentation/vm/hugetlbpage.txt
-r--r--r-- 1 root root 15156 Jan  4 03:06 /usr/share/doc/kernel-doc-2.6.39/Documentation/vm/hugetlbpage.txt

It is only for shared memory (so SGA and not PGA):

Users can use the huge page support in Linux kernel by either using the mmap system call or standard SYSV shared memory system calls (shmget, shmat).

The documentation also mentioned its usage using mmap method and so allocating Huge Pages in a filesystem. Why not using it with AMM ? To test it I mounted a filesystem of hugetlbfs type giving privileges to my oracle Unix account.

Kernel settings:

# Huge Pages
vm.nr_hugepages = 101
vm.hugetlb_shm_group = 500

Oracle settings:

*.memory_target=200m
*.pre_page_sga=TRUE
*.use_large_pages='ONLY'

To mount the filesystem:

[oracle@server1 ~]$ id
uid=501(oracle) gid=500(dba) groups=500(dba)
[root@server1 ~]# mount -t hugetlbfs -o uid=501,gid=500,size=202M none /dev/shm
[root@server1 ~]# df -P /dev/shm
Filesystem         1024-blocks      Used Available Capacity Mounted on
none                    206848         0    206848       0% /dev/shm

But unfortunately Oracle is not starting:>/p>

SQL> startup
ORA-27102: out of memory
Linux-x86_64 Error: 12: Cannot allocate memory
Additional information: 1
Additional information: 1081350
Additional information: 14

I tried to manually create and fill a dummy file, but same error:

[oracle@server1 ~]$ cd /dev/shm
[oracle@server1 shm]$ touch test1
[oracle@server1 shm]$ ll
total 0
-rw-r--r-- 1 oracle dba 0 Feb  5 12:09 test1
[oracle@server1 shm]$ echo test1 > test1
-bash: echo: WRITE error: Invalid argument

And finally found in Kernel documentation (!!):

While read system calls are supported on files that reside on hugetlb file systems, write system calls are not.

Then I’m wondering the real usage of this mmap method using Huge Pages…

There are few posts about using _realfree_heap_pagesize_hint initialization parameter to make PGA use Huge Pages but it does not work on Linux…

Remark:
Huge Pages size can also be obtained using:

[root@server1 orcl]# hugeadm --page-sizes-all
2097152

Transparent Huge Pages (THP)

Official documentation is available at:

[root@server1 ~]# ll /usr/share/doc/kernel-doc-2.6.39/Documentation/vm/transhuge.txt
-r--r--r-- 1 root root 14310 Jan  4 03:06 /usr/share/doc/kernel-doc-2.6.39/Documentation/vm/transhuge.txt

As its name stand for it is Huge Pages benefit:

  • Bigger pages (2MB on Linux) so 512 times less ping-pong with kernel for page fault
  • Faster and reduced Translation Lookaside Buffer (TLB) misses and reduced kernel PageSize

And this transparently for every applications without the need to allocate them in advance (no server reboot). The only drawback I see versus standard Huge Pages is that they can be swap out (as 4KB pages) even if RedHat state it is an added value, at least not for Oracle databases.

You can also see in documentation some interesting insights for future:

Currently it only works for anonymous memory mappings but in the future it can expand over the pagecache layer starting with tmpfs.

To activate THP set below file to always or madvise:

[root@server1 ~]# cat /sys/kernel/mm/transparent_hugepage/enabled
[always] madvise never

This will start khugepaged daemon:

[root@server1 ~]# ps -ef | grep khugepaged | grep -v grep
root        26     2  0 Feb04 ?        00:00:02 [khugepaged]

Memory automatic defragmentation to allocate HugePages is controlled by:

[root@server1 ~]# cat /sys/kernel/mm/transparent_hugepage/defrag
[always] madvise never

You can, as well, find many khugepaged parameters in /sys/kernel/mm/transparent_hugepage/khugepaged directory.

RedHat claim 10% improvement using THP, but how to monitor THP usage ?

For testing I have used below settings, and measured memory allocation before and after starting an Oracle database:

*.sga_target=150m
*.pga_aggregate_target=100m
*.pre_page_sga=TRUE
*.use_large_pages='ONLY'

For Linux:

[root@server1 ~]# sysctl -w vm.nr_hugepages=77
vm.nr_hugepages = 77

Before:

[root@server1 ~]# grep Ano /proc/meminfo
AnonPages:         24140 kB
AnonHugePages:      6144 kB
[root@server1 ~]# egrep 'trans|thp' /proc/vmstat
nr_anon_transparent_hugepages 3
thp_fault_alloc 3765
thp_fault_fallback 201
thp_collapse_alloc 805
thp_collapse_alloc_failed 14
thp_split 167

After:

[root@server1 ~]# grep Ano /proc/meminfo
AnonPages:        426380 kB
AnonHugePages:    350208 kB
[root@server1 ~]# egrep 'trans|thp' /proc/vmstat
nr_anon_transparent_hugepages 173
thp_fault_alloc 3996
thp_fault_fallback 233
thp_collapse_alloc 805
thp_collapse_alloc_failed 14
thp_split 167

So yes it is used but it’s difficult to know what is really using it (transparent you said ?), here I know because the Oracle database is the only thing that has been started… On top of this mapping what has been set at Oracle level (PGA = 100MB) and what you see at OS level is not possible…

But again Oracle position is quite clear on the subject:

Because Transparent Huge Pages are known to cause unexpected node reboots and performance problems with RAC, Oracle strongly advises to disable the use of Transparent Huge Pages. In addition, Transparent Huge Pages may cause problems even in a single-instance database environment with unexpected performance problems or delays. As such, Oracle recommends disabling Transparent Huge Pages on all Database servers running Oracle.

References

One thought on “Tmpfs vs Ramfs vs (Transparent) Huge Pages

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes:

<a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>