Table Of Contents
In addition of Veritas Volume Manager (VxVM) Symantec is also proposing Veritas File System (VxFS) that is most of the time used in combination of VxVM. Symantec claim that highest benefit is found when using both in parallel. This document has been written using Red Hat Enterprise Linux Server release 5.5 (Tikanga) and below VxFS/VxVM releases:
[root@server1 ~]# rpm -qa | grep -i vrts VRTSvlic-3.02.51.010-0 VRTSfssdk-5.1.100.000-SP1_GA_RHEL5 VRTSob-3.4.289-0 VRTSsfmh-3.1.429.0-0 VRTSspt-5.5.000.005-GA VRTSlvmconv-5.1.100.000-SP1_RHEL5 VRTSodm-5.1.100.000-SP1_GA_RHEL5 VRTSvxvm-18.104.22.168-SP1RP1P1_RHEL5 VRTSvxfs-5.1.100.000-SP1_GA_RHEL5 VRTSatClient-22.214.171.124-0 VRTSaslapm-5.1.100.000-SP1_RHEL5 VRTSatServer-126.96.36.199-0 VRTSperl-188.8.131.52-RHEL5.3 VRTSdbed-5.1.100.000-SP1_RHEL5
To avoid putting real server name in your document think of something like:
export PS1="[\u@server1 \W]# "
File system physical parameters
When creating a file system there are two important characteristics to choose:
- Block size (cannot be changed once the file system has been created)
- Intent log size (can be changed after file system creation with fsadm, VxFS usually performs better with larger log sizes)
When using VxVM with VxFS Symantec recommend usage of vxresize (instead of vxassist and fsadm) to club volume and filesystem shrink or grow.
From man mkfs_vxfs command (-o bsize=bsize):
File system size Default block size
0 TB to 1 TB 1k
>1 TB 8k
Similarly, the block size determines the maximum possible file system size, as given on the following table:
Block size Maximum file system size
1k 32 TB
2k 64 TB
4k 128 TB
8k 256 TB
Recommended Oracle file systems block size (assuming your Oracle database have block size equal or bigger than 8KB which is :
|File System||Block Size|
|Oracle software and dump/diagnostic directories||1KB|
|Redo log directory||512 bytes for Solaris, AIX, Windows, Linux and 1KB for HP-UX|
|Archived log directory||1KB|
|Control files directory||8KB (control files block size is 16KB starting with Oracle 10g)|
|Data, index, undo, system/sysaux and temporary directories||8KB|
You can check control file block size with (Linux RedHat 5.5 and Oracle 184.108.40.206):
SQL> select cfbsz from x$kcccf; CFBSZ ---------- 16384 16384
For Oracle release lower than 10g control files block size was equal to Oracle initialization parameter db_block_size, starting with 10g their block size is now 16KB whatever value of db_block_size.
You can check redo log block size with (Linux RedHat 5.5 and Oracle 220.127.116.11):
SQL> select lebsz from x$kccle; LEBSZ ---------- 512 512 512
As 4KB block size disk are slowly coming on the market, Oracle 11gR2 now offer the capability to create redo log files with the size you like…
Intent log size
From man mkfs_vxfs command (-o logsize=n):
Block size Minimum log size Maximum log size
———- —————- —————-
1k 256 blocks 262,144 blocks
2k 128 blocks 131,072 blocks
4k 64 blocks 65,536 blocks
8k 32 blocks 32,768 blocks
The default log size increases with the file system size, as shown on the following table:
File system size Default log size
0 MB to 8 MB 256k
8 MB to 512 MB 1 MB
512 MB to 16 GB 16 MB
16 GB to 512 GB 64 MB
512+ GB 256 MB
Intent log size:
[root@server1 ~]# /opt/VRTS/bin/fsadm -t vxfs -L /ora_prisma/rbs UX:vxfs fsadm: INFO: V-3-25669: logsize=16384 blocks, logvol=""
if the fsadm command complains for something like:
fsadm: Wrong argument "-t". (see: fsadm --help)
then look for VxFS binaries in /opt/VRTS/bin directory. Be careful if changing the PATH because simple tool like df will not behave the same as Symantec has re-written it.
So in my 6GB file system example example:
[root@server1 ~]# df -P /ora_prisma/rbs Filesystem 1024-blocks Used Available Capacity Mounted on /dev/vx/dsk/vgp1417/lvol9 6291456 5264042 963208 85% /ora_prisma/rbs
The default block size (1KB) has been chosen and so default intent log size of 16384 blocks i.e. 16MB.
So which intent log size to choose ? Symantec say recovery time increae with larger intent log while VxFS performs better with larger intent log size. As you obliviously when to tune for the 99.99% of time when your system is up and running you should consider creating large intent log size keeping in mind that behavior must be controlled while application is running (no clear Oracle recommendation)…
Same as Oracle table extent you can change default extent allocation policy and/or preallocate space to a file:
[root@server1 prisma]# getext undotbs01.dbf undotbs01.dbf: Bsize 1024 Reserve 0 Extent Size 0
An extent size of 0 use default extent allocation. See vxtunefs for policy description (parameters are initial_extent_size and max_seqio_extent_size).
Small exemple with an empty file:
[root@server1 data]# df -P . Filesystem 1024-blocks Used Available Capacity Mounted on /dev/vx/dsk/vgp4118/lvol4 83886080 28377085 52039685 36% /ora_iedbre/data [root@server1 data]# touch yannick [root@server1 data]# getext yannick yannick: Bsize 1024 Reserve 0 Extent Size 0 [root@server1 data]# df -P . Filesystem 1024-blocks Used Available Capacity Mounted on /dev/vx/dsk/vgp4118/lvol4 83886080 28377085 52039684 36% /ora_iedbre/data
Now changing its extend and initial allocation:
[root@server1 data]# setext -t vxfs -r 30g -f chgsize yannick [root@server1 data]# df -P . Filesystem 1024-blocks Used Available Capacity Mounted on /dev/vx/dsk/vgp4118/lvol4 83886080 59834365 22548484 73% /ora_iedbre/data [root@server1 data]# ll yannick -rw-r----- 1 root root 32212254720 Jun 22 14:45 yannick [root@server1 data]# getext yannick yannick: Bsize 1024 Reserve 31457280 Extent Size 0 [root@server1 data]# setext -t vxfs -e 1g -r 30g yannick [root@server1 data]# ll yannick -rw-r----- 1 root root 32212254720 Jul 4 12:55 yannick [root@server1 data]# getext yannick yannick: Bsize 1024 Reserve 31457280 Extent Size 1048576
Please note it takes a bit of time to recover free space when deleting this test file.
Fixed extent sizes and Oracle ? I would say it is beneficial for Oracle datafiles as it avoids fragmentation, but if like me you work with autoextend feature then do not set a too small next extent and you would achieve same behavior.
File system tuning
Tunable filesystem parameters
[root@server1 ~]# vxtunefs -p /ora_prisma/rbs Filesystem i/o parameters for /ora_prisma/rbs read_pref_io = 65536 read_nstream = 1 read_unit_io = 65536 write_pref_io = 65536 write_nstream = 1 write_unit_io = 65536 pref_strength = 10 buf_breakup_size = 1048576 discovered_direct_iosz = 262144 max_direct_iosz = 1048576 default_indir_size = 8192 odm_cache_enable = 0 write_throttle = 0 max_diskq = 1048576 initial_extent_size = 8 max_seqio_extent_size = 2048 max_buf_data_size = 8192 hsm_write_prealloc = 0 read_ahead = 1 inode_aging_size = 0 inode_aging_count = 0 fcl_maxalloc = 195225600 fcl_keeptime = 0 fcl_winterval = 3600 fcl_ointerval = 600 oltp_load = 0 delicache_enable = 1 thin_friendly_alloc = 0
If file systems are used with VxVM it is suggested to let default value so do test when changing…
When using VxFS with VxVM, VxVM by default breaks up I/O requests larger than 256K.
File system fragmentation
To display it issue:
[root@server1 ~]# /opt/VRTS/bin/fsadm -t vxfs -D /ora_prisma/data Directory Fragmentation Report Dirs Total Immed Immeds Dirs to Blocks to Searched Blocks Dirs to Add Reduce Reduce total 3 1 2 0 0 0
Symantec do recommend to perform regular file system defragmentation (!!):
In general, VxFS works best if the percentage of free space in the file system does not get below 10 percent. This is because file systems with 10 percent or more free space have less fragmentation and better extent allocation. Regular use of Veritas df command (not the default OS df) to monitor free space is desirable (man df_vxfs).
[root@server1 ~]# /opt/VRTS/bin/df -o s /ora_prisma/data/ /ora_prisma/data (/dev/vx/dsk/vgp1417/lvol8): 14065752 blocks 1875431 files Free Extents by Size 1: 2 2: 2 4: 1 8: 1 16: 1 32: 0 64: 0 128: 1 256: 1 512: 1 1024: 1 2048: 0 4096: 1 8192: 1 16384: 1 32768: 0 65536: 0 131072: 1 262144: 0 524288: 0 1048576: 1 2097152: 1 4194304: 1 8388608: 0 16777216: 0 33554432: 0 67108864: 0 134217728: 0 268435456: 0 536870912: 0 1073741824: 0 2147483648: 0
An unfragmented file system has the following characteristics:
- Less than 1 percent of free space in extents of less than 8 blocks in length
- Less than 5 percent of free space in extents of less than 64 blocks in length
- More than 5 percent of the total file system size available as free extents in lengths of 64 or more blocks
A badly fragmented file system has one or more of the following characteristics:
- Greater than 5 percent of free space in extents of less than 8 blocks in length
- More than 50 percent of free space in extents of less than 64 blocks in length
- Less than 5 percent of the total file system size available as free extents in lengths of 64 or more blocks
Suggested mount options for Oracle databases:
|File System||Normal Mount Options (VxFS)||Advanced Mount Options (VxFS)|
|Oracle software and dump/diagnostic directories||delaylog,datainlog,nolargefiles||delaylog,nodatainlog,nolargefiles|
|Redo log directory||delaylog,datainlog,largefiles||delaylog,nodatainlog,convosync=direct,mincache=direct,largefiles|
|Archived log directory||delaylog,datainlog,nolargefiles||delaylog,nodatainlog,convosync=direct,mincache=direct,nolargefiles|
|Control files directory||delaylog,datainlog,nolargefiles||delaylog,datainlog,nolargefiles|
|Data, index, undo, system/sysaux and temporary directories||delaylog,datainlog,largefiles||delaylog,nodatainlog,convosync=direct,mincache=direct,largefiles|
Licensed product Concurrent I/O (CIO) should also be considered when looking for I/O performance and running an Oracle database.
Veritas Extension for Oracle Disk Manager
I found this in Veritas File System Administrator’s Guide and more deeply in Veritas Storage Foundation: Storage and Availability Management for Oracle Databases and it’s like re-discovering hot water. What is this Oracle Disk Manager (ODM) ?
From Symantec documentation:
The benefits of using Oracle Disk Manager are as follows:
- True kernel asynchronous I/O for files and raw devices
- Reduced system call overhead
- Improved file system layout by preallocating contiguous files on a VxFS file system
- Performance on file system files that is equivalent to raw devices
- Transparent to users
Oracle Disk Manager improves database I/O performance to VxFS file systems by:
- Supporting kernel asynchronous I/O
- Supporting direct I/O and avoiding double buffering
- Avoiding kernel write locks on database files
- Supporting many concurrent I/Os in one system call
- Avoiding duplicate opening of files per Oracle instance
- Allocating contiguous datafiles
From Oracle documentation:
Oracle has developed a new disk and file management API called odmlib, which is marketed under the feature Oracle Disk Manager (ODM). ODM is fundamentally a file management and I/O interface that allows DBAs to manage larger and more complex databases, whilst maintaining the total cost of ownership.
Oracle Disk Manager (ODM) is packaged as part of Oracle9i and above; however, you’ll need a third party vendor’s ODM driver to fully implement Oracle’s interface. For example, Veritas’ VRTSodm package (in Database Edition V3.5) provides an ODM library. Other vendors such as HP and Network Appliance (DAFS) have also announced support and integration of ODM.
A bit of history can be found in this Veritas slide:
ODM is an integrated solution and is considered as replacement of Quick I/O.
Let’s confirm option is available and usable:
[root@server1 ~]# rpm -qa | grep VRTSodm VRTSodm-5.1.100.000-SP1_GA_RHEL5 [root@server1 ~]# /sbin/vxlictest -n "VERITAS Database Edition for Oracle" -f "ODM" ODM feature is licensed [root@server1 ~]# /opt/VRTS/bin/vxlicrep | grep ODM ODM = Enabled [root@server1 ~]# lsmod | grep odm vxodm 164224 1 fdd 83552 2 vxodm [root@server1 ~]# ll /dev/odm total 0 -rw-rw-rw- 1 root root 0 Jul 3 17:27 cluster -rw-rw-rw- 1 root root 0 Jul 3 17:27 ctl -rw-rw-rw- 1 root root 0 Jul 3 17:27 fid -rw-rw-rw- 1 root root 0 Jul 3 17:27 ktrace -rw-rw-rw- 1 root root 0 Jul 3 17:27 stats
Looking at documentation on how to configure it I had the surprise to see that it’s already there:
[root@server1 ~]# /etc/init.d/vxodm status vxodm is running... [orapris@server1 ~]$ ll $ORACLE_HOME/lib/libodm* -rw-r--r-- 1 orapris dba 7442 Aug 14 2009 /ora_prisma/software/lib/libodm11.a lrwxrwxrwx 1 orapris dba 12 Nov 12 2011 /ora_prisma/software/lib/libodm11.so -> libodmd11.so -rw-r--r-- 1 orapris dba 12331 Aug 14 2009 /ora_prisma/software/lib/libodmd11.so
So then what’s the difference with the library from Veritas package and this one ? I have feeling that the Oracle one is fake library for link consistency and in any case you must use the one coming from Veritas package.
Once database restarted you should see this appearing in alert log file located in ADR (Automatic Diagnostic Repository):
Oracle instance running with ODM: Veritas 5.1.100.00 ODM Library, Version 2.0
Once activated (/dev/omd/fid file not empty, File Identification Descriptor) you can find usage statistics in:
[root@server1 data]# cat /dev/odm/stats abort: 0 cancel: 0 commit: 0 create: 0 delete: 0 identify: 0 io: 0 reidentify: 0 resize: 0 unidentify: 0 mname: 0 vxctl: 0 vxvers: 0 mname2: 0 protvers: 0 sethint: 0 gethint: 660 resethint: 0 io req: 0 io calls: 0 comp req: 0 comp calls: 0 io mor cmp: 0 io zro cmp: 0 io nop cmp: 0 cl receive: 0 cl ident: 0 cl reserve: 0 cl delete: 0 cl resize: 0 cl join: 0 cl same op: 0 cl opt idn: 0 cl opt rsv: 0
And using odmstat:
[root@server1 ~]# odmstat -i 10 -c 5 /ora_prisma/log/prisma/redo01.log OPERATIONS FILE BLOCKS AVG TIME(ms) FILE NAME NREADS NWRITES RBLOCKS WBLOCKS RTIME WTIME Wed 04 Jul 2012 06:06:49 PM CEST /ora_prisma/log/prisma/redo01.log 601 4842 614401 106126 0.0 111.2 Wed 04 Jul 2012 06:06:59 PM CEST /ora_prisma/log/prisma/redo01.log 0 6 0 30 0.0 53.3 Wed 04 Jul 2012 06:07:09 PM CEST /ora_prisma/log/prisma/redo01.log 0 5 0 11 0.0 22.0 Wed 04 Jul 2012 06:07:19 PM CEST /ora_prisma/log/prisma/redo01.log 0 6 0 121 0.0 23.3 Wed 04 Jul 2012 06:07:29 PM CEST /ora_prisma/log/prisma/redo01.log 0 6 0 65 0.0 88.3
Once ODM is activated you do not have to bother anymore with mount options and filesystem properties as ODM performs direct IO (raw) and work in kernalized asynchronous IO (kaio) mode.
It is strongly suggested to backup your database files if deactivating ODM.
Veritas Cached Oracle Disk Manager
As we have seen ODM bypasses file system cache and so do direct I/O and no read ahead (as we know read intensive database can suffer from this). Cached ODM (CODM) is implementing selected cached I/O, what ? Better than a long explanation from Symantec documentation:
ODM I/O bypasses the file system cache and directly reads from and writes to disk. Cached ODM enables selected I/O to use caching (file system buffering) and read ahead, which can improve overall Oracle DB I/O performance. Cached ODM performs a conditional form of caching that is based on per-I/O hints from Oracle. The hints indicate what Oracle will do with the data. ODM uses these hints to perform caching and read ahead for some reads, but ODM avoids caching other reads, possibly even for the same file.
CODM is an ODM extension (that must be installed as a requisite), check CODM package is installed with:
[root@server1 ~]# rpm -qa | grep VRTSdbed VRTSdbed-5.1.100.000-SP1_RHEL5
Activate it on a file system using (/etc/vx/tunefstab to make it persistent across reboot):
[root@server1 ~]# vxtunefs -o odm_cache_enable=1 /ora_prisma/log
Then use setcachefile and getcachefile odmadm parameters to change individual files:
[root@server1 ~]# odmadm getcachefile /ora_prisma/data/prisma/mndata01.dbf /ora_prisma/data/prisma/mndata01.dbf,DEF
The cachemap maps file type and I/O type combinations to caching advisories. you can tune it using setcachemap and getcachemap odmadm parameters. List of available parmaeters:
[root@eul2032 ~]# odmadm getcachemap ctl/redolog_write none ctl/redolog_read none ctl/archlog_read none ctl/media_recovery_write none ctl/mirror_read none ctl/resilvering_write none ctl/ctl_file_read none ctl/ctl_file_write none ctl/flash_read none ctl/flash_write none . .
On top of complexity to understand which files can benefit from caching or not, cachemap has so much values to tune that it becomes impossible to tune CODM manually without any advices. Please note that cachemap settings are not persistent across reboot, use /etc/vx/odmadm file to achieve it. So how to achieve this ?
It is advised not to change default cachemap to avoid drawback like file system cache and Oracle SGA double cache. To understand which files can benefit from CODM you have two options:
- Use a Veritas tool called Cached ODM Manager (dbed_codm_adm) that can be used by DBAs.
- Generate AWR reports (Oracle 10g and above) and order tablespaces/files per Reads, highest physical reads datafiles would benefit from CODM.
Putting all together it starts to be a bit complex:
- Oracle SGA
- File system cache
- CODM (ODM)
So then where to put available memory ? Added value of CODM is dynamic allocation, SGA is not dynamic (SGA_MAX_SIZE / MEMORY_MAX_TARGET). Then CODM versus file system ? CODM has a much better granularity as a per file cache, so you can activate it for file where it’s really needed (using AWR and/or dbed_codm_adm).
- Storage Foundation DocCentral
- Pros and Cons of Using Direct I/O for Databases [ID 1005087.1]
- Async IO does not appear to be in use by Oracle on VxFS [ID 756275.1]
- Master Note for Oracle Disk Manager [ID 1226653.1]
- Cached Oracle Disk Manager: Usage Guidelines and Best Practices
- Controlfile Structure
- Oracle Internals – 2 (Oracle Controlfile Structures )
- Log Block Size
- Oracle redo logs use a different blocksize
- Understanding 4KB Sector Support for Oracle Files