IT World https://blog.yannickjaquier.com RDBMS, Unix and many more... Tue, 19 Jun 2018 10:38:23 +0000 en-US hourly 1 https://wordpress.org/?v=4.9.6 Pacemaker configuration for an Oracle database and its listener https://blog.yannickjaquier.com/linux/pacemaker-configuration-oracle-database.html https://blog.yannickjaquier.com/linux/pacemaker-configuration-oracle-database.html#respond Fri, 08 Jun 2018 17:14:34 +0000 http://blog.yannickjaquier.com/?p=4126 Preamble In order to test a real life high availability scenario you might want to create an operating system cluster to simulate what you could have in production. Where I work the standard tool to manage OS cluster is Veritas Cluster Server (VCS). It’s a nice tool but its installation require a license key that […]

The post Pacemaker configuration for an Oracle database and its listener appeared first on IT World.

]]>

Table of contents

Preamble

In order to test a real life high availability scenario you might want to create an operating system cluster to simulate what you could have in production. Where I work the standard tool to manage OS cluster is Veritas Cluster Server (VCS). It’s a nice tool but its installation require a license key that is not easy to get to test the product.

A free alternative is anyway available and is called Pacemaker. In this blog post I will setup a completer cluster with a virtual IP address (192.168.56.99), a LVM volume group (vg01), a file system (/u01) and finally an Oracle database and its associated listener. The listener will obviously listen on the virtual IP address of the cluster.

For testing I have used two virtual machines running Oracle Linux Server release 7.2 64 bits and Oracle Enterprise edition 12cR2 (12.2.0.1.0) but any Oracle release can be used. The virtual servers are:

  • server2.domain.com using non routable IP address 192.168.56.102
  • server3.domain.com using non routable IP address 192.168.56.103

The command to control and manage Pacemaker is pcs.

Pacemaker installation

Install PCS that control and configure pacemaker and corosync with:

[root@server2 ~]# yum -y install pcs

Pacemaker and corosync will be installed as well:

Dependencies Resolved

===========================================================================================================================================================================================================
 Package                                                           Arch                                 Version                                             Repository                                Size
===========================================================================================================================================================================================================
Installing:
 pcs                                                               x86_64                               0.9.152-10.0.1.el7                                  ol7_latest                               5.0 M
Installing for dependencies:
 corosync                                                          x86_64                               2.4.0-4.el7                                         ol7_latest                               212 k
 corosynclib                                                       x86_64                               2.4.0-4.el7                                         ol7_latest                               125 k
 libqb                                                             x86_64                               1.0-1.el7                                           ol7_latest                                91 k
 libtool-ltdl                                                      x86_64                               2.4.2-22.el7_3                                      ol7_latest                                48 k
 libxslt                                                           x86_64                               1.1.28-5.0.1.el7                                    ol7_latest                               241 k
 libyaml                                                           x86_64                               0.1.4-11.el7_0                                      ol7_latest                                54 k
 nano                                                              x86_64                               2.3.1-10.el7                                        ol7_latest                               438 k
 net-snmp-libs                                                     x86_64                               1:5.7.2-24.el7_3.2                                  ol7_latest                               747 k
 pacemaker                                                         x86_64                               1.1.15-11.el7                                       ol7_latest                               441 k
 pacemaker-cli                                                     x86_64                               1.1.15-11.el7                                       ol7_latest                               319 k
 pacemaker-cluster-libs                                            x86_64                               1.1.15-11.el7                                       ol7_latest                                95 k
 pacemaker-libs                                                    x86_64                               1.1.15-11.el7                                       ol7_latest                               521 k
 perl-TimeDate                                                     noarch                               1:2.30-2.el7                                        ol7_latest                                51 k
 psmisc                                                            x86_64                               22.20-11.el7                                        ol7_latest                               140 k
 python-backports                                                  x86_64                               1.0-8.el7                                           ol7_latest                               5.2 k
 python-backports-ssl_match_hostname                               noarch                               3.4.0.2-4.el7                                       ol7_latest                                11 k
 python-clufter                                                    x86_64                               0.59.5-2.0.1.el7                                    ol7_latest                               349 k
 python-lxml                                                       x86_64                               3.2.1-4.el7                                         ol7_latest                               758 k
 python-setuptools                                                 noarch                               0.9.8-4.el7                                         ol7_latest                               396 k
 resource-agents                                                   x86_64                               3.9.5-82.el7                                        ol7_latest                               359 k
 ruby                                                              x86_64                               2.0.0.648-29.el7                                    ol7_latest                                68 k
 ruby-irb                                                          noarch                               2.0.0.648-29.el7                                    ol7_latest                                89 k
 ruby-libs                                                         x86_64                               2.0.0.648-29.el7                                    ol7_latest                               2.8 M
 rubygem-bigdecimal                                                x86_64                               1.2.0-29.el7                                        ol7_latest                                80 k
 rubygem-io-console                                                x86_64                               0.4.2-29.el7                                        ol7_latest                                51 k
 rubygem-json                                                      x86_64                               1.7.7-29.el7                                        ol7_latest                                76 k
 rubygem-psych                                                     x86_64                               2.0.0-29.el7                                        ol7_latest                                78 k
 rubygem-rdoc                                                      noarch                               4.0.0-29.el7                                        ol7_latest                               319 k
 rubygems                                                          noarch                               2.0.14.1-29.el7                                     ol7_latest                               215 k

Transaction Summary
===========================================================================================================================================================================================================

On all nodes:

[root@server2 ~]# systemctl start pcsd.service
[root@server2 ~]# systemctl enable pcsd.service
Created symlink from /etc/systemd/system/multi-user.target.wants/pcsd.service to /usr/lib/systemd/system/pcsd.service.

Change hacluster password on all nodes:

[root@server3 ~]# echo secure_password | passwd --stdin hacluster
Changing password for user hacluster.
passwd: all authentication tokens updated successfully.

Set authentication for pcs:

[root@server3 ~]# pcs cluster auth server2.domain.com server3.domain.com
Username: hacluster
Password:
server3.domain.com: Authorized
server2.domain.com: Authorized

Create your cluster (cluster01) on your two nodes with:

[root@server2 ~]# pcs cluster setup --start --name cluster01 server2.domain.com server3.domain.com
Destroying cluster on nodes: server2.domain.com, server3.domain.com...
server2.domain.com: Stopping Cluster (pacemaker)...
server3.domain.com: Stopping Cluster (pacemaker)...
server2.domain.com: Successfully destroyed cluster
server3.domain.com: Successfully destroyed cluster

Sending cluster config files to the nodes...
server2.domain.com: Succeeded
server3.domain.com: Succeeded

Starting cluster on nodes: server2.domain.com, server3.domain.com...
server2.domain.com: Starting Cluster...
server3.domain.com: Starting Cluster...

Synchronizing pcsd certificates on nodes server2.domain.com, server3.domain.com...
server3.domain.com: Success
server2.domain.com: Success

Restarting pcsd on the nodes in order to reload the certificates...
server3.domain.com: Success
server2.domain.com: Success

Check it with:

[root@server2 ~]# pcs status
Cluster name: cluster01
WARNING: no stonith devices and stonith-enabled is not false
Stack: unknown
Current DC: NONE
Last updated: Wed Apr 19 10:01:02 2017          Last change: Wed Apr 19 10:00:47 2017 by hacluster via crmd on server2.domain.com

2 nodes and 0 resources configured

Node server2.domain.com: UNCLEAN (offline)
Node server3.domain.com: UNCLEAN (offline)

No resources


Daemon Status:
  corosync: active/disabled
  pacemaker: active/disabled
  pcsd: active/enabled

Notice the WARNING above about missing stonish device…

Enable cluster with:

[root@server2 ~]# pcs cluster enable --all
server2.domain.com: Cluster Enabled
server3.domain.com: Cluster Enabled
[root@server2 ~]# pcs cluster status
Cluster Status:
 Stack: corosync
 Current DC: server2.domain.com (version 1.1.15-11.el7-e174ec8) - partition with quorum
 Last updated: Wed Apr 19 10:02:02 2017         Last change: Wed Apr 19 10:01:08 2017 by hacluster via crmd on server2.domain.com
 2 nodes and 0 resources configured

PCSD Status:
  server2.domain.com: Online
  server3.domain.com: Online

As documentation says:

STONITH is an acronym for “Shoot The Other Node In The Head” and it protects your data from being corrupted by rogue nodes or concurrent access.

This is also known as split brain, this simply allow multiple nodes to access same resource (like writing to a filesystem) at same time and simple goal is to avoid corruption… As the aim is to build something simple I will disable fencing with:

[root@server2 ~]# pcs property set stonith-enabled=false
[root@server2 ~]# pcs status
Cluster name: cluster01
Stack: corosync
Current DC: server2.domain.com (version 1.1.15-11.el7-e174ec8) - partition with quorum
Last updated: Wed Apr 19 10:02:53 2017          Last change: Wed Apr 19 10:02:49 2017 by root via cibadmin on server2.domain.com

2 nodes and 0 resources configured

Online: [ server2.domain.com server3.domain.com ]

No resources


Daemon Status:
  corosync: active/enabled
  pacemaker: active/enabled
  pcsd: active/enabled

Pacemaker resources creation

To create a new resource you might want to know what are available ones in Pacemaker:

[root@server2 ~]# pcs resource list ocf:heartbeat
ocf:heartbeat:CTDB - CTDB Resource Agent
ocf:heartbeat:Delay - Waits for a defined timespan
ocf:heartbeat:Dummy - Example stateless resource agent
ocf:heartbeat:Filesystem - Manages filesystem mounts
ocf:heartbeat:IPaddr - Manages virtual IPv4 and IPv6 addresses (Linux specific version)
ocf:heartbeat:IPaddr2 - Manages virtual IPv4 and IPv6 addresses (Linux specific version)
ocf:heartbeat:IPsrcaddr - Manages the preferred source address for outgoing IP packets
ocf:heartbeat:LVM - Controls the availability of an LVM Volume Group
ocf:heartbeat:MailTo - Notifies recipients by email in the event of resource takeover
ocf:heartbeat:Route - Manages network routes
ocf:heartbeat:SendArp - Broadcasts unsolicited ARP announcements
ocf:heartbeat:Squid - Manages a Squid proxy server instance
ocf:heartbeat:VirtualDomain - Manages virtual domains through the libvirt virtualization framework
ocf:heartbeat:Xinetd - Manages a service of Xinetd
ocf:heartbeat:apache - Manages an Apache Web server instance
ocf:heartbeat:clvm - clvmd
ocf:heartbeat:conntrackd - This resource agent manages conntrackd
ocf:heartbeat:db2 - Resource Agent that manages an IBM DB2 LUW databases in Standard role as primitive or in HADR roles as master/slave configuration. Multiple partitions are supported.
ocf:heartbeat:dhcpd - Chrooted ISC DHCP server resource agent.
ocf:heartbeat:docker - Docker container resource agent.
ocf:heartbeat:ethmonitor - Monitors network interfaces
ocf:heartbeat:exportfs - Manages NFS exports
ocf:heartbeat:galera - Manages a galara instance
ocf:heartbeat:garbd - Manages a galera arbitrator instance
ocf:heartbeat:iSCSILogicalUnit - Manages iSCSI Logical Units (LUs)
ocf:heartbeat:iSCSITarget - iSCSI target export agent
ocf:heartbeat:iface-vlan - Manages VLAN network interfaces.
ocf:heartbeat:mysql - Manages a MySQL database instance
ocf:heartbeat:nagios - Nagios resource agent
ocf:heartbeat:named - Manages a named server
ocf:heartbeat:nfsnotify - sm-notify reboot notifications
ocf:heartbeat:nfsserver - Manages an NFS server
ocf:heartbeat:nginx - Manages an Nginx web/proxy server instance
ocf:heartbeat:oracle - Manages an Oracle Database instance
ocf:heartbeat:oralsnr - Manages an Oracle TNS listener
ocf:heartbeat:pgsql - Manages a PostgreSQL database instance
ocf:heartbeat:portblock - Block and unblocks access to TCP and UDP ports
ocf:heartbeat:postfix - Manages a highly available Postfix mail server instance
ocf:heartbeat:rabbitmq-cluster - rabbitmq clustered
ocf:heartbeat:redis - Redis server
ocf:heartbeat:rsyncd - Manages an rsync daemon
ocf:heartbeat:slapd - Manages a Stand-alone LDAP Daemon (slapd) instance
ocf:heartbeat:symlink - Manages a symbolic link
ocf:heartbeat:tomcat - Manages a Tomcat servlet environment instance

Virtual Ip address

Add resource, a virtual IP, to test your cluster. I have chosen to use the Host-only Virtualbox adapter as it is cluster nodes communication so eth0 on all my nodes. We have seen how to configure this with Oracle Enterprise Linux or Redhat:

[root@server2 ~]# pcs resource create virtualip IPaddr2 ip=192.168.56.99 cidr_netmask=24 nic=eth0 op monitor interval=10s
[root@server2 ~]# pcs status
Cluster name: cluster01
Stack: corosync
Current DC: server2.domain.com (version 1.1.15-11.el7-e174ec8) - partition with quorum
Last updated: Wed Apr 19 10:03:50 2017          Last change: Wed Apr 19 10:03:36 2017 by root via cibadmin on server2.domain.com

2 nodes and 1 resource configured

Online: [ server2.domain.com server3.domain.com ]

Full list of resources:

 virtualip      (ocf::heartbeat:IPaddr2):       Started server2.domain.com

Daemon Status:
  corosync: active/enabled
  pacemaker: active/enabled
  pcsd: active/enabled

You can check at OS level it has been done with:

[root@server2 ~]# ping -c 1 192.168.56.99
PING 192.168.56.99 (192.168.56.99) 56(84) bytes of data.
64 bytes from 192.168.56.99: icmp_seq=1 ttl=64 time=0.025 ms

--- 192.168.56.99 ping statistics ---
1 packets transmitted, 1 received, 0% packet loss, time 0ms
rtt min/avg/max/mdev = 0.025/0.025/0.025/0.000 ms
[root@server2 ~]# ip addr show dev eth0
2: eth0:  mtu 1500 qdisc pfifo_fast state UP qlen 1000
    link/ether 08:00:27:47:54:07 brd ff:ff:ff:ff:ff:ff
    inet 192.168.56.102/24 brd 192.168.56.255 scope global eth0
       valid_lft forever preferred_lft forever
    inet 192.168.56.99/24 brd 192.168.56.255 scope global secondary eth0
       valid_lft forever preferred_lft forever
    inet6 fe80::a00:27ff:fe47:5407/64 scope link
       valid_lft forever preferred_lft forever

Move virtual IP on server3.domain.com:

[root@server3 ~]# pcs resource move virtualip server3.domain.com
[root@server3 ~]# pcs status
Cluster name: cluster01
Stack: corosync
Current DC: server2.domain.com (version 1.1.15-11.el7-e174ec8) - partition with quorum
Last updated: Wed Apr 19 10:24:59 2017          Last change: Wed Apr 19 10:06:21 2017 by root via crm_resource on server3.domain.com

2 nodes and 1 resource configured

Online: [ server2.domain.com server3.domain.com ]

Full list of resources:

 virtualip      (ocf::heartbeat:IPaddr2):       Started server3.domain.com

Daemon Status:
  corosync: active/enabled
  pacemaker: active/enabled
  pcsd: active/enabled

We see the IP address has been transferred to server3.domain.com:

[root@server3 ~]# ip addr show dev eth0
2: eth0:  mtu 1500 qdisc pfifo_fast state UP qlen 1000
    link/ether 08:00:27:b4:9d:bf brd ff:ff:ff:ff:ff:ff
    inet 192.168.56.103/24 brd 192.168.56.255 scope global eth0
       valid_lft forever preferred_lft forever
    inet 192.168.56.99/24 brd 192.168.56.255 scope global secondary eth0
       valid_lft forever preferred_lft forever
    inet6 fe80::a00:27ff:feb4:9dbf/64 scope link
       valid_lft forever preferred_lft forever

Volume group

I create a volume group (vg01) on a shared disk, I also mount the logical volume but this part is not yet required:

[root@server2 ~]# vgcreate vg01 /dev/sdb
  Physical volume "/dev/sdb" successfully created.
  Volume group "vg01" successfully created
[root@server2 ~]# lvcreate -n lvol01 -L 5G vg01
  Logical volume "lvol01" created.
[root@server2 ~]# mkfs -t xfs /dev/vg01/lvol01
meta-data=/dev/vg01/lvol01       isize=256    agcount=4, agsize=327680 blks
         =                       sectsz=512   attr=2, projid32bit=1
         =                       crc=0        finobt=0, sparse=0
data     =                       bsize=4096   blocks=1310720, imaxpct=25
         =                       sunit=0      swidth=0 blks
naming   =version 2              bsize=4096   ascii-ci=0 ftype=0
log      =internal log           bsize=4096   blocks=2560, version=2
         =                       sectsz=512   sunit=0 blks, lazy-count=1
realtime =none                   extsz=4096   blocks=0, rtextents=0
[root@server2 ~]# mkdir /u01
[root@server2 ~]# systemctl daemon-reload
[root@server2 /]# mount -t xfs /dev/vg01/lvol01 /u01
[root@server2 /]# df /u01
Filesystem              1K-blocks  Used Available Use% Mounted on
/dev/mapper/vg01-lvol01   5232640 32928   5199712   1% /u01

I add the LVM resource to Pacemaker, I deliberately create it on server2.domain.com:

[root@server2 /]# pcs resource create vg01 LVM volgrpname=vg01
[root@server2 /]# pcs status
Cluster name: cluster01
Stack: corosync
Current DC: server3.domain.com (version 1.1.15-11.el7-e174ec8) - partition with quorum
Last updated: Wed Apr 19 15:27:28 2017          Last change: Wed Apr 19 15:27:24 2017 by root via cibadmin on server2.domain.com

2 nodes and 2 resources configured

Online: [ server2.domain.com server3.domain.com ]

Full list of resources:

 virtualip      (ocf::heartbeat:IPaddr2):       Started server3.domain.com
 vg01   (ocf::heartbeat:LVM):   Started server2.domain.com

Daemon Status:
  corosync: active/enabled
  pacemaker: active/enabled
  pcsd: active/enabled

I try to move vg01 volume group to server3.domain.com:

[root@server2 ~]# pcs resource move vg01 server3.domain.com
[root@server2 ~]# pcs status
Cluster name: cluster01
Stack: corosync
Current DC: server3.domain.com (version 1.1.15-11.el7-e174ec8) - partition with quorum
Last updated: Wed Apr 19 15:31:17 2017          Last change: Wed Apr 19 15:31:04 2017 by root via crm_resource on server2.domain.com

2 nodes and 2 resources configured

Online: [ server2.domain.com server3.domain.com ]

Full list of resources:

 virtualip      (ocf::heartbeat:IPaddr2):       Started server3.domain.com
 vg01   (ocf::heartbeat:LVM):   FAILED server2.domain.com (blocked)

Failed Actions:
* vg01_stop_0 on server2.domain.com 'unknown error' (1): call=12, status=complete, exitreason='LVM: vg01 did not stop correctly',
    last-rc-change='Wed Apr 19 15:31:04 2017', queued=1ms, exec=10526ms


Daemon Status:
  corosync: active/enabled
  pacemaker: active/enabled
  pcsd: active/enabled

All this to show that it is not so easy and it requires a bit more of modification. I start by removing the volume group resource:

[root@server2 ~]# pcs resource delete vg01
Deleting Resource - vg01

On all nodes:

[root@server2 ~]# lvmconf --enable-halvm --services --startstopservices
Warning: Stopping lvm2-lvmetad.service, but it can still be activated by:
  lvm2-lvmetad.socket
Removed symlink /etc/systemd/system/sysinit.target.wants/lvm2-lvmetad.socket.
[root@server2 ~]# ps -ef | grep lvm
root     31974  9198  0 15:58 pts/1    00:00:00 grep --color=auto lvm

In /etc/lvm/lvm.conf file of all nodes I add:

volume_list = [ "vg00" ]

Execute below command on each node, this is not supporting kernel upgrade. The annoying thing is that each time you have a new kernel you have to issue the command on new kernel BEFORE rebooting or you need to reboot two times:

[root@server3 ~]# dracut -H -f /boot/initramfs-$(uname -r).img $(uname -r)

Recreate the volume group resource with exclusive option (parameter to ensure that only the cluster is capable of activating the LVM logical volume):

[root@server2 ~]# pcs resource create vg01 LVM volgrpname=vg01 exclusive=true
[root@server2 ~]# pcs resource show
 virtualip      (ocf::heartbeat:IPaddr2):       Started server3.domain.com
 vg01   (ocf::heartbeat:LVM):   Started server2.domain.com
[root@server2 ~]# pcs status
Cluster name: cluster01
Stack: corosync
Current DC: server3.domain.com (version 1.1.15-11.el7-e174ec8) - partition with quorum
Last updated: Wed Apr 19 17:10:53 2017          Last change: Wed Apr 19 17:10:43 2017 by root via cibadmin on server2.domain.com

2 nodes and 2 resources configured

Online: [ server2.domain.com server3.domain.com ]

Full list of resources:

 virtualip      (ocf::heartbeat:IPaddr2):       Started server3.domain.com
 vg01   (ocf::heartbeat:LVM):   Started server2.domain.com

Daemon Status:
  corosync: active/enabled
  pacemaker: active/enabled
  pcsd: active/enabled

The volume group move is now working fine:

[root@server2 ~]# pcs status
Cluster name: cluster01
Stack: corosync
Current DC: server3.domain.com (version 1.1.15-11.el7-e174ec8) - partition with quorum
Last updated: Wed Apr 19 17:10:53 2017          Last change: Wed Apr 19 17:10:43 2017 by root via cibadmin on server2.domain.com

2 nodes and 2 resources configured

Online: [ server2.domain.com server3.domain.com ]

Full list of resources:

 virtualip      (ocf::heartbeat:IPaddr2):       Started server3.domain.com
 vg01   (ocf::heartbeat:LVM):   Started server2.domain.com

Daemon Status:
  corosync: active/enabled
  pacemaker: active/enabled
  pcsd: active/enabled
[root@server2 ~]# pcs resource move vg01 server3.domain.com
[root@server2 ~]# pcs status
Cluster name: cluster01
Stack: corosync
Current DC: server3.domain.com (version 1.1.15-11.el7-e174ec8) - partition with quorum
Last updated: Wed Apr 19 17:12:20 2017          Last change: Wed Apr 19 17:11:06 2017 by root via crm_resource on server2.domain.com

2 nodes and 2 resources configured

Online: [ server2.domain.com server3.domain.com ]

Full list of resources:

 virtualip      (ocf::heartbeat:IPaddr2):       Started server3.domain.com
 vg01   (ocf::heartbeat:LVM):   Started server3.domain.com

Daemon Status:
  corosync: active/enabled
  pacemaker: active/enabled
  pcsd: active/enabled

Filesystem

Create the file system based on a logical volume:

[root@server2 ~]# pcs resource create u01 Filesystem device="/dev/vg01/lvol01" directory="/u01" fstype="xfs"
[root@server2 ~]# pcs status
Cluster name: cluster01
Stack: corosync
Current DC: server3.domain.com (version 1.1.15-11.el7-e174ec8) - partition with quorum
Last updated: Wed Apr 19 17:13:51 2017          Last change: Wed Apr 19 17:13:47 2017 by root via cibadmin on server2.domain.com

2 nodes and 3 resources configured

Online: [ server2.domain.com server3.domain.com ]

Full list of resources:

 virtualip      (ocf::heartbeat:IPaddr2):       Started server3.domain.com
 vg01   (ocf::heartbeat:LVM):   Started server3.domain.com
 u01    (ocf::heartbeat:Filesystem):    Started server3.domain.com

Daemon Status:
  corosync: active/enabled
  pacemaker: active/enabled
  pcsd: active/enabled

To collocate resources I create a group, this can also be done with constraints but a group is more logic in our case (the order you choose will be starting order):

[root@server3 u01]# pcs resource group add oracle virtualip vg01 u01
[root@server3 u01]# pcs status
Cluster name: cluster01
Stack: corosync
Current DC: server3.domain.com (version 1.1.15-11.el7-e174ec8) - partition with quorum
Last updated: Wed Apr 19 17:36:10 2017          Last change: Wed Apr 19 17:36:07 2017 by root via cibadmin on server3.domain.com

2 nodes and 3 resources configured

Online: [ server2.domain.com server3.domain.com ]

Full list of resources:

 Resource Group: oracle
     virtualip  (ocf::heartbeat:IPaddr2):       Started server3.domain.com
     vg01       (ocf::heartbeat:LVM):   Started server3.domain.com
     u01        (ocf::heartbeat:Filesystem):    Started server3.domain.com

Daemon Status:
  corosync: active/enabled
  pacemaker: active/enabled
  pcsd: active/enabled

At that stage you can test already created resources are moving from one cluster node to the other with something like:

[root@server3 u01]# pcs cluster standby server3.domain.com
[root@server2 ~]# pcs status
Cluster name: cluster01
Stack: corosync
Current DC: server3.domain.com (version 1.1.15-11.el7-e174ec8) - partition with quorum
Last updated: Wed Apr 19 17:45:45 2017          Last change: Wed Apr 19 17:45:37 2017 by root via crm_resource on server2.domain.com

2 nodes and 3 resources configured

Node server3.domain.com: standby
Online: [ server2.domain.com ]

Full list of resources:

 Resource Group: oracle
     virtualip  (ocf::heartbeat:IPaddr2):       Started server2.domain.com
     vg01       (ocf::heartbeat:LVM):   Started server2.domain.com
     u01        (ocf::heartbeat:Filesystem):    Started server2.domain.com

Daemon Status:
  corosync: active/enabled
  pacemaker: active/enabled
  pcsd: active/enabled

And be back in initial situation with (they are back on server3.domain.com because I have also played with preferred node but this is not mandatory):

[root@server3 ~]# pcs node unstandby server3.domain.com
[root@server3 ~]# pcs status
Cluster name: cluster01
Stack: corosync
Current DC: server3.domain.com (version 1.1.15-11.el7-e174ec8) - partition with quorum
Last updated: Wed Apr 19 17:48:05 2017          Last change: Wed Apr 19 17:48:03 2017 by root via crm_attribute on server3.domain.com

2 nodes and 3 resources configured

Online: [ server2.domain.com server3.domain.com ]

Full list of resources:

 Resource Group: oracle
     virtualip  (ocf::heartbeat:IPaddr2):       Started server3.domain.com
     vg01       (ocf::heartbeat:LVM):   Started server3.domain.com
     u01        (ocf::heartbeat:Filesystem):    Started server3.domain.com

Daemon Status:
  corosync: active/enabled
  pacemaker: active/enabled
  pcsd: active/enabled

Oracle database

[root@server2 ~]# pcs resource describe oracle
ocf:heartbeat:oracle - Manages an Oracle Database instance

Resource script for oracle. Manages an Oracle Database instance
as an HA resource.

Resource options:
  sid (required): The Oracle SID (aka ORACLE_SID).
  home: The Oracle home directory (aka ORACLE_HOME). If not specified, then the SID along with its home should be listed in /etc/oratab.
  user: The Oracle owner (aka ORACLE_OWNER). If not specified, then it is set to the owner of file $ORACLE_HOME/dbs/*${ORACLE_SID}.ora. If this does not work for you, just set it explicitely.
  monuser: Monitoring user name. Every connection as sysdba is logged in an audit log. This can result in a large number of new files created. A new user is created (if it doesn't exist) in the start action and subsequently used in monitor. It should have very limited rights. Make sure that the password for this user does not expire.
  monpassword: Password for the monitoring user. Make sure that the password for this user does not expire.
  monprofile: Profile used by the monitoring user. If the profile does not exist, it will be created with a non-expiring password.
  ipcrm: Sometimes IPC objects (shared memory segments and semaphores) belonging to an Oracle instance might be left behind which prevents the instance from starting. It is not easy to figure out which shared segments belong to which instance, in particular when more instances are running as same user. What we use here is the "oradebug" feature and its "ipc" trace utility. It is not optimal to parse the debugging information, but I am not aware of any other way to find out about the IPC information. In case the format or wording of the trace report changes, parsing might fail. There are some precautions, however, to prevent stepping on other peoples toes. There is also a dumpinstipc option which will make us print the IPC objects which belong to the instance. Use it to see if we parse the trace file correctly. Three settings are possible: - none: don't mess with IPC and hope for the best (beware: you'll probably be out of luck, sooner or later) - instance: try to figure out the IPC stuff which belongs to the instance and remove only those (default; should be safe) - orauser: remove all IPC belonging to the user which runs the instance (don't use this if you run more than one instance as same user or if other apps running as this user use IPC) The default setting "instance" should be safe to use, but in that case we cannot guarantee that the instance will start. In case IPC objects were already left around, because, for instance, someone mercilessly killing Oracle processes, there is no way any more to find out which IPC objects should be removed. In that case, human intervention is necessary, and probably _all_ instances running as same user will have to be stopped. The third setting, "orauser", guarantees IPC objects removal, but it does that based only on IPC objects ownership, so you should use that only if every instance runs as separate user. Please report any problems. Suggestions/fixes welcome.
  clear_backupmode: The clear of the backup mode of ORACLE.
  shutdown_method: How to stop Oracle is a matter of taste it seems. The default method ("checkpoint/abort") is: alter system checkpoint; shutdown abort; This should be the fastest safe way bring the instance down. If you find "shutdown abort" distasteful, set this attribute to "immediate" in which case we will shutdown immediate; If you still think that there's even better way to shutdown an Oracle instance we are willing to listen.

I have then installed Oracle on server3.domain.com where is mounted /u01 filesystem. I have also copied /etc/oratab, /usr/local/bin/coraenv, /usr/local/bin/dbhome and /usr/local/bin/oraenv to server2.domain.com. This step is not mandatory but it will ease Oracle usage on both nodes.

I have also, obviously, changed the listener to make it listening on my virtual IP i.e. 192.168.56.99.

Create the oracle resource:

[root@server3 ~]# pcs resource create orcl oracle sid="orcl" --group=oracle
[root@server3 ~]# pcs status
Cluster name: cluster01
Stack: corosync
Current DC: server3.domain.com (version 1.1.15-11.el7-e174ec8) - partition with quorum
Last updated: Thu Apr 20 18:18:41 2017          Last change: Thu Apr 20 18:18:38 2017 by root via cibadmin on server3.domain.com

2 nodes and 4 resources configured

Online: [ server2.domain.com server3.domain.com ]

Full list of resources:

 Resource Group: oracle
     virtualip  (ocf::heartbeat:IPaddr2):       Started server3.domain.com
     vg01       (ocf::heartbeat:LVM):   Started server3.domain.com
     u01        (ocf::heartbeat:Filesystem):    Started server3.domain.com
     orcl       (ocf::heartbeat:oracle):        Started server3.domain.com

Daemon Status:
  corosync: active/enabled
  pacemaker: active/enabled
  pcsd: active/enabled

You must create monitoring user (default is OCFMON) and profile (default is OCFMONPROFILE) or you will get below error message:

* orcl_start_0 on server2.domain.com 'unknown error' (1): call=268, status=complete, exitreason='monprofile must start with C## for container databases',
    last-rc-change='Fri Apr 21 15:17:06 2017', queued=0ms, exec=17138ms

Please note that container databases is also taken into account and the account must be created on container with C## option as a global account. I have chosen not to create the required profile but I must take it into account when creating the resource:

SQL> create user c##ocfmon identified by "secure_password";

User created.

SQL> grant connect to c##ocfmon;

Grant succeeded.

I create the Oracle database Pacemaker resource:

[root@server2 ~]# pcs resource update orcl monpassword="secure_password" monuser="c##ocfmon" monprofile="default"
[root@server2 ~]# pcs resource show orcl
 Resource: orcl (class=ocf provider=heartbeat type=oracle)
  Attributes: sid=orcl monpassword=secure_password monuser=c##ocfmon monprofile=default
  Operations: start interval=0s timeout=120 (orcl-start-interval-0s)
              stop interval=0s timeout=120 (orcl-stop-interval-0s)
              monitor interval=120 timeout=30 (orcl-monitor-interval-120)

To have my pluggable database automatically opened as instance startup I have used a nice 12cR2 new feature called pluggable database default state:

SQL> SELECT * FROM dba_pdb_saved_states;

no rows selected

SQL> set lines 150
SQL> col name for a20
SQL> SELECT name, open_mode FROM v$pdbs;

NAME                 OPEN_MODE
-------------------- ----------
PDB$SEED             READ ONLY
PDB1                 MOUNTED

SQL> alter pluggable database pdb1 open;

Pluggable database altered.

SQL> alter pluggable database pdb1 save state;

Pluggable database altered.

SQL> col con_name for a20
SQL> SELECT con_name, state FROM dba_pdb_saved_states;

CON_NAME             STATE
-------------------- --------------
PDB1                 OPEN

Oracle listener

Create the Oracle listener resource with:

[root@server3 ~]# pcs resource create listener_orcl oralsnr sid="orcl" listener="listener_orcl" --group=oracle
[root@server3 ~]# pcs status
Cluster name: cluster01
Stack: corosync
Current DC: server3.domain.com (version 1.1.15-11.el7-e174ec8) - partition with quorum
Last updated: Thu Apr 20 18:21:05 2017          Last change: Thu Apr 20 18:21:02 2017 by root via cibadmin on server3.domain.com

2 nodes and 5 resources configured

Online: [ server2.domain.com server3.domain.com ]

Full list of resources:

 Resource Group: oracle
     virtualip  (ocf::heartbeat:IPaddr2):       Started server3.domain.com
     vg01       (ocf::heartbeat:LVM):   Started server3.domain.com
     u01        (ocf::heartbeat:Filesystem):    Started server3.domain.com
     orcl       (ocf::heartbeat:oracle):        Started server3.domain.com
     listener_orcl      (ocf::heartbeat:oralsnr):       Started server3.domain.com

Daemon Status:
  corosync: active/enabled
  pacemaker: active/enabled
  pcsd: active/enabled

Pacemaker graphical interface

You can go on any node of your cluster in https on port 2224 and get a very nice graphical interface where you can do apparently all the required modification of your cluster. Including the stop/start of resources. Overall this graphical interface is of great help when you want to know which options are available for resources:

pcs01
pcs01
pcs02
pcs02

Issues encountered

LVM volume group creation

If for any reason you must re-create or simply create the LVM volume group once you have done the configuration to forbid kernel to activate any volume outside of the root one (vg00 in my case) you must use below trick to escape from all LVM error messages.

The error messages you will get are:

[root@server2 ~]# vgcreate vg01 /dev/sdb
  Physical volume "/dev/sdb" successfully created.
  Volume group "vg01" successfully created
[root@server2 ~]# lvcreate -L 500m -n lvol01 vg01
  Volume "vg01/lvol01" is not active locally.
  Aborting. Failed to wipe start of new LV.

Trying to activate the volume group is not changing anything:

[root@server2 ~]# vgchange -a y vg01
  0 logical volume(s) in volume group "vg01" now active

To overcome the problem use below sequence:

[root@server2 ~]# lvscan
  ACTIVE            '/dev/vg00/lvol00' [10.00 GiB] inherit
  ACTIVE            '/dev/vg00/lvol03' [500.00 MiB] inherit
  ACTIVE            '/dev/vg00/lvol01' [4.00 GiB] inherit
  ACTIVE            '/dev/vg00/lvol02' [4.00 GiB] inherit
  ACTIVE            '/dev/vg00/lvol20' [5.00 GiB] inherit
[root@server2 ~]# vgcreate vg01 /dev/sdb --addtag pacemaker --config 'activation { volume_list = [ "@pacemaker" ] }'
  Volume group "vg01" successfully created
[root@server2 ~]# lvcreate --addtag pacemaker -L 15g -n lvol01 vg01 --config 'activation { volume_list = [ "@pacemaker" ] }'
  Logical volume "lvol01" created.
[root@server2 ~]# lvs
  LV     VG   Attr       LSize   Pool Origin Data%  Meta%  Move Log Cpy%Sync Convert
  lvol00 vg00 -wi-ao----  10.00g
  lvol01 vg00 -wi-ao----   4.00g
  lvol02 vg00 -wi-ao----   4.00g
  lvol03 vg00 -wi-ao---- 500.00m
  lvol20 vg00 -wi-ao----   5.00g
  lvol01 vg01 -wi-a-----  15.00g
[root@server2 ~]# lvscan
  ACTIVE            '/dev/vg01/lvol01' [15.00 GiB] inherit
  ACTIVE            '/dev/vg00/lvol00' [10.00 GiB] inherit
  ACTIVE            '/dev/vg00/lvol03' [500.00 MiB] inherit
  ACTIVE            '/dev/vg00/lvol01' [4.00 GiB] inherit
  ACTIVE            '/dev/vg00/lvol02' [4.00 GiB] inherit
  ACTIVE            '/dev/vg00/lvol20' [5.00 GiB] inherit
[root@server2 ~]# lvchange -an vg01/lvol01 --deltag pacemaker
  Logical volume vg01/lvol01 changed.
[root@server2 ~]# vgchange -an vg01 --deltag pacemaker
  Volume group "vg01" successfully changed
  0 logical volume(s) in volume group "vg01" now active
[root@server2 ~]# pcs resource create vg01 LVM volgrpname=vg01 exclusive=true --group oracle
[root@server2 ~]# pcs status
Cluster name: cluster01
Stack: corosync
Current DC: server3.domain.com (version 1.1.15-11.el7-e174ec8) - partition with quorum
Last updated: Thu Apr 20 17:16:37 2017          Last change: Thu Apr 20 17:16:34 2017 by root via cibadmin on server2.domain.com

2 nodes and 2 resources configured

Online: [ server2.domain.com server3.domain.com ]

Full list of resources:

 Resource Group: oracle
     virtualip  (ocf::heartbeat:IPaddr2):       Started server3.domain.com
     vg01       (ocf::heartbeat:LVM):   Started server3.domain.com

Daemon Status:
  corosync: active/enabled
  pacemaker: active/enabled
  pcsd: active/enabled

Resources constraints

Display resources constraints with:

[root@server2 ~]# pcs constraint show --full
Location Constraints:
  Resource: oracle
    Enabled on: server3.domain.com (score:INFINITY) (role: Started) (id:cli-prefer-oracle)
  Resource: virtualip
    Enabled on: server3.domain.com (score:INFINITY) (role: Started) (id:cli-prefer-virtualip)
Ordering Constraints:
Colocation Constraints:
Ticket Constraints:

If you want to remove location contraint (currently set to server3.domain.com):

[root@server2 ~]# pcs constraint location remove cli-prefer-oracle
[root@server2 ~]# pcs constraint show --full
Location Constraints:
  Resource: virtualip
    Enabled on: server3.domain.com (score:INFINITY) (role: Started) (id:cli-prefer-virtualip)
Ordering Constraints:
Colocation Constraints:
Ticket Constraints:
[root@server2 ~]# pcs constraint location remove cli-prefer-virtualip
[root@server2 ~]# pcs constraint show --full
Location Constraints:
Ordering Constraints:
Colocation Constraints:
Ticket Constraints:

If for example you want to colocate two resources without creating a group use something like:

[root@server2 ~]# pcs constraint colocation set virtualip vg01
[root@server2 ~]# pcs constraint show
Location Constraints:
Ordering Constraints:
Colocation Constraints:
  Resource Sets:
    set virtualip vg01 setoptions score=INFINITY
Ticket Constraints:
[root@server2 ~]# pcs constraint colocation show --full
Colocation Constraints:
  Resource Sets:
    set virtualip vg01 (id:pcs_rsc_set_virtualip_vg01) setoptions score=INFINITY (id:pcs_rsc_colocation_set_virtualip_vg01)
[root@server2 ~]# pcs constraint remove pcs_rsc_colocation_set_virtualip_vg01

References

The post Pacemaker configuration for an Oracle database and its listener appeared first on IT World.

]]>
https://blog.yannickjaquier.com/linux/pacemaker-configuration-oracle-database.html/feed 0
Grub configuration to disable consistent network device naming in OEL 7 https://blog.yannickjaquier.com/linux/disable-consistent-network-device-naming.html https://blog.yannickjaquier.com/linux/disable-consistent-network-device-naming.html#respond Sat, 12 May 2018 08:55:21 +0000 http://blog.yannickjaquier.com/?p=4104 Preamble Starting with Red Hat Enterprise Linux 7 and so Oracle Enterprise Linux 7 (and maybe on many other linux distributions, at least Centos 7 for sure) the network interface names have been moved to something a little bit different from traditional eth[0,1,2,..]: [root@server3 ~]# ip addr 1: lo: mtu 65536 qdisc noqueue state UNKNOWN […]

The post Grub configuration to disable consistent network device naming in OEL 7 appeared first on IT World.

]]>
Preamble

Starting with Red Hat Enterprise Linux 7 and so Oracle Enterprise Linux 7 (and maybe on many other linux distributions, at least Centos 7 for sure) the network interface names have been moved to something a little bit different from traditional eth[0,1,2,..]:

[root@server3 ~]# ip addr
1: lo:  mtu 65536 qdisc noqueue state UNKNOWN
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host
       valid_lft forever preferred_lft forever
2: enp0s3:  mtu 1500 qdisc pfifo_fast state UP qlen 1000
    link/ether 08:00:27:47:54:07 brd ff:ff:ff:ff:ff:ff
    inet 192.168.56.102/24 brd 192.168.56.255 scope global enp0s3
       valid_lft forever preferred_lft forever
    inet6 fe80::a00:27ff:fe47:5407/64 scope link
       valid_lft forever preferred_lft forever
3: enp0s8:  mtu 1500 qdisc pfifo_fast state UP qlen 1000
    link/ether 08:00:27:fc:21:55 brd ff:ff:ff:ff:ff:ff
    inet 10.70.101.94/24 brd 10.70.101.255 scope global dynamic enp0s8
       valid_lft 3572sec preferred_lft 3572sec
    inet6 fe80::a00:27ff:fefc:2155/64 scope link
       valid_lft forever preferred_lft forever

The reason for this is clear from Red Hat official documentation:

In Red Hat Enterprise Linux 7, udev supports a number of different naming schemes. The default is to assign fixed names based on firmware, topology, and location information. This has the advantage that the names are fully automatic, fully predictable, that they stay fixed even if hardware is added or removed (no re-enumeration takes place), and that broken hardware can be replaced seamlessly. The disadvantage is that they are sometimes harder to read than the eth0 or wlan0 names traditionally used. For example: enp5s0.

How to come back to legacy situation ? You might want to do this not only because bad habits die hard but simply because you are configuring a cluster of servers (RAC, NoSQL, …) and want to be sure that the interconnect interface is called eth0 on all your nodes…

Grub configuration

This blog post has been written with a virtual machine running Oracle Linux Server release 7.3 and having two network interfaces: one for interconnect and one for internet access.

grub01
grub01

Edit /etc/default/grub file and at the end of GRUB_CMDLINE_LINUX variable value add:

net.ifnames=0 biosdevname=0

Examples:

  • GRUB_CMDLINE_LINUX=”crashkernel=auto rd.lvm.lv=vg00/lvol00 rd.lvm.lv=vg00/lvol01 rhgb quiet numa=off transparent_hugepage=never net.ifnames=0 biosdevname=0″
  • GRUB_CMDLINE_LINUX=”crashkernel=auto rd.lvm.lv=vg00/lvol00 rd.lvm.lv=vg00/lvol01 rhgb quiet net.ifnames=0 biosdevname=0″

Rebuild Grub configuration:

[root@server3 ~]# grub2-mkconfig -o /boot/grub2/grub.cfg
Generating grub configuration file ...
Found linux image: /boot/vmlinuz-4.1.12-61.1.25.el7uek.x86_64
Found initrd image: /boot/initramfs-4.1.12-61.1.25.el7uek.x86_64.img
Found linux image: /boot/vmlinuz-3.10.0-514.6.1.el7.x86_64
Found initrd image: /boot/initramfs-3.10.0-514.6.1.el7.x86_64.img
Found linux image: /boot/vmlinuz-0-rescue-7e6fb04dc02343d0a54dccc3940ad366
Found initrd image: /boot/initramfs-0-rescue-7e6fb04dc02343d0a54dccc3940ad366.img
done

Copy network configuration interface files to new name:

[root@server3 grub2]# cd /etc/sysconfig/network-scripts/
[root@server3 network-scripts]# cp ifcfg-enp0s3 ifcfg-eth0
[root@server3 network-scripts]# cp ifcfg-enp0s8 ifcfg-eth1

Change values of NAME and DEVICE in both files:

[root@server3 network-scripts]# cat ifcfg-eth0
HWADDR=08:00:27:DC:FB:92
TYPE=Ethernet
BOOTPROTO=none
DEFROUTE=yes
IPV4_FAILURE_FATAL=no
IPV4_DNS_PRIORITY=100
IPV6INIT=yes
IPV6_AUTOCONF=no
IPV6_DEFROUTE=yes
IPV6_FAILURE_FATAL=no
IPV6_ADDR_GEN_MODE=stable-privacy
IPV6_DNS_PRIORITY=100
NAME=eth0
UUID=eefd48d5-7810-4848-a1ce-9040938fb455
DEVICE=eth0
ONBOOT=yes
IPADDR=192.168.56.103
PREFIX=24
[root@server3 network-scripts]# cat ifcfg-eth1
TYPE=Ethernet
BOOTPROTO=dhcp
DEFROUTE=yes
IPV4_FAILURE_FATAL=no
IPV6INIT=no
IPV6_AUTOCONF=yes
IPV6_DEFROUTE=yes
IPV6_PEERDNS=yes
IPV6_PEERROUTES=yes
IPV6_FAILURE_FATAL=no
NAME=eth1
UUID=6b145311-798e-4927-8876-18d02570f386
DEVICE=eth1
ONBOOT=yes
PEERDNS=yes
PEERROUTES=yes

Disable network manager:

[root@server3 ~]# systemctl disable NetworkManager
Removed symlink /etc/systemd/system/multi-user.target.wants/NetworkManager.service.
Removed symlink /etc/systemd/system/dbus-org.freedesktop.NetworkManager.service.
Removed symlink /etc/systemd/system/dbus-org.freedesktop.nm-dispatcher.service.

Reboot server:

[root@server3 ~]# reboot

You should see something like:

[root@server3 ~]# ip addr
1: lo:  mtu 65536 qdisc noqueue state UNKNOWN
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host
       valid_lft forever preferred_lft forever
2: eth0:  mtu 1500 qdisc pfifo_fast state UP qlen 1000
    link/ether 08:00:27:47:54:07 brd ff:ff:ff:ff:ff:ff
    inet 192.168.56.102/24 brd 192.168.56.255 scope global eth0
       valid_lft forever preferred_lft forever
    inet6 fe80::a00:27ff:fe47:5407/64 scope link
       valid_lft forever preferred_lft forever
3: eth1:  mtu 1500 qdisc pfifo_fast state UP qlen 1000
    link/ether 08:00:27:fc:21:55 brd ff:ff:ff:ff:ff:ff
    inet 10.70.101.94/24 brd 10.70.101.255 scope global dynamic eth1
       valid_lft 604794sec preferred_lft 604794sec
    inet6 fe80::a00:27ff:fefc:2155/64 scope link
       valid_lft forever preferred_lft forever

As we have modified the default grub configuration the change is resisting to a Kernel upgrade !! Welcome to old legacy network naming !

With the drawback that cool network tools are not working anymore:

[root@server3 ~]# nmcli
Error: NetworkManager is not running.
[root@server3 ~]# nmtui
NetworkManager is not running.

References

The post Grub configuration to disable consistent network device naming in OEL 7 appeared first on IT World.

]]>
https://blog.yannickjaquier.com/linux/disable-consistent-network-device-naming.html/feed 0
Restricting and securing your database network – part 2 https://blog.yannickjaquier.com/oracle/restricting-securing-database-network-part-2.html https://blog.yannickjaquier.com/oracle/restricting-securing-database-network-part-2.html#comments Fri, 13 Apr 2018 07:50:33 +0000 http://blog.yannickjaquier.com/?p=4074 Preamble Encrypting network traffic and controlling what access to what is not already a standard implemented everywhere but it starts to be more and more a classical request from security teams and aduti companies. Starting with Oracle 12cR1 the Oracle Advanced Security enterprise edition option is no longer required to encrypt and authenticate network: Network […]

The post Restricting and securing your database network – part 2 appeared first on IT World.

]]>

Table of contents

Preamble

Encrypting network traffic and controlling what access to what is not already a standard implemented everywhere but it starts to be more and more a classical request from security teams and aduti companies. Starting with Oracle 12cR1 the Oracle Advanced Security enterprise edition option is no longer required to encrypt and authenticate network:

Network encryption (native network encryption and SSL/TLS) and strong authentication services (Kerberos, PKI, and RADIUS) are no longer part of Oracle Advanced Security and are available in all licensed editions of all supported releases of the Oracle database.

In below my database server is server1.domain.com (192.168.56.101) running Oracle database enterprise edition 12cR1 (12.1.0.2). Server2.domain.com (192.168.56.102) will be my Oracle client (12cR1, 12.1.0.2) and server3.domain.com (192.168.56.103) will be the node I will use to monitor network traffic. All virtual machines are running with Oracle Linux Server release 7.3 64 bits.

You can access to part 1 where we have introduced on how to isolate your database servers from unwanted connections

Network encryption

The first thing I have asked myself is how to see that information are going in clear between my client and my server ? I have started to use world famous wireshark but as here I would like to monitor traffic between virtual machines I have not been able to configure it as I wanted. So finally decided to use tcpdump that is available in standard Oracle linux repository. So installation is as simple as:

[root@server3 ~]# yum install tcpdump

To be able to monitor traffic between my database server and my client I have been obliged to change promiscuous mode of the Host-only Adapter lan card:

secure_database_network15
secure_database_network15

I create in my database below test table containing confidential sales figures:

SQL> drop table sales;

Table dropped.

SQL> create table sales
  2  (region varchar2(20),
  3  val number)
  4  tablespace users;

Table created.

SQL> insert into sales values('Europe',15587496);

1 row created.

SQL> insert into sales values('Asia/Pacific',25587425);

1 row created.

SQL> insert into sales values('US',12584789);

1 row created.

SQL> commit;

Commit complete.

SQL> select * from sales;

REGION                      VAL
-------------------- ----------
Europe                 15587496
Asia/Pacific           25587425
US                     12584789

If you use tcpdump to monitor traffic you will see something like (I have also edited /etc/services to comment out 1531 port). I monitor traffic between my server (server1.domain.com) and my client (server2.domain.com) using a third node called server3.domain.com:

[root@server3 ~]# tcpdump -A host server2.domain.com and server1.domain.com and port 1531
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on enp0s3, link-type EN10MB (Ethernet), capture size 65535 bytes
17:35:41.181023 IP server2.32988 > server1.1531: Flags [P.], seq 2387244669:2387244970, ack 3577557774, win 460, options [nop,nop,TS val 27739139 ecr 27735720], length 301
E..ac.@.@.....8f..8e.....Jv}.=3.....@......
................................................................................................W.......................................................................................select * from sales....................................................
17:35:41.181749 IP server1.1531 > server2.32988: Flags [P.], seq 1:460, ack 301, win 427, options [nop,nop,TS val 27752932 ecr 27739139], length 459
E...*L@.@.....8e..8f.....=3..Jw.....Y......
..y...D.....................n?.S.K....uxu....%*.......Q.............................i..................REGION..............................................................VAL...................xu...$).......................B............=...................................,...Europe...;Ka....aB..................................................... ...............................6................3......................................................................
17:35:41.182025 IP server2.32988 > server1.1531: Flags [.], ack 460, win 483, options [nop,nop,TS val 27739140 ecr 27752932], length 0
E..4c.@.@.....8f..8e.....Jw..=4.....*......
..D...y.
17:35:41.182346 IP server2.32988 > server1.1531: Flags [P.], seq 301:322, ack 460, win 483, options [nop,nop,TS val 27739141 ecr 27752932], length 21
E..Ic.@.@.....8f..8e.....Jw..=4......\.....
..D...y......................
17:35:41.182798 IP server1.1531 > server2.32988: Flags [P.], seq 460:728, ack 322, win 427, options [nop,nop,TS val 27752933 ecr 27739141], length 268
E..@*M@.@..O..8e..8f.....=4..Jw......M.....
;0Z............{........... ...............................6................3..........................................................{............ORA-01403: no data found

17:35:41.222776 IP server2.32988 > server1.1531: Flags [.], ack 728, win 505, options [nop,nop,TS val 27739181 ecr 27752933], length 0
E..4c.@.@.....8f..8e.....Jw..=5.....)C.....
..D-..y.

You might also install world famous Wireshark tool to see this graphically. On my Linux virtual machine installation is as simple as:

[root@server3 ~]# yum install wireshark.x86_64
[root@server3 ~]# yum install wireshark-gnome.x86_64

If you hit the display issue install required fonts (wondering why they are not by default):

[root@server3 ~]# yum install dejavu-sans-fonts.noarch dejavu-serif-fonts.noarch

Using the same capture filter as with tcpdump:

secure_database_network16
secure_database_network16

As you can see the decoding for a dummy hacker like myself is not so obvious and, so far, I have not deeply investigated how to decode the number from ASCII characters. Instead of wasting too much time on this and keep my focus on encrypting feature I have changed the query to:

SQL> select region, to_char(val) as val from sales;

REGION               VAL
-------------------- ----------------------------------------
Europe               15587496
Asia/Pacific         25587425
US                   12584789

Doing the same tcpdump I now get:

[root@server3 ~]# tcpdump -A host server2.domain.com and server1.domain.com and port 1531
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on enp0s3, link-type EN10MB (Ethernet), capture size 65535 bytes
17:36:08.466986 IP server2.32988 > server1.1531: Flags [P.], seq 2387244991:2387245341, ack 3577558501, win 505, options [nop,nop,TS val 27766425 ecr 27752933], length 350
E...c.@.@..e..8f..8e.....Jw..=5......T.....
................................................................................................W......................................................................................-select region, to_char(val) as val from sales....................................................
17:36:08.467847 IP server1.1531 > server2.32988: Flags [P.], seq 1:459, ack 350, win 447, options [nop,nop,TS val 27780218 ecr 27766425], length 458
E...*N@.@.....8e..8f.....=5..Jy............
.'<.......Q.............................i..................REGION...................(.......................i...(..............VAL...................xu...%     ......................"*............=............................X......Europe.15587496....aB.....................................................................................6................3......................................................................
17:36:08.467960 IP server2.32988 > server1.1531: Flags [.], ack 459, win 528, options [nop,nop,TS val 27766426 ecr 27780218], length 0
E..4c.@.@.....8f..8e.....Jy..=7.....Q......
.......z
17:36:08.468222 IP server2.32988 > server1.1531: Flags [P.], seq 350:371, ack 459, win 528, options [nop,nop,TS val 27766426 ecr 27780218], length 21
E..Ic.@.@.....8f..8e.....Jy..=7............
.......z.....................
17:36:08.468622 IP server1.1531 > server2.32988: Flags [P.], seq 459:729, ack 371, win 447, options [nop,nop,TS val 27780219 ecr 27766426], length 270
E..B*O@.@..K..8e..8f.....=7..Jy2...........
...{..................................................................Asia/Pacific.25587425......US.12584789............{........... ...............................6................3..........................................................{............ORA-01403: no data found

17:36:08.508216 IP server2.32988 > server1.1531: Flags [.], ack 729, win 550, options [nop,nop,TS val 27766467 ecr 27780219], length 0
E..4c.@.@.....8f..8e.....Jy2.=8....&O......
.......{

Now the display is a bit more clearer, so much clearer that you can read them black on white in TCP/IP frames. No need to say that someone sniffing your network can grab highly confidential information…

You can define the desired encryption at client and server level with below parameters (default value is ACCEPTED):

  • SQLNET.ENCRYPTION_CLIENT = accepted | rejected | requested | required
  • SQLNET.ENCRYPTION_SERVER = accepted | rejected | requested | required

You define the encryption algorithm using below parameters at client and server level (by default all algorithms are selected):

  • SQLNET.ENCRYPTION_TYPES_SERVER = (3des112, 3des168, aes128, aes192, aes256, des, des40, rc4_40, rc4_56, rc4_128, rc4_256)
  • SQLNET.ENCRYPTION_TYPES_CLIENT = (3des112, 3des168, aes128, aes192, aes256, des, des40, rc4_40, rc4_56, rc4_128, rc4_256)

CLient and server must have a common encryption algorithm to be able to initiate the connection or below error message is thrown:

ERROR:
ORA-12650: No common encryption or data integrity algorithm

Means that in the case of multiple clients connecting to a single server, you can configure only the server and all clients will benefit by default from encryption. Below table extracted from official Oracle documentation explain how negotiation is performed:

Client Setting Server Setting Encryption and Data Negotiation
REJECTED REJECTED OFF
ACCEPTED REJECTED OFF
REQUESTED REJECTED OFF
REQUIRED REJECTED Connection fails
REJECTED ACCEPTED OFF
ACCEPTED ACCEPTED OFF (This value defaults to OFF. Cryptography and data integrity are not enabled until the user changes this parameter by using Oracle Net Manager or by modifying the sqlnet.ora file.)>
REQUESTED ACCEPTED ON
REQUIRED ACCEPTED ON
REJECTED REQUESTED OFF
ACCEPTED REQUESTED ON
REQUESTED REQUESTED ON
REQUIRED REQUESTED ON
REJECTED REQUIRED Connection fails
ACCEPTED REQUIRED ON
REQUESTED REQUIRED ON
REQUIRED REQUIRED ON

So in sqlnet.ora file of your database server you only have to set. I have chosen REQUIRED value but REQUESTED would do the job. Here it means that clients that have specifically set REJECTED will not be able to connect:

SQLNET.ENCRYPTION_SERVER = required

SQLNET.ENCRYPTION_TYPES_SERVER= (AES256)

You can also use netmgr but the tool will force you to set deprecated SQLNET.CRYPTO_SEED parameter:

secure_database_network17
secure_database_network17

SQLNET.CRYPTO_SEED
This parameter was used to seed a random number generator for Oracle Advanced Security. Starting with Oracle Database 10g, Oracle Advanced Security uses a random number generator that does not to require a user-supplied seed value. Last Supported Release 9.2 (9iR2).

If you set SQLNET.ENCRYPTION_CLIENT = REJECTED in client sqlnet.ora then your client will not be able to connect throwing below error message:

ERROR:
ORA-12660: Encryption or crypto-checksumming parameters incompatible

If you sniff again the network between your client and you newly configured encrypted database server you can see that traffic is much less readable:

[root@server3 ~]# tcpdump -A host server2.domain.com and server1.domain.com and port 1531
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on enp0s3, link-type EN10MB (Ethernet), capture size 65535 bytes
18:01:19.041724 IP server2.32991 > server1.1531: Flags [P.], seq 1900443499:1900443831, ack 372174552, win 488, options [nop,nop,TS val 29277000 ecr 29283750], length 332
E.....@.@.G...8f..8e....qFwk........r1.....
...H.......L. ..........:.D......S..)`0/~..cc...........;.>.x,...,..6.!.&.u.d...X...b....n.()y..F.k..%o..Yc..........0.H        ReA~...^.+......p.W
E....Z...N..
18:01:19.042807 IP server1.1531 > server2.32991: Flags [P.], seq 1:461, ack 332, win 436, options [nop,nop,TS val 29290793 ecr 29277000], length 460
E....l@.@.?p..8e..8f........qFx.....h......
8.K..2.Q..+..|..v3...g.!{G]...i....4.cx....f..$..09k.=.........cWagr.S..'.....'y.@.;.....J..{..[.~.1.D....<.],...N.....C...qC.O.......I,.!...Y.ds..]...>.....Z.{HE..+.5hy.Mr.>........L...?k.W..-9.O.N.A....9S.L.C..c...]..Tn...   ....6..K...CA......%. .:F&..8&...f:o"Uc9.s...Z.... .Kc...].     .p.N...VLh..3V.wDaP\....G}3..V.c...>....)..n..O./.E...He]..pM5J.........$..S...J...@C....2.\.W...A/+..Z.z]+....
18:01:19.043039 IP server2.32991 > server1.1531: Flags [.], ack 461, win 510, options [nop,nop,TS val 29277001 ecr 29290793], length 0
E..4..@.@.I...8f..8e....qFx.........[......
...I...)
18:01:19.043361 IP server2.32991 > server1.1531: Flags [P.], seq 332:360, ack 461, win 510, options [nop,nop,TS val 29277002 ecr 29290793], length 28
E..P..@.@.H...8f..8e....qFx................
...J...)..... ....h...-..4t.[..b#...
18:01:19.043849 IP server1.1531 > server2.32991: Flags [P.], seq 461:745, ack 360, win 436, options [nop,nop,TS val 29290794 ecr 29277002], length 284
E..P.m@.@.@...8e..8f........qFx.....$......
...*...J..... .....d..0NM.2......3...$..e)ss"..,"B_t.9.I:.&..,.....v.m.Q............V.g.j..K0....@.^.&&L....:,Lz.,.......F..NF..vp.7....?=$D...@.       .m......Z.su9K.v......".........p..........y]^O.[}E.../...\v..X...-...V/.........q.(......'....c..9<...8.X-....f=/......H............('.P....4..
18:01:19.083335 IP server2.32991 > server1.1531: Flags [.], ack 745, win 533, options [nop,nop,TS val 29277042 ecr 29290794], length 0
E..4..@.@.I...8f..8e....qFx.........Y......
...r...*

Network integrity

Network integrity aims at ensuring that data has not been modified between the client and the server (data modification attack). It also aims at ensuring the same transaction will not be send multiple time to compromise your system (data replay attack). Testing it is not as easy as with network encryption unfortunately…

To activate it you set below parameters at client or server level, the possible value at client and server work the same as with network encryption (default value is ACCEPTED):

  • SQLNET.CRYPTO_CHECKSUM_SERVER = accepted | rejected | requested | required
  • SQLNET.CRYPTO_CHECKSUM_CLIENT = accepted | rejected | requested | required

You then choose checksum algorithm, client and server must have one in common to work (by default all algorithms are chosen):

  • SQLNET.CRYPTO_CHECKSUM_TYPES_SERVER= (MD5, SHA1, SHA256, SHA384, SHA512)
  • SQLNET.CRYPTO_CHECKSUM_TYPES_CLIENT= (MD5, SHA1, SHA256, SHA384, SHA512)

I have chosen to activate it, same as before, only at database server level. By default all is ready on client to proceed:

SQLNET.CRYPTO_CHECKSUM_TYPES_SERVER= (SHA1)

SQLNET.CRYPTO_CHECKSUM_SERVER = required

This can also be done with netmgr:

secure_database_network18
secure_database_network18

Here below the tcpdump result with encryption, we simply see that all TCP/IP frames are a bit longer than the initial output making me think that checksum (network integrity) has been added:

[root@server3 ~]# tcpdump -A host server2.domain.com and server1.domain.com and port 1531
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on enp0s3, link-type EN10MB (Ethernet), capture size 65535 bytes
10:36:50.366041 IP server2.32995 > server1.1531: Flags [P.], seq 2461874204:2461874552, ack 3658538857, win 488, options [nop,nop,TS val 89008324 ecr 89012257], length 348
E...".@.@.$r..8f..8e......8....i...........
.................................................................................................L.....................................................................................-select region, to_char(val) as val from sales....................................................."...yW.T...e....{...
10:36:50.367147 IP server1.1531 > server2.32995: Flags [P.], seq 1:480, ack 348, win 438, options [nop,nop,TS val 89022117 ecr 89008324], length 479
E....<@.@.T...8e..8f.......i..9x....a^.....
........."*.........................................X......Europe.15587496.....C.....................................................................................6.........................................................................................F.#..]...yA.F..x@c.
10:36:50.367404 IP server2.32995 > server1.1531: Flags [.], ack 480, win 510, options [nop,nop,TS val 89008326 ecr 89022117], length 0
E..4".@.@.%...8f..8e......9x...H.....-.....
.N(..N^.
10:36:50.367733 IP server2.32995 > server1.1531: Flags [P.], seq 348:390, ack 480, win 510, options [nop,nop,TS val 89008326 ecr 89022117], length 42
E..^".@.@.%...8f..8e......9x...H...........
.N(..N^....*. ....................&..=.!t.)M.%.my.
10:36:50.368078 IP server1.1531 > server2.32995: Flags [P.], seq 480:771, ack 390, win 438, options [nop,nop,TS val 89022118 ecr 89008326], length 291
E..W.=@.@.UG..8e..8f.......H..9......;.....
.N^..N(....#. ........................................................Asia/Pacific.25587425......US.12584789............{........... ...............................6...........................................................................{............ORA-01403: no data found
J......'l...O..x!".+.
10:36:50.408468 IP server2.32995 > server1.1531: Flags [.], ack 771, win 533, options [nop,nop,TS val 89008367 ecr 89022118], length 0
E..4".@.@.%...8f..8e......9....k...........
.N(..N^.

Network strong authentication

By default database users are authenticated with their passwords (whether it is through network or with Operating System for OS authenticated users). If you want to move a step forward in security you might wish to implement what is called strong authentication using third party authentication services (Kerberos and Radius) or SSL with digital certificates.

As just written there are multiple methods to implement strong authentication:

  • Kerberos
  • Remote Authentication Dial-In User Service (RADIUS)
  • Secure Sockets Layer

The authentication method I plan to test is Secure Sockets Layer. It works with certificates stored in Oracle wallet to authenticate clients and server. I will obviously not worked with certificate signed by an authorized authority (Comodo, Verizon,…) but with self signed certificates that must not obviously be used in production.

My Oracle Support (MOS) has plenty of documentation on how to do it, I tend to say maybe too much as I had to mix multiple notes to reach a working environment.

With orapki

I wanted to use graphical Oracle Wallet Manager (OWM) to do it but after many unsuccessful tries I have realized that this tool cannot be used in the case of self signed certificates. Or maybe my knowledge is too poor…

The first time you execute OWM:

secure_database_network19
secure_database_network19

You will get a popup asking to create the default directory, I have chosen to do it and use this directory to store the wallet I will finally create with command line tools. This way it is more convenient to graphically control all is going well:

secure_database_network20
secure_database_network20

The tool to use is orapki. Start by creating a wallet. The wallet is created with auto_login option to avoid being obliged to supply password to use it (password will be asked only in case of modifications):

[oracle@server1 ~]$ cd /u01/app/oracle/product/12.1.0/dbhome_1/owm/wallets/oracle
[oracle@server1 oracle]$ ll
total 0
[oracle@server1 oracle]$ orapki wallet create -wallet . -auto_login -pwd secure_password
Oracle PKI Tool : Version 12.1.0.2
Copyright (c) 2004, 2014, Oracle and/or its affiliates. All rights reserved.

[oracle@server1 oracle]$ ll
total 8
-rw------- 1 oracle dba 120 Feb  9 12:04 cwallet.sso
-rw-rw-rw- 1 oracle dba   0 Feb  9 12:04 cwallet.sso.lck
-rw------- 1 oracle dba  75 Feb  9 12:04 ewallet.p12
-rw-rw-rw- 1 oracle dba   0 Feb  9 12:04 ewallet.p12.lck

Add a self-signed certificate to your wallet:

[oracle@server1 oracle]$ orapki wallet add -wallet . -dn 'CN=server1.domain.com,O=MyCompany,L=Geneva,C=CH' -keysize 2048 -self_signed -validity 365 -pwd secure_password
Oracle PKI Tool : Version 12.1.0.2
Copyright (c) 2004, 2014, Oracle and/or its affiliates. All rights reserved.

[oracle@server1 oracle]$ ll
total 8
-rw------- 1 oracle dba 4085 Feb  9 12:06 cwallet.sso
-rw-rw-rw- 1 oracle dba    0 Feb  9 12:04 cwallet.sso.lck
-rw------- 1 oracle dba 4040 Feb  9 12:06 ewallet.p12
-rw-rw-rw- 1 oracle dba    0 Feb  9 12:04 ewallet.p12.lck
[oracle@server1 oracle]$ orapki wallet display -wallet .
Oracle PKI Tool : Version 12.1.0.2
Copyright (c) 2004, 2014, Oracle and/or its affiliates. All rights reserved.

Requested Certificates:
User Certificates:
Subject:        CN=server1.domain.com,O=MyCompany,L=Geneva,C=CH
Trusted Certificates:
Subject:        CN=server1.domain.com,O=MyCompany,L=Geneva,C=CH

Export the server certificate to import it in your clients’ wallet:

[oracle@server1 oracle]$ orapki wallet export -wallet . -dn 'CN=server1.domain.com,O=MyCompany,L=Geneva,C=CH' -cert server_ca.cert
Oracle PKI Tool : Version 12.1.0.2
Copyright (c) 2004, 2014, Oracle and/or its affiliates. All rights reserved.

[oracle@server1 oracle]$ ll
total 12
-rw------- 1 oracle dba 4085 Feb  9 12:06 cwallet.sso
-rw-rw-rw- 1 oracle dba    0 Feb  9 12:04 cwallet.sso.lck
-rw------- 1 oracle dba 4040 Feb  9 12:06 ewallet.p12
-rw-rw-rw- 1 oracle dba    0 Feb  9 12:04 ewallet.p12.lck
-rw------- 1 oracle dba 1123 Feb  9 15:13 server_ca.cert

If you control graphically what has been done you get:

secure_database_network21
secure_database_network21

Do almost the same thing on your client. Start by creating an auto login wallet:

[oracle@server2 ~]$ cd /u01/app/oracle/product/12.1.0/dbhome_1/owm/wallets/oracle
[oracle@server2 oracle]$ ll
total 0
[oracle@server2 oracle]$ orapki wallet create -wallet . -auto_login -pwd secure_password
Oracle PKI Tool : Version 12.1.0.2
Copyright (c) 2004, 2014, Oracle and/or its affiliates. All rights reserved.

[oracle@server2 oracle]$ ll
total 8
-rw------- 1 oracle dba 120 Feb  9 15:20 cwallet.sso
-rw-rw-rw- 1 oracle dba   0 Feb  9 15:20 cwallet.sso.lck
-rw------- 1 oracle dba  75 Feb  9 15:20 ewallet.p12
-rw-rw-rw- 1 oracle dba   0 Feb  9 15:20 ewallet.p12.lck

Add a self signed certificate to your wallet (here I create a certificate for an user I plan to call yjaquier):

[oracle@server2 oracle]$ orapki wallet add -wallet . -dn 'CN=yjaquier,O=MyCompany,L=Geneva,C=CH' -keysize 2048 -self_signed -validity 365 -pwd secure_password
Oracle PKI Tool : Version 12.1.0.2
Copyright (c) 2004, 2014, Oracle and/or its affiliates. All rights reserved.

Export your client certificate for server wallet import:

[oracle@server2 oracle]$ orapki wallet export -wallet . -dn 'CN=yjaquier,O=MyCompany,L=Geneva,C=CH' -cert client_ca.cert
Oracle PKI Tool : Version 12.1.0.2
Copyright (c) 2004, 2014, Oracle and/or its affiliates. All rights reserved.

[oracle@server2 oracle]$ ll
total 12
-rw------- 1 oracle dba 1095 Feb  9 15:21 client_ca.cert
-rw------- 1 oracle dba 4037 Feb  9 15:20 cwallet.sso
-rw-rw-rw- 1 oracle dba    0 Feb  9 15:20 cwallet.sso.lck
-rw------- 1 oracle dba 3992 Feb  9 15:20 ewallet.p12
-rw-rw-rw- 1 oracle dba    0 Feb  9 15:20 ewallet.p12.lck

Then transfer on client with ssh server certificated and vice-versa. Then on client import server certificate as a trusted one:

[oracle@server2 oracle]$ orapki wallet add -wallet . -trusted_cert -cert server_ca.cert -pwd secure_password
Oracle PKI Tool : Version 12.1.0.2
Copyright (c) 2004, 2014, Oracle and/or its affiliates. All rights reserved.

Import on server client certificate as a trusted one:

[oracle@server1 oracle]$ orapki wallet add -wallet . -trusted_cert -cert client_ca.cert -pwd secure_password
Oracle PKI Tool : Version 12.1.0.2
Copyright (c) 2004, 2014, Oracle and/or its affiliates. All rights reserved.

At the end you should get on server:

secure_database_network22
secure_database_network22

Now comes the Oracle network configuration. In listener.ora file I have added the listening on secure TCP port (tcps) and wallet location:

LISTENER_ORCL =
  (DESCRIPTION_LIST =
    (DESCRIPTION =
      (ADDRESS = (PROTOCOL = TCP)(HOST = server1.domain.com)(PORT = 1531))
      (ADDRESS = (PROTOCOL = TCPS)(HOST = server1.domain.com)(PORT = 1532))
    )
  )

WALLET_LOCATION =
  (SOURCE=
    (METHOD=file)
    (METHOD_DATA=
        (DIRECTORY=/u01/app/oracle/product/12.1.0/dbhome_1/owm/wallets/oracle)
     )
   )

in sqlnet.ora of server I also add wallet location:

WALLET_LOCATION =
  (SOURCE=
    (METHOD=file)
    (METHOD_DATA=
        (DIRECTORY=/u01/app/oracle/product/12.1.0/dbhome_1/owm/wallets/oracle)
     )
   )

I had to stop and restart the listener to make it working and you can control it is activated with:

[oracle@server1 admin]$ lsnrctl status listener_orcl

LSNRCTL for Linux: Version 12.1.0.2.0 - Production on 09-MAR-2017 15:08:48

Copyright (c) 1991, 2014, Oracle.  All rights reserved.

Connecting to (DESCRIPTION=(ADDRESS=(PROTOCOL=TCP)(HOST=server1.domain.com)(PORT=1531)))
STATUS of the LISTENER
------------------------
Alias                     listener_orcl
Version                   TNSLSNR for Linux: Version 12.1.0.2.0 - Production
Start Date                09-MAR-2017 12:52:53
Uptime                    0 days 2 hr. 15 min. 55 sec
Trace Level               off
Security                  ON: Local OS Authentication
SNMP                      OFF
Listener Parameter File   /u01/app/oracle/product/12.1.0/dbhome_1/network/admin/listener.ora
Listener Log File         /u01/app/oracle/diag/tnslsnr/server1/listener_orcl/alert/log.xml
Listening Endpoints Summary...
  (DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=server1)(PORT=1531)))
  (DESCRIPTION=(ADDRESS=(PROTOCOL=tcps)(HOST=server1)(PORT=1532)))
Services Summary...
Service "orcl" has 1 instance(s).
  Instance "orcl", status READY, has 1 handler(s) for this service...
Service "pdb1" has 1 instance(s).
  Instance "orcl", status READY, has 1 handler(s) for this service...
The command completed successfully

Few parameters must be set at instance level:

SQL> show parameter authent

NAME                                 TYPE        VALUE
------------------------------------ ----------- ------------------------------
os_authent_prefix                    string      ops$
remote_os_authent                    boolean     FALSE
SQL> alter system set os_authent_prefix='' scope=spfile;

System altered.

SQL> show parameter authent

NAME                                 TYPE        VALUE
------------------------------------ ----------- ------------------------------
os_authent_prefix                    string      ops$
remote_os_authent                    boolean     FALSE
SQL> shutdown immediate
Database closed.
Database dismounted.
ORACLE instance shut down.
SQL> startup
ORACLE instance started.

Total System Global Area 1073741824 bytes
Fixed Size                  2932632 bytes
Variable Size             796917864 bytes
Database Buffers          268435456 bytes
Redo Buffers                5455872 bytes
Database mounted.
Database opened.
SQL> show parameter authent

NAME                                 TYPE        VALUE
------------------------------------ ----------- ------------------------------
os_authent_prefix                    string
remote_os_authent                    boolean     FALSE

SQL> alter pluggable database pdb1 open;

Pluggable database altered.

I also create the test user (yaquier, same as in client certificate) for network strong authentication:

SQL> alter session set container=pdb1;

Session altered.

SQL> create user yjaquier identified externally as 'CN=yjaquier,O=MyCompany,L=Geneva,C=CH';

User created.

SQL> grant dba to yjaquier;

Grant succeeded.

In sqlnet.ora of client I also add wallet location:

WALLET_LOCATION =
  (SOURCE=
    (METHOD=file)
    (METHOD_DATA=
        (DIRECTORY=/u01/app/oracle/product/12.1.0/dbhome_1/owm/wallets/oracle)
     )
   )

If you control with netmgr you get this on server:

secure_database_network23
secure_database_network23

And on client:

secure_database_network24
secure_database_network24

Remark
The documentation on MOS is asking you to set plenty of other parameters but for most of them the default value is more than sufficient. In fact I have been able to make the feature working without setting them.

Then you can test your listener is answering on both secure and unsecure ports:

[oracle@server2 ~]$ tnsping "(ADDRESS=(PROTOCOL=tcp)(HOST=server1.domain.com)(PORT=1531))"

TNS Ping Utility for Linux: Version 12.1.0.2.0 - Production on 09-MAR-2017 15:32:25

Copyright (c) 1997, 2014, Oracle.  All rights reserved.

Attempting to contact (ADDRESS=(PROTOCOL=tcp)(HOST=server1.domain.com)(PORT=1531))
OK (0 msec)
[oracle@server2 ~]$ tnsping "(ADDRESS=(PROTOCOL=tcps)(HOST=server1.domain.com)(PORT=1532))"

TNS Ping Utility for Linux: Version 12.1.0.2.0 - Production on 09-MAR-2017 15:32:30

Copyright (c) 1997, 2014, Oracle.  All rights reserved.

Attempting to contact (ADDRESS=(PROTOCOL=tcps)(HOST=server1.domain.com)(PORT=1532))
OK (30 msec)

To ease connection I create in tnsnames.ora of my clients the below entries (with my configuration it is still possible to connect on classic TCP port while at the end you should remove it):

PDB1 =
  (DESCRIPTION =
    (ADDRESS_LIST =
      (ADDRESS = (PROTOCOL = TCP)(HOST = server1.domain.com)(PORT = 1531))
    )
    (CONNECT_DATA =
      (SERVICE_NAME = pdb1)
    )
  )

PDB1S =
  (DESCRIPTION =
    (ADDRESS_LIST =
      (ADDRESS = (PROTOCOL = TCPS)(HOST = server1.domain.com)(PORT = 1532))
    )
    (CONNECT_DATA =
      (SERVICE_NAME = pdb1)
    )
  )
[oracle@server2 admin]$ sqlplus /@pdb1s

SQL*Plus: Release 12.1.0.2.0 Production on Thu Mar 9 15:30:21 2017

Copyright (c) 1982, 2014, Oracle.  All rights reserved.


Connected to:
Oracle Database 12c Enterprise Edition Release 12.1.0.2.0 - 64bit Production
With the Partitioning, OLAP, Advanced Analytics, Real Application Testing
and Unified Auditing options

SQL> show user
USER is "YJAQUIER"
SQL> select sys_context('userenv','network_protocol') from dual;

SYS_CONTEXT('USERENV','NETWORK_PROTOCOL')
--------------------------------------------------------------------------------
tcps

It is anyway still possible to connect in standard way supplying the password (not on command line obviously !!):

[oracle@server2 ~]$ sqlplus test/secure_password@pdb1s

SQL*Plus: Release 12.1.0.2.0 Production on Thu Mar 9 15:42:45 2017

Copyright (c) 1982, 2014, Oracle.  All rights reserved.


Connected to:
Oracle Database 12c Enterprise Edition Release 12.1.0.2.0 - 64bit Production
With the Partitioning, OLAP, Advanced Analytics, Real Application Testing
and Unified Auditing options

SQL> select sys_context('userenv','network_protocol') from dual;

SYS_CONTEXT('USERENV','NETWORK_PROTOCOL')
--------------------------------------------------------------------------------
tcps

If you test this from a third client where certificate has not been inserted in server wallet you should be unable to connect:

[oracle@server3 ~]$ tnsping "(ADDRESS=(PROTOCOL=tcp)(HOST=server1.domain.com)(PORT=1531))"

TNS Ping Utility for Linux: Version 12.1.0.2.0 - Production on 09-MAR-2017 15:32:44

Copyright (c) 1997, 2014, Oracle.  All rights reserved.

Attempting to contact (ADDRESS=(PROTOCOL=tcp)(HOST=server1.domain.com)(PORT=1531))
OK (50 msec)
[oracle@server3 ~]$ tnsping "(ADDRESS=(PROTOCOL=tcps)(HOST=server1.domain.com)(PORT=1532))"

TNS Ping Utility for Linux: Version 12.1.0.2.0 - Production on 09-MAR-2017 15:32:50

Copyright (c) 1997, 2014, Oracle.  All rights reserved.

Attempting to contact (ADDRESS=(PROTOCOL=tcps)(HOST=server1.domain.com)(PORT=1532))
TNS-12560: TNS:protocol adapter error
[oracle@server3 ~]$ sqlplus test/secure_password@pdb1s

SQL*Plus: Release 12.1.0.2.0 Production on Thu Mar 9 16:04:06 2017

Copyright (c) 1982, 2014, Oracle.  All rights reserved.

ERROR:
ORA-28759: failure to open file


Enter user-name:

The above error is the one you will get if the wallet is not configured, if you configure it (without server certificate) you would get:

[oracle@server3 admin]$ sqlplus test/secure_password@pdb1s

SQL*Plus: Release 12.1.0.2.0 Production on Thu Mar 9 16:57:51 2017

Copyright (c) 1982, 2014, Oracle.  All rights reserved.

ERROR:
ORA-29024: Certificate validation failure


Enter user-name:

With openssl

Instead of using orapki I have tried to use openssl command to generate certificates and to signed them but it did not worked as expected…

First I have asked myself if I should use DSA or RSA and found the answer in Archlinux:

OpenSSH 7.0 deprecated and disabled support for DSA keys due to discovered vulnerabilities, therefore the choice of cryptosystem lies within RSA or one of the two types of ECC.

I have started be generating a root RSA 2048 bits key (I’m working in ~oracle/openssl directory):

[oracle@server1 openssl]$ openssl genrsa -out root.key 2048
Generating RSA private key, 2048 bit long modulus
...............+++
......................................+++
e is 65537 (0x10001)
[oracle@server1 openssl]$ ll
total 4
-rw-r--r-- 1 oracle dba 1766 Feb  9 11:46 root.key

Self-signed the root certificate with:

[oracle@server1 openssl]$ openssl req -x509 -new -nodes -key root.key -sha256 -days 1024 -out self-root.pem
You are about to be asked to enter information that will be incorporated
into your certificate request.
What you are about to enter is what is called a Distinguished Name or a DN.
There are quite a few fields but you can leave some blank
For some fields there will be a default value,
If you enter '.', the field will be left blank.
-----
Country Name (2 letter code) [XX]:CH
State or Province Name (full name) []:
Locality Name (eg, city) [Default City]:Geneva
Organization Name (eg, company) [Default Company Ltd]:MyCompany
Organizational Unit Name (eg, section) []:
Common Name (eg, your name or your server's hostname) []:server1.domain.com
Email Address []:
[oracle@server1 openssl]$ ll
total 8
-rw-r--r-- 1 oracle dba 1675 Mar  8 15:33 root.key
-rw-r--r-- 1 oracle dba 1253 Mar  8 15:38 self-root.pem

The Distinguish Name (DN) of your self-signed root certificate is:

CN=server1.domain.com,O=MyCompany,L=Geneva,C=CH

Create the wallet (I have chosen default OWM directory) with orapki (can be done with graphical interface):

[oracle@server1 oracle]$ cd $ORACLE_HOME/owm/wallets/oracle
[oracle@server1 oracle]$ orapki wallet create -wallet . -pwd secure_password -auto_login
Oracle PKI Tool : Version 12.1.0.2
Copyright (c) 2004, 2014, Oracle and/or its affiliates. All rights reserved.

[oracle@server1 oracle]$ ll
total 8
-rw------- 1 oracle dba 120 Mar  8 15:42 cwallet.sso
-rw-rw-rw- 1 oracle dba   0 Mar  8 15:42 cwallet.sso.lck
-rw------- 1 oracle dba  75 Mar  8 15:42 ewallet.p12
-rw-rw-rw- 1 oracle dba   0 Mar  8 15:42 ewallet.p12.lck

Add the self signed root certificate you have created before:

[oracle@server1 oracle]$ cd $ORACLE_HOME/owm/wallets/oracle
[oracle@server1 oracle]$ orapki wallet add -wallet . -trusted_cert -cert ~oracle/openssl/self-root.pem -pwd secure_password
Oracle PKI Tool : Version 12.1.0.2
Copyright (c) 2004, 2014, Oracle and/or its affiliates. All rights reserved.

[oracle@server1 oracle]$ ll
total 8
-rw------- 1 oracle dba 1189 Mar  8 15:44 cwallet.sso
-rw-rw-rw- 1 oracle dba    0 Mar  8 15:42 cwallet.sso.lck
-rw------- 1 oracle dba 1144 Mar  8 15:44 ewallet.p12
-rw-rw-rw- 1 oracle dba    0 Mar  8 15:42 ewallet.p12.lck
[oracle@server1 oracle]$ orapki wallet display -wallet .
Oracle PKI Tool : Version 12.1.0.2
Copyright (c) 2004, 2014, Oracle and/or its affiliates. All rights reserved.

Requested Certificates:
User Certificates:
Trusted Certificates:
Subject:        CN=server1.domain.com,O=MyCompany,L=Geneva,C=CH

Generate the certificate signing request (CSR):

[oracle@server1 oracle]$ orapki wallet add -wallet . -dn "CN=server1.domain.com,O=MyCompany,L=Geneva,C=CH" -keysize 1024 -pwd secure_password
Oracle PKI Tool : Version 12.1.0.2
Copyright (c) 2004, 2014, Oracle and/or its affiliates. All rights reserved.

[oracle@server1 oracle]$ orapki wallet display -wallet .
Oracle PKI Tool : Version 12.1.0.2
Copyright (c) 2004, 2014, Oracle and/or its affiliates. All rights reserved.

Requested Certificates:
Subject:        CN=server1.domain.com,O=MyCompany,L=Geneva,C=CH
User Certificates:
Trusted Certificates:
Subject:        CN=server1.domain.com,O=MyCompany,L=Geneva,C=CH

Export the CSR using:

[oracle@server1 oracle]$ orapki wallet export -wallet . -dn "CN=server1.domain.com,O=MyCompany,L=Geneva,C=CH" -request self-signed-oracle.csr -pwd secure_password
Oracle PKI Tool : Version 12.1.0.2
Copyright (c) 2004, 2014, Oracle and/or its affiliates. All rights reserved.

[oracle@server1 oracle]$ ll
total 12
-rw------- 1 oracle dba 2429 Mar  8 15:46 cwallet.sso
-rw-rw-rw- 1 oracle dba    0 Mar  8 15:42 cwallet.sso.lck
-rw------- 1 oracle dba 2384 Mar  8 15:46 ewallet.p12
-rw-rw-rw- 1 oracle dba    0 Mar  8 15:42 ewallet.p12.lck
-rw------- 1 oracle dba  621 Mar  8 15:47 self-signed-oracle.csr

Trying to sign the CSR with the server self-signed certificate did not worked:

[oracle@server1 openssl]$ ll
total 12
-rw-r--r-- 1 oracle dba 1675 Mar  8 15:33 root.key
-rw-r--r-- 1 oracle dba 1253 Mar  8 15:38 self-root.pem
-rw------- 1 oracle dba  621 Mar  8 15:47 self-signed-oracle.csr
[oracle@server1 openssl]$ openssl x509 -req -in self-signed-oracle.csr -CA self-root.pem -CAkey root.key -CAcreateserial -out self-signed-oracle.crt -days 365 -sha256
Signature verification error
140156873422752:error:0D0C50A1:asn1 encoding routines:ASN1_item_verify:unknown message digest algorithm:a_verify.c:191:

My server is perfectly up to date (as of writing of this blog post) and I have unfortunately not found how to overcome this error… If someone has an idea feel free to comment…

With keytool

Keytool is coming with Java and I have seen it used by Tim Hall, this would be by far my last choice but I wanted to see if I get the same issues as with openssl. Create a Java KeyStore (JKS) containing a self-signed certificate:

[oracle@server1 ~]$ cd $ORACLE_HOME/owm/wallets/oracle
[oracle@server1 oracle]$ keytool -genkeypair -keyalg RSA -keysize 2048 -alias selfsigned -keystore keystore.jks -dname 'CN=server1.domain.com,O=MyCompany,L=Geneva,C=CH' -storepass secure_password -validity 365 -keypass secure_password
[oracle@server1 oracle]$ ll
total 4
-rw-r--r-- 1 oracle dba 2190 Feb 10 12:55 keystore.jks

Create an empty wallet:

[oracle@server1 oracle]$ orapki wallet create -wallet . -pwd secure_password -auto_login
Oracle PKI Tool : Version 12.1.0.2
Copyright (c) 2004, 2014, Oracle and/or its affiliates. All rights reserved.

[oracle@server1 oracle]$ ll
total 12
-rw------- 1 oracle dba  120 Feb 10 12:56 cwallet.sso
-rw-rw-rw- 1 oracle dba    0 Feb 10 12:56 cwallet.sso.lck
-rw------- 1 oracle dba   75 Feb 10 12:56 ewallet.p12
-rw-rw-rw- 1 oracle dba    0 Feb 10 12:56 ewallet.p12.lck
-rw-r--r-- 1 oracle dba 2190 Feb 10 12:55 keystore.jks

Import the Java KeyStore (JKS) into the wallet:

[oracle@server1 oracle]$ orapki wallet jks_to_pkcs12 -wallet . -pwd secure_password -keystore keystore.jks -jkspwd secure_password
Oracle PKI Tool : Version 12.1.0.2
Copyright (c) 2004, 2014, Oracle and/or its affiliates. All rights reserved.

[oracle@server1 oracle]$ orapki wallet display -wallet .
Oracle PKI Tool : Version 12.1.0.2
Copyright (c) 2004, 2014, Oracle and/or its affiliates. All rights reserved.

Requested Certificates:
User Certificates:
Subject:        CN=server1.domain.com,O=MyCompany,L=Geneva,C=CH
Trusted Certificates:
Subject:        CN=server1.domain.com,O=MyCompany,L=Geneva,C=CH

All worked fine and I got same result as with orapki, so not encountering any issue !

Securing database network conclusion

Before 12cR1 even if you have found an huge interest in network encryption, checksum and strong authentication the advanced security enterprise edition option price might have been a stopper in its implementation.

This is finally no more true and nothing should prevent you to, at least, implement network encryption and checksum as it require really low effort. You can still implement it from your database server in an optimistic way, means by not forbidding client connection that do not want to implement it…

References

The post Restricting and securing your database network – part 2 appeared first on IT World.

]]>
https://blog.yannickjaquier.com/oracle/restricting-securing-database-network-part-2.html/feed 1
Restricting and securing your database network – part 1 https://blog.yannickjaquier.com/oracle/restricting-securing-database-network-part-1.html https://blog.yannickjaquier.com/oracle/restricting-securing-database-network-part-1.html#comments Fri, 16 Mar 2018 10:50:39 +0000 http://blog.yannickjaquier.com/?p=4002 Preamble In our never ending SOX (Sarbanes–Oxley Act) journey the need to restrict network access to our highly critical databases came onto the table. Thinking of it three different ideas came to my mind: Use default Linux firewall to create rule to allow or disallow access to hostname network directly. Restrict access to listener with […]

The post Restricting and securing your database network – part 1 appeared first on IT World.

]]>

Table of contents

Preamble

In our never ending SOX (Sarbanes–Oxley Act) journey the need to restrict network access to our highly critical databases came onto the table. Thinking of it three different ideas came to my mind:

  • Use default Linux firewall to create rule to allow or disallow access to hostname network directly.
  • Restrict access to listener with standard sqlnet.ora parameters. Feature called valid node checking.
  • Use Connection Manager (CMAN) that has better administrative option versus previous method. Other added values of CMAN have poor interest when restricting database network access.

Blog post is based on Oracle release 12cR1 (12.1.0.2.0). I have a database server called server1.domain.com (192.168.56.101) and Connection Manager has been installed in a different Oracle home than the database but on same server. My two clients are server2.domain.com (192.168.56.102) and server3.domain.com (192.168.56.103). My three servers are virtual machines (Virtualbox) and are using Oracle Linux Server release 7.3.

In a second part we will see how to secure network layer to ensure no one has modified the content between your clients and your database server (man-in-the-middle attack (MITM)).

Linux firewall

Start and optionally enabled firewalld process onto your database server with:

[root@server1 ~]# systemctl start firewalld
[root@server1 ~]# systemctl enable firewalld
Created symlink from /etc/systemd/system/dbus-org.fedoraproject.FirewallD1.service to /usr/lib/systemd/system/firewalld.service.
Created symlink from /etc/systemd/system/basic.target.wants/firewalld.service to /usr/lib/systemd/system/firewalld.service.
[root@server1 ~]# systemctl status firewalld
● firewalld.service - firewalld - dynamic firewall daemon
   Loaded: loaded (/usr/lib/systemd/system/firewalld.service; enabled; vendor preset: enabled)
   Active: active (running) since Tue 2017-01-24 16:39:49 CET; 1 day 1h ago
     Docs: man:firewalld(1)
 Main PID: 13157 (firewalld)
   CGroup: /system.slice/firewalld.service
           └─13157 /usr/bin/python -Es /usr/sbin/firewalld --nofork --nopid

Jan 24 16:39:49 server1.domain.com systemd[1]: Starting firewalld - dynamic firewall daemon...
Jan 24 16:39:49 server1.domain.com systemd[1]: Started firewalld - dynamic firewall daemon.
[root@server1 ~]# systemctl status firewalld
● firewalld.service - firewalld - dynamic firewall daemon
   Loaded: loaded (/usr/lib/systemd/system/firewalld.service; disabled; vendor preset: enabled)
   Active: active (running) since Tue 2017-01-24 16:39:49 CET; 24h ago
     Docs: man:firewalld(1)
 Main PID: 13157 (firewalld)
   CGroup: /system.slice/firewalld.service
           └─13157 /usr/bin/python -Es /usr/sbin/firewalld --nofork --nopid

Jan 24 16:39:49 server1.domain.com systemd[1]: Starting firewalld - dynamic firewall daemon...
Jan 24 16:39:49 server1.domain.com systemd[1]: Started firewalld - dynamic firewall daemon.

to ease administration I strongly suggest to install Linux firewall graphical configuration tool (firewall-config), but you can still use the command line too called firewall-cmd (I will try to supply commands as much as possible):

[root@server1 ~]# yum install firewall-config

I solved the font issue:

[root@server1 ~]# firewall-config

(firewall-config:10381): Pango-WARNING **: failed to choose a font, expect ugly output. engine-type='PangoRenderFc', script='common'

(firewall-config:10381): Pango-WARNING **: failed to choose a font, expect ugly output. engine-type='PangoRenderFc', script='latin'
GLib-GIO-Message: Using the 'memory' GSettings backend.  Your settings will not be saved or shared with other applications.

By issuing:

[root@server1 ~]# yum install dejavu-fonts-common dejavu-sans-fonts.noarch dejavu-serif-fonts.noarch

Finally got the Firewall graphical interface working:

secure_database_network01
secure_database_network01

On my virtual machine I have two (three with loopback) network devices, one for internal non routable network and one for Internet access:

[root@server1 ~]# ip link
1: lo:  mtu 65536 qdisc noqueue state UNKNOWN mode DEFAULT
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
2: enp0s3:  mtu 1500 qdisc pfifo_fast state UP mode DEFAULT qlen 1000
    link/ether 08:00:27:bc:4d:c3 brd ff:ff:ff:ff:ff:ff
3: enp0s8:  mtu 1500 qdisc pfifo_fast state UP mode DEFAULT qlen 1000
    link/ether 08:00:27:20:2e:d6 brd ff:ff:ff:ff:ff:ff

By default they are all in public zone:

[root@server1 ~]# firewall-cmd --get-active-zones
public
  interfaces: enp0s3 enp0s8

Public zone is:

For use in public areas. You do not trust the other computers on the network to not harm your computer. Only selected incoming connections are accepted.

Once you have activated the firewall then your database is not accessible from outside:

[oracle@server2 ~]$ tnsping //server1.domain.com:1531/pdb1

TNS Ping Utility for Linux: Version 12.1.0.2.0 - Production on 25-JAN-2017 17:51:16

Copyright (c) 1997, 2014, Oracle.  All rights reserved.

Used parameter files:
/u01/app/oracle/product/12.1.0/dbhome_1/network/admin/sqlnet.ora

Used HOSTNAME adapter to resolve the alias
Attempting to contact (DESCRIPTION=(CONNECT_DATA=(SERVICE_NAME=pdb1))(ADDRESS=(PROTOCOL=TCP)(HOST=192.168.56.101)(PORT=1531)))
TNS-12543: TNS:destination host unreachable
[oracle@server3 ~]$ tnsping //server1.domain.com:1531/pdb1

TNS Ping Utility for Linux: Version 12.1.0.2.0 - Production on 25-JAN-2017 18:05:33

Copyright (c) 1997, 2014, Oracle.  All rights reserved.

Used parameter files:

Used HOSTNAME adapter to resolve the alias
Attempting to contact (DESCRIPTION=(CONNECT_DATA=(SERVICE_NAME=pdb1))(ADDRESS=(PROTOCOL=TCP)(HOST=192.168.56.101)(PORT=1531)))
TNS-12543: TNS:destination host unreachable

The example we will setup is give access from server2.domain.com and keep forbidden access from server3.domain.com. You can either do it with graphical interface or command line:

[root@server1 ~]# firewall-cmd --zone=public --add-rich-rule='rule family="ipv4" source address="192.168.56.102" destination address="192.168.56.101" port port="1531" protocol="tcp" accept'
success
[root@server1 ~]# firewall-cmd --list-rich-rules
rule family="ipv4" source address="192.168.56.102" destination address="192.168.56.101" port port="1531" protocol="tcp" accept

Remark:
You need to issue same command with –permanent option to make it permanent across restart and reboot.

From graphical interface:

secure_database_network02
secure_database_network02

Connection from server2.domain.com is now possible while connection from server3.domain.com is still forbidden:

[oracle@server2 ~]$ tnsping //server1.domain.com:1531/pdb1

TNS Ping Utility for Linux: Version 12.1.0.2.0 - Production on 30-JAN-2017 12:25:05

Copyright (c) 1997, 2014, Oracle.  All rights reserved.

Used parameter files:
/u01/app/oracle/product/12.1.0/dbhome_1/network/admin/sqlnet.ora

Used HOSTNAME adapter to resolve the alias
Attempting to contact (DESCRIPTION=(CONNECT_DATA=(SERVICE_NAME=pdb1))(ADDRESS=(PROTOCOL=TCP)(HOST=192.168.56.101)(PORT=1531)))
OK (0 msec)
[oracle@server2 ~]$ sqlplus yjaquier/'secure_password'@//server1.domain.com:1531/pdb1

SQL*Plus: Release 12.1.0.2.0 Production on Mon Jan 30 12:25:20 2017

Copyright (c) 1982, 2014, Oracle.  All rights reserved.

Last Successful login time: Mon Jan 30 2017 12:24:54 +01:00

Connected to:
Oracle Database 12c Enterprise Edition Release 12.1.0.2.0 - 64bit Production
With the Partitioning, OLAP, Advanced Analytics and Real Application Testing options

SQL>

This method is by far the most simple and is available with very low effort. The biggest drawback is to be obliged to work with IP addresses. If you work with servers that have fixed IP addresses then no issue but if you need to allocate end users’ desktops/laptops where they most probably get their IP addresses by DHCP then it might become a nightmare…

Valid node checking

The listener.ora file I use is the one created at database installation where I have just customized the listening port to avoid 1521:

LISTENER_ORCL =
  (DESCRIPTION_LIST =
    (DESCRIPTION =
      (ADDRESS = (PROTOCOL = TCP)(HOST = server1.domain.com)(PORT = 1531))
      (ADDRESS = (PROTOCOL = IPC)(KEY = EXTPROC1531))
    )
  )

Valid node checking feature is based on three parameters that you set in sqlnet.ora file:

  • TCP.VALIDNODE_CHECKING: To enable and disable valid node checking for incoming connections.
  • TCP.EXCLUDED_NODES: To specify which clients are denied access to the database.
  • TCP.INVITED_NODES: To specify which clients are allowed access to the database. This list takes precedence over the TCP.EXCLUDED_NODES parameter if both lists are present.

To reproduce a complete blackout from outside I have set:

tcp.validnode_checking=yes
tcp.excluded_nodes=(*)

And reload your listener configuration file to activate the sqlnet.ora parameters:

[oracle@server1 ~]$ lsnrctl reload listener_orcl

LSNRCTL for Linux: Version 12.1.0.2.0 - Production on 30-JAN-2017 17:09:34

Copyright (c) 1991, 2014, Oracle.  All rights reserved.

Connecting to (DESCRIPTION=(ADDRESS=(PROTOCOL=TCP)(HOST=server1.domain.com)(PORT=1531)))
The command completed successfully

This is first mistake not to do as the listener will not even look for SQL*Net local connection i.e. no services supported because PMON will not register the database:

[oracle@server1 ~]$ lsnrctl status listener_orcl

LSNRCTL for Linux: Version 12.1.0.2.0 - Production on 30-JAN-2017 17:07:05

Copyright (c) 1991, 2014, Oracle.  All rights reserved.

Connecting to (DESCRIPTION=(ADDRESS=(PROTOCOL=TCP)(HOST=server1.domain.com)(PORT=1531)))
TNS-12547: TNS:lost contact
 TNS-12560: TNS:protocol adapter error
  TNS-00517: Lost contact
   Linux Error: 104: Connection reset by peer
Connecting to (DESCRIPTION=(ADDRESS=(PROTOCOL=IPC)(KEY=EXTPROC1531)))
STATUS of the LISTENER
------------------------
Alias                     listener_orcl
Version                   TNSLSNR for Linux: Version 12.1.0.2.0 - Production
Start Date                30-JAN-2017 16:52:27
Uptime                    0 days 0 hr. 14 min. 38 sec
Trace Level               off
Security                  ON: Local OS Authentication
SNMP                      OFF
Listener Parameter File   /u01/app/oracle/product/12.1.0/dbhome_1/network/admin/listener.ora
Listener Log File         /u01/app/oracle/diag/tnslsnr/server1/listener_orcl/alert/log.xml
Listening Endpoints Summary...
  (DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=server1)(PORT=1531)))
  (DESCRIPTION=(ADDRESS=(PROTOCOL=ipc)(KEY=EXTPROC1531)))
The listener supports no services
The command completed successfully

So you have to add the database server name in tcp.invited_nodes node parameters:

tcp.validnode_checking=yes
tcp.excluded_nodes=(*)
tcp.invited_nodes=(server1.domain.com)

And reload configuration again:

[oracle@server1 ~]$ lsnrctl reload listener_orcl

LSNRCTL for Linux: Version 12.1.0.2.0 - Production on 30-JAN-2017 17:11:10

Copyright (c) 1991, 2014, Oracle.  All rights reserved.

Connecting to (DESCRIPTION=(ADDRESS=(PROTOCOL=TCP)(HOST=server1.domain.com)(PORT=1531)))
TNS-12547: TNS:lost contact
 TNS-12560: TNS:protocol adapter error
  TNS-00517: Lost contact
   Linux Error: 104: Connection reset by peer
Connecting to (DESCRIPTION=(ADDRESS=(PROTOCOL=IPC)(KEY=EXTPROC1531)))
The command completed successfully

After this services are again supported and as expected connection from my two clients is not working:

[oracle@server2 ~]$ tnsping //server1.domain.com:1531/pdb1

TNS Ping Utility for Linux: Version 12.1.0.2.0 - Production on 30-JAN-2017 17:14:25

Copyright (c) 1997, 2014, Oracle.  All rights reserved.

Used parameter files:
/u01/app/oracle/product/12.1.0/dbhome_1/network/admin/sqlnet.ora

Used HOSTNAME adapter to resolve the alias
Attempting to contact (DESCRIPTION=(CONNECT_DATA=(SERVICE_NAME=pdb1))(ADDRESS=(PROTOCOL=TCP)(HOST=192.168.56.101)(PORT=1531)))
TNS-12547: TNS:lost contact

Same as before if I want to only give access to server2.domain.com I change my sqlnet.ora to (and reload listener to activate it):

tcp.validnode_checking=yes
tcp.excluded_nodes=(*)
tcp.invited_nodes=(server1.domain.com,server2.domain.com)

Now I can connect from server2.domain.com:

[oracle@server2 ~]$ tnsping //server1.domain.com:1531/pdb1

TNS Ping Utility for Linux: Version 12.1.0.2.0 - Production on 30-JAN-2017 17:21:33

Copyright (c) 1997, 2014, Oracle.  All rights reserved.

Used parameter files:
/u01/app/oracle/product/12.1.0/dbhome_1/network/admin/sqlnet.ora

Used HOSTNAME adapter to resolve the alias
Attempting to contact (DESCRIPTION=(CONNECT_DATA=(SERVICE_NAME=pdb1))(ADDRESS=(PROTOCOL=TCP)(HOST=192.168.56.101)(PORT=1531)))
OK (10 msec)
[oracle@server2 ~]$ sqlplus yjaquier/'secure_password'@//server1.domain.com:1531/pdb1

SQL*Plus: Release 12.1.0.2.0 Production on Mon Jan 30 17:21:43 2017

Copyright (c) 1982, 2014, Oracle.  All rights reserved.

Last Successful login time: Mon Jan 30 2017 13:00:07 +01:00

Connected to:
Oracle Database 12c Enterprise Edition Release 12.1.0.2.0 - 64bit Production
With the Partitioning, OLAP, Advanced Analytics and Real Application Testing options

SQL>

While access from server3.domain.com is still forbidden (note the different TNS message versus method one):

[oracle@server3 ~]$ tnsping //server1.domain.com:1531/pdb1

TNS Ping Utility for Linux: Version 12.1.0.2.0 - Production on 30-JAN-2017 18:13:10

Copyright (c) 1997, 2014, Oracle.  All rights reserved.

Used parameter files:

Used HOSTNAME adapter to resolve the alias
Attempting to contact (DESCRIPTION=(CONNECT_DATA=(SERVICE_NAME=pdb1))(ADDRESS=(PROTOCOL=TCP)(HOST=192.168.56.101)(PORT=1531)))
TNS-12547: TNS:lost contact

One drawback I have found in MOS is the listener start failure is case there is there is an invalid hostname in TCP.INVITED_NODES parameter. For famous databases in your organisation the management of all client names in a single line might also become cumbersome…

Connection Manager

Connection Manager is a proxy server (SQL proxy) that will forward connections to databases (or other proxy servers). You normally install it on a standalone server different from your database server but to ease my testing I will install it on my database server that is clearly not recommended. After few glitches (mainly due to unique sqlnet.ora file) while using same Oracle home as my Oracle database I have decided to install Connection Manager in its own separated Oracle home. To install it I have used Oracle Database 12c Release 1 Client (12.1.0.2.0) for Linux x86-64 (64-bit) zip file.

From official documentation Connection Manager main features are:

  • Access control: To use rule-based configuration to filter user-specified client requests and accept others.
  • Session multiplexing: To funnel multiple client sessions through a network connection to a shared server destination.

Installation

Custom installation:

secure_database_network03
secure_database_network03

Language:

secure_database_network04
secure_database_network04

Choosing a separate Oracle home than my Oracle database:

secure_database_network05
secure_database_network05

Taking only Connection Manager, other required parts will be automatically chosen by installer:

secure_database_network06
secure_database_network06

Summary:

secure_database_network07
secure_database_network07

Installation:

secure_database_network08
secure_database_network08

Default network configuration (I had to stop the default listener afterward):

secure_database_network09
secure_database_network09

Root.sh:

secure_database_network10
secure_database_network10

Successful installation:

secure_database_network11
secure_database_network11

Configuration

To create the first Connection Manager configuration I have used My Oracle Support (MOS) note called Connection Manager Configuration Utility (Doc ID 1435277.1). Execute the provided tool with something like (I have extracted the zip in /tmp directory):

[oracle@server1 ~]$ java -jar /tmp/CmanConfig.jar

Choose you configuration mode:

secure_database_network12
secure_database_network12

Fill in few required information (name, port, Oracle home and hostname):

secure_database_network13
secure_database_network13

Successful execution message:

secure_database_network14
secure_database_network14

This small tool generate a file called CMAN_Basic_Next_Step.rtf where you have run it. This file only contains a copy command from generated Connection Manager configuration file called /home/oracle/CMAN_Config/cman.ora (in my case):

[oracle@server1 ~]$ cat /home/oracle/CMAN_Config/cman.ora
CMAN=
 (CONFIGURATION=
  (ADDRESS=(PROTOCOL=TCP)(HOST=server1.domain.com)(port=1541))
  (RULE_LIST =
    (RULE=(SRC=*) (DST=*) (SRV=*) (ACT=ACCEPT)))
 (PARAMETER_LIST =
  (LOG_LEVEL=OFF)
  (LOG_DIRECTORY=/u01/app/oracle/product/12.1.0/dbhome_2/network/log)
  (DIAG_ADR_ENABLED=OFF)
  (TRACE_LEVEL=OFF)
  (TRACE_TIMESTAMP=ON)
  (TRACE_DIRECTORY=OFF)
  ##(TRACE_FILELEN=1024)
  ##(TRACE_FILENO=10)
  (ASO_AUTHENTICATION_FILTER=OFF)
  (CONNECTION_STATISTICS=NO)
  (IDLE_TIMEOUT=0)
  (INBOUND_CONNECT_TIMEOUT=0)
  ##(MAX_CMCTL_SESSIONS=4)
  ##(MAX_CONNECTIONS=1024)
  ##(MAX_GATEWAY_PROCESSES=4)
  ##(MIN_GATEWAY_PROCESSES=4)
  (OUTBOUND_CONNECT_TIMEOUT=0)
  (SESSION_TIMEOUT=0)
  )
 )

I have customized it and in following cman.ora example we target to listen on port 1541 and redirect connection for pdb1 service to database running on same server. For all clients whatever their IP. You can add as many rules as you wish:

CMAN=
 (CONFIGURATION=
  (ADDRESS=(PROTOCOL=TCP)(HOST=server1.domain.com)(port=1541))
  (RULE_LIST =
    (RULE=(SRC=*) (DST=server1.domain.com) (SRV=pdb1) (ACT=ACCEPT))
  )
 (PARAMETER_LIST =
  (LOG_LEVEL=OFF)
  (LOG_DIRECTORY=/u01/app/oracle/product/12.1.0/dbhome_2/network/log)
  (DIAG_ADR_ENABLED=OFF)
  (TRACE_LEVEL=OFF)
  (TRACE_TIMESTAMP=ON)
  (TRACE_DIRECTORY=OFF)
  (ASO_AUTHENTICATION_FILTER=OFF)
  (CONNECTION_STATISTICS=NO)
  (IDLE_TIMEOUT=0)
  (INBOUND_CONNECT_TIMEOUT=0)
  (OUTBOUND_CONNECT_TIMEOUT=0)
  (SESSION_TIMEOUT=0)
  )
 )

To forbid direct connection to listener you need to restrict it only to server where is running your connection manager with valid node checking feature we have seen just above:

tcp.validnode_checking=yes
tcp.excluded_nodes=(*)
tcp.invited_nodes=(server1.domain.com)

On Unix the executable to control Connection Manager is called cmctl. My first startup attempt miserably failed:

[oracle@server1 ~]$ cmctl

CMCTL for Linux: Version 12.1.0.2.0 - Production on 31-JAN-2017 11:37:29

Copyright (c) 1996, 2014, Oracle.  All rights reserved.

Welcome to CMCTL, type "help" for information.

CMCTL> administer cman
Current instance cman is not yet started
Connections refer to (ADDRESS=(PROTOCOL=TCP)(HOST=server1.domain.com)(port=1541)).
The command completed successfully.
CMCTL:cman> startup
TNS-04012: Unable to start Oracle Connection Manager instance.
CMCTL:cman>

In $ORACLE_HOME/network/log/cman_alert.log file I have found:

(LOG_RECORD=(TIMESTAMP=31-JAN-2017 11:37:51)(EVENT=CMAN.ORA contains no rule for local CMCTL connection)(Add (rule=(src=server1)(dst=127.0.0.1)(srv=cmon)(act=accept)) in rule_list)

So added it to cman.ora file:

    (RULE=(SRC=server1.domain.com) (DST=127.0.0.1) (SRV=cmon) (ACT=ACCEPT))

Second startup attempt also failed:

CMCTL:cman> startup
Starting Oracle Connection Manager instance cman. Please wait...
TNS-04013: CMCTL timed out waiting for Oracle Connection Manager to start

Replaced by what is explained in CMAN Fails to Start and Throws the Following Errors: TNS-04013 and TNS-12529 (Doc ID 1059938.1). It is related to IPv6:

    (RULE=(SRC=server1.domain.com) (DST=::1) (SRV=cmon) (ACT=ACCEPT))

Third startup attempt failed for:

CMCTL:cman> startup
TNS-04012: Unable to start Oracle Connection Manager instance.

In $ORACLE_HOME/network/log/cman_alert.log file I have found:

Error listening on: (ADDRESS=(PROTOCOL=TCP)(HOST=server1.domain.com)(port=1541))
TNS-12542: TNS:address already in use
 TNS-12560: TNS:protocol adapter error
  TNS-00512: Address already in use
   Linux Error: 98: Address already in use
)(OPN=77)(NS1=12564)(NS2=0)(NT1=0)(NT2=0))

Killed remaining processes:

[root@server1 tmp]# ps -ef | grep cman |grep -v grep
oracle   21261     1  0 11:45 ?        00:00:00 /u01/app/oracle/product/12.1.0/dbhome_2/bin/cmadmin cman -inherit
oracle   21264     1  0 11:45 ?        00:00:00 /u01/app/oracle/product/12.1.0/dbhome_2/bin/tnslsnr ifile=/u01/app/oracle/product/12.1.0/dbhome_1/network/admin/cman.ora cman -inherit -mode proxy
oracle   21267     1  0 11:45 ?        00:00:00 /u01/app/oracle/product/12.1.0/dbhome_2/bin/cmgw cmgw0 0 16 cman SNLSM:99224000
oracle   21269     1  0 11:45 ?        00:00:00 /u01/app/oracle/product/12.1.0/dbhome_2/bin/cmgw cmgw1 1 16 cman SNLSM:99224000

And finally worked this time:

CMCTL:cman> startup
Starting Oracle Connection Manager instance cman. Please wait...
TNS-04077: WARNING: No password set for the Oracle Connection Manager instance.
CMAN for Linux: Version 12.1.0.2.0 - Production
Status of the Instance
----------------------
Instance name             cman
Version                   CMAN for Linux: Version 12.1.0.2.0 - Production
Start date                31-JAN-2017 11:56:21
Uptime                    0 days 0 hr. 0 min. 9 sec
Num of gateways started   2
Average Load level        0
Log Level                 OFF
Trace Level               OFF
Instance Config file      /u01/app/oracle/product/12.1.0/dbhome_2/network/admin/cman.ora
Instance Log directory    /u01/app/oracle/product/12.1.0/dbhome_2/network/log
Instance Trace directory  OFF
The command completed successfully.

In tnsnames.ora file of my database server Oracle home I have added (listener_orcl was already there):

LISTENER_ORCL =
  (ADDRESS = (PROTOCOL = TCP)(HOST = server1.domain.com)(PORT = 1531))

CMAN =
  (ADDRESS = (PROTOCOL = TCP)(HOST = server1.domain.com)(PORT = 1541))

Then I register remote listener for my Oracle database:

SQL> alter system set remote_listener=cman;

System altered.

SQL> show parameter listener

NAME                                 TYPE        VALUE
------------------------------------ ----------- ------------------------------
listener_networks                    string
local_listener                       string      LISTENER_ORCL
remote_listener                      string      CMAN

SQL> alter system register;

System altered.

Now services are well server by Connection Manager:

CMCTL:cman> show services
Services Summary...
Proxy service "cmgw" has 1 instance(s).
  Instance "cman", status READY, has 2 handler(s) for this service...
    Handler(s):
      "cmgw001" established:0 refused:0 current:0 max:256 state:ready
         
         (ADDRESS=(PROTOCOL=tcp)(HOST=::1)(PORT=31192))
      "cmgw000" established:0 refused:0 current:0 max:256 state:ready
         
         (ADDRESS=(PROTOCOL=tcp)(HOST=::1)(PORT=44850))
Service "cmon" has 1 instance(s).
  Instance "cman", status READY, has 1 handler(s) for this service...
    Handler(s):
      "cmon" established:1 refused:0 current:1 max:4 state:ready
         
         (ADDRESS=(PROTOCOL=tcp)(HOST=::1)(PORT=64226))
Service "orcl" has 1 instance(s).
  Instance "orcl", status READY, has 1 handler(s) for this service...
    Handler(s):
      "DEDICATED" established:0 refused:0 state:ready
         REMOTE SERVER
         (ADDRESS=(PROTOCOL=TCP)(HOST=server1)(PORT=1531))
Service "pdb1" has 1 instance(s).
  Instance "orcl", status READY, has 1 handler(s) for this service...
    Handler(s):
      "DEDICATED" established:0 refused:0 state:ready
         REMOTE SERVER
         (ADDRESS=(PROTOCOL=TCP)(HOST=server1)(PORT=1531))
The command completed successfully.

Connection are also well redirected:

[oracle@server2 ~]$ tnsping //server1.domain.com:1541/pdb1

TNS Ping Utility for Linux: Version 12.1.0.2.0 - Production on 31-JAN-2017 16:06:28

Copyright (c) 1997, 2014, Oracle.  All rights reserved.

Used parameter files:
/u01/app/oracle/product/12.1.0/dbhome_1/network/admin/sqlnet.ora

Used HOSTNAME adapter to resolve the alias
Attempting to contact (DESCRIPTION=(CONNECT_DATA=(SERVICE_NAME=pdb1))(ADDRESS=(PROTOCOL=TCP)(HOST=192.168.56.101)(PORT=1541)))
OK (0 msec)

Problem is that direct access to listener is still possible. So far access is possible from any clients:

[oracle@server2 ~]$ tnsping //server1.domain.com:1531/pdb1

TNS Ping Utility for Linux: Version 12.1.0.2.0 - Production on 31-JAN-2017 16:06:30

Copyright (c) 1997, 2014, Oracle.  All rights reserved.

Used parameter files:
/u01/app/oracle/product/12.1.0/dbhome_1/network/admin/sqlnet.ora

Used HOSTNAME adapter to resolve the alias
Attempting to contact (DESCRIPTION=(CONNECT_DATA=(SERVICE_NAME=pdb1))(ADDRESS=(PROTOCOL=TCP)(HOST=192.168.56.101)(PORT=1531)))
OK (0 msec)

To forbid direct access to my listener I use valid node checking feature we have seen just above and allow only connection from Connection Manager hostname, that is also server1.domain.com. And I reload listener parameters:

tcp.validnode_checking=yes
tcp.excluded_nodes=(*)
tcp.invited_nodes=(server1.domain.com)

Now from any clients direct listener connections are no more possible and I must go through Connection Manager:

[oracle@server2 ~]$ tnsping //server1.domain.com:1531/pdb1

TNS Ping Utility for Linux: Version 12.1.0.2.0 - Production on 31-JAN-2017 17:02:57

Copyright (c) 1997, 2014, Oracle.  All rights reserved.

Used parameter files:
/u01/app/oracle/product/12.1.0/dbhome_1/network/admin/sqlnet.ora

Used HOSTNAME adapter to resolve the alias
Attempting to contact (DESCRIPTION=(CONNECT_DATA=(SERVICE_NAME=pdb1))(ADDRESS=(PROTOCOL=TCP)(HOST=192.168.56.101)(PORT=1531)))
TNS-12547: TNS:lost contact
[oracle@server2 ~]$ tnsping //server1.domain.com:1541/pdb1

TNS Ping Utility for Linux: Version 12.1.0.2.0 - Production on 31-JAN-2017 17:02:59

Copyright (c) 1997, 2014, Oracle.  All rights reserved.

Used parameter files:
/u01/app/oracle/product/12.1.0/dbhome_1/network/admin/sqlnet.ora

Used HOSTNAME adapter to resolve the alias
Attempting to contact (DESCRIPTION=(CONNECT_DATA=(SERVICE_NAME=pdb1))(ADDRESS=(PROTOCOL=TCP)(HOST=192.168.56.101)(PORT=1541)))
OK (10 msec)

Now same as before I want to forbid access from server3.domain.com only. I change the rules part of my cman.ora file with:

  (RULE_LIST =
    (RULE=(SRC=server2.domain.com) (DST=server1.domain.com) (SRV=pdb1) (ACT=ACCEPT))
    (RULE=(SRC=server1.domain.com) (DST=::1) (SRV=cmon) (ACT=ACCEPT))
  )

You need to reload connection manager and you can display active rules with:

CMCTL:cman> reload
The command completed successfully.
CMCTL:cman> show rules
Number of filtering rules currently in effect: 2
(rule_list=
  (rule=
    (SRC=server2.domain.com)
    (DST=server1.domain.com)
    (SRV=pdb1)
    (ACT=ACCEPT)
  )
  (rule=
    (SRC=server1.domain.com)
    (DST=::1)
    (SRV=cmon)
    (ACT=ACCEPT)
  )
)
The command completed successfully.

Behavior remains unchanged for server2.domain.com while on server3.domain.com it’s working, partially, I would say. Tnsping is still answering while connection are not possible as expected:

[oracle@server3 ~]$ tnsping //server1.domain.com:1541/pdb1

TNS Ping Utility for Linux: Version 12.1.0.2.0 - Production on 31-JAN-2017 18:15:30

Copyright (c) 1997, 2014, Oracle.  All rights reserved.

Used parameter files:

Used HOSTNAME adapter to resolve the alias
Attempting to contact (DESCRIPTION=(CONNECT_DATA=(SERVICE_NAME=pdb1))(ADDRESS=(PROTOCOL=TCP)(HOST=192.168.56.101)(PORT=1541)))
OK (0 msec)
[oracle@server3 ~]$ sqlplus yjaquier/'secure_password'@//server1.domain.com:1541/pdb1

SQL*Plus: Release 12.1.0.2.0 Production on Tue Jan 31 18:15:33 2017

Copyright (c) 1982, 2014, Oracle.  All rights reserved.

ERROR:
ORA-12529: TNS:connect request rejected based on current filtering rules

I have tried to explicitly drop or reject connection for any servers with below rule but tnsping is still positively answering:

  (rule=
    (SRC=*)
    (DST=server1.domain.com)
    (SRV=pdb1)
    (ACT=drop)
  )

If you ask yourself the possible value of ACT parameter and more precisely the difference between reject and drop it is in Oracle official documentation: accept to accept incoming requests, reject to reject incoming requests, or drop to reject incoming requests without sending an error message.

I would have expected drop option to refuse even tnsping request but it is not the case…

Restricting database network conclusion

The Linux firewall method is the most simple, only drawback is to be obliged to work with IP addresses so not really compatible with a client population and DHCP. This would be the solution you implement in the case of a database not directly accessible by end users.

The valid node checking is better in this way as you can specify hostname and client names. There is apparently no particular limitation in parameter length in case you have a long list of clients. It also accepts, even if I have not tested it, wildcards in IP addresses. The only (big) drawback is that the listener will not start if one of the hostnames or IP addresses cannot be resolved with either ping or nslookup. It might also become complex to handle in case you have a long list of clients. This would be my recommended solution in case of a simple intranet need.

Finally Connection Manager is more a solution you would use in case of a demilitarized zone (DMZ) which would drastically decrease the number of rules you need to add in your firewall. Meaning that all DMZ clients will go through Connection Manage to access to the database that could be in your intranet. Connection Manager accept wildcard in IP addresses only in the form of x.x.x.x/nn where nn represents a subnet mask that comprises nn left-most bits. This is a solution I would use in the case of an internet need or in the case of plenty of direct clients access as the management would be a little easier. And in the case of multiple clients connection multiplexing could also be implemented.

References

The post Restricting and securing your database network – part 1 appeared first on IT World.

]]>
https://blog.yannickjaquier.com/oracle/restricting-securing-database-network-part-1.html/feed 1
Secure external password store (SEPS) implementation https://blog.yannickjaquier.com/oracle/secure-external-password-store-implementation.html https://blog.yannickjaquier.com/oracle/secure-external-password-store-implementation.html#respond Mon, 19 Feb 2018 09:31:00 +0000 http://blog.yannickjaquier.com/?p=4223 Preamble Warning: nothing recent in this blog post ! But as usual in our continuous improvement to secure our Oracle databases and the never ending requests of our preferred SOX auditors we have been looking for a solution to hide applicative accounts passwords from developers and end users. You might argue that those applicative passwords […]

The post Secure external password store (SEPS) implementation appeared first on IT World.

]]>

Table of contents

Preamble

Warning: nothing recent in this blog post ! But as usual in our continuous improvement to secure our Oracle databases and the never ending requests of our preferred SOX auditors we have been looking for a solution to hide applicative accounts passwords from developers and end users.

You might argue that those applicative passwords should not be known by any non authorized person, and you were right except that in real life there might be some divergence on this basic rule. So how would they know those applicative accounts passwords ? Answer is by simply displaying batch job or display running processes for example (ps command). Introduced with Oracle 10gR2 Oracle secure external password store (SEPS) feature target is exactly answering to this problem: hiding clear text passwords in batch scripts and allowing people to access a database with an account without knowing the password.

The background of this feature is Oracle Wallet and we will store inside accounts and their associated password, in an encrypted way (3DES) of course, and the usage will be then password less on command line either interactively or in batch jobs.

Last but not least this feature does not require the Enterprise Advanced Security paid option. You can even use it for free on your Windows laptop client !

This blog post has been written using two virtual machines running Oracle 12cR2 (12.2.0.1.0) Enterprise Edition for the Oracle database and client part. Both machines are running Oracle Linux Server release 7.4. In below server1.domain.com is my database server while client part is running on server4.domain.com.

Wallet creation

Wallet management is made of three distinct tools:

  • Oracle Wallet Manager (OWM), only graphical tool in this list
  • orapki
  • mkstore

Very quickly you will realize that OWM has no menu to manage SEPS and even if you can create an empty wallet you cannot save if it is empty… We will need to work with command line tools…

The recommended tool to use is mkstore but orapki has the interesting -auto_login_local option which forbid the wallet to be copied to another machine. I will use OWM default directory i.e. $ORACLE_HOME/owm/wallets/oracle.

Create the wallet with:

[oracle@server4 ~]$ mkstore -wrl $ORACLE_HOME/owm/wallets/oracle -create
Oracle Secret Store Tool : Version 12.2.0.1.0
Copyright (c) 2004, 2016, Oracle and/or its affiliates. All rights reserved.

Enter password:
Enter password again:
[oracle@server4 ~]$ ll $ORACLE_HOME/owm/wallets/oracle
total 8
-rw------- 1 oracle dba 194 Feb  6 12:46 cwallet.sso
-rw------- 1 oracle dba   0 Feb  6 12:46 cwallet.sso.lck
-rw------- 1 oracle dba 149 Feb  6 12:46 ewallet.p12
-rw------- 1 oracle dba   0 Feb  6 12:46 ewallet.p12.lck

Or with orapki and -auto_login_local option:

[oracle@server4 ~]$ orapki wallet create -wallet $ORACLE_HOME/owm/wallets/oracle -auto_login_local
Oracle PKI Tool : Version 12.2.0.1.0
Copyright (c) 2004, 2016, Oracle and/or its affiliates. All rights reserved.

Enter password:
Enter password again:
Operation is successfully completed.
[oracle@server4 ~]$ ll $ORACLE_HOME/owm/wallets/oracle
total 8
-rw------- 1 oracle dba 194 Feb  6 12:48 cwallet.sso
-rw------- 1 oracle dba   0 Feb  6 12:48 cwallet.sso.lck
-rw------- 1 oracle dba 149 Feb  6 12:48 ewallet.p12
-rw------- 1 oracle dba   0 Feb  6 12:48 ewallet.p12.lck

You also need to modify your client sqlnet.ora to specify where is your wallet with WALLET_LOCATION and tell the client to override the credential with he one stored in the wallet with SQLNET.WALLET_OVERRIDE:

[oracle@server4 ~]$ cat $ORACLE_HOME/network/admin/sqlnet.ora
WALLET_LOCATION=
  (SOURCE=
      (METHOD=file)
      (METHOD_DATA=
         (DIRECTORY=/u01/app/oracle/product/12.2.0/client_1/owm/wallets/oracle)))

SQLNET.WALLET_OVERRIDE=true

Secure External Password Store credentials creation

To insert credentials into your wallet the only available tool is mkstore. When creating credential either you supply password on command line like first example or, better, you supply them interactively like second example. As you have guessed -deleteCredential is used to delete a credential and -listCredential to list them:

[oracle@server4 ~]$ mkstore
Oracle Secret Store Tool : Version 12.2.0.1.0
Copyright (c) 2004, 2016, Oracle and/or its affiliates. All rights reserved.

mkstore [-wrl wrl] [-create] [-createSSO] [-createLSSO] [-createALO] [-delete] [-deleteSSO] [-list] [-createEntry alias secret] [-viewEntry alias]
[-modifyEntry alias secret] [-deleteEntry alias] [-createCredential connect_string username password] [-listCredential]
[-modifyCredential connect_string username password] [-deleteCredential connect_string]  [-createUserCredential map key   password]
[-modifyUserCredential map key username password]  [-deleteUserCredential map key] [-help] [-nologo]
[oracle@server4 ~]$ mkstore -wrl $ORACLE_HOME/owm/wallets/oracle -createCredential pdb1_yjaquier yjaquier secure_password
Oracle Secret Store Tool : Version 12.2.0.1.0
Copyright (c) 2004, 2016, Oracle and/or its affiliates. All rights reserved.

Enter wallet password:
[oracle@server4 ~]$ mkstore -wrl $ORACLE_HOME/owm/wallets/oracle -deleteCredential pdb1_yjaquier
Oracle Secret Store Tool : Version 12.2.0.1.0
Copyright (c) 2004, 2016, Oracle and/or its affiliates. All rights reserved.

Enter wallet password:
[oracle@server4 ~]$ mkstore -wrl $ORACLE_HOME/owm/wallets/oracle -createCredential pdb1_yjaquier yjaquier
Oracle Secret Store Tool : Version 12.2.0.1.0
Copyright (c) 2004, 2016, Oracle and/or its affiliates. All rights reserved.

Your secret/Password is missing in the command line
Enter your secret/Password:
Re-enter your secret/Password:
Enter wallet password:
[oracle@server4 ~]$ mkstore -wrl $ORACLE_HOME/owm/wallets/oracle -listCredential
Oracle Secret Store Tool : Version 12.2.0.1.0
Copyright (c) 2004, 2016, Oracle and/or its affiliates. All rights reserved.

Enter wallet password:
List credential (index: connect_string username)
1: pdb1_yjaquier yjaquier

The orapki equivalent has no interest, same as -list option of mkstore:

[oracle@server4 ~]$ orapki wallet display -wallet /u01/app/oracle/product/12.2.0/client_1/owm/wallets/oracle
Oracle PKI Tool : Version 12.2.0.1.0
Copyright (c) 2004, 2016, Oracle and/or its affiliates. All rights reserved.

Requested Certificates:
User Certificates:
Oracle Secret Store entries:
oracle.security.client.connect_string1
oracle.security.client.password1
oracle.security.client.username1
Trusted Certificates:
[oracle@server4 ~]$ mkstore -wrl /u01/app/oracle/product/12.2.0/client_1/owm/wallets/oracle -list
Oracle Secret Store Tool : Version 12.2.0.1.0
Copyright (c) 2004, 2016, Oracle and/or its affiliates. All rights reserved.

Enter wallet password:
Oracle Secret Store entries:
oracle.security.client.connect_string1
oracle.security.client.connect_string2
oracle.security.client.password1
oracle.security.client.password2
oracle.security.client.username1
oracle.security.client.username2

Remark
If you want to modify or delete the entries you have mkstore -modifyCredential and -deleteCredential option !

You also need to insert a TNS entry in tnsnames.ora with exact same name as the credential you have just created:

[oracle@server4 ~]$ cat $ORACLE_HOME/network/admin/tnsnames.ora
pdb1_yjaquier =
  (DESCRIPTION =
    (ADDRESS = (PROTOCOL = TCP)(HOST = server1.domain.com)(PORT = 1531))
    (CONNECT_DATA =
      (SERVER = DEDICATED)
      (SERVICE_NAME = pdb1)
    )
  )
[oracle@server4 ~]$ tnsping pdb1_yjaquier

TNS Ping Utility for Linux: Version 12.2.0.1.0 - Production on 06-FEB-2018 15:21:23

Copyright (c) 1997, 2016, Oracle.  All rights reserved.

Used parameter files:
/u01/app/oracle/product/12.2.0/client_1//network/admin/sqlnet.ora


Used TNSNAMES adapter to resolve the alias
Attempting to contact (DESCRIPTION = (ADDRESS = (PROTOCOL = TCP)(HOST = server1.domain.com)(PORT = 1531)) (CONNECT_DATA = (SERVER = DEDICATED) (SERVICE_NAME = pdb1)))
OK (10 msec)

Remark:
We see here that it’s important to setup a good naming convention in our credentials or it might quickly become a mess. Here I have chosen service name_account name.

Even if you cannot create those entries with OWM you can use it to display them, editing is also not available:

seps01
seps01

One “funny” thing is that you can still display passwords of the entries you have created in SEPS:

[oracle@server4 ~]$ mkstore -wrl /u01/app/oracle/product/12.2.0/client_1/owm/wallets/oracle -viewEntry oracle.security.client.username1
Oracle Secret Store Tool : Version 12.2.0.1.0
Copyright (c) 2004, 2016, Oracle and/or its affiliates. All rights reserved.

Enter wallet password:
oracle.security.client.username1 = yjaquier
[oracle@server4 ~]$ mkstore -wrl /u01/app/oracle/product/12.2.0/client_1/owm/wallets/oracle -viewEntry oracle.security.client.password1
Oracle Secret Store Tool : Version 12.2.0.1.0
Copyright (c) 2004, 2016, Oracle and/or its affiliates. All rights reserved.

Enter wallet password:
oracle.security.client.password1 = secure_password

Secure External Password Store testing

SQL*Plus

The most simple test is using SQL*Plus that is coming with my Linux client:

[oracle@server4 ~]$ sqlplus /@pdb1_yjaquier

SQL*Plus: Release 12.2.0.1.0 Production on Tue Feb 6 15:21:52 2018

Copyright (c) 1982, 2016, Oracle.  All rights reserved.

Last Successful login time: Mon Feb 05 2018 15:02:21 +01:00

Connected to:
Oracle Database 12c Enterprise Edition Release 12.2.0.1.0 - 64bit Production

SQL> show user
USER is "YJAQUIER"

JDBC OCI driver

Using the JDBC OCI driver is the most simple when planning to use SEPS because you directly benefit from the Oracle client where you have configured SEPS through the Oracle wallet. The source code I have written is:

import java.sql.ResultSet;
import java.sql.Connection;
import java.sql.SQLException;
import oracle.jdbc.OracleDriver;
import java.sql.DriverManager;

public class seps_oci {
  public static void main(String[] args) throws Exception {
    Connection connection1 = null;
    String query1 = "select user from dual";
    ResultSet resultset1 = null;

    try {
      connection1 = DriverManager.getConnection("jdbc:oracle:oci:/@pdb1_yjaquier");
    }
    catch (SQLException e) {
      System.out.println("Connection Failed! Check output console");
      e.printStackTrace();
      System.exit(1);
    }
    System.out.println("Connected to Oracle database...");
    
    if (connection1!=null) {
      try {
        resultset1 = connection1.createStatement().executeQuery(query1);
        while (resultset1.next()) {
          System.out.println("Connected user: "+resultset1.getString(1));
        }
      }
      catch (SQLException e) {
        System.out.println("Query has failed...");
      }
    }
    resultset1.close();
    connection1.close(); 
  }
}

To execute it command line (I normally use Eclipse) do:

[oracle@server4 ~]$ javac -cp $ORACLE_HOME/jdbc/lib/ojdbc8.jar seps_oci.java
[oracle@server4 ~]$ java -cp $ORACLE_HOME/jdbc/lib/ojdbc8.jar:. seps_oci
Connected to Oracle database...
Connected user: YJAQUIER

JDBC Thin driver

Using JDBC Thin driver is a little more complex because all the part done by the installed Oracle client is not pre-configured as for the JDBC OCI driver. And here it’s a little weird as you are supposed not having a client (Thin driver) but you need one for libraries and wallet configuration. Please note that instant client is not enough to do the job.

The first property to set when trying to connect is wallet location with oracle.net.wallet_location, this is done by:

props.setProperty("oracle.net.wallet_location","(SOURCE=(METHOD=file)(METHOD_DATA=(DIRECTORY=/u01/app/oracle/product/12.2.0/client_1/owm/wallets/oracle)))");

After to specify connect string you have two options either you insert in SEPS the complete TNS entry with something like:

[oracle@server4 ~]$ mkstore -wrl $ORACLE_HOME/owm/wallets/oracle -createCredential "(DESCRIPTION=(ADDRESS=(PROTOCOL=TCP)(HOST=server1.domain.com)
(PORT=1531))(CONNECT_DATA=(SERVER=DEDICATED)(SERVICE_NAME=pdb1)))" yjaquier
Oracle Secret Store Tool : Version 12.2.0.1.0
Copyright (c) 2004, 2016, Oracle and/or its affiliates. All rights reserved.

Your secret/Password is missing in the command line
Enter your secret/Password:
Re-enter your secret/Password:
Enter wallet password:
[oracle@server4 ~]$ mkstore -wrl $ORACLE_HOME/owm/wallets/oracle -listCredential
Oracle Secret Store Tool : Version 12.2.0.1.0
Copyright (c) 2004, 2016, Oracle and/or its affiliates. All rights reserved.

Enter wallet password:
List credential (index: connect_string username)
2: (DESCRIPTION=(ADDRESS=(PROTOCOL=TCP)(HOST=server1.domain.com)(PORT=1531))(CONNECT_DATA=(SERVER=DEDICATED)(SERVICE_NAME=pdb1))) yjaquier
1: pdb1_yjaquier yjaquier

Or you tell your Java program where is located tnsnames.ora file with oracle.net.tns_admin property:

System.setProperty("oracle.net.tns_admin","/u01/app/oracle/product/12.2.0/client_1/network/admin");

I have kept the two options in comment in my below Java code:

import java.sql.ResultSet;
import java.util.Properties;
import java.sql.Connection;
import java.sql.SQLException;
import oracle.jdbc.OracleDriver;
import java.sql.DriverManager;

public class seps_thin {
  public static void main(String[] args) throws Exception {
    Connection connection1 = null;
    String query1 = "select user from dual";
    String connect_string = "(DESCRIPTION=(ADDRESS=(PROTOCOL=TCP)(HOST=server1.domain.com)(PORT=1531))"+
                            "(CONNECT_DATA=(SERVER=DEDICATED)(SERVICE_NAME=pdb1)))";
    ResultSet resultset1 = null;
    Properties props = new Properties();

    try {
      props.setProperty("oracle.net.wallet_location","(SOURCE=(METHOD=file)(METHOD_DATA="+
                        "(DIRECTORY=/u01/app/oracle/product/12.2.0/client_1/owm/wallets/oracle)))");
      System.setProperty("oracle.net.tns_admin","/u01/app/oracle/product/12.2.0/client_1/network/admin");
      //connection1 = DriverManager.getConnection("jdbc:oracle:thin:/@" + connect_string, props);
      connection1 = DriverManager.getConnection("jdbc:oracle:thin:/@pdb1_yjaquier", props);
    }
    catch (SQLException e) {
      System.out.println("Connection Failed! Check output console");
      e.printStackTrace();
      System.exit(1);
    }
    System.out.println("Connected to Oracle database...");
    
    if (connection1!=null) {
      try {
        resultset1 = connection1.createStatement().executeQuery(query1);
        while (resultset1.next()) {
          System.out.println("Connected user: "+resultset1.getString(1));
        }
      }
      catch (SQLException e) {
        System.out.println("Query has failed...");
      }
    }
    resultset1.close();
    connection1.close(); 
  }
}

Execute it same as for JDBC Thin driver except that you need to add oraclepki.jar from $ORACLE_HOME/jlib directory (not the one from $ORACLE_HOME/oc4j/jlib directory):

[oracle@server4 ~]$ javac -cp $ORACLE_HOME/jdbc/lib/ojdbc8.jar:$ORACLE_HOME/jlib/oraclepki.jar seps_thin.java
[oracle@server4 ~]$ java -cp $ORACLE_HOME/jdbc/lib/ojdbc8.jar:$ORACLE_HOME/jlib/oraclepki.jar:. seps_thin
Connected to Oracle database...
Connected user: YJAQUIER

References

The post Secure external password store (SEPS) implementation appeared first on IT World.

]]>
https://blog.yannickjaquier.com/oracle/secure-external-password-store-implementation.html/feed 0
AWR mining for performance trend analysis https://blog.yannickjaquier.com/oracle/awr-mining-performance-trends.html https://blog.yannickjaquier.com/oracle/awr-mining-performance-trends.html#respond Sat, 20 Jan 2018 07:04:10 +0000 http://blog.yannickjaquier.com/?p=3964 Preamble Following a performance issue we had on a BI environment we have extracted with Automatic Workload Repository (AWR) reports few SQLs that are running for tens of minutes not to say multiple hours. Before the mess started we had an hardware storage issue (I/O switches) which triggered additional I/Os to recover situation. In parallel […]

The post AWR mining for performance trend analysis appeared first on IT World.

]]>

Table of contents

Preamble

Following a performance issue we had on a BI environment we have extracted with Automatic Workload Repository (AWR) reports few SQLs that are running for tens of minutes not to say multiple hours. Before the mess started we had an hardware storage issue (I/O switches) which triggered additional I/Os to recover situation. In parallel applicative team that got no feedback about the situation that was under recover started to reorganize tables to try to reduce High Water Mark and increase performance. Overall the benefit was the opposite because when storage issues were completely resolved we did not get the exact performance before it all started.

The big question you must answer is: how was it behaving before the issue and do those SQLs had their execution time changed ?

Ideally you would need to have a baseline when performance were good and you would be able to compare, as we have already seen in this blog post. Another option is to use AWR tables and mine into them to try to compare how SQLs have diverged over time. For this obviously you need historical AWR snapshots so the must to keep at least 15 days of history (not to say one month) and change the too low default 7 days value. Example with 1 hour snapshot and 30 days history:

SQL> exec dbms_workload_repository.modify_snapshot_settings (interval=>60, retention=>43200);

PL/SQL procedure successfully completed.

Checking which SQLs diverge is also overall a very interesting information and can trigger nice discover in your batch jobs scheduling (too early, too late,…) and/or jobs running in parallel even if you are not stuck in a performance issue situation.

So far testing has been done on Oracle Enterprise Edition 11.2.0.4 with Tuning and Diagnostic packs running on RedHat Linux release 6.4 64 bits. I will complement this post on other releases and operating system in future…

Parallel downgrades

If you are using Cloud Control one thing you have surely noticed (thanks to the red arrow) in SQL Monitoring page is the decrease in allocated parallel processes if your server is overloaded or sessions are exaggeratedly using parallelism:

awr_mining01
awr_mining01

This can also be seen at SQL level with something like:

SQL> set lines 150 pages 1000
SQL> SELECT
  sql_id,
  sql_exec_id,
  TO_CHAR(sql_exec_start,'dd-mon-yyyy hh24:mi:ss') AS sql_exec_start,
  ROUND(elapsed_time/1000000) AS "Elapsed(s)",
  px_servers_requested AS "Requested DOP",
  px_servers_allocated AS "Allocated DOP",
  ROUND(cpu_time/1000000) AS "CPU(s)",
  buffer_gets AS "Buffer Gets",
  ROUND(physical_read_bytes /(1024*1024)) AS "Phys reads(MB)",
  ROUND(physical_write_bytes/(1024*1024)) AS "Phys writes(MB)",
  ROUND(user_io_wait_time/1000000) AS "IO Wait(s)"
FROM v$sql_monitor
WHERE px_servers_requested<>px_servers_allocated
ORDER BY sql_exec_start,sql_id;

SQL_ID        SQL_EXEC_ID SQL_EXEC_START       Elapsed(s) Requested DOP Allocated DOP     CPU(s) Buffer Gets Phys reads(MB) Phys writes(MB) IO Wait(s)
------------- ----------- -------------------- ---------- ------------- ------------- ---------- ----------- -------------- --------------- ----------
262dzg4ab75nt    16777216 02-dec-2016 15:22:34        263            32             0         10      511923           3625             810        250
883v5mk5bqwq8    16777216 02-dec-2016 15:23:43         41            64            14          8      156832           1426               0         27
7rykdg0zdyjz5    16777216 02-dec-2016 15:24:16        159            32             0          9      560335            802             856        151
amzmxuns5dctz    16777216 02-dec-2016 15:24:59        114            32             0          5       36383            569             548        108
414x9b7p4z10x    16777280 02-dec-2016 15:26:30          0            16            14          0           5              0               0          0
414x9b7p4z10x    16777281 02-dec-2016 15:26:31          0            16            10          0           5              0               0          0
414x9b7p4z10x    16777282 02-dec-2016 15:26:31          0            16            14          0           5              0               0          0
3472f0m6nm343    16778004 02-dec-2016 15:26:33          0            32            14          0         381              0               0          0
414x9b7p4z10x    16777283 02-dec-2016 15:26:35          0            16            14          0           5              0               0          0
3472f0m6nm343    16778005 02-dec-2016 15:26:40          0            32            14          0         381              0               0          0
3472f0m6nm343    16778006 02-dec-2016 15:26:46          0            32            14          0         381              0               0          0

11 rows selected.

Problem is that V$SQL_MONITORING has no history version so if you come late onto database you will not be able to get any past information about it…

What you can have is overall parallel situation of your database with:

SQL> set lines 150 pages 1000
SQL> SELECT name, value
FROM V$SYSSTAT
WHERE lower(name) LIKE '%parallel%'
ORDER BY 1;

NAME                                                                  VALUE
---------------------------------------------------------------- ----------
DBWR parallel query checkpoint buffers written                      1132544
DDL statements parallelized                                            1864
DFO trees parallelized                                               295265
DML statements parallelized                                              21
Parallel operations downgraded 1 to 25 pct                            55785
Parallel operations downgraded 25 to 50 pct                           12427
Parallel operations downgraded 50 to 75 pct                           78033
Parallel operations downgraded 75 to 99 pct                            8125
Parallel operations downgraded to serial                             150542
Parallel operations not downgraded                                   241815
queries parallelized                                                 208744

11 rows selected.

As this table has no historical equivalent you can have values for each AWR snapshot:

SQL> set lines 150 pages 1000
SQL> SELECT
  to_char(hsn.begin_interval_time,'dd-mon-yyyy hh24:mi:ss') AS begin_interval_time,
  to_char(hsn.end_interval_time,'dd-mon-yyyy hh24:mi:ss') AS end_interval_time,
  hsy.stat_name,
  hsy.value
FROM dba_hist_sysstat hsy, dba_hist_snapshot hsn
WHERE hsy.snap_id = hsn.snap_id
AND hsy.instance_number = hsn.instance_number
AND lower(hsy.stat_name) like '%parallel%'
ORDER BY hsn.snap_id DESC;


BEGIN_INTERVAL_TIME  END_INTERVAL_TIME    STAT_NAME                                                             VALUE
-------------------- -------------------- ---------------------------------------------------------------- ----------
30-nov-2016 23:00:18 01-dec-2016 00:00:48 queries parallelized                                                 182440
30-nov-2016 23:00:18 01-dec-2016 00:00:48 Parallel operations downgraded 25 to 50 pct                           10602
30-nov-2016 23:00:18 01-dec-2016 00:00:48 Parallel operations downgraded 1 to 25 pct                            48914
30-nov-2016 23:00:18 01-dec-2016 00:00:48 DML statements parallelized                                              17
30-nov-2016 23:00:18 01-dec-2016 00:00:48 DDL statements parallelized                                            1376
30-nov-2016 23:00:18 01-dec-2016 00:00:48 Parallel operations downgraded 50 to 75 pct                           56565
30-nov-2016 23:00:18 01-dec-2016 00:00:48 Parallel operations downgraded 75 to 99 pct                            7782
30-nov-2016 23:00:18 01-dec-2016 00:00:48 DBWR parallel query checkpoint buffers written                       936382
30-nov-2016 23:00:18 01-dec-2016 00:00:48 Parallel operations not downgraded                                   225266
30-nov-2016 23:00:18 01-dec-2016 00:00:48 DFO trees parallelized                                               248642
30-nov-2016 23:00:18 01-dec-2016 00:00:48 Parallel operations downgraded to serial                             117995
30-nov-2016 22:00:52 30-nov-2016 23:00:18 Parallel operations downgraded 50 to 75 pct                           56500
30-nov-2016 22:00:52 30-nov-2016 23:00:18 DFO trees parallelized                                               248352
30-nov-2016 22:00:52 30-nov-2016 23:00:18 Parallel operations not downgraded                                   225118
30-nov-2016 22:00:52 30-nov-2016 23:00:18 DBWR parallel query checkpoint buffers written                       919901
30-nov-2016 22:00:52 30-nov-2016 23:00:18 Parallel operations downgraded 75 to 99 pct                            7780
30-nov-2016 22:00:52 30-nov-2016 23:00:18 Parallel operations downgraded 25 to 50 pct                           10542
30-nov-2016 22:00:52 30-nov-2016 23:00:18 Parallel operations downgraded 1 to 25 pct                            48899
30-nov-2016 22:00:52 30-nov-2016 23:00:18 DML statements parallelized                                              17
30-nov-2016 22:00:52 30-nov-2016 23:00:18 queries parallelized                                                 182170
30-nov-2016 22:00:52 30-nov-2016 23:00:18 Parallel operations downgraded to serial                             117847
30-nov-2016 22:00:52 30-nov-2016 23:00:18 DDL statements parallelized                                            1356
.

With LAG analytic function you might even get the trend of one particular system statistics. I have chosen Parallel operations downgraded to serial means all queries that have moved from parallel to serial. On some you might expect a big performance penalty:

SQL> set lines 150 pages 1000
SQL> SELECT
  to_char(hsn.begin_interval_time,'dd-mon-yyyy hh24:mi:ss') AS begin_interval_time,
  to_char(hsn.end_interval_time,'dd-mon-yyyy hh24:mi:ss') AS end_interval_time,
  hsy.stat_name,
  hsy.value - hsy.prev_value AS value
  FROM (SELECT snap_id,instance_number,stat_name,value,LAG(value,1,value) OVER (ORDER BY snap_id) AS prev_value
        FROM dba_hist_sysstat
        WHERE stat_name = 'Parallel operations downgraded to serial') hsy,
        dba_hist_snapshot hsn
WHERE hsy.snap_id = hsn.snap_id
AND hsy.instance_number = hsn.instance_number
AND hsy.value - hsy.prev_value<>0
ORDER BY hsn.snap_id DESC;

BEGIN_INTERVAL_TIME  END_INTERVAL_TIME    STAT_NAME                                                             VALUE
-------------------- -------------------- ---------------------------------------------------------------- ----------
30-nov-2016 23:00:18 01-dec-2016 00:00:48 Parallel operations downgraded to serial                                148
30-nov-2016 22:00:52 30-nov-2016 23:00:18 Parallel operations downgraded to serial                                264
30-nov-2016 21:00:15 30-nov-2016 22:00:52 Parallel operations downgraded to serial                                160
30-nov-2016 20:00:42 30-nov-2016 21:00:15 Parallel operations downgraded to serial                                 65
30-nov-2016 19:00:25 30-nov-2016 20:00:42 Parallel operations downgraded to serial                                 31
30-nov-2016 18:00:29 30-nov-2016 19:00:25 Parallel operations downgraded to serial                                  8
30-nov-2016 17:00:57 30-nov-2016 18:00:29 Parallel operations downgraded to serial                               1789
30-nov-2016 16:00:11 30-nov-2016 17:00:57 Parallel operations downgraded to serial                                588
30-nov-2016 15:00:37 30-nov-2016 16:00:11 Parallel operations downgraded to serial                               1214
30-nov-2016 14:01:07 30-nov-2016 15:00:37 Parallel operations downgraded to serial                                856
30-nov-2016 09:00:24 30-nov-2016 10:00:26 Parallel operations downgraded to serial                                603
30-nov-2016 07:00:03 30-nov-2016 08:00:22 Parallel operations downgraded to serial                                  6
30-nov-2016 06:00:42 30-nov-2016 07:00:03 Parallel operations downgraded to serial                                502
30-nov-2016 05:00:24 30-nov-2016 06:00:42 Parallel operations downgraded to serial                                  8
30-nov-2016 04:00:04 30-nov-2016 05:00:24 Parallel operations downgraded to serial                               3032
30-nov-2016 03:00:37 30-nov-2016 04:00:04 Parallel operations downgraded to serial                               1161
30-nov-2016 02:00:39 30-nov-2016 03:00:37 Parallel operations downgraded to serial                               1243
30-nov-2016 01:01:04 30-nov-2016 02:00:39 Parallel operations downgraded to serial                                492
30-nov-2016 00:01:05 30-nov-2016 01:01:04 Parallel operations downgraded to serial                                 51
29-nov-2016 23:00:00 30-nov-2016 00:01:05 Parallel operations downgraded to serial                                 94
29-nov-2016 22:00:09 29-nov-2016 23:00:00 Parallel operations downgraded to serial                               7150
29-nov-2016 21:00:07 29-nov-2016 22:00:09 Parallel operations downgraded to serial                                167
29-nov-2016 20:00:33 29-nov-2016 21:00:07 Parallel operations downgraded to serial                                124
29-nov-2016 19:00:41 29-nov-2016 20:00:33 Parallel operations downgraded to serial                                157
29-nov-2016 18:00:22 29-nov-2016 19:00:41 Parallel operations downgraded to serial                                820
29-nov-2016 17:00:59 29-nov-2016 18:00:22 Parallel operations downgraded to serial                                 10
29-nov-2016 16:00:29 29-nov-2016 17:00:59 Parallel operations downgraded to serial                                 46
29-nov-2016 15:00:59 29-nov-2016 16:00:29 Parallel operations downgraded to serial                                761
29-nov-2016 14:00:23 29-nov-2016 15:00:59 Parallel operations downgraded to serial                              10342
29-nov-2016 11:00:08 29-nov-2016 12:00:14 Parallel operations downgraded to serial                                  8
29-nov-2016 09:00:31 29-nov-2016 10:00:45 Parallel operations downgraded to serial                                163
29-nov-2016 08:00:33 29-nov-2016 09:00:31 Parallel operations downgraded to serial                                 47
29-nov-2016 07:00:07 29-nov-2016 08:00:33 Parallel operations downgraded to serial                              10920
29-nov-2016 06:00:17 29-nov-2016 07:00:07 Parallel operations downgraded to serial                               1409
.

To be honest I was not expecting so high figures as for few 1 hour interval I got more than 10 thousands queries downgraded to serial (!!). The problem is, I think, coming from the maximum number of parallel processes that we have set to contain an applicative issue:

SQL> show parameter parallel_max_servers

NAME                                 TYPE        VALUE
------------------------------------ ----------- ------------------------------
parallel_max_servers                 integer     60

Unstable execution time

Goal of this chapter is to identify queries that have a highly divergent execution time. The table to use here is DBA_HIST_SQLSTAT that contains tons of very useful information. In this view never ever use the xx_TOTAL columns as if you statement has been aged out from library cache the cumulative value will restart from 0. Discussing with teammates I have decided to use the statistical function called standard deviation that is by default available in Oracle as analytics function called STDDEV. In plain English standard deviation is the average of the difference to average value (!!). Here below I have chosen to keep sql_id that have a maximum elapsed time of 5 minutes and where standard deviation is two times greater that minimum execution time to keep only extreme values:

SQL> set lines 150 pages 1000
SQL> WITH sql_id_stdded AS (SELECT
  sql_id,
  SUM(total_exec) AS total_exec,
  ROUND(MIN(avg_elapsed_time),2) AS min_elapsed_time,
  ROUND(MAX(avg_elapsed_time),2) AS max_elapsed_time,
  ROUND(stddev_elapsed_time,2) AS stddev_elapsed_time
  FROM (SELECT
          sql_id,
          total_exec,
          avg_elapsed_time,
          STDDEV(avg_elapsed_time) OVER(PARTITION BY sql_id) AS stddev_elapsed_time
        FROM (SELECT
                hsq.sql_id,
                hsq.plan_hash_value,
                SUM(nvl(hsq.executions_delta,0)) AS total_exec,
                SUM(hsq.elapsed_time_delta)/DECODE(SUM(nvl(hsq.executions_delta,0)),0,1,SUM(hsq.executions_delta))/1000000 AS avg_elapsed_time
              FROM dba_hist_sqlstat hsq, dba_hist_snapshot hsn
              WHERE hsq.snap_id = hsn.snap_id
              AND hsq.instance_number = hsn.instance_number
              AND hsq.executions_delta > 0
              GROUP BY hsq.sql_id, hsq.plan_hash_value))
        GROUP BY sql_id, stddev_elapsed_time)
SELECT
  sql_id,
  total_exec,
  TO_CHAR(CAST(NUMTODSINTERVAL(min_elapsed_time,'second') AS interval day(2) to second(0))) AS min_elapsed_time,
  TO_CHAR(CAST(NUMTODSINTERVAL(max_elapsed_time,'second') AS interval day(2) to second(0))) AS max_elapsed_time,
  TO_CHAR(CAST(NUMTODSINTERVAL(stddev_elapsed_time,'second') AS interval day(2) to second(0))) AS stddev_elapsed_time
FROM sql_id_stdded
WHERE stddev_elapsed_time>2*min_elapsed_time
AND max_elapsed_time>5*60
AND total_exec>1
ORDER BY stddev_elapsed_time desc;

SQL_ID        TOTAL_EXEC MIN_ELAPSED_T MAX_ELAPSED_T STDDEV_ELAPSE
------------- ---------- ------------- ------------- -------------
6hds16zkc8cgm          9 +00 00:09:04  +00 05:29:46  +00 01:49:56
76g9pn3z3a35u          8 +00 00:07:23  +00 02:47:35  +00 01:13:09
6w05thcf4w6pm          2 +00 00:02:30  +00 01:38:45  +00 01:08:04
bmtkpjynyhs88          8 +00 00:02:19  +00 02:38:59  +00 01:08:00
7k9cjk1q686f2         10 +00 00:09:38  +00 02:20:49  +00 00:53:56
56xyy7uq071g6          4 +00 00:08:56  +00 01:30:50  +00 00:44:57
cvbq6vqk8dbf3         13 +00 00:01:36  +00 02:25:31  +00 00:44:04
g5khnky3q36m1         18 +00 00:07:37  +00 01:38:04  +00 00:37:19
cru4zku27tv0p          2 +00 00:00:55  +00 00:47:09  +00 00:32:42
2m2ww08p9btvc          2 +00 00:09:50  +00 00:48:35  +00 00:27:24
08kdqk1abqm0v          2 +00 00:07:34  +00 00:45:43  +00 00:26:59
12txp882z4ucy         16 +00 00:04:17  +00 00:57:28  +00 00:22:17
1gta6uu65u8nw         14 +00 00:06:16  +00 00:48:53  +00 00:20:15
gcdtt7t6rf0hg         16 +00 00:02:59  +00 00:51:18  +00 00:18:29
3kffqq3kwta74          3 +00 00:00:43  +00 00:32:37  +00 00:18:13
1nqaj68tn9xp2          2 +00 00:05:46  +00 00:28:15  +00 00:15:54
b8pyc5puhh9c7          8 +00 00:00:49  +00 00:29:44  +00 00:15:40
2svyb8a5n6qb5         10 +00 00:00:56  +00 00:45:44  +00 00:15:31
8zsu7t63hj1zp         14 +00 00:03:46  +00 00:33:57  +00 00:14:13
1tdc0da6km50h         67 +00 00:01:09  +00 00:30:28  +00 00:14:04
3uvcgnm27xf1c          3 +00 00:06:21  +00 00:33:03  +00 00:13:22
ch3xpjhf3y2bf         11 +00 00:03:51  +00 00:22:21  +00 00:13:05
fdcpq4j8kxd49         13 +00 00:04:31  +00 00:22:50  +00 00:12:57
csctx8ttu58d9         25 +00 00:00:30  +00 00:26:47  +00 00:12:15
dv1cvuw300ny4         11 +00 00:03:14  +00 00:28:38  +00 00:10:40
gsx2423tf2a7f          7 +00 00:04:35  +00 00:29:30  +00 00:10:33
62nt4sxdsy586         19 +00 00:02:29  +00 00:27:16  +00 00:10:29
1jb164khmsyzj         60 +00 00:00:40  +00 00:19:42  +00 00:09:15
6mgwwp95jaxzz         13 +00 00:03:56  +00 00:15:46  +00 00:08:22
8v38gy0u6a2mk          5 +00 00:01:02  +00 00:15:50  +00 00:08:20
9mt8p7z1wjjkn          9 +00 00:02:12  +00 00:17:34  +00 00:08:09
byd0g9xfpwcuj          2 +00 00:00:10  +00 00:11:15  +00 00:07:51
7pm2wxdvu3mc7         11 +00 00:02:55  +00 00:13:27  +00 00:07:26
4u50sp3h1szcp          5 +00 00:02:52  +00 00:11:38  +00 00:06:12
7430xmabdv8av          3 +00 00:02:25  +00 00:12:17  +00 00:05:01
cd3xmx3sk7vrv         21 +00 00:00:22  +00 00:05:06  +00 00:03:21

36 rows selected.

If we take first sql_id query says minimum execution time is 9 minutes and 4 seconds and maximum one is 5 hours 29 minutes and 46 seconds. What an amazing difference !

To focus on this sql_id you might use something like. It is strongly suggested to execute this query in a graphical query tool because it is difficult to see something in pure command line. All timing are in seconds while size are in MegaBytes:

SQL> select
    hsq.sql_id,
    hsq.plan_hash_value,
    nvl(sum(hsq.executions_delta),0) as total_exec,
    round(sum(hsq.elapsed_time_delta)/1000000,2) as elapsed_time_total,
		round(sum(hsq.px_servers_execs_delta)/decode(sum(hsq.executions_delta),0,null,sum(hsq.executions_delta)),2) as avg_px_servers_execs,
    round(sum(hsq.elapsed_time_delta)/decode(sum(hsq.executions_delta),0,null,sum(hsq.executions_delta))/1000000,2) as avg_elapsed_time,
    round(sum(hsq.cpu_time_delta)/decode(sum(hsq.executions_delta),0,null,sum(hsq.executions_delta))/1000000,2) as avg_cpu_time,
    round(sum(hsq.iowait_delta)/decode(sum(hsq.executions_delta),0,null,sum(hsq.executions_delta))/1000000,2) as avg_iowait,
    round(sum(hsq.clwait_delta)/decode(sum(hsq.executions_delta),0,null,sum(hsq.executions_delta))/1000000,2) as avg_cluster_wait,
    round(sum(hsq.apwait_delta)/decode(sum(hsq.executions_delta),0,null,sum(hsq.executions_delta))/1000000,2) as avg_application_wait,
    round(sum(hsq.ccwait_delta)/decode(sum(hsq.executions_delta),0,null,sum(hsq.executions_delta))/1000000,2) as avg_concurrency_wait,
    round(sum(hsq.rows_processed_delta)/decode(sum(hsq.executions_delta),0,null,sum(hsq.executions_delta)),2) as avg_rows_processed,
    round(sum(hsq.buffer_gets_delta)/decode(sum(hsq.executions_delta),0,null,sum(hsq.executions_delta)),2) as avg_buffer_gets,
    round(sum(hsq.disk_reads_delta)/decode(sum(hsq.executions_delta),0,null,sum(hsq.executions_delta)),2) as avg_disk_reads,
    round(sum(hsq.direct_writes_delta)/decode(sum(hsq.executions_delta),0,null,sum(hsq.executions_delta)),2) as avg_direct_writes,
    round(sum(hsq.io_interconnect_bytes_delta)/decode(sum(hsq.executions_delta),0,null,sum(hsq.executions_delta))/(1024*1024),0) as avg_io_interconnect_mb,
    round(sum(hsq.physical_read_requests_delta)/decode(sum(hsq.executions_delta),0,null,sum(hsq.executions_delta)),0) as avg_phys_read_requests,
    round(sum(hsq.physical_read_bytes_delta)/decode(sum(hsq.executions_delta),0,null,sum(hsq.executions_delta))/(1024*1024),0) as avg_phys_read_mb,
    round(sum(hsq.physical_write_requests_delta)/decode(sum(hsq.executions_delta),0,null,sum(hsq.executions_delta)),0) as avg_phys_write_requests,
    round(sum(hsq.physical_write_bytes_delta)/decode(sum(hsq.executions_delta),0,null,sum(hsq.executions_delta))/(1024*1024),0) as avg_phys_write_mb
from dba_hist_sqlstat hsq
where hsq.sql_id='6hds16zkc8cgm'
group by hsq.sql_id, hsq.plan_hash_value;

SQL_ID        PLAN_HASH_VALUE TOTAL_EXEC ELAPSED_TIME_TOTAL AVG_PX_SERVERS_EXECS AVG_ELAPSED_TIME AVG_CPU_TIME AVG_IOWAIT AVG_CLUSTER_WAIT AVG_APPLICATION_WAIT AVG_CONCURRENCY_WAIT AVG_ROWS_PROCESSED AVG_BUFFER_GETS AVG_DISK_READS AVG_DIRECT_WRITES AVG_IO_INTERCONNECT_MB AVG_PHYS_READ_REQUESTS AVG_PHYS_READ_MB AVG_PHYS_WRITE_REQUESTS AVG_PHYS_WRITE_MB
------------- --------------- ---------- ------------------ -------------------- ---------------- ------------ ---------- ---------------- -------------------- -------------------- ------------------ --------------- -------------- ----------------- ---------------------- ---------------------- ---------------- ----------------------- -----------------
6hds16zkc8cgm       705417430          2           15342.95                   16          7671.48        42.59    7521.09                0                 3.14                77.09                959         7948659       739317.5                 0                      3                    187                3                       0                 0
6hds16zkc8cgm      2195747324          2            1087.57                   32           543.78         43.8     354.06                0                  1.6               140.49                961       6115419.5       739927.5                 0                      6                    299                6                       0                 0
6hds16zkc8cgm      4190635369          1           29581.68                   32         29581.68        44.39   29366.99                0                    0               131.88               1226         6573306         709640                 0                      7                    350                7                       0                 0
6hds16zkc8cgm      1445916266          1            8639.51                   12          8639.51        37.77    8538.27                0                 7.14                50.05                919         4620163         698482                 0                      7                    342                7                       0                 0
6hds16zkc8cgm      3246633284          1           15650.16                   12         15650.16        47.34   15533.76                0                  .42                51.99               1364         7318402         731331                 0                      1                     73                1                       0                 0
6hds16zkc8cgm       463109863          1            7905.76                   16          7905.76        41.31    7761.58                0                    0                72.88                890         4709496         756848                 0                      7                    330                7                       0                 0
6hds16zkc8cgm      2566346142          1            9315.98                   27          9315.98        53.88    9131.03                0                  5.2                93.75               1394        11788790         754218                 0                      5                    211                5                       0                 0
6hds16zkc8cgm      2387297713          1            5672.93                   12          5672.93        43.45    5565.33                0                27.04                35.22               1074         8563338         697194                 0                      0                     31                0                       0                 0

8 rows selected. 

PLAN_HASH_VALUE column is the column to use to understand if too SQL plan are identical or not rather than comparing plans line by line. Here above we see that SQL plan of the same SQL is almost never the same, that could explain the huge difference in response time. We also see huge differences in I/O wait time from 354 seconds to 29366 seconds…

To display all the different plan and start to dig into them you can use (I’m not displaying mine as it would be too long):

select * from table(dbms_xplan.display_awr(sql_id=>'6hds16zkc8cgm',format=>'all allstats'));

You should correlate the above figures with what you can find in DBA_HIST_ACTIVE_SESS_HISTORY. Do not use TIME_WAITED column in this view but count 10 seconds per line. 10 seconds because it is the frequency of snapshot taken from V$ACTIVE_SESSION_HISTORY. And V$ACTIVE_SESSION_HISTORY has a frequency of 1 second (that’s why you would simply use count if selecting from V$ACTIVE_SESSION_HISTORY):

SQL> set lines 150 pages 1000
SQL> col wait_class for a15
SQL> col event for a35
SQL> SELECT
     sql_id, sql_plan_hash_value, actual_dop, TO_CHAR(sql_exec_start,'dd-mon-yyyy hh24:mi:ss') AS sql_exec_start,
     wait_class, event, COUNT(*)*10 AS "time_waited (s)"
     FROM
     (SELECT sql_id,sql_plan_hash_value,trunc(px_flags / 2097152) AS actual_dop,
      sql_exec_start,
      DECODE(NVL(wait_class,'ON CPU'),'ON CPU',DECODE(session_type,'BACKGROUND','BCPU','CPU'),wait_class) AS wait_class,
      nvl(event,'ON CPU') AS event
      FROM dba_hist_active_sess_history
      WHERE sql_id='6hds16zkc8cgm') a
     GROUP BY sql_id, sql_plan_hash_value, actual_dop, sql_exec_start, wait_class,event
     ORDER BY sql_exec_start, sql_plan_hash_value, wait_class,event;

SQL_ID        SQL_PLAN_HASH_VALUE ACTUAL_DOP SQL_EXEC_START       WAIT_CLASS      EVENT                               time_waited (s)
------------- ------------------- ---------- -------------------- --------------- ----------------------------------- ---------------
6hds16zkc8cgm          4190635369         16 01-dec-2016 15:38:50 CPU             ON CPU                                           10
6hds16zkc8cgm          4190635369          0 01-dec-2016 15:38:50 Other           reliable message                                 20
6hds16zkc8cgm          4190635369         16 01-dec-2016 15:38:50 User I/O        db file parallel read                          1830
6hds16zkc8cgm          4190635369         16 01-dec-2016 15:38:50 User I/O        db file sequential read                         160
6hds16zkc8cgm          4190635369         16 01-dec-2016 15:38:50 User I/O        direct path read                              27030
6hds16zkc8cgm          4190635369         16 01-dec-2016 15:38:50 User I/O        read by other session                           180
6hds16zkc8cgm          2566346142          0 02-dec-2016 16:40:27 Application     enq: KO - fast object checkpoint                 10
6hds16zkc8cgm          2566346142         14 02-dec-2016 16:40:27 CPU             ON CPU                                          300
6hds16zkc8cgm          2566346142          0 02-dec-2016 16:40:27 Configuration   log buffer space                                 10
6hds16zkc8cgm          2566346142         14 02-dec-2016 16:40:27 User I/O        db file sequential read                          10
6hds16zkc8cgm          2566346142         14 02-dec-2016 16:40:27 User I/O        direct path read                               8240
6hds16zkc8cgm          2566346142         14 02-dec-2016 16:40:27 User I/O        read by other session                           650
6hds16zkc8cgm          2195747324         16 16-nov-2016 16:17:50 CPU             ON CPU                                           20
6hds16zkc8cgm          2195747324         16 16-nov-2016 16:17:50 User I/O        db file sequential read                         300
6hds16zkc8cgm          1759484988            17-nov-2016 18:11:25 CPU             ON CPU                                           30
6hds16zkc8cgm          1759484988          0 17-nov-2016 18:11:25 Other           reliable message                                 10
6hds16zkc8cgm          1759484988            17-nov-2016 18:11:25 User I/O        db file parallel read                           140
6hds16zkc8cgm          1759484988            17-nov-2016 18:11:25 User I/O        db file scattered read                           10
6hds16zkc8cgm          1759484988            17-nov-2016 18:11:25 User I/O        db file sequential read                          70
6hds16zkc8cgm          2195747324            18-nov-2016 18:06:45 CPU             ON CPU                                           50
6hds16zkc8cgm          2195747324            18-nov-2016 18:06:45 User I/O        db file parallel read                            30
6hds16zkc8cgm          2195747324            18-nov-2016 18:06:45 User I/O        db file sequential read                          50
6hds16zkc8cgm          3246633284          0 19-nov-2016 16:19:49 Application     enq: KO - fast object checkpoint                 10
6hds16zkc8cgm          3246633284            19-nov-2016 16:19:49 CPU             ON CPU                                           20
6hds16zkc8cgm          3246633284            19-nov-2016 16:19:49 Scheduler       resmgr:cpu quantum                               10
6hds16zkc8cgm          3246633284            19-nov-2016 16:19:49 User I/O        db file parallel read                           770
6hds16zkc8cgm          3246633284            19-nov-2016 16:19:49 User I/O        db file sequential read                          30
6hds16zkc8cgm          2195747324            20-nov-2016 15:41:33 CPU             ON CPU                                           20
6hds16zkc8cgm          2195747324          0 20-nov-2016 15:41:33 Other           reliable message                                 10
6hds16zkc8cgm          2195747324            20-nov-2016 15:41:33 User I/O        db file parallel read                           160
6hds16zkc8cgm          2195747324            20-nov-2016 15:41:33 User I/O        db file sequential read                          20
6hds16zkc8cgm          2195747324            20-nov-2016 15:41:33 User I/O        read by other session                            10
6hds16zkc8cgm          2195747324            21-nov-2016 15:49:06 CPU             ON CPU                                           20
6hds16zkc8cgm          2195747324          0 21-nov-2016 15:49:06 Other           reliable message                                 10
6hds16zkc8cgm          2195747324            21-nov-2016 15:49:06 User I/O        db file parallel read                           140
6hds16zkc8cgm          2195747324            21-nov-2016 15:49:06 User I/O        db file sequential read                          40
6hds16zkc8cgm           463109863          8 22-nov-2016 16:54:09 CPU             ON CPU                                           20
6hds16zkc8cgm           463109863          0 22-nov-2016 16:54:09 Other           reliable message                                 10
6hds16zkc8cgm           463109863          8 22-nov-2016 16:54:09 User I/O        db file parallel read                           550
6hds16zkc8cgm           463109863          8 22-nov-2016 16:54:09 User I/O        db file sequential read                          30
6hds16zkc8cgm           463109863          8 22-nov-2016 16:54:09 User I/O        direct path read                               7120
6hds16zkc8cgm          2387297713          0 23-nov-2016 17:06:32 Application     enq: KO - fast object checkpoint                 20
6hds16zkc8cgm          2387297713          6 23-nov-2016 17:06:32 CPU             ON CPU                                           30
6hds16zkc8cgm          2387297713          0 23-nov-2016 17:06:32 Configuration   log buffer space                                 10
6hds16zkc8cgm          2387297713          6 23-nov-2016 17:06:32 User I/O        db file sequential read                          40
6hds16zkc8cgm          2387297713          6 23-nov-2016 17:06:32 User I/O        direct path read                               5480
6hds16zkc8cgm          2387297713          6 23-nov-2016 17:06:32 User I/O        read by other session                            60
6hds16zkc8cgm           705417430          6 24-nov-2016 16:48:13 CPU             ON CPU                                           60
6hds16zkc8cgm           705417430          6 24-nov-2016 16:48:13 Configuration   log buffer space                                 20
6hds16zkc8cgm           705417430          6 24-nov-2016 16:48:13 User I/O        db file sequential read                         400
6hds16zkc8cgm           705417430          6 24-nov-2016 16:48:13 User I/O        direct path read                               5780
6hds16zkc8cgm           705417430          6 24-nov-2016 16:48:13 User I/O        read by other session                           170
6hds16zkc8cgm          3246633284          6 25-nov-2016 15:39:23 CPU             ON CPU                                           70
6hds16zkc8cgm          3246633284          6 25-nov-2016 15:39:23 User I/O        db file parallel read                          1980
6hds16zkc8cgm          3246633284          6 25-nov-2016 15:39:23 User I/O        db file sequential read                          80
6hds16zkc8cgm          3246633284          6 25-nov-2016 15:39:23 User I/O        direct path read                              13430
6hds16zkc8cgm          3246633284          6 25-nov-2016 15:39:23 User I/O        read by other session                            10
6hds16zkc8cgm          2195747324         16 26-nov-2016 15:32:14 CPU             ON CPU                                           50
6hds16zkc8cgm          2195747324         16 26-nov-2016 15:32:14 User I/O        db file sequential read                         140
6hds16zkc8cgm          2195747324         16 26-nov-2016 15:32:14 User I/O        direct path read                                360
6hds16zkc8cgm          2195747324         16 26-nov-2016 15:32:14 User I/O        read by other session                            10
6hds16zkc8cgm          2195747324         16 27-nov-2016 16:27:27 CPU             ON CPU                                           40
6hds16zkc8cgm          2195747324         16 27-nov-2016 16:27:27 User I/O        direct path read                                280
6hds16zkc8cgm           705417430          0 28-nov-2016 16:42:39 Application     enq: KO - fast object checkpoint                 10
6hds16zkc8cgm           705417430         10 28-nov-2016 16:42:39 CPU             ON CPU                                           10
6hds16zkc8cgm           705417430         10 28-nov-2016 16:42:39 Concurrency     buffer busy waits                                90
6hds16zkc8cgm           705417430         10 28-nov-2016 16:42:39 Configuration   log buffer space                                 10
6hds16zkc8cgm           705417430          0 28-nov-2016 16:42:39 Other           reliable message                                 10
6hds16zkc8cgm           705417430         10 28-nov-2016 16:42:39 User I/O        direct path read                               8520
6hds16zkc8cgm          1445916266          0 29-nov-2016 16:33:40 Application     enq: KO - fast object checkpoint                 10
6hds16zkc8cgm          1445916266          6 29-nov-2016 16:33:40 CPU             ON CPU                                           30
6hds16zkc8cgm          1445916266          6 29-nov-2016 16:33:40 User I/O        db file parallel read                           880
6hds16zkc8cgm          1445916266          6 29-nov-2016 16:33:40 User I/O        db file sequential read                         120
6hds16zkc8cgm          1445916266          6 29-nov-2016 16:33:40 User I/O        direct path read                               7520
6hds16zkc8cgm          1445916266          6 29-nov-2016 16:33:40 User I/O        read by other session                            30
6hds16zkc8cgm           463109863          0 30-nov-2016 16:33:40 CPU             ON CPU                                           10
6hds16zkc8cgm           463109863         15 30-nov-2016 16:33:40 CPU             ON CPU                                           10
6hds16zkc8cgm           463109863         15 30-nov-2016 16:33:40 User I/O        db file sequential read                         150
6hds16zkc8cgm           463109863         15 30-nov-2016 16:33:40 User I/O        direct path read                                190
6hds16zkc8cgm           463109863          0                      CPU             ON CPU                                           20
6hds16zkc8cgm           463109863                                 CPU             ON CPU                                           10
6hds16zkc8cgm           463109863          0                      Concurrency     cursor: pin S wait on X                         430
6hds16zkc8cgm           463109863          0                      Concurrency     library cache lock                               10
6hds16zkc8cgm           705417430          0                      CPU             ON CPU                                           10
6hds16zkc8cgm           705417430                                 CPU             ON CPU                                           10
6hds16zkc8cgm           705417430          0                      Concurrency     cursor: pin S wait on X                          90
6hds16zkc8cgm           705417430          0                      Concurrency     library cache lock                               20
6hds16zkc8cgm          1445916266                                 CPU             ON CPU                                           10
6hds16zkc8cgm          2195747324                                 CPU             ON CPU                                           40
6hds16zkc8cgm          2195747324          0                      CPU             ON CPU                                           10
6hds16zkc8cgm          2195747324          0                      Concurrency     cursor: pin S wait on X                         310
6hds16zkc8cgm          2387297713          0                      CPU             ON CPU                                           10
6hds16zkc8cgm          2387297713          0                      Concurrency     cursor: pin S wait on X                         110
6hds16zkc8cgm          3246633284                                 CPU             ON CPU                                           10
6hds16zkc8cgm          3246633284          0                      CPU             ON CPU                                           10
6hds16zkc8cgm          3246633284          0                      Concurrency     cursor: pin S wait on X                         110
6hds16zkc8cgm          3246633284                                 User I/O        db file sequential read                          10
6hds16zkc8cgm          4190635369          0                      CPU             ON CPU                                           10
6hds16zkc8cgm          4190635369          0                      Concurrency     cursor: pin S wait on X                         310

99 rows selected.

We see that top number one wait even while executing our query is direct path read. This wait event is new in 11g where Oracle has decided to bypass buffer cache and read directly in PGA for large table that does not fit in SGA. Overall it is not really an issue, THE SQL behind must be tune to favor more optimal choices. See Direct path chapter for more information.

Direct path

On one of our database we had below Top Foreground wait event:

awr_mining02
awr_mining02

Even if you will immediately focus on SQL tuning you might want to know which SQL are mainly responsible of this:

SQL> set lines 150 pages 1000
SQL> col wait_class for a10
SQL> col event for a25
SQL> SELECT
     sql_id, sql_plan_hash_value, TO_CHAR(sql_exec_start,'dd-mon-yyyy hh24:mi:ss') AS sql_exec_start,
     wait_class, event, COUNT(*)*10 AS "time_waited (s)"
     FROM
     (SELECT sql_id,sql_plan_hash_value,sql_exec_start,
      DECODE(NVL(wait_class,'ON CPU'),'ON CPU',DECODE(session_type,'BACKGROUND','BCPU','CPU'),wait_class) AS wait_class,
      nvl(event,'ON CPU') AS event
      FROM dba_hist_active_sess_history
      WHERE event like '%direct%') a
     GROUP BY sql_id, sql_plan_hash_value, sql_exec_start, wait_class,event
     ORDER BY 6 desc;

SQL_ID        SQL_PLAN_HASH_VALUE SQL_EXEC_START       WAIT_CLASS EVENT                     time_waited (s)
------------- ------------------- -------------------- ---------- ------------------------- ---------------
amzmxuns5dctz          1006290636 21-nov-2016 15:30:51 User I/O   direct path read temp              117670
amzmxuns5dctz          2716462349 30-nov-2016 15:28:23 User I/O   direct path read temp              101540
amzmxuns5dctz          2716462349 03-dec-2016 15:40:04 User I/O   direct path read temp               96800
                                0                      User I/O   direct path read                    90720
amzmxuns5dctz          1006290636 25-nov-2016 06:39:34 User I/O   direct path write temp              50740
f92r4f37kn015          2776764876 30-nov-2016 03:22:25 User I/O   direct path write temp              48010
114w500wy5039          2301194886 04-dec-2016 16:34:14 User I/O   direct path write temp              45480
cvbq6vqk8dbf3           720130659 25-nov-2016 15:42:05 User I/O   direct path read                    36520
amzmxuns5dctz          2716462349 26-nov-2016 15:40:00 User I/O   direct path read temp               33610
f92r4f37kn015          2776764876 24-nov-2016 00:57:35 User I/O   direct path write temp              33300
b2cwuxca44yw8          1919045833 01-dec-2016 16:30:03 User I/O   direct path read                    31890
amzmxuns5dctz          2716462349 30-nov-2016 06:35:58 User I/O   direct path read temp               31830
amzmxuns5dctz          1006290636 24-nov-2016 15:10:44 User I/O   direct path read temp               28210
amzmxuns5dctz          2716462349 30-nov-2016 06:35:58 User I/O   direct path write temp              27350
6hds16zkc8cgm          4190635369 01-dec-2016 15:38:50 User I/O   direct path read                    27030
amzmxuns5dctz          2716462349 28-nov-2016 15:13:16 User I/O   direct path read temp               26850
6hds16zkc8cgm          3246633284 05-dec-2016 16:15:22 User I/O   direct path read                    23020
aqh9x3yay6khc          2336987607 21-nov-2016 20:54:01 User I/O   direct path read temp               21060
amzmxuns5dctz          2716462349 01-dec-2016 15:08:29 User I/O   direct path read temp               20510
9mt8p7z1wjjkn          2327812543 04-dec-2016 11:00:32 User I/O   direct path read                    20450
48091vbtjj4xa          1708748504 28-nov-2016 04:03:28 User I/O   direct path read temp               20090
amzmxuns5dctz          2716462349 03-dec-2016 06:36:36 User I/O   direct path write temp              17190
f92r4f37kn015          2776764876 27-nov-2016 00:41:30 User I/O   direct path read temp               17190
amzmxuns5dctz          2716462349 29-nov-2016 15:43:15 User I/O   direct path read temp               16860
1wzd30k3m1ghn          1433478644 07-dec-2016 06:29:49 User I/O   direct path read temp               15990
amzmxuns5dctz          2716462349 30-nov-2016 15:28:23 User I/O   direct path write temp              15830
amzmxuns5dctz          2716462349 07-dec-2016 06:35:28 User I/O   direct path write temp              14850
1wzd30k3m1ghn          1433478644 29-nov-2016 08:07:57 User I/O   direct path read temp               14850
08kdqk1abqm0v          2813508085 04-dec-2016 15:49:02 User I/O   direct path read                    14680
aqh9x3yay6khc          2336987607 22-nov-2016 21:07:34 User I/O   direct path read temp               14450
5tjqq1cggd0c2          3765923229 23-nov-2016 19:04:17 User I/O   direct path read                    14280
5tjqq1cggd0c2          1896303415 22-nov-2016 19:23:11 User I/O   direct path read                    14210
5tjqq1cggd0c2          3837189456 29-nov-2016 19:34:04 User I/O   direct path read                    14010
grb144xf2asf3           759327608 25-nov-2016 22:17:39 User I/O   direct path read                    13790
f92r4f37kn015          2776764876 05-dec-2016 00:11:15 User I/O   direct path read temp               13460
6hds16zkc8cgm          3246633284 25-nov-2016 15:39:23 User I/O   direct path read                    13430
amzmxuns5dctz          1006290636 21-nov-2016 07:14:12 User I/O   direct path read temp               12780
b4yrbsczwf9xw          3470921796 27-nov-2016 03:58:25 User I/O   direct path read                    12720
grb144xf2asf3          1075986364 05-dec-2016 22:48:32 User I/O   direct path read                    12310
g8n42y0duhggt           152545410 24-nov-2016 21:13:25 User I/O   direct path read                    12060
a2xhxhppc23y1           327407190 06-dec-2016 22:56:59 User I/O   direct path read                    11930
1wzd30k3m1ghn          1433478644 23-nov-2016 07:13:00 User I/O   direct path read temp               11820
0pmqzssfmmdcv          3444765716 04-dec-2016 01:57:54 User I/O   direct path read                    11480
1zzsqqwax38kk           660240898 27-nov-2016 06:58:14 User I/O   direct path read                    11200
amzmxuns5dctz          2716462349 07-dec-2016 06:35:28 User I/O   direct path read temp               11160
aqh9x3yay6khc          2336987607 29-nov-2016 20:01:40 User I/O   direct path read temp               11110
ckapap92tfy3n          2695238043 23-nov-2016 23:47:49 User I/O   direct path read                    11050
aqh9x3yay6khc          2336987607 03-dec-2016 22:33:33 User I/O   direct path read temp               10990
amzmxuns5dctz          2716462349 03-dec-2016 06:36:36 User I/O   direct path read temp               10950
1wzd30k3m1ghn          1433478644 26-nov-2016 06:48:59 User I/O   direct path read temp               10950
1wzd30k3m1ghn          1433478644 06-dec-2016 06:45:53 User I/O   direct path read temp               10870
1wzd30k3m1ghn          1433478644 02-dec-2016 06:41:28 User I/O   direct path read temp               10690
57svfqqryy7h0          3182238936 28-nov-2016 05:17:30 User I/O   direct path read                    10540
74447zmmmw0zk           316009422 29-nov-2016 02:41:26 User I/O   direct path read                    10470
f92r4f37kn015          2776764876 28-nov-2016 02:22:45 User I/O   direct path read temp               10310
0cy2upaz2mtp3           420311346 01-dec-2016 17:54:20 User I/O   direct path read                    10220
daxpwau39csac          2945512965 29-nov-2016 18:07:05 User I/O   direct path read                    10200
                                0                      User I/O   direct path write                   10060
.

NULL sql_id is checkpoint process.

Here above we see than sql_id amzmxuns5dctz is the one to focus on…

Checkpoint

After a reboot of our database we have (luckily) noticed a huge increase in physical writes as shown in HP Performance Manager application:

awr_mining03
awr_mining03

We have investigated in all directions we could: OS, database, SQL tuning,… We noticed we mistakenly (since long time) set FAST_START_MTTR_TARGET to 300. To be honest I have never really understood the added value of this parameter even if I understand the description. What’s the point to tune a situation (recovery) that (hopefully) happen rarely and impacting all year long your performance. I prefer to let the checkpoint occur at redo log switch and set FAST_START_MTTR_TARGET to 0 (default value).

Anyways that said we reset the parameter and guess what ? Physical write decreased a lot:

awr_mining04
awr_mining04

Then I wanted to see at Oracle the decrease in number of checkpoint as well as decrease in number of write due to incremental check pointing activity. DBA_HIST_SYSSTAT comes to the rescue. In meanwhile a teammate changed AWR frequency so I had to tune a bit my query to have the sum per hour:

SELECT TO_CHAR(TRUNC(begin_interval_time,'HH'),'dd-mon-yyyy hh24:mi:ss') AS time,
stat_name,
SUM(value) AS value
FROM(
SELECT
hsn.snap_id,
  hsn.begin_interval_time,
  hsn.end_interval_time,
  hsy.stat_name,
  hsy.value - hsy.prev_value AS value
  FROM (SELECT snap_id,instance_number,stat_name,value,LAG(value,1,value) OVER (ORDER BY snap_id,stat_name) AS prev_value
        FROM dba_hist_sysstat
        WHERE stat_name in 'DBWR checkpoints') hsy,
        dba_hist_snapshot hsn
WHERE hsy.snap_id = hsn.snap_id
AND hsy.instance_number = hsn.instance_number
ORDER BY hsn.snap_id DESC)
group by trunc(begin_interval_time,'HH'),snap_id,stat_name
order by snap_id;

The two statistics name I will use are (please refer to official documenatiopn or V$STATNAME for a complete list of available ones):

Name Description
DBWR checkpoint buffers written Number of buffers that were written for checkpoints
DBWR checkpoints Number of times the DBWR was asked to scan the cache and write all blocks marked for a checkpoint or the end of recovery. This statistic is always larger than “background checkpoints completed”

I initially exported the result set in Excel format and build graph in Excel directly but finally decided to test the graph capability of SQL Developer. To access it use the Reports tab, or activate it in View and Reports menu. Then Create a new one in User Defined Reports using Chart style and Area as Chart Type, and any other option you like. A good trick is to connect to a database and check the Use Live Data in Property/Data option to directly see how your report looks like.

The queries to be displayed must be built to return the following three values: ‘x axis value’,’serie name’ and ‘y axis value’.

We see number of checkpoint decreasing after December the 7TH AM (we did the change around 10:00 AM CET):

awr_mining05
awr_mining05

But more impressive the number of buffer written:

awr_mining06
awr_mining06

But we did not change anything on the database before the applicative started to complain all their queries were slow. The change in FAST_START_MTTR_TARGET solved performance troubles but does not explain the root cause of the issue. Restoring a backup we have been able to extract and load old AWR figures (see chapter on this) to finally be able to perform difference AWR reports. Then obviously I have focused on Top Segments Comparison by Physical Writes paragraph of my AWR difference report (a good day before the issue versus a bad one) and saw this:

awr_mining07
awr_mining07

I noticed many new comers in top physical writes while the first one had 500% increase in number of physical writes…

The table to use in this situation is DBA_HIST_SEG_STAT, but you need the object id to fetch it:

SQL> set lines 150 pages 1000
SQL> col object_name for a30
SQL> select object_id,owner,object_name,object_type
     from dba_objects
     where owner||'.'||object_name in ('HUB.CRM_INVOICE_PREV_PK','E2DWH.BACKLOG_NEW',
     'E2DWH.E2_X_DWH_2_PK','E2DWH.E2_X_DWH2_IDX_SO_ITEM__ID','E2DWH.E2_X_DWH2_IDX_LAST_UPD');

 OBJECT_ID OWNER                          OBJECT_NAME                    OBJECT_TYPE
---------- ------------------------------ ------------------------------ -------------------
  43994766 E2DWH                          BACKLOG_NEW                    TABLE
  44003115 HUB                            CRM_INVOICE_PREV_PK            INDEX
  44005025 E2DWH                          E2_X_DWH2_IDX_SO_ITEM__ID      INDEX
  44005026 E2DWH                          E2_X_DWH2_IDX_LAST_UPD         INDEX
  44005027 E2DWH                          E2_X_DWH_2_PK                  INDEX

Or as they claim use DBA_HIST_SEG_STAT_OBJ table (even if it is really difficult to guess which key this table has):

SQL> set lines 150 pages 1000
SQL> col object_name for a50
SQL> select distinct obj#,owner||'.'||object_name||' ('||nvl2(subobject_name,object_type || ': ' || subobject_name,object_type)||')' as object_name
     from DBA_HIST_SEG_STAT_OBJ
     where owner||'.'||OBJECT_NAME in ('HUB.CRM_INVOICE_PREV_PK','E2DWH.BACKLOG_NEW',
     'E2DWH.E2_X_DWH_2_PK','E2DWH.E2_X_DWH2_IDX_SO_ITEM__ID','E2DWH.E2_X_DWH2_IDX_LAST_UPD');

      OBJ# OBJECT_NAME
---------- --------------------------------------------------
  43994766 E2DWH.BACKLOG_NEW (TABLE)
  44005027 E2DWH.E2_X_DWH_2_PK (INDEX)
  44005026 E2DWH.E2_X_DWH2_IDX_LAST_UPD (INDEX)
  44003115 HUB.CRM_INVOICE_PREV_PK (INDEX)
  44005025 E2DWH.E2_X_DWH2_IDX_SO_ITEM__ID (INDEX)

Then you can access to physical writes of this object with something like:

SQL> set lines 150 pages 1000
SQL> SELECT
     TO_CHAR(begin_interval_time,'dd-mon-yyyy hh24:mi:ss') AS begin_interval_time,
     TO_CHAR(end_interval_time,'dd-mon-yyyy hh24:mi:ss') AS end_interval_time,
     hss.physical_writes_delta,
     hss.physical_write_requests_delta
     FROM dba_hist_seg_stat hss, dba_hist_snapshot hsn
     WHERE hss.snap_id = hsn.snap_id
     AND hss.instance_number = hsn.instance_number
     AND hss.obj# = 44003115
     ORDER BY hss.snap_id

BEGIN_INTERVAL_TIME  END_INTERVAL_TIME    PHYSICAL_WRITES_DELTA PHYSICAL_WRITE_REQUESTS_DELTA
-------------------- -------------------- --------------------- -----------------------------
04-nov-2016 14:00:31 04-nov-2016 15:00:48                742817                        658007
05-nov-2016 14:00:08 05-nov-2016 15:00:26                749335                        652146
05-nov-2016 15:00:26 05-nov-2016 16:00:38                     0                             0
06-nov-2016 14:00:41 06-nov-2016 15:01:00                815897                        718208
07-nov-2016 14:00:20 07-nov-2016 15:00:36                722403                        613129
08-nov-2016 14:00:37 08-nov-2016 15:00:56                608903                        546690
09-nov-2016 14:00:51 09-nov-2016 15:00:08                633603                        549895
10-nov-2016 14:00:39 10-nov-2016 15:00:59                722215                        656331
11-nov-2016 14:00:54 11-nov-2016 15:00:14                606513                        535992
11-nov-2016 15:00:14 11-nov-2016 16:00:28                     0                             0
11-nov-2016 16:00:28 11-nov-2016 17:00:46                     0                             0
11-nov-2016 17:00:46 11-nov-2016 18:02:20                     0                             0
11-nov-2016 18:02:20 11-nov-2016 19:00:48                     0                             0
11-nov-2016 19:00:48 11-nov-2016 20:00:15                     0                             0
18-nov-2016 14:00:40 18-nov-2016 15:00:19               4514625                       4298799
19-nov-2016 14:00:22 19-nov-2016 15:00:56               4811683                       4610613
20-nov-2016 14:00:28 20-nov-2016 15:00:10               2404822                       2267424
21-nov-2016 14:00:08 21-nov-2016 15:00:34               2300106                       2145016
22-nov-2016 14:00:06 22-nov-2016 15:00:39               3589330                       3428962
22-nov-2016 15:00:39 22-nov-2016 16:00:04               1422715                       1373614
23-nov-2016 14:00:39 23-nov-2016 15:00:06               4505033                       4318507
23-nov-2016 17:00:40 23-nov-2016 17:10:56                     0                             0
24-nov-2016 14:00:34 24-nov-2016 15:00:59               2027948                       1873524
25-nov-2016 14:00:07 25-nov-2016 14:55:52               4477194                       4288564
26-nov-2016 14:00:59 26-nov-2016 15:00:18               2350258                       2210634
27-nov-2016 14:00:37 27-nov-2016 15:00:14               4886175                       4681261
28-nov-2016 14:00:19 28-nov-2016 15:00:45               2512397                       2394625
29-nov-2016 14:00:23 29-nov-2016 15:00:59               3476692                       3323396
29-nov-2016 15:00:59 29-nov-2016 16:00:29               1686445                       1632668
30-nov-2016 14:01:07 30-nov-2016 15:00:37               4468593                       4279528
01-dec-2016 14:00:40 01-dec-2016 15:00:02               2549539                       2426793
02-dec-2016 14:00:18 02-dec-2016 15:00:01               4781775                       4589442

32 rows selected.

We clearly see the trend with a 7-8 times increase in number of writes.

If you check for a segment that has no figures for a period (appearing or disappearing objects) then the query is a bit more complex to build. In other word building a query to report what appear in a difference AWR report is not so easy. Using standard variance (STDDEV) I have tried to build a query that would show me the segments that have varied the most for physical writes. Again if the segment appear or disappear:

SQL> set lines 150 pages 1000
SQL> col object_name for a50
SQL> SELECT
     TO_CHAR(hsn.begin_interval_time,'dd-mon-yyyy hh24:mi:ss') AS begin_interval_time,
     TO_CHAR(hsn.end_interval_time,'dd-mon-yyyy hh24:mi:ss') AS end_interval_time,
     --obj#,
     (SELECT distinct owner||'.'||object_name||' ('||nvl2(subobject_name,object_type || ': ' || subobject_name,object_type)||')'
      FROM dba_hist_seg_stat_obj
      WHERE obj#=hss.obj# AND dbid=hss.dbid AND dataobj#=hss.dataobj# AND ts#=hss.ts#) AS object_name,
     hss.physical_writes_delta,
     hss.stddev_physical_writes_delta
     FROM
     (SELECT
     snap_id,
     dbid,
     instance_number,
     obj#,
     dataobj#,
     ts#,
     physical_writes_delta,
     ROUND(STDDEV(physical_writes_delta) over (partition by obj#)) AS stddev_physical_writes_delta,
     COUNT(*) OVER (PARTITION BY obj#) AS nb
     FROM dba_hist_seg_stat
     GROUP BY snap_id,dbid,instance_number,obj#,dataobj#,ts#,physical_writes_delta) hss, dba_hist_snapshot hsn
     WHERE hss.snap_id = hsn.snap_id
     AND hss.instance_number = hsn.instance_number
     AND hss.nb >= 5
     ORDER BY stddev_physical_writes_delta desc,hss.snap_id;

BEGIN_INTERVAL_TIME  END_INTERVAL_TIME    OBJECT_NAME                                        PHYSICAL_WRITES_DELTA STDDEV_PHYSICAL_WRITES_DELTA
-------------------- -------------------- -------------------------------------------------- --------------------- ----------------------------
04-nov-2016 14:00:31 04-nov-2016 15:00:48 HUB.CRM_INVOICE_PREV_PK (INDEX)                                   742817                      1763019
05-nov-2016 14:00:08 05-nov-2016 15:00:26 HUB.CRM_INVOICE_PREV_PK (INDEX)                                   749335                      1763019
05-nov-2016 15:00:26 05-nov-2016 16:00:38 HUB.CRM_INVOICE_PREV_PK (INDEX)                                        0                      1763019
06-nov-2016 14:00:41 06-nov-2016 15:01:00 HUB.CRM_INVOICE_PREV_PK (INDEX)                                   815897                      1763019
07-nov-2016 14:00:20 07-nov-2016 15:00:36 HUB.CRM_INVOICE_PREV_PK (INDEX)                                   722403                      1763019
08-nov-2016 14:00:37 08-nov-2016 15:00:56 HUB.CRM_INVOICE_PREV_PK (INDEX)                                   608903                      1763019
09-nov-2016 14:00:51 09-nov-2016 15:00:08 HUB.CRM_INVOICE_PREV_PK (INDEX)                                   633603                      1763019
10-nov-2016 14:00:39 10-nov-2016 15:00:59 HUB.CRM_INVOICE_PREV_PK (INDEX)                                   722215                      1763019
11-nov-2016 14:00:54 11-nov-2016 15:00:14 HUB.CRM_INVOICE_PREV_PK (INDEX)                                   606513                      1763019
11-nov-2016 15:00:14 11-nov-2016 16:00:28 HUB.CRM_INVOICE_PREV_PK (INDEX)                                        0                      1763019
11-nov-2016 16:00:28 11-nov-2016 17:00:46 HUB.CRM_INVOICE_PREV_PK (INDEX)                                        0                      1763019
11-nov-2016 17:00:46 11-nov-2016 18:02:20 HUB.CRM_INVOICE_PREV_PK (INDEX)                                        0                      1763019
11-nov-2016 18:02:20 11-nov-2016 19:00:48 HUB.CRM_INVOICE_PREV_PK (INDEX)                                        0                      1763019
11-nov-2016 19:00:48 11-nov-2016 20:00:15 HUB.CRM_INVOICE_PREV_PK (INDEX)                                        0                      1763019
18-nov-2016 14:00:40 18-nov-2016 15:00:19 HUB.CRM_INVOICE_PREV_PK (INDEX)                                  4514625                      1763019
19-nov-2016 14:00:22 19-nov-2016 15:00:56 HUB.CRM_INVOICE_PREV_PK (INDEX)                                  4811683                      1763019
20-nov-2016 14:00:28 20-nov-2016 15:00:10 HUB.CRM_INVOICE_PREV_PK (INDEX)                                  2404822                      1763019
21-nov-2016 14:00:08 21-nov-2016 15:00:34 HUB.CRM_INVOICE_PREV_PK (INDEX)                                  2300106                      1763019
22-nov-2016 14:00:06 22-nov-2016 15:00:39 HUB.CRM_INVOICE_PREV_PK (INDEX)                                  3589330                      1763019
22-nov-2016 15:00:39 22-nov-2016 16:00:04 HUB.CRM_INVOICE_PREV_PK (INDEX)                                  1422715                      1763019
23-nov-2016 14:00:39 23-nov-2016 15:00:06 HUB.CRM_INVOICE_PREV_PK (INDEX)                                  4505033                      1763019
23-nov-2016 17:00:40 23-nov-2016 17:10:56 HUB.CRM_INVOICE_PREV_PK (INDEX)                                        0                      1763019
24-nov-2016 14:00:34 24-nov-2016 15:00:59 HUB.CRM_INVOICE_PREV_PK (INDEX)                                  2027948                      1763019
25-nov-2016 14:00:07 25-nov-2016 14:55:52 HUB.CRM_INVOICE_PREV_PK (INDEX)                                  4477194                      1763019
26-nov-2016 14:00:59 26-nov-2016 15:00:18 HUB.CRM_INVOICE_PREV_PK (INDEX)                                  2350258                      1763019
27-nov-2016 14:00:37 27-nov-2016 15:00:14 HUB.CRM_INVOICE_PREV_PK (INDEX)                                  4886175                      1763019
28-nov-2016 14:00:19 28-nov-2016 15:00:45 HUB.CRM_INVOICE_PREV_PK (INDEX)                                  2512397                      1763019
29-nov-2016 14:00:23 29-nov-2016 15:00:59 HUB.CRM_INVOICE_PREV_PK (INDEX)                                  3476692                      1763019
29-nov-2016 15:00:59 29-nov-2016 16:00:29 HUB.CRM_INVOICE_PREV_PK (INDEX)                                  1686445                      1763019
30-nov-2016 14:01:07 30-nov-2016 15:00:37 HUB.CRM_INVOICE_PREV_PK (INDEX)                                  4468593                      1763019
01-dec-2016 14:00:40 01-dec-2016 15:00:02 HUB.CRM_INVOICE_PREV_PK (INDEX)                                  2549539                      1763019
02-dec-2016 14:00:18 02-dec-2016 15:00:01 HUB.CRM_INVOICE_PREV_PK (INDEX)                                  4781775                      1763019
04-nov-2016 00:00:55 04-nov-2016 01:00:44 HUB.CRM_INVOICE_PK (INDEX)                                        474743                      1141594
04-nov-2016 20:00:26 04-nov-2016 21:00:47 HUB.CRM_INVOICE_PK (INDEX)                                        765938                      1141594
05-nov-2016 21:00:38 05-nov-2016 22:00:04 HUB.CRM_INVOICE_PK (INDEX)                                        919735                      1141594
06-nov-2016 19:00:44 06-nov-2016 20:00:11 HUB.CRM_INVOICE_PK (INDEX)                                        107619                      1141594
06-nov-2016 20:00:11 06-nov-2016 21:00:38 HUB.CRM_INVOICE_PK (INDEX)                                        474607                      1141594
07-nov-2016 20:00:30 07-nov-2016 21:00:45 HUB.CRM_INVOICE_PK (INDEX)                                        725956                      1141594
08-nov-2016 20:00:48 08-nov-2016 21:00:02 HUB.CRM_INVOICE_PK (INDEX)                                        113880                      1141594
08-nov-2016 20:00:48 08-nov-2016 21:00:02 HUB.CRM_INVOICE_PK (INDEX)                                        339013                      1141594
.

In above query I have limited the sample size to at least 5 elements because below this limit I rate standard variance number values makes no sense. We almost get this inside an AWR report except the appearing/disappearing segments…

The Service Request we opened at Oracle support reached to the same exact conclusion…

AWR figures extract and load

If you really come too late to check performance and you then lack the one when they were “good” one of the solution is to restore a database backup and create a temporary database on a test server. You might not always be able to do it but if you can afford do it (at the same time it will validate your backup/restore policy) ! Once this backup has been started export the newest AWR figure from your production database using awrextr.sql script located in in $ORACLE_HOME/rdbms/admin, you just need to create a directory and a bit of free disk space.

Transfer the file to your test server where you have recovered the production backup (so containing old AWR figures) and load in it latest AWR with awrload.sql script located in $ORACLE_HOME/rdbms/admin.

At first try I got below error message:

ERROR at line 1:
ORA-20105: unable to move AWR data to SYS
ORA-06512: at "SYS.DBMS_SWRF_INTERNAL", line 2984
ORA-20107: not allowed to move AWR data for local dbid
ORA-06512: at line 3

So changed the DBID of my restored database with NID (database New ID) tool (of course the restore must has been well done):

[oraxyz@server11 ~]$ nid target=sys DBNAME=sidxyz

DBNEWID: Release 11.2.0.4.0 - Production on Fri Dec 9 15:11:09 2016

Copyright (c) 1982, 2011, Oracle and/or its affiliates.  All rights reserved.

Password:
Connected to database EDWHUB (DBID=3256370993)

Connected to server version 11.2.0

Control Files in database:
    /ora_edwhub1/ctrl/edwhub/control01.ctl
    /ora_edwhub1/ctrl/edwhub/control02.ctl


The following datafiles are offline immediate:
    /ora_edwhub1/data02/edwhub/logdmfact2_01.db (284)
    /ora_edwhub/data02/edwhub/logdmfact2_01.db (290)

NID-00122: Database should have no offline immediate datafiles


Change of database name failed during validation - database is intact.
DBNEWID - Completed with validation errors.

Once recover well done and DBID change the import will run successfully and you can then generate AWR difference reports…

References

The post AWR mining for performance trend analysis appeared first on IT World.

]]>
https://blog.yannickjaquier.com/oracle/awr-mining-performance-trends.html/feed 0