ODA X8-2-HA cluster issue and NTP configuration

October 13, 2020, 3:56 am

≫ Next: Upgrade to Oracle 19c – performance issue

I have been recently deploying some new ODA X8-2-HA. After reimaging and creating the appliance, I had to patch the ODA. The day after, checking status, I could realize that my node 0 was not working as expected. I did some interesting troubleshooting that I wanted to share hoping it might be helpful for someone.

Why patching an ODA after reimaging it?

It might be curious to speak about patching an ODA after fresh installation (reimaging). The problem is that reimaging with last available version will only install last operating system and grid version. ILOM and BIOS will not be updated. The need to patch can easily been seen when you are checking the installed components :

[root@ODA-node0 ~]# odacli describe-component
System Version
---------------
19.8.0.0.0

System node Name
---------------
ODA-node0

Local System Version
---------------
19.8.0.0.0

Component                                Installed Version    Available Version
---------------------------------------- -------------------- --------------------
OAK                                       19.8.0.0.0            up-to-date

GI                                        19.8.0.0.200714       up-to-date

DB                                        12.1.0.2.200714       up-to-date

DCSAGENT                                  19.8.0.0.0            up-to-date

ILOM                                      4.0.4.38.r130206      4.0.4.51.r134837

BIOS                                      52010400              52021300

OS                                        7.8                   up-to-date

FIRMWARECONTROLLER                        16.00.01.00           16.00.08.00

FIRMWAREEXPANDER                          0309                  0310

FIRMWAREDISK {
[ c2d0,c2d1 ]                             1120                  1102
[ c0d0,c0d1,c0d2,c0d3,c0d4,c0d5,c0d6,     A959                  up-to-date
c0d7,c0d8,c0d9,c0d10,c0d11,c0d12,c0d13,
c0d14,c0d15,c0d16,c0d17,c0d18,c0d19,
c0d20,c0d21,c0d22,c0d23,c1d0,c1d1,c1d2,
c1d3,c1d4,c1d5,c1d6,c1d7,c1d8,c1d9,
c1d10,c1d11,c1d12,c1d13,c1d14,c1d15,
c1d16,c1d17,c1d18,c1d19,c1d20,c1d21,
c1d22,c1d23 ]
}

HMP                                       2.4.5.0.1             up-to-date

System node Name
---------------
ODA-node1

Local System Version
---------------
19.8.0.0.0

Component                                Installed Version    Available Version
---------------------------------------- -------------------- --------------------
OAK                                       19.8.0.0.0            up-to-date

GI                                        19.8.0.0.200714       up-to-date

DB                                        12.1.0.2.200714       up-to-date

DCSAGENT                                  19.8.0.0.0            up-to-date

ILOM                                      4.0.4.38.r130206      4.0.4.51.r134837

BIOS                                      52010400              52021300

OS                                        7.8                   up-to-date

FIRMWARECONTROLLER                        16.00.01.00           16.00.08.00

FIRMWAREEXPANDER                          0309                  0310

FIRMWAREDISK {
[ c2d0,c2d1 ]                             1120                  1102
[ c0d0,c0d1,c0d2,c0d3,c0d4,c0d5,c0d6,     A959                  up-to-date
c0d7,c0d8,c0d9,c0d10,c0d11,c0d12,c0d13,
c0d14,c0d15,c0d16,c0d17,c0d18,c0d19,
c0d20,c0d21,c0d22,c0d23,c1d0,c1d1,c1d2,
c1d3,c1d4,c1d5,c1d6,c1d7,c1d8,c1d9,
c1d10,c1d11,c1d12,c1d13,c1d14,c1d15,
c1d16,c1d17,c1d18,c1d19,c1d20,c1d21,
c1d22,c1d23 ]
}

HMP                                       2.4.5.0.1             up-to-date

In my case here, I have been redeploying version ODA 19.8 on the ODA and I’m having a gap in version for ILOM, BIOS and storage. There are some very good blogs for patching an ODA to version 19.X :
Patching ODA from 18.8 to 19.6
Reimaging ODA in version 19.8 and patching

In some cases the ODA patching is failing to update the ILOM and the BIOS and this needs to be done manually. Such an operation is also described in details in same last blog :
ILOM and BIOS manual updates

Cluster offset issue

Day after, before moving forward I checked the ODA status and could see some problem with my node 0. The ASM instance was not started, as well as the ASM listener, see below.

oracle@ODA-node0:/home/oracle/ [rdbms12102_1] ser
2020-09-16_10-23-05::dmk-run.pl::CheckListenerFile     ::ERROR ==> Couldn't open the listener.ora : /u01/app/oracle/product/12.1.0.2/dbhome_1/network/admin

Dummy:
------
rdbms12102_1         : DUMMY           (12.1.0.2/dbhome_1)


Database(s):
------------
+ASM1                : STOPPED         (grid)




oracle@ODA-node0:/home/oracle/ [rdbms12102_1]

Please note that our customer environments are running dbi DMK management kit, that’s why some of those displays might not be usual.

Easy, even surprised, I decided to try and to start ASM listener :

grid@ODA-node0:/home/grid/ [+ASM1] srvctl status listener -listener ASMNET1LSNR_ASM
PRCR-1070 : Failed to check if resource ora.ASMNET1LSNR_ASM.lsnr is registered
CRS-0184 : Cannot communicate with the CRS daemon.

Which was definitively unsuccessfull.

I then checked the Oracle Restart status :

grid@psrpri230:/home/grid/ [rdbms12102_1] crsctl check crs
CRS-4638: Oracle High Availability Services is online
CRS-4535: Cannot communicate with Cluster Ready Services
CRS-4529: Cluster Synchronization Services is online
CRS-4533: Event Manager is online

The cluster was having problems. I tried to stop and to start it.

[root@ODA-node0 bin]# ./crsctl start cluster
CRS-2672: Attempting to start 'ora.ctssd' on 'ODA-node0'
The clock on host ODA-node0 differs from mean cluster time by 21656584768 microseconds. The Cluster Time Synchronization Service will not perform time synchronization because the time difference is beyond the permissible offset of 600 seconds. Details in /u01/app/grid/diag/crs/ODA-node0/crs/trace/octssd.trc.
CRS-2674: Start of 'ora.ctssd' on 'ODA-node0' failed
CRS-2672: Attempting to start 'ora.ctssd' on 'ODA-node0'
The clock on host ODA-node0 differs from mean cluster time by 21656573817 microseconds. The Cluster Time Synchronization Service will not perform time synchronization because the time difference is beyond the permissible offset of 600 seconds. Details in /u01/app/grid/diag/crs/ODA-node0/crs/trace/octssd.trc.
CRS-2674: Start of 'ora.ctssd' on 'ODA-node0' failed
CRS-4000: Command Start failed, or completed with errors.

There was definitively a time synchronization issue between my 2 nodes node0 and node1. This could be seen in the log file as well :

[root@ODA-node0 bin]# tail -20 /u01/app/grid/diag/crs/ODA-node0/crs/trace/octssd.trc
2020-09-16 10:51:56.830 : GIPCTLS:2817517312:  gipcmodTlsSetAuthFlags: nzcred auth (global 0) flags, feature: 32, optional:0
2020-09-16 10:51:56.830 : GIPCTLS:2817517312:  gipcmodTlsAuthInit: tls context initialized successfully
2020-09-16 10:51:56.839 : GIPCTLS:2817517312:  gipcmodTlsAuthStart: TLS HANDSHAKE - SUCCESSFUL
2020-09-16 10:51:56.839 : GIPCTLS:2817517312:  gipcmodTlsAuthStart: peerUser: NULL
2020-09-16 10:51:56.839 : GIPCTLS:2817517312:  gipcmodTlsAuthStart: name:CN=ea78a0117f76ff9cff4967dc0d8bf3cf_4294692300,O=Oracle Clusterware,
2020-09-16 10:51:56.839 : GIPCTLS:2817517312:  gipcmodTlsAuthStart: name:CN=ea78a0117f76ff9cff4967dc0d8bf3cf_1599664758,O=Oracle_Clusterware,
2020-09-16 10:51:56.839 : GIPCTLS:2817517312:  gipcmodTlsAuthStart: endpoint 0x7efe9003ac00 [0000000000000a28] { gipcEndpoint : localAddr 'gipcha://ODA-node0:3202-6a4f-ef55-4696', remoteAddr 'gipcha://ODA-node1:CTSSGROUP_2/a7fa-52e9-ad59-eca4', numPend 2, numReady 0, numDone 1, numDead 0, numTransfer 0, objFlags 0x0, pidPeer 0, readyRef (nil), ready 0, wobj 0x7efe9003d410, sendp (nil) status 13flags 0x200b8602, flags-2 0x10, usrFlags 0x20020 }, auth state: gipcmodTlsAuthStateReady (3)
2020-09-16 10:51:56.839 : GIPCTLS:2817517312:  gipcmodTlsAuthReady: TLS Auth completed Successfully
2020-09-16 10:51:56.977 :  CRSCCL:2817517312: ConnAccepted from Peer:msgTag= 0xcccccccc version= 0 msgType= 4 msgId= 0 msglen = 0 clschdr.size_clscmsgh= 88 src= (2, 4294731470) dest= (1, 1099434)
2020-09-16 10:51:56.977 :    CTSS:2811213568: ctssslave_swm2_1: Waiting for time sync message from master. sync_state[2].
2020-09-16 10:51:56.977 :    CTSS:2815416064: ctsscomm_recv_cb4_2: Receive active version change msg. Old active version [318767104] New active version [318767104].
2020-09-16 10:51:56.977 :    CTSS:2815416064: ctssslave_msg_handler4_1: Waiting for slave_sync_with_master to finish sync process. sync_state[3].
2020-09-16 10:51:56.977 :    CTSS:2811213568: ctssslave_swm2_3: Received time sync message from master.
2020-09-16 10:51:56.977 :    CTSS:2811213568: ctssslave_swm: The magnitude [21656573817] of the offset [21656573817 usec] is larger than [600000000 usec] sec which is the CTSS limit. Hence CTSS is exiting.
2020-09-16 10:51:56.977 :    CTSS:2811213568: (:ctsselect_msm3:): Failed in clsctssslave_sync_with_master [12]: Time offset is too much to be corrected, exiting.
2020-09-16 10:51:56.977 :    CTSS:2815416064: ctssslave_msg_handler4_3: slave_sync_with_master finished sync process. Exiting clsctssslave_msg_handler
2020-09-16 10:51:57.310 :  CRSCCL:3293234944: clsCclGetPriMemberData: Detected pridata change for node[1]. Retrieving it to the cache.
2020-09-16 10:51:57.532 :    CTSS:3315066624: ctss_checkcb: clsdm requested check alive. checkcb_data{mode[0xf6], offset[21656573 ms]}, length=[8].
2020-09-16 10:51:57.532 :    CTSS:2811213568: ctsselect_msm: CTSS daemon exiting [12] as offset is to large.
2020-09-16 10:51:57.532 :    CTSS:2811213568: CTSS daemon aborting

I then decided to check server time displayed on the OS :

grid@ODA-node0:/home/grid/ [rdbms12102_1] date
Wed Sep 16 11:10:17 CEST 2020

oracle@ODA-node1:/home/oracle/ [rdbms12102_1] date
Wed Sep 16 05:09:29 CEST 2020

Node 0 was on time but node 1 definitively not.

But why? NTP has been configured on the appliance :

[root@ODA-node1 ~]# odacli describe-system

Appliance Information
----------------------------------------------------------------
                     ID: 74352d02-f2f6-4547-a5cf-71b40f7b7879
               Platform: X8-2-HA
        Data Disk Count: 24
         CPU Core Count: 32
                Created: September 9, 2020 2:04:08 AM CEST

System Information
----------------------------------------------------------------
                   Name: 
            Domain Name: domain_name
              Time Zone: Europe/Zurich
             DB Edition: EE
            DNS Servers: 10.X.X.X 10.X.X.X
            NTP Servers: 10.10.40.2 10.7.40.23 10.7.40.25

Disk Group Information
----------------------------------------------------------------
DG Name                   Redundancy                Percentage
------------------------- ------------------------- ------------
Data                      Normal                    50
Reco                      Normal                    50

Please note that the IP addresses have been changed not to display customer’s one.

NTP configuration on the ODA

NTP configuration on the ODA is a common linux NTP configuration.

I have checked the NTP configuration and could realize that the NTP pool from Redhat are by default configured and active. Main customers won’t accept such external connection and are having their own internal NTP server. This was the case by this customer, so I have commented the lines in the NTP configuration file.

[root@ODA-node1 bin]# cp -p /etc/ntp.conf /etc/ntp.conf.20200916

[root@ODA-node1 bin]# vi /etc/ntp.conf

[root@ODA-node1 bin]# diff /etc/ntp.conf /etc/ntp.conf.20200916
21,24c21,24
< #server 0.rhel.pool.ntp.org iburst
< #server 1.rhel.pool.ntp.org iburst
< #server 2.rhel.pool.ntp.org iburst
 server 0.rhel.pool.ntp.org iburst
> server 1.rhel.pool.ntp.org iburst
> server 2.rhel.pool.ntp.org iburst
> server 3.rhel.pool.ntp.org iburst

The NTP configuration file looks then like this :

[root@ODA-node1 bin]# cat /etc/ntp.conf
# For more information about this file, see the man pages
# ntp.conf(5), ntp_acc(5), ntp_auth(5), ntp_clock(5), ntp_misc(5), ntp_mon(5).

driftfile /var/lib/ntp/drift

# Permit time synchronization with our time source, but do not
# permit the source to query or modify the service on this system.
restrict default nomodify notrap nopeer noquery

# Permit all access over the loopback interface.  This could
# be tightened as well, but to do so would effect some of
# the administrative functions.
restrict 127.0.0.1
restrict ::1

# Hosts on local network are less restricted.
#restrict 192.168.1.0 mask 255.255.255.0 nomodify notrap

# Use public servers from the pool.ntp.org project.
# Please consider joining the pool (http://www.pool.ntp.org/join.html).
#server 0.rhel.pool.ntp.org iburst
#server 1.rhel.pool.ntp.org iburst
#server 2.rhel.pool.ntp.org iburst
#server 3.rhel.pool.ntp.org iburst

#broadcast 192.168.1.255 autokey        # broadcast server
#broadcastclient                        # broadcast client
#broadcast 224.0.1.1 autokey            # multicast server
#multicastclient 224.0.1.1              # multicast client
#manycastserver 239.255.254.254         # manycast server
#manycastclient 239.255.254.254 autokey # manycast client

# Enable public key cryptography.
#crypto

includefile /etc/ntp/crypto/pw

# Key file containing the keys and key identifiers used when operating
# with symmetric key cryptography.
keys /etc/ntp/keys

# Specify the key identifiers which are trusted.
#trustedkey 4 8 42

# Specify the key identifier to use with the ntpdc utility.
#requestkey 8

# Specify the key identifier to use with the ntpq utility.
#controlkey 8

# Enable writing of statistics records.
#statistics clockstats cryptostats loopstats peerstats

# Disable the monitoring facility to prevent amplification attacks using ntpdc
# monlist command when default restrict does not include the noquery flag. See
# CVE-2013-5211 for more details.
# Note: Monitoring will not be disabled with the limited restriction flag.
disable monitor
server 10.10.40.2 prefer
server 10.7.40.23
server 10.7.40.25

I wanted to check how the server was synchronized using ntpq tool.

[root@ODA-node1 bin]# ntpq -p
ntpq: read: Connection refused

But I had to restart the service first :

[root@ODA-node1 bin]# service ntpd restart
Redirecting to /bin/systemctl restart ntpd.service

I could then check the NTP configuration.

[root@ODA-node1 bin]# ntpq -p
     remote           refid          st t when poll reach   delay   offset  jitter
==============================================================================
 lantime.domain_name .INIT.          16 u    -  512    0    0.000    0.000   0.000
+ntp1.domain_name    131.188.3.223    2 u   32   64  377    0.364    0.635   0.247
*ntp2.domain_name    131.188.3.223    2 u   26   64  377    0.385    0.851   0.173

remote

The remote column will display the NTP server names that should be known by the DNS server. In my case those are the NTP servers I configured when creating the appliance, see previous odacli describe-system command and nslookup below command.

[root@psrpri231 ~]# nslookup ntp2.domain_name
Server:         10.7.9.9
Address:        10.7.9.9#53

Name:   ntp2.domain_name
Address: 10.7.40.25

[root@psrpri231 ~]# nslookup ntp1.domain_name
Server:         10.7.9.9
Address:        10.7.9.9#53

Name:   ntp1.domain_name
Address: 10.7.40.23

At that time there was a network routing issue to the preferred NTP. This is why only the 2 other NTP servers are displayed in the ntpq command. This was certainly the root cause of the synchronization problem between both nodes.

refid

The refid entry will show the current source of synchronization for each peer. In my case the source was 131.188.3.223 which is a NTP server of the Friedrich-Alexander university in Erlangen (Germany). This can be confirmed by doing a simple nslookup from my laptop.

C:\Users\maw>nslookup 131.188.3.223
Server:  UnKnown
Address:  192.168.1.254

Name:    ntp3.rrze.uni-erlangen.de
Address:  131.188.3.223

The character at the left margin will display the synchronization status for the peer.
The character * will highlight which peer is currently selected. It is named the system peer.
The character + will highlight the peers which are designated as been acceptable for synchronization but not currently selected. They are considered as good reference time sources and are survivors of the selection process. They can be potential system peers and are then called candidates.
The character – will highlight a peer that is discarded from the selection. Those other time sources display a time that differs from the survivors’ time. They are called falsetickers.

st

The st entry will display the peer stratum. The stratum will measure the distance between the peer to the NTP source. That would show how many hops/servers there are until the hardware refclock. In my case, the customer NTP servers are stratum 2 servers. This indicates that they are synchronized directly with the NTP server of the Friedrich-Alexander university which is itself a stratum 1 which has a hardware refclock (stratum 0). The stratum can be seen as a level in the timing hierarchy.

t

The t enty will display the type of the peer.
u=unicast
m=multicast
l=local
-=don’t know

when

The when entry will display the time in seconds since the peer was last heard.

poll

The poll entry will display the polling interval in seconds. This is the time in seconds the NTP daemon took to synchronize with the peer. That’s to say the delay in seconds between 2 requests.

reach

The reach entry will display in octal format the status of the server availability. After 8 successful synchronization attempts the value will be 377.

delay

The delay entry will display the latency the request will take from client to server and back again. It describe the round-trip delay of the NTP request. This value is important for the NTP to take the network latency in account and adjust accordingly the timing.

offset

The offset entry will give the time difference between the ODA itself and the peer (the synchronization server).

jitter

The jitter entry will give the dispersion, the magnitude of variance. Each peer has a different amount of jitter. The lower the jitter value is, more accurate will be the peer.

Resolving and checking the cluster offset issue

Now that the NTP issue is resolved, I could restart Oracle cluster. I could also check the nodes was up and running again.

I first could see that the ODA date has been automatically adjusted by the NTP :

[root@ODA-node1 ~]# date
Wed Sep 16 11:29:31 CEST 2020

I could restart the cluster :

[root@ODA-node0 bin]# cd /u01/app/19.0.0.0/grid/bin/

[root@ODA-node0 bin]#  ./crsctl start cluster
CRS-2672: Attempting to start 'ora.ctssd' on 'ODA-node0'
CRS-2676: Start of 'ora.ctssd' on 'ODA-node0' succeeded
CRS-2672: Attempting to start 'ora.asm' on 'ODA-node0'
CRS-2672: Attempting to start 'ora.crsd' on 'ODA-node0'
CRS-2676: Start of 'ora.crsd' on 'ODA-node0' succeeded
CRS-2676: Start of 'ora.asm' on 'ODA-node0' succeeded

And I could check the cluster is up and running :

[root@ODA-node0 bin]# ./crsctl check crs
CRS-4638: Oracle High Availability Services is online
CRS-4537: Cluster Ready Services is online
CRS-4529: Cluster Synchronization Services is online
CRS-4533: Event Manager is online

[root@ODA-node0 bin]# ./crsctl status resource -t
--------------------------------------------------------------------------------
Name           Target  State        Server                   State details
--------------------------------------------------------------------------------
Local Resources
--------------------------------------------------------------------------------
ora.DATA.COMMONSTORE.advm
               ONLINE  ONLINE       ODA-node0                STABLE
               ONLINE  ONLINE       ODA-node1                STABLE
ora.LISTENER.lsnr
               ONLINE  ONLINE       ODA-node0                STABLE
               ONLINE  ONLINE       ODA-node1                STABLE
ora.chad
               ONLINE  ONLINE       ODA-node0                STABLE
               ONLINE  ONLINE       ODA-node1                STABLE
ora.data.commonstore.acfs
               ONLINE  ONLINE       ODA-node0                mounted on /opt/orac
                                                             le/dcs/commonstore,S
                                                             TABLE
               ONLINE  ONLINE       ODA-node1                mounted on /opt/orac
                                                             le/dcs/commonstore,S
                                                             TABLE
ora.net1.network
               ONLINE  ONLINE       ODA-node0                STABLE
               ONLINE  ONLINE       ODA-node1                STABLE
ora.ons
               ONLINE  ONLINE       ODA-node0                STABLE
               ONLINE  ONLINE       ODA-node1                STABLE
ora.proxy_advm
               ONLINE  ONLINE       ODA-node0                STABLE
               ONLINE  ONLINE       ODA-node1                STABLE
--------------------------------------------------------------------------------
Cluster Resources
--------------------------------------------------------------------------------
ora.ASMNET1LSNR_ASM.lsnr(ora.asmgroup)
      1        ONLINE  ONLINE       ODA-node0                STABLE
      2        ONLINE  ONLINE       ODA-node1                STABLE
ora.DATA.dg(ora.asmgroup)
      1        ONLINE  ONLINE       ODA-node0                STABLE
      2        ONLINE  ONLINE       ODA-node1                STABLE
ora.LISTENER_SCAN1.lsnr
      1        ONLINE  ONLINE       ODA-node0                STABLE
ora.LISTENER_SCAN2.lsnr
      1        ONLINE  ONLINE       ODA-node1                STABLE
ora.RECO.dg(ora.asmgroup)
      1        ONLINE  ONLINE       ODA-node0                STABLE
      2        OFFLINE OFFLINE                               STABLE
ora.asm(ora.asmgroup)
      1        ONLINE  ONLINE       ODA-node0                Started,STABLE
      2        ONLINE  ONLINE       ODA-node1                STABLE
ora.asmnet1.asmnetwork(ora.asmgroup)
      1        ONLINE  ONLINE       ODA-node0                STABLE
      2        ONLINE  ONLINE       ODA-node1                STABLE
ora.cvu
      1        ONLINE  ONLINE       ODA-node1                STABLE
ora.ODA-node0.vip
      1        ONLINE  ONLINE       ODA-node0                STABLE
ora.ODA-node1.vip
      1        ONLINE  ONLINE       ODA-node1                STABLE
ora.qosmserver
      1        ONLINE  ONLINE       ODA-node1                STABLE
ora.scan1.vip
      1        ONLINE  ONLINE       ODA-node0                STABLE
ora.scan2.vip
      1        ONLINE  ONLINE       ODA-node1                STABLE
--------------------------------------------------------------------------------

And finally, the ASM instance and the ASM listener were up and running again :

oracle@ODA-node0:/home/oracle/ [rdbms12102_1] ser
2020-09-16_11-34-31::dmk-run.pl::CheckListenerFile     ::ERROR ==> Couldn't open the listener.ora : /u01/app/oracle/product/12.1.0.2/dbhome_1/network/admin

Dummy:
------
rdbms12102_1         : DUMMY           (12.1.0.2/dbhome_1)


Database(s):
------------
+ASM1                : STARTED         (grid)


Listener(s):
------------
ASMNET1LSNR_ASM      : Up              (grid)
LISTENER             : Up              (grid)
LISTENER_SCAN1       : Up              (grid)

Cet article ODA X8-2-HA cluster issue and NTP configuration est apparu en premier sur Blog dbi services.

↧

Upgrade to Oracle 19c – performance issue

October 15, 2020, 6:37 am

≫ Next: Oracle Database Appliance vs Oracle Cloud Infrastructure

≪ Previous: ODA X8-2-HA cluster issue and NTP configuration

In this blog I want to introduce you to a workaround for a performance issue which randomly appeared during the upgrades of several Oracle 12c databases to 19c I performed for a financial services provider. During the upgrades we ran into a severe performance issue after the upgrades of more than 40 databases had worked just fine. While most of them finished in less than one hour, we run into one which would have taken days to complete.

Issue

After starting the database upgrade from Oracle 12.2.0.1.0 to Production Version 19.8.0.0.0 the upgrade locked up during compiling:

@utlrp

Reason

One select-statement on the unified_audit_trail was running for hours with no result, blocking the upgrade progress and consuming nearly all database resources. The size of the audit_trail itself was about 35MB, so not the size you would expect such a bottleneck from:

SQL> SELECT count(*) from gv$unified_audit_trail;

Solution

After some research and testing (see notes below) I found the following workaround (after killing the upgrade process):

SQL> begin
DBMS_AUDIT_MGMT.CLEAN_AUDIT_TRAIL(
audit_trail_type => DBMS_AUDIT_MGMT.AUDIT_TRAIL_UNIFIED,
use_last_arch_timestamp => TRUE);
end;
/
SQL> set timing on;
SELECT count(*) from gv$unified_audit_trail;
exec DBMS_AUDIT_MGMT.FLUSH_UNIFIED_AUDIT_TRAIL;

Note

As a first attempt I used the procedure below, described in Note 2212196.1.

But flush_unified_audit_trail lasted too long, so I killed the process after it ran for one hour. The flash procedure again worked fine after using clean_audit_trail as described above:

SQL> begin
DBMS_AUDIT_MGMT.FLUSH_UNIFIED_AUDIT_TRAIL;
for i in 1..10 loop
DBMS_AUDIT_MGMT.TRANSFER_UNIFIED_AUDIT_RECORDS;
end loop;
end;
/

A few days later we encountered the same issue on an Oracle 12.1.0.2 database which requires Patch 25985768 for executing dbms_audit_mgmt.transfer_unified_audit_records.

This procedure is available out of the box in the Oracle 12.2 database and in the Oracle 12.1.0.2 databases which have been patched with Patch 25985768.

To avoid to get caught in this trap it is my advise that you gather all relevant statistics before any upgrade from Oracle 12c to 19c and to query gv$unified_audit_trail in advance. This query usually finishes within a few seconds.

Oracle Database Appliance vs Oracle Cloud Infrastructure

October 18, 2020, 3:33 am

≫ Next: Disk usage on ODA – Free MB and usable MB

≪ Previous: Upgrade to Oracle 19c – performance issue

Introduction

Oracle Database Appliances are very popular these days. And not only among new customers for this kind of engineered systems. Almost all customers already using old generation ODAs are renewing their infrastructure by choosing again ODAs, meaning that the solution is good enough and probably better than anything else. But now, public clouds are a real alternative to on-premise servers, and Oracle Cloud Infrastructure is a solid competitor vs Amazon and Azure public clouds. So what’s the best solution for your databases, ODA or OCI? Let’s do the match.

Round 1 – Cost

Yes, it is important. You will need to buy ODAs and you will need a budget for that. Nothing new regarding this platform, it requires an investment. ODA is cheaper since light models are available, but if you need significant amount of storage, it comes at a cost. But hopefully, the cost is quite similar to another x86 platform, and the ODA doesn’t have these hidden costs due to additional work for troubleshooting compatibility issues.
Cost works differently on OCI. Basically, you will pay for servers, storage, services on a monthly basis. No initial investment is needed, and that is one of the advantages of OCI. However, don’t expect the “TCO” to be lower than acquiring your own hardware. I do not mean that cloud solutions are expensive, but the cost will be quite similar to an on-premise solution after some years. Going to the cloud is mainly changing your mind about what’s an infrastructure. Is it servers you manage on your own or is it a platform for running your information system?
There is no winner in this round, you will only know after several years which solution would have been the less expensive.

Winner: none

Round 2 – Deployment speed

ODA allows fast deployment of a new database infrastructure. Actually, it’s the best on-premise solution regarding that point. And it’s a serious advantage over DIY platforms. Being able to create your first database the same day you open the box is quite nice. But OCI is even better, because at this very moment we are talking now, your future servers are already available, Terabytes of storage are waiting for you, and databases are almost there, few clicks away from now. If you’re looking for fast deployment, OCI is an easy winner.

Winner: OCI

Round 3 – Security

Everybody is talking about security. Is my database safer in the cloud than in my own datacenter? Actually, it’s quite hard to tell. For sure, OCI is a public cloud, meaning that your database can be reached from virtually everywhere. But you will probably build strong security rules to protect your cloud infrastructure. You will use IPSec VPN between OCI and your on-premise site, or a FastConnect channel to dedicate a link between your on-premise equipment and OCI avoiding data to transit through the internet. Putting your database in the cloud is not less secure than giving remote connection on your infrastructure to your employees or providers. Furthermore, databases in OCI are stored using encryption, even with Standard Edition and without the need for Advanced Security option.
On ODA, you database is in your network, meaning not on something public and meaning less visible. This is good, but again, only if you have good security rules inside your company.

Winner: none

Round 4 – Performance

ODA is a strong performer, especially the X8-2M model. With up to 75TB of NVMe SSD, it’s quite tough to achieve better performance with anything else. Yes you could grab few MB/s more or ms less with few other solutions, but do you really think that your users will see the difference? No. And what about OCI? OCI rely on SSD storage only, that’s a very good start. And do they offer NVMe? Yes, for sure. Bare metals servers (BM Dense I/O) provide up to 51TB of RAW capacity based on 8 NVMe drives. And something tells me that these servers are actually nearly the same as ODA X7-2Ms. So expect similar performance on both solutions.

Winner: none

Round 5 – License management

No doubt that on-demand capacity of Enterprise licenses is one of the key feature of the ODA. You can start with only 1 Enterprise license on each ODA, and increase the number of licenses when you need more resources. A kind of fine tuning for the licenses.

On OCI, you can choose to bring your own license you bought long time ago, and keep your investment for later if for some reason you would like to go back to on-premise infrastructure. Or you can choose to include the license fees into the monthly fees. With the first option, you manage your licenses as you always did, and should be careful when you increase the cloud resources dedicated to your databases (mainly the oCPUs). With the second option, you don’t have to manage your licenses anymore: you don’t need to buy them, pay the yearly support, or review them regularly because all is included with your OCI database services. It’s simply a great feature.

Winner: OCI

Round 6 – Simplicity

ODA and OCI share the same goal: simplify your database infrastructure. ODA is simplifying by providing the best automation available for deploying complex Oracle stack. And when you come from an existing on-premise infrastructure, migration to ODA will be quite easy. OCI looks even more simplifying, but if you will not have to work on the servers, you’ll have to think about how to implement your infrastructure. Which subnet for my databases? Should I also move my application servers? What kind of network connectivity with my on-premise environment? Which kind of database service fits my needs?

If you’re starting from scratch with Oracle databases, it’s probably more simple to go directly to OCI. If you’re migrating from an existing on-premise environment, it’s probably more simple to replace your existing servers with ODAs. No winner here.

Winner: none

Round 7 – Control

For some customers, being able to control their infrastructure is vital. On public clouds, you will not have control on everything, because someone will do a part of the maintenance job, mostly automated tasks. And this is for some other customers something they don’t want to manage. On ODA, you control everything on your server: first, it’s not mandatory to connect it to the internet. Updates on ODA cannot be automated and will be applied manually through good old zipfiles, and in case of serious problems, ODA is fast to redeploy. So if you need to have total control over your infrastructure, the ODA is the best solution for you.

OCI is only a good solution if you already planned to lose some control, for obvious workload reasons.

Winner: ODA

Round 8 – Resilience

Disaster recovery solutions were not so common 10 years ago. People were relying on tape backups, were confident about this solution and were believing they would be able to restore the tape “somewhere”, without asking them where actually was “somewhere”. At best, the old servers were kept for disaster recovery usage, in the same rack.
This has definitely changed, and now disaster recovery is part of each new infrastructure design. And regarding the software side of the database, this is something mature and highly reliable (with Data Guard or Dbvisit standby). The most complex part being to design the split into multiple datacenters (2, most of the time). Implementing that cleverly, avoiding Single Point Of Failure that could wipe out the efforts to achieve high resiliency, being a tough challenge. ODA is a server like others, and you will have to do the same amount of work to design a high resilient infrastructure.

Cloud providers have been thinking about disaster recovery since the very beginning. The datacenters are spread all around the world, and each one has separate availability domains (isolated building blocks), allowing multiple levels of disaster recovery scenarios. Furthermore, storage and backups naturally embed this high resilience. And as everyone will use the same mechanisms, you can trust OCI regarding resilience.

As a conclusion, it’s nearly impossible to reach the level of resilience of OCI on your on-premise ODA infrastructure, that must be said…

Winner: OCI

What about the other solutions?

For sure, it still possible to build your own database infrastructure with classic servers. But do you really have time for that?
EXADATA is also a nice solution if you need such a beast for multi-TB databases with high number of transactions or fastest BI platform. And now it can bring you both the advantages of OCI and appliance with the Cloud@customer mode. Oracle brings the server in your datacenter, and you only pay for it monthly as if you were using it in the cloud.
Hybrid solution with a mix of ODA of OCI could also fit your needs but you’ll have to manage both technologies, and that’s not so smart. Unless you need this kind of solution for the transition to the cloud…

Conclusion

Is ODA better than OCI? Is OCI better than ODA? Both solutions are smart choices and none will disappoint you if you achieve to leverage the advantages and avoid the constraints of each one. On OCI, you will benefit from immediate availability of the resources, fast provisioning, flexibility, no-brainer license management. With ODA, you will keep your database infrastructure at home, and you will have strong performance and full control over your servers, including for the cost. Choosing between these two solutions is only a matter of strategy, and this does not only concern the DBA.

Cet article Oracle Database Appliance vs Oracle Cloud Infrastructure est apparu en premier sur Blog dbi services.

↧

Disk usage on ODA – Free MB and usable MB

October 21, 2020, 12:34 am

≫ Next: Oracle Restart 19c and ACFS

≪ Previous: Oracle Database Appliance vs Oracle Cloud Infrastructure

Introduction

Oracle Database Appliances rely on ASM to manage disk redundancy. And ASM is brilliant. Compared to RAID, redundancy is managed at the block level. For NORMAL redundancy, which is similar to RAID1, you need at least 2 disks, but it can also work with 3 disks, 4 disks, 5 disks and so on. There is no need for parity at the disk level. HIGH redundancy, which does not exist in RAID technology, is basically a triple security. Each block is written on 3 different disks. For this kind of redundancy, you need at least 3 disks, but you can also use 4 disks, 5 disks, 6 disks and so on. You can add and remove disks online, without any downtime, using various degrees of parallelism to increase speed or to lower CPU usage during the rebalancing operations.

RAW space vs usable space

As there is no RAID controler in your ODA, you will see from the system, and more precisely from ASM instance, the RAW space available. For example, on ODA X8-2M with 4 disks, RAW capacity is 25.6TB. This is the free space size you would see on this kind of ODA if there were no databases configured on it. This is not a problem as soon as you understand that you don’t really have these 25.6TB. There is also a usable space notion. One should think it is space available with redundancy being computed, but it’s not exactly that. It can be quite different actually depending on your ODA.

Real world example

For my example, I will use an ODA X8-2M with 4 disks running on 19.6. Redundancy has been set to NORMAL, and DATA/RECO ratio to 90/10. Several databases are running on this ODA. Regarding the spec sheet of this server, the ODA X8-2M comes with 2x 6.4TB disks as standard, and you can add up to 5 expansions, each expansion being a bundle of 2x 6.4TB disks. RAW capacity starts from 12.4TB and goes up to 76.8TB. As you probably know, a 6.4TB disk hasn’t 6.4TB of real usable capacity, so don’t expect to store more than 5.8TB on each disk. But this is not related to ODA. It’s been years that disk manufacturers are writing optimistic sizes on their disks.

I’m using V$ASM_DISKGROUP dynamic view from +ASM1 instance to check available space and free space.

desc v$asm_diskgroup
 Name                                      Null?    Type
 ----------------------------------------- -------- ----------------------------
 GROUP_NUMBER                                       NUMBER
 NAME                                               VARCHAR2(30)
 SECTOR_SIZE                                        NUMBER
 LOGICAL_SECTOR_SIZE                                NUMBER
 BLOCK_SIZE                                         NUMBER
 ALLOCATION_UNIT_SIZE                               NUMBER
 STATE                                              VARCHAR2(11)
 TYPE                                               VARCHAR2(6)
 TOTAL_MB                                           NUMBER
 FREE_MB                                            NUMBER
 HOT_USED_MB                                        NUMBER
 COLD_USED_MB                                       NUMBER
 REQUIRED_MIRROR_FREE_MB                            NUMBER
 USABLE_FILE_MB                                     NUMBER
 OFFLINE_DISKS                                      NUMBER
 COMPATIBILITY                                      VARCHAR2(60)
 DATABASE_COMPATIBILITY                             VARCHAR2(60)
 VOTING_FILES                                       VARCHAR2(1)
 CON_ID                                             NUMBER

One can guess that real diskgroup free % is normally FREE_MB/TOTAL_MB:

SQL> set lines 200
SQL> select GROUP_NUMBER, NAME, TOTAL_MB, FREE_MB, USABLE_FILE_MB, TYPE from v$asm_diskgroup;

GROUP_NUMBER NAME                             TOTAL_MB    FREE_MB USABLE_FILE_MB TYPE
------------ ------------------------------ ---------- ---------- -------------- ------
           1 DATA                             21977088    9851876        2178802 NORMAL
           2 RECO                              2441216    1421004         405350 NORMAL


Select round(9851876/21977088*100,1) "% Free"  from dual;

    % Free
----------
      44.8

Free space is more than 44% on my ODA. Not bad.

And when I use USABLE_FILE_MB to get another metric for the same thing:

SQL> set lines 200
SQL> select GROUP_NUMBER, NAME, TOTAL_MB, FREE_MB, USABLE_FILE_MB, TYPE from v$asm_diskgroup;

GROUP_NUMBER NAME                             TOTAL_MB    FREE_MB USABLE_FILE_MB TYPE
------------ ------------------------------ ---------- ---------- -------------- ------
           1 DATA                             21977088    9851876        2178802 NORMAL
           2 RECO                              2441216    1421004         405350 NORMAL


Select round(2178801/21977088*100,1) "% Free"  from dual;

    % Free
----------
       9.9

This is bad. According to this metric, I have less than 10% free in that diskgroup. I’m getting anxious… I thought I was fine but I’m now critical?

What is really USABLE_FILE_MB?

When you look into the documentation, it’s quite clear:

USABLE_FILE_MB is free MB according to diskgroup redundancy. Among 9’851’876 MB, only half, 4’925’938 MB of data, can be used in that diskgroup. This is for NORMAL redundancy (each block exists on 2 different disks). This is quite relevant regarding what has been said before
USABLE_FILE_MB is free MB according to a disk being able to get lost and redundancy would be guaranteed. On this ODA with 4 disks, ¼ of the total disk space shouldn’t be considered as available unlike RAID system (a loss of one disk is not visible by the system). For a total MB of 21’977’088, only 16’482’816 MB should be considered as usable for DATA
Finally, USABLE_FILE_MB is the mix of these 2 facts. For NORMAL redundancy, the formula is USABLE_FILE_MB = (FREE_MB – TOTAL_MB/nb_disks) / 2 = (9’851’876 MB – 5’494’272 MB) / 2 = 2’178’802 MB

Let’s take another example to be sure. This time it’s an ODA X8-2M with 6 disks in NORMAL redundancy. Let’s do the math:

SQL> set lines 200
SQL> select GROUP_NUMBER, NAME, TOTAL_MB, FREE_MB, USABLE_FILE_MB, TYPE from v$asm_diskgroup;

GROUP_NUMBER NAME                             TOTAL_MB    FREE_MB USABLE_FILE_MB TYPE
------------ ------------------------------ ---------- ---------- -------------- ------
           1 DATA                             32965632   25028756        9767242 NORMAL
           2 RECO                              3661824    2549852         969774 NORMAL

select round((25028756 - 32965632/6)/2,1) "DATA_USABLE_FILE_MB" from v$asm_diskgroup where name='DATA';

DATA_USABLE_FILE_MB
-------------------
            9767242

The formula is correct.

Should I use USABLE_FILE_MB for monitoring?

That’s a good question. Using USABLE_FILE_MB for monitoring is considering the worst case. Using FREE_MB/TOTAL_MB is considering the best case. Using FREE_MB seems recommended but with lower values than a normal filesystem: WARNING should be triggered when 65/70% is reached, and CRITICAL should be triggered when 80/85% is reached. For 2 reasons: because the volume will be filled 2 times faster than a view through a RAID system (3 times faster with HIGH redundancy) and because when your disks are nearly full, the only way to extend the volume is to buy new disks from Oracle (if you have not reached the limit).

Remember that the only resilience guarantee for an ODA is not having enough space in diskgroups for loosing one disk but having a functional Data Guard configuration. It’s why I never configure HIGH redundancy on ODA, it’s a waste of disk space and it does not provide me much higher failure tolerance (I still have “only” 2 power supplies and 2 network interfaces).

To make it crystal clear, let’s compare again to a RAID system. Imagine you have a 4x 6TB disks RAID1 system. These 4 disks have a RAW capacity of 24TB, but only 12TB are usable. If you loose one disk, 12TB are still usable, but you’ve lost redundancy for half of the data. With ASM in NORMAL redundancy, you can see a total of 24TB, but only 12TB is really available for your databases. But if you look at the USABLE_FILE_MB, you will find that only 9TB is usable, because you redundancy can survive to a disk crash. The RAID is simply not able to do that.

Furthermore, if you want to do the same with RAID1 you could, but it means that you will need 5 disks instead of 4. The fifth one being the spare disk to rebuild redundancy in case of disk failure of one of the four disks.

Should I use storage even if USABLE_FILE_MB is 0?

Yes, you can. But you have to know that if you loose a disk, redundancy cannot be guaranteed anymore. Like if it were on a RAID system. You can also see negative values in USABLE_FILE_MB.

And what about the number of disks?

For sure, the more disk you have, the less space you will “loose” from the USABLE_FILE_MB view. An ODA with 3 or 4 disks with NORMAL redundancy is definitely not very comfortable, but starting from 6 disks, this USABLE_FILE_MB becomes much more convenient.

On a 2-disk ODA with NORMAL redundancy, there is no way of keeping redundancy after loosing a disk. That’s quite obvious. X8-2S and X8-2M with base disk configuration are not that nice for this reason.

Number of disks is not only a matter of storage size you need but also an increased level of security for your databases. The more disks you have, the more disk failure you can survive keeping redundancy (if disks are not having simultaneous failures for sure).

Conclusion

ODA storage is commonly misunderstood, because it does not use classic RAID. ASM is very powerful and more secure than a RAID system. Don’t hesitate to order more disks than needed on your ODAs. Yes it’s expensive but this is a good investment for the next 5 years. And it’s usually cheaper to order additional disks with the ODA than ordering them later.

Cet article Disk usage on ODA – Free MB and usable MB est apparu en premier sur Blog dbi services.

↧

Oracle Restart 19c and ACFS

October 27, 2020, 8:53 am

≫ Next: ODA 19.9 software release is out

≪ Previous: Disk usage on ODA – Free MB and usable MB

In this blog we are going to install an Oracle Restart 19c with ASM Filter Driver (AFD). I am using following configuration
-Oracle 19c
-Oracle Linux Server 7.6
-Kernel 4.14.35-1902.2.0.el7uek.x86_64

Once the Oracle Restart installed, I will create an ACFS filesystem

We will have to configure following disks for AFD
-/dev/sdc
-/dev/sdd

The Oracle downloaded software was unzipped to the GRID_HOME

[oracle@oraadserver u01]$ unzip -d /u01/app/19.0.0.0/grid LINUX.X64_193000_grid_home.zip

After setting the variables ORACLE_HOME and ORACLE_BASE, we then use with user root the command asmcmd afd_label to provision disk devices for use with Oracle ASM Filter Driver.

[root@oraadserver dev]# export ORACLE_HOME=/u01/app/19.0.0.0/grid
[root@oraadserver dev]# export ORACLE_BASE=/tmp
[root@oraadserver dev]# /u01/app/19.0.0.0/grid/bin/asmcmd afd_label DATA /dev/sdc --init
[root@oraadserver dev]# /u01/app/19.0.0.0/grid/bin/asmcmd afd_label CRS /dev/sdd --init

We can verify the status of the disks with the command afd_lslbl

[root@oraadserver ~]# /u01/app/19.0.0.0/grid/bin/asmcmd afd_lslbl /dev/sdc
--------------------------------------------------------------------------------
Label                     Duplicate  Path
================================================================================
DATA                                  /dev/sdc
[root@oraadserver ~]# /u01/app/19.0.0.0/grid/bin/asmcmd afd_lslbl /dev/sdd
--------------------------------------------------------------------------------
Label                     Duplicate  Path
================================================================================
CRS                                   /dev/sdd
[root@oraadserver ~]#

In my case with the kernel 4.14.35-1902.2.0.el7uek.x86_64, I had to apply the following
Patch 27494830: BUILD UEK5U2 COMPATIBLE ACFS GRID KERNEL MODULES
To avoid this error during the installation

Action - To proceed, do not specify or select the Oracle ASM Filter Driver option.  Additional Information:
 - AFD-620: AFD is not supported on this operating system version: '4.14.35-1902.2.0.el7uek.x86_64'

The patch was applied before the installation as described in the following documentation
How to Apply a Grid Infrastructure Patch Before Grid Infrastructure Configuration (before root.sh or rootupgrade.sh or gridsetup.bat) is Executed (Doc ID 1410202.1)
The opatch version was verified

[oracle@oraadserver u01]$ /u01/app/19.0.0.0/grid/OPatch/opatch version
OPatch Version: 12.2.0.1.17

OPatch succeeded.
[oracle@oraadserver u01]$

After the patch was unpacked

[oracle@oraadserver u01]$ unzip p27494830_193000ACFSRU_Linux-x86-64.zip

And then the patch was applied

[oracle@oraadserver u01]$ /u01/app/19.0.0.0/grid/gridSetup.sh -silent -applyRU /u01/27494830/
Preparing the home to patch...
Applying the patch /u01/27494830/...
Successfully applied the patch.
The log can be found at: /tmp/GridSetupActions2020-10-20_09-32-54PM/installerPatchActions_2020-10-20_09-32-54PM.log

And now we lunch the install. Note that only some screenshots will be shown. And for the non shown pictures the default was kept.

[oracle@oraadserver grid]$ pwd
/u01/app/19.0.0.0/grid
[oracle@oraadserver grid]$ ./gridSetup.sh

For the ASM diskgroups configuration

The groups of my installation

I got some warnings that Idecided to ignore (just a test environment)

The two scripts are executed

[root@oraadserver oraInventory]# /u01/app/oraInventory/orainstRoot.sh
Changing permissions of /u01/app/oraInventory.
Adding read,write permissions for group.
Removing read,write,execute permissions for world.

Changing groupname of /u01/app/oraInventory to oinstall.
The execution of the script is complete.
[root@oraadserver oraInventory]#

[root@oraadserver oraInventory]# /u01/app/19.0.0.0/grid/root.sh
Performing root user operation.

The following environment variables are set as:
ORACLE_OWNER= oracle
ORACLE_HOME= /u01/app/19.0.0.0/grid

Enter the full pathname of the local bin directory: [/usr/local/bin]:
The contents of "dbhome" have not changed. No need to overwrite.
The contents of "oraenv" have not changed. No need to overwrite.
The contents of "coraenv" have not changed. No need to overwrite.

Creating /etc/oratab file...
Entries will be added to the /etc/oratab file as needed by
Database Configuration Assistant when a database is created
Finished running generic part of root script.
Now product-specific root actions will be performed.
Using configuration parameter file: /u01/app/19.0.0.0/grid/crs/install/crsconfig_params
The log of current session can be found at:
/u01/app/oracle/crsdata/oraadserver/crsconfig/roothas_2020-10-20_09-58-51PM.log
2020/10/20 21:58:55 CLSRSC-363: User ignored prerequisites during installation
LOCAL ADD MODE
Creating OCR keys for user 'oracle', privgrp 'oinstall'..
Operation successful.
LOCAL ONLY MODE
Successfully accumulated necessary OCR keys.
Creating OCR keys for user 'root', privgrp 'root'..
Operation successful.
CRS-4664: Node oraadserver successfully pinned.
2020/10/20 22:00:43 CLSRSC-330: Adding Clusterware entries to file 'oracle-ohasd.service'

oraadserver 2020/10/20 22:03:14 /u01/app/oracle/crsdata/oraadserver/olr/backup_20201020_220314.olr 2451816761
2020/10/20 22:03:49 CLSRSC-327: Successfully configured Oracle Restart for a standalone server
[root@oraadserver oraInventory]#

After the installation we can validate that the ASM was up

[root@oraadserver oraInventory]# crsctl check has
CRS-4638: Oracle High Availability Services is online
[root@oraadserver oraInventory]# ps -ef | grep pmon
root     10153  1730  0 22:10 pts/0    00:00:00 grep --color=auto pmon
oracle   26037     1  0 22:05 ?        00:00:00 asm_pmon_+ASM
[root@oraadserver oraInventory]#

Now using asmca, let’s configure an ACFS filesystem . Basically we
-Create a CRS diskgroup
-In CRS diskgroup we create a volume
-In this volume we create an ACFS filesystem and mount it

[oracle@oraadserver ~]$ which asmca
/u01/app/19.0.0.0/grid/bin/asmca
[oracle@oraadserver ~]$ asmca

The ACFS filesystem is now mounted

[root@oraadserver ~]# df -h /share_acfs/
Filesystem           Size  Used Avail Use% Mounted on
/dev/asm/acfs_vol-3  5.0G  319M  4.7G   7% /share_acfs
[root@oraadserver ~]#

Now let’s reboot the system and verify that the acfs filesystem is mounted.

[root@oraadserver ~]# df -h
Filesystem           Size  Used Avail Use% Mounted on
devtmpfs             1.8G  8.0K  1.8G   1% /dev
tmpfs                1.8G  637M  1.2G  35% /dev/shm
tmpfs                1.8G  8.8M  1.8G   1% /run
tmpfs                1.8G     0  1.8G   0% /sys/fs/cgroup
/dev/mapper/ol-root   47G   14G   34G  29% /
/dev/sda1            497M  230M  268M  47% /boot
tmpfs                368M     0  368M   0% /run/user/54323
tmpfs                368M     0  368M   0% /run/user/0

No. When trying to manually mount it, I got this error

[root@oraadserver ~]# /bin/mount -t acfs /dev/asm/acfs_vol-3 /share_acfs
mount.acfs: CLSU-00107: operating system function: open64; failed with error data: 2; at location: OOF_1
mount.acfs: CLSU-00101: operating system error message: No such file or directory
mount.acfs: CLSU-00104: additional error information: open64 (/dev/ofsctl)
mount.acfs: ACFS-00502: Failed to communicate with the ACFS driver.  Verify the ACFS driver has been loaded.
[root@oraadserver ~]#

I then load the acfs drivers

[root@oraadserver ~]#  /u01/app/19.0.0.0/grid/bin/acfsload start -s
ACFS-9300: ADVM/ACFS distribution files found.
ACFS-9314: Removing previous ADVM/ACFS installation.
ACFS-9315: Previous ADVM/ACFS components successfully removed.
ACFS-9294: updating file /etc/sysconfig/oracledrivers.conf
ACFS-9307: Installing requested ADVM/ACFS software.
ACFS-9294: updating file /etc/sysconfig/oracledrivers.conf
ACFS-9308: Loading installed ADVM/ACFS drivers.
ACFS-9321: Creating udev for ADVM/ACFS.
ACFS-9323: Creating module dependencies - this may take some time.
ACFS-9154: Loading 'oracleoks.ko' driver.
ACFS-9154: Loading 'oracleadvm.ko' driver.
ACFS-9154: Loading 'oracleacfs.ko' driver.
ACFS-9327: Verifying ADVM/ACFS devices.
ACFS-9156: Detecting control device '/dev/asm/.asm_ctl_spec'.
ACFS-9156: Detecting control device '/dev/ofsctl'.
ACFS-9309: ADVM/ACFS installation correctness verified.
[root@oraadserver ~]#

When checking the volume, it was disabled

ASMCMD> volinfo --all
Diskgroup Name: CRS

         Volume Name: ACFS_VOL
         Volume Device: /dev/asm/acfs_vol-3
         State: DISABLED
         Size (MB): 5120
         Resize Unit (MB): 64
         Redundancy: UNPROT
         Stripe Columns: 8
         Stripe Width (K): 1024
         Usage: ACFS
         Mountpath: /share_acfs

So I enable enable it

ASMCMD> volenable –all
ASMCMD> volinfo --all
Diskgroup Name: CRS

         Volume Name: ACFS_VOL
         Volume Device: /dev/asm/acfs_vol-3
         State: ENABLED
         Size (MB): 5120
         Resize Unit (MB): 64
         Redundancy: UNPROT
         Stripe Columns: 8
         Stripe Width (K): 1024
         Usage: ACFS
         Mountpath: /share_acfs

ASMCMD>

And then I run the mount command

[root@oraadserver trace]# /bin/mount -t acfs /dev/asm/acfs_vol-3 /share_acfs

[root@oraadserver trace]# df -h /share_acfs/
Filesystem           Size  Used Avail Use% Mounted on
/dev/asm/acfs_vol-3  5.0G  319M  4.7G   7% /share_acfs
[root@oraadserver trace]#

It was strange to manually mount the ACFS filesystem at system reboot but it is mentioned in Oracle documentation
Starting with Oracle Database 12c, Oracle Restart configurations do not support the Oracle ACFS registry.
This means that the Oracle ACFS file system resource is supported only for Oracle Grid Infrastructure cluster configurations; it is not supported for Oracle Restart configurations. When using ACFS on Oracle Restart, we have to mount the filesystem manually mount and umount the ACFS filesystem

Conclusion

When using ACFS with Oracle Restart, we have to manually mount and umount the ACFS filesystem when the system stops or starts. This can be done by some customized scripts.

Cet article Oracle Restart 19c and ACFS est apparu en premier sur Blog dbi services.

↧

ODA 19.9 software release is out

October 29, 2020, 10:30 am

≫ Next: Oracle 19c Grid Infrastructure for a cluster

≪ Previous: Oracle Restart 19c and ACFS

Introduction

6 months after the 19.6, the first 19c production release, here comes the 19.9 Oracle Database Appliance patch with major improvements.

Will my ODA supports this 19.9 patch?

First, the 19.9 release will be the same for all ODAs, as usual. Like 19.6, the oldest ODA compatible with this release will be the X5-2. Don’t expect to install 19c on older models. X4-2 is stuck to 18.8, X3-2 is stuck to 18.5 and V1 is stuck to 12.2. If you are still using these models, please consider an upgrade to X8-2 and 19c to go back to supported hardware and software.

What are the new features?

For sure, 19.9 includes all the latest patches for all database homes, including for those versions no more supported with Premier Support (provided patches are the very latest from the 20th of October, 2020).

It’s not a new feature, but odacli now support Data Guard since 19.8. Expect 19.9 to be more reliable regarding this feature. This was a great improvement over previous versions.

The most important new feature is dedicated for those willing to use virtualization. The OLVM/KVM stack has now replaced OVM/Xen. It means the implicit death of oakcli and the end of having to manage 2 appliance tools for ODA, odacli finally becoming the main and only tool for appliance management, associated with the GUI if you need it. This OLVM/KVM virtualization stack comes for sure with hard partitioning for you ODAs, useful if you have Enterprise Edition licenses: the cores not enabled for databases are available for running application VMs. And now it also works with ODA X8-2S and X8-2M: virtualization was always limited to HA ODAs before. This can be a game changer as ODA lite, especially X8-2M, has plenty of resources for other purposes than databases.

If you plan to use virtualization, you do not have to deploy a specific ISO image for virtualized mode anymore. All ODAs will now be deployed as Bare Metal, and virtualization is running on top of this Bare Metal deployment. Your databases will continue to run on Bare Metal, having the advantages of both previous solutions (fully virtualized or Bare Metal without any VM).

What is also interesting is the ability to dedicate additional Bare Metal CPU pools to other DBhomes for better database isolation. For example, you can imagine to have a 2-core pool for test databases and a 4-core pool for production databases. CPU pools are also available in the VM CPU pools flavour, here for dedicating cores to group of VMs on your ODA.

Finally, the SE-HA high availability feature for SE2 (the one that replaced RAC removed from the SE recently) seems to be included as well with odacli.

Still able to run older databases with 19.9?

Yes, 19.9 will let you run all versions of database starting from 11.2.0.4. However, it’s highly recommended to migrate to 19c, as it’s the only version with long term support available now. Deploying 19.9 and planning to migrate your databases in the next months is definitely a brilliant idea. With ODA you can easily migrate your databases with odacli move-database: this move to another home running 19c will update your database to 19c accordingly.

Is is possible to upgrade to 19.6 from my current release?

You will need to already run 19.x release, starting from 19.5, to apply this patch. If your ODA is running on 18.8, you will have to patch to 19.6 prior applying 19.9. If your ODA is running on 18.7 or older 18.x release, an upgrade to 18.8 will be needed before patching to 19.6. If you are using older versions, it’s highly recommended to do a reimaging of your ODA. It will be easier than applying 3+ patches. And you’ll benefit from a brand new and clean ODA. Patching is still a lot of work, and if you didn’t patch regularly, it could be tough to bring your ODA to the latest version. Reimaging is a lot of work too, but it’s success guaranteed.

If you are using a virtualized ODA with OVM/Xen, you will not be able to patch. A complete reimaging is needed. But it’s worth it.

Conclusion

19.9 is a major release for customers using ODAs. Apart from maturity of databases including 19c, you will benefit from virtualization even for lite ODA models. And virtualization keeping Bare Metal databases is a great solution.

Cet article ODA 19.9 software release is out est apparu en premier sur Blog dbi services.

↧

Oracle 19c Grid Infrastructure for a cluster

October 30, 2020, 3:48 am

≫ Next: NoSQL and SQL: key-value access always scale

≪ Previous: ODA 19.9 software release is out

In this previous blog , we deal with Oracle Restart and ACFS. We saw that using ACFS with Oracle Restart will require some manual tasks. The solution is to install a Grid Infrastructure for a cluster, even if we are only using one node.

I am using the same configuration
-Oracle 19c
-Oracle Linux Server 7.6
-Kernel 4.14.35-1902.2.0.el7uek.x86_64

As we are going to install a GI for a cluster, we will need

-Public IP: 192.168.2.21
-Virtual IP: 192.168.2.23
-SCAN: 192.168.2.22
-Interconnect: 10.14.163.67

The installation was done like the previous blog , we will only show the differences for the screenshots

-The disks were configured to use AFD (see this blog )
-The patch 27494830 was applied (see this blog )

During my first try I got following errors when executing the root.sh script

[root@oraadserver network-scripts]# /u01/app/19.0.0.0/grid/root.sh
Performing root user operation.

The following environment variables are set as:
    ORACLE_OWNER= oracle
    ORACLE_HOME=  /u01/app/19.0.0.0/grid

…
…
CRS-2672: Attempting to start 'ora.gpnpd' on 'oraadserver'
CRS-2676: Start of 'ora.gpnpd' on 'oraadserver' succeeded
CRS-2672: Attempting to start 'ora.gipcd' on 'oraadserver'
CRS-2674: Start of 'ora.gipcd' on 'oraadserver' failed
CRS-2673: Attempting to stop 'ora.gpnpd' on 'oraadserver'
CRS-2677: Stop of 'ora.gpnpd' on 'oraadserver' succeeded
CRS-2673: Attempting to stop 'ora.cssdmonitor' on 'oraadserver'
CRS-2677: Stop of 'ora.cssdmonitor' on 'oraadserver' succeeded
CRS-2673: Attempting to stop 'ora.mdnsd' on 'oraadserver'
CRS-2677: Stop of 'ora.mdnsd' on 'oraadserver' succeeded
CRS-2673: Attempting to stop 'ora.evmd' on 'oraadserver'
CRS-2677: Stop of 'ora.evmd' on 'oraadserver' succeeded
CRS-2673: Attempting to stop 'ora.driver.afd' on 'oraadserver'
CRS-2677: Stop of 'ora.driver.afd' on 'oraadserver' succeeded
CRS-4000: Command Start failed, or completed with errors.
2020/10/23 09:54:11 CLSRSC-119: Start of the exclusive mode cluster failed
Died at /u01/app/19.0.0.0/grid/crs/install/crsinstall.pm line 2439.
[root@oraadserver network-scripts]#

According this document, it seems that there is a bug normally fixed starting with 18.8. In my case with Oracle 19c, it was not. As workaround I choose cluster name smaller than 15 characters.
CLSRSC-119: Start of the exclusive mode cluster failed While Running root.sh While Installing Grid Infrastructure 19c (Doc ID 2568395.1)

The root.sh was successfully executed

Performing root user operation.

The following environment variables are set as:
    ORACLE_OWNER= oracle
    ORACLE_HOME=  /u01/app/19.0.0.0/grid

Enter the full pathname of the local bin directory: [/usr/local/bin]:
The contents of "dbhome" have not changed. No need to overwrite.
The contents of "oraenv" have not changed. No need to overwrite.
The contents of "coraenv" have not changed. No need to overwrite.

Entries will be added to the /etc/oratab file as needed by
Database Configuration Assistant when a database is created
Finished running generic part of root script.
Now product-specific root actions will be performed.
Relinking oracle with rac_on option
Using configuration parameter file: /u01/app/19.0.0.0/grid/crs/install/crsconfig_params
The log of current session can be found at:
  /u01/app/oracle/crsdata/oraadserver/crsconfig/rootcrs_oraadserver_2020-10-23_02-43-14PM.log
2020/10/23 14:43:19 CLSRSC-594: Executing installation step 1 of 19: 'SetupTFA'.
2020/10/23 14:43:19 CLSRSC-594: Executing installation step 2 of 19: 'ValidateEnv'.
2020/10/23 14:43:19 CLSRSC-363: User ignored prerequisites during installation
2020/10/23 14:43:19 CLSRSC-594: Executing installation step 3 of 19: 'CheckFirstNode'.
2020/10/23 14:43:21 CLSRSC-594: Executing installation step 4 of 19: 'GenSiteGUIDs'.
2020/10/23 14:43:21 CLSRSC-594: Executing installation step 5 of 19: 'SetupOSD'.
2020/10/23 14:43:21 CLSRSC-594: Executing installation step 6 of 19: 'CheckCRSConfig'.
2020/10/23 14:43:22 CLSRSC-594: Executing installation step 7 of 19: 'SetupLocalGPNP'.
2020/10/23 14:43:41 CLSRSC-594: Executing installation step 8 of 19: 'CreateRootCert'.
2020/10/23 14:43:46 CLSRSC-594: Executing installation step 9 of 19: 'ConfigOLR'.
2020/10/23 14:43:52 CLSRSC-594: Executing installation step 10 of 19: 'ConfigCHMOS'.
2020/10/23 14:43:52 CLSRSC-594: Executing installation step 11 of 19: 'CreateOHASD'.
2020/10/23 14:43:55 CLSRSC-594: Executing installation step 12 of 19: 'ConfigOHASD'.
2020/10/23 14:43:56 CLSRSC-330: Adding Clusterware entries to file 'oracle-ohasd.service'
2020/10/23 14:44:03 CLSRSC-4002: Successfully installed Oracle Trace File Analyzer (TFA) Collector.
2020/10/23 14:45:25 CLSRSC-594: Executing installation step 13 of 19: 'InstallAFD'.
2020/10/23 14:46:46 CLSRSC-594: Executing installation step 14 of 19: 'InstallACFS'.
2020/10/23 14:48:03 CLSRSC-594: Executing installation step 15 of 19: 'InstallKA'.
2020/10/23 14:48:06 CLSRSC-594: Executing installation step 16 of 19: 'InitConfig'.

[INFO] [DBT-30161] Disk label(s) created successfully. Check /u01/app/oracle/cfgtoollogs/asmca/asmca-201023PM024840.log for details.


2020/10/23 14:49:55 CLSRSC-482: Running command: '/u01/app/19.0.0.0/grid/bin/ocrconfig -upgrade oracle oinstall'
CRS-4256: Updating the profile
Successful addition of voting disk aecbe50ed1474f05bfbb0fd8192e5358.
Successfully replaced voting disk group with +CRS_DSKGP.
CRS-4256: Updating the profile
CRS-4266: Voting file(s) successfully replaced
##  STATE    File Universal Id                File Name Disk group
--  -----    -----------------                --------- ---------
 1. ONLINE   aecbe50ed1474f05bfbb0fd8192e5358 (AFD:CRS_DSK) [CRS_DSKGP]
Located 1 voting disk(s).
2020/10/23 14:51:03 CLSRSC-594: Executing installation step 17 of 19: 'StartCluster'.
2020/10/23 14:52:36 CLSRSC-343: Successfully started Oracle Clusterware stack
2020/10/23 14:52:36 CLSRSC-594: Executing installation step 18 of 19: 'ConfigNode'.
2020/10/23 14:57:38 CLSRSC-594: Executing installation step 19 of 19: 'PostConfig'.
2020/10/23 15:00:11 CLSRSC-325: Configure Oracle Grid Infrastructure for a Cluster ... succeeded
[root@oraadserver oracle]#

The Oracle cluster verification failed because some prerequisites were ignored

At the end of the installation we can check the status of the cluster

[oracle@oraadserver ~]$ crsctl status res -t
--------------------------------------------------------------------------------
Name           Target  State        Server                   State details
--------------------------------------------------------------------------------
Local Resources
--------------------------------------------------------------------------------
ora.DATA_DSKGP.VOL_ACFS.advm
               ONLINE  ONLINE       oraadserver              STABLE
ora.LISTENER.lsnr
               ONLINE  ONLINE       oraadserver              STABLE
ora.chad
               ONLINE  ONLINE       oraadserver              STABLE
ora.data_dskgp.vol_acfs.acfs
               ONLINE  ONLINE       oraadserver              mounted on /share_ac
                                                             fs,STABLE
ora.net1.network
               ONLINE  ONLINE       oraadserver              STABLE
ora.ons
               ONLINE  ONLINE       oraadserver              STABLE
ora.proxy_advm
               ONLINE  ONLINE       oraadserver              STABLE
--------------------------------------------------------------------------------
Cluster Resources
--------------------------------------------------------------------------------
ora.ASMNET1LSNR_ASM.lsnr(ora.asmgroup)
      1        ONLINE  ONLINE       oraadserver              STABLE
      2        ONLINE  OFFLINE                               STABLE
      3        ONLINE  OFFLINE                               STABLE
ora.CRS_DSKGP.dg(ora.asmgroup)
      1        ONLINE  ONLINE       oraadserver              STABLE
      2        OFFLINE OFFLINE                               STABLE
      3        OFFLINE OFFLINE                               STABLE
ora.DATA_DSKGP.dg(ora.asmgroup)
      1        ONLINE  ONLINE       oraadserver              STABLE
      2        OFFLINE OFFLINE                               STABLE
      3        OFFLINE OFFLINE                               STABLE
ora.LISTENER_SCAN1.lsnr
      1        ONLINE  OFFLINE                               STABLE
ora.asm(ora.asmgroup)
      1        ONLINE  ONLINE       oraadserver              Started,STABLE
      2        OFFLINE OFFLINE                               STABLE
      3        OFFLINE OFFLINE                               STABLE
ora.asmnet1.asmnetwork(ora.asmgroup)
      1        ONLINE  ONLINE       oraadserver              STABLE
      2        OFFLINE OFFLINE                               STABLE
      3        OFFLINE OFFLINE                               STABLE
ora.cvu
      1        ONLINE  ONLINE       oraadserver              STABLE
ora.oraadserver.vip
      1        ONLINE  ONLINE       oraadserver              STABLE
ora.qosmserver
      1        ONLINE  ONLINE       oraadserver              STABLE
ora.scan1.vip
      1        ONLINE  OFFLINE                               STABLE
--------------------------------------------------------------------------------
[oracle@oraadserver ~]$

Once the installation done, we will be able to create an ACFS filesystem which will be automatically mounted at server startup. See here for the creation of ACFS.

Conclusion

For using ACFS, a Grid Infrastructure for a cluster must be configured (starting with Oracle 12). I guess this is the reason why we have a scan and an interconnect configured on non HA ODA, even if this scan is disabled on ODAs

Cet article Oracle 19c Grid Infrastructure for a cluster est apparu en premier sur Blog dbi services.

↧

NoSQL and SQL: key-value access always scale

November 4, 2020, 10:40 am

≫ Next: Patching Oracle Database Appliance to 19.9

≪ Previous: Oracle 19c Grid Infrastructure for a cluster

By Franck Pachot

.
I have written about some NoSQL myths in previous posts (here and here) and I got some feedback from people mentioning that the test case was on relatively small data. This is true. In order to understand how it works, we need to explain and trace the execution, and that is easier on a small test case. Once the algorithm is understood it is easy to infer how it scales. Then, if readers want to test it on huge data, they can. This may require lot of cloud credits, and I usually don’t feel the need to do this test for a blog post, especially when I include all the code to reproduce it on a larger scale.

But this approach may be biased by the fact that I’ve been working a lot with RDBMS where we have all tools to understand how it works. When you look at the execution plan, you know the algorithm and can extrapolate the numbers to larger tables. When you look at the wait events, you know on which resource they can scale with higher concurrency. But times change, NoSQL databases, especially the ones managed by the cloud providers, provide only a simple API with limited execution statistics. Because that’s the goal: simple usage. And this is why people prefer to look at real scale executions when talking about performance. And, as it is not easy to run a real scale Proof of Concept, they look at the well-known big data users like Google, Facebook, Amazon…

I was preparing my DOAG presentation “#KnowSQL: Where are we with SQL, NoSQL, NewSQL in 2020?” where I’ll mention at 4TB table I’ve seen at a customer. The table was far too big (30 million extents!) because it was not managed for years (a purge job failing, not being monitored, and the outsourcing company adding datafiles for years without trying to understand). But the application was still working well, with happy users. Because they use a key-value access, and this has always been scalable. Here is a common misconception: NoSQL databases didn’t invent new techniques to store data. Hash partitioning and indexes are the way to scale this. It existed in RDBMS for a long time. What NoSQL did was providing easy access to this limited API, and restraining data access to this simple API in order to guarantee predictable performance.

By coincidence, with my presentation in mind, I had access to an Oracle Exadata that was not yet used, and I got the occasion to create a similar table containing billion of items:


SQL> info+ BIG

TABLE: BIG
         LAST ANALYZED:2020-10-29 22:32:06.0
         ROWS         :3049226754
         SAMPLE SIZE  :3049226754
         INMEMORY     :
         COMMENTS     :

Columns

NAME         DATA TYPE      NULL  DEFAULT    LOW_VALUE   HIGH_VALUE   NUM_DISTINCT   HISTOGRAM
------------ -------------- ----- ---------- ----------- ------------ -------------- ---------
*K           RAW(16 BYTE)   No                                            3049226754     HYBRID
 V           BLOB           Yes                                           0              NONE

Indexes
INDEX_NAME             UNIQUENESS   STATUS   FUNCIDX_STATUS   COLUMNS
---------------------- ------------ -------- ---------------- ------
FRANCK.SYS_C00205258   UNIQUE       N/A                       K

Just two columns: one RAW(16) to store the key as a UUID and one BLOB to store any document value. Exactly like a key-value document store today, and similar to the table I’ve seen at my customer. Well, at this customer, this was desined in the past century, with LONG ROW instead of BLOB, but this would make no sense today. And this table was not partitioned because they didn’t expect this size. In my test I did what we should do today for this key-value use case: partition by HASH:


create table BIG ( K RAW(16) primary key using index local, V BLOB ) tablespace users
LOB (V) store as securefile (enable storage in row nocache compress high)
partition by hash (K) partitions 100000 parallel 20

It is probably not useful to have 100000 partitions for a few terabytes table, but then this table is ready for a lot of more data. And in Oracle 100000 partition is far from the limit which is 1 million partitions. Note that this is a lab. I am not recommending to create 100000 partitions if you don’t need to. I’m just saying that it is easy to create a terabytes table with the performance of really small tables when accessed with the partitioning key.

So, here is the size:


14:58:58 SQL> select segment_type,segment_name,dbms_xplan.format_size(sum(bytes)) "SIZE",count(*)
from dba_segments where owner='FRANCK'
group by grouping sets ((),(segment_type),(segment_type,segment_name))
;

SEGMENT_TYPE                   SEGMENT_NAME                   SIZE         COUNT(*)
------------------------------ ------------------------------ ---------- ----------
LOB PARTITION                  SECURFILE                      781G           100000
LOB PARTITION                                                 781G           100000
INDEX PARTITION                SYS_C00205258                  270G           100000
INDEX PARTITION                SYS_IL0000826625C00002$$       6250M          100000
INDEX PARTITION                                               276G           200000
TABLE PARTITION                BIG                            7691G          100000
TABLE PARTITION                                               7691G          100000
                                                              8749G          400000
8 rows selected.

There’s 8.5 TB in total here. The table, named “BIG”, has 100000 partitions for a total of 7691 GB. The primary key index, “SYS_C00205258” is 270 GB as it contains the key (so 16 bytes, plus the ROWID to address the table, per entry). It is a local index, with same HASH partitioning as the table. For documents that are larger than the table block, the LOB partition can store them. Here I have mostly small documents which are stored in the table.

I inserted the rows quickly with a bulk load which I didn’t really tune or monitor. But here is an excerpt from AWR report when the insert was running:



Plan Statistics                                          DB/Inst: EXA19C/EXA19C1  Snaps: 16288-16289
-> % Snap Total shows the % of the statistic for the SQL statement compared to the instance total

Stat Name                                Statement   Per Execution % Snap
---------------------------------------- ---------- -------------- -------
Elapsed Time (ms)                        3.2218E+07   16,108,804.2    99.4
CPU Time (ms)                            2.1931E+07   10,965,611.8    99.4
Executions                                        2            1.0     0.0
Buffer Gets                              6.5240E+08  326,201,777.5   115.6
Disk Reads                               4.0795E+07   20,397,517.5   100.2
Parse Calls                                      48           24.0     0.0
Rows                                     5.1202E+08  256,008,970.0     N/A
User I/O Wait Time (ms)                   2,024,535    1,012,267.6    96.9
Cluster Wait Time (ms)                    4,293,684    2,146,841.9    99.9
Application Wait Time (ms)                      517          258.5    20.6
Concurrency Wait Time (ms)                4,940,260    2,470,130.0    96.0
          -------------------------------------------------------------

This is about 16 key-value ingested per millisecond (256,008,970.0/16,108,804.2). And it can go further as I have 15% of buffer contention that I can easily get rid of if I take care of the index definition.

After running this a few days, I have nearly 5 billion rows here:

SQL> select count(*) from BIG;

  COUNT(*)
----------
4967207817

Elapsed: 00:57:05.949

The full scan to get the exact count lasted one hour here because I’ve run it without parallel query (an equivalent of map reduce) so the count was done on one CPU only. Anyway, if counting the rows were a use case, I would create a materialized view to aggregate some metrics.

By curiosity I’ve run the same with parallel query: 6 minutes to count the 5 billion documents with 20 parallel processes:

My goal is to test reads. In order to have predictable results, I flush the buffer cache:


5:10:51 SQL> alter system flush buffer_cache;

System FLUSH altered.

Elapsed: 00:00:04.817

Of course, in real life, there’s a good chance that all the index branches stay in memory.


15:11:11 SQL> select * from BIG where K=hextoraw('B23375823AD741B3E0532900000A7499');

K                                V
-------------------------------- --------------------------------------------------------------------------------
B23375823AD741B3E0532900000A7499 4E2447354728705E776178525C7541354640695C577D2F2C3F45686264226640657C3E5D2453216A

Elapsed: 00:00:00.011

“Elapsed” is the elapsed time in seconds. Here 11 milliseconds. NoSQL databases advertise their “single digit millisecond” and that’s right, because “No SQL” provides a very simple API (key-value access). Any database, NoSQL or RDBMS, can be optimized for this key-value access. An index on the key ensures a O(logN) scalability and, when you can hash partition it, you can maintain this cost constant when data grows, which is then O(1).

In order to understand not only the time, but also how it scales with more data or high throughput, I look at the execution plan:



15:11:27 SQL> select * from dbms_xplan.display_cursor(format=>'allstats last');

PLAN_TABLE_OUTPUT
----------------------------------------------------------------------------------------------------------------------------------------------
SQL_ID  gqcazx39y5jnt, child number 21
--------------------------------------
select * from BIG where K=hextoraw('B23375823AD741B3E0532900000A7499')

Plan hash value: 2410449747

----------------------------------------------------------------------------------------------------------------------
| Id  | Operation                          | Name          | Starts | E-Rows | A-Rows |   A-Time   | Buffers | Reads |
----------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                   |               |      1 |        |      1 |00:00:00.01 |       3 |     3 |
|   1 |  PARTITION HASH SINGLE             |               |      1 |      1 |      1 |00:00:00.01 |       3 |     3 |
|   2 |   TABLE ACCESS BY LOCAL INDEX ROWID| BIG           |      1 |      1 |      1 |00:00:00.01 |       3 |     3 |
|*  3 |    INDEX UNIQUE SCAN               | SYS_C00205258 |      1 |      1 |      1 |00:00:00.01 |       2 |     2 |
----------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   3 - access("K"=HEXTORAW('B23375823AD741B3E0532900000A7499'))

I’ve read only 3 “Buffers” here. Thanks to the partitioning (PARTITION HASH SINGLE), each local index is small, with a root branch and a leaf block: 2 buffers read. This B*Tree index (INDEX UNIQUE SCAN) returns the physical address in the table (TABLE ACCESS BY LOCAL INDEX ROWID) in order to get the additional column.

Finally, I insert one row:


SQL> set timing on autotrace on
SQL> insert into BIG values(hextoraw('1D15EA5E8BADF00D8BADF00DFF'),utl_raw.cast_to_raw(dbms_random.string('p',1024)));

1 row created.

Elapsed: 00:00:00.04

This takes 4 milliseconds

The autotrace shows what is behind:


Execution Plan
----------------------------------------------------------

---------------------------------------------------------------------------------
| Id  | Operation                | Name | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------------
|   0 | INSERT STATEMENT         |      |     1 |  1177 |     1   (0)| 00:00:01 |
|   1 |  LOAD TABLE CONVENTIONAL | BIG  |       |       |            |          |
---------------------------------------------------------------------------------

Statistics
----------------------------------------------------------
          1  recursive calls
          8  db block gets
          1  consistent gets
          7  physical reads
       1864  redo size
        872  bytes sent via SQL*Net to client
       1048  bytes received via SQL*Net from client
          3  SQL*Net roundtrips to/from client
          1  sorts (memory)
          0  sorts (disk)
          1  rows processed

There are a few blocks to maintain (db block gets) when adding a new entry into a B*Tree index, especially when there are some blocks to split to allocate more space in the tree. In RDBMS you should categorize the data ingestion into:

high throughput for big data, like metrics and logs from IoT, with the rate of bulk inserts as I did to fill-in the table
fast response time to put one of few items and this is milliseconds, scaling thanks to local index partitioning

I’m talking about the roots of NoSQL here: providing the simplest key-value access in order to scale. But the most advanced NoSQL managed services went further, pushing the data ingest performance with LSM (log-structured merge) indexes rather than B*Tree in-place index maintenance. They have also implemented many features to autonomously maintain the partitions at their best for storage, performance and high availability. This presentation explains a few in the context of AWS DynamoDB:

With DynamoDB you can’t get the execution plan, but you can ask for the ConsumedCapacity to be returned with the result. This helps to validate your understanding of the data access even without running on huge volume and expensive provisioned capacity. This is what I did in https://blog.dbi-services.com/rdbms-scales-the-algorithm/ measuring the linear increase of RCU on a small 2 million items table, which is sufficient to extrapolate to larger data sets. Key-value access always scales in this way: response time remains constant when data grows. And It can remain constant when the users grow as well, by splitting partitions to more storage.

Cet article NoSQL and SQL: key-value access always scale est apparu en premier sur Blog dbi services.

↧

Patching Oracle Database Appliance to 19.9

November 5, 2020, 1:35 am

≫ Next: Oracle Grid Infrastructure on Windows With 2 Nodes

≪ Previous: NoSQL and SQL: key-value access always scale

ODA 19.9 has just been released for Bare Metal yesterday, and I had the opportunity to already patch a customer production ODA to this latest version. Through this blog I wanted to share my experience on patching an ODA to 19.9 as well as a new tricky skip-orachk option.

Patching requirement

To patch the Bare Metal ODA to 19.9 version (patch 31922078), we need to be in either 19.5, 19.6, 19.7 or 19.8 version. This is described in the ODA documentation.

First of all we need to ensure we have enough space on /, /u01 and /opt file systems. At least 20 GB should be available. If not, we can do some cleaning or extend the LVM partitions.

[root@ODA01 /]# df -h / /u01 /opt
Filesystem                          Size  Used Avail Use% Mounted on
/dev/mapper/VolGroupSys-LogVolRoot   30G  9.5G   19G  34% /
/dev/mapper/VolGroupSys-LogVolU01    99G   55G   40G  59% /u01
/dev/mapper/VolGroupSys-LogVolOpt    75G   43G   29G  60% /opt

Then we will check that no hardware failure is existing on the ODA. This can be checked with the ILOM GUI or using a ssh connection on the ILOM :

-> show /SP/faultmgmt

 /SP/faultmgmt
    Targets:
        shell

    Properties:

    Commands:
        cd
        show

-> start /SP/faultmgmt/shell
Are you sure you want to start /SP/faultmgmt/shell (y/n)? y

faultmgmtsp> fmadm faulty
No faults found

Recommendation is to use odabr tool and perform a snapshot backup :

[root@ODA01 /]# /opt/odabr/odabr backup -snap
INFO: 2020-11-04 16:30:42: Please check the logfile '/opt/odabr/out/log/odabr_37159.log' for more details


--------------------------------------------------------
odabr - ODA node Backup Restore
Author: Ruggero Citton 
RAC Pack, Cloud Innovation and Solution Engineering Team
Copyright Oracle, Inc. 2013, 2019
Version: 2.0.1-47
--------------------------------------------------------

INFO: 2020-11-04 16:30:42: Checking superuser
INFO: 2020-11-04 16:30:42: Checking Bare Metal
INFO: 2020-11-04 16:30:42: Removing existing LVM snapshots
WARNING: 2020-11-04 16:30:42: LVM snapshot for 'opt' does not exist
WARNING: 2020-11-04 16:30:42: LVM snapshot for 'u01' does not exist
WARNING: 2020-11-04 16:30:42: LVM snapshot for 'root' does not exist
INFO: 2020-11-04 16:30:42: Checking LVM size
INFO: 2020-11-04 16:30:42: Doing a snapshot backup only
INFO: 2020-11-04 16:30:42: Boot device backup
INFO: 2020-11-04 16:30:42: ...getting boot device
INFO: 2020-11-04 16:30:42: ...making boot device backup
INFO: 2020-11-04 16:30:44: ...boot device backup saved as '/opt/odabr/out/hbi/boot.img'
INFO: 2020-11-04 16:30:44: Getting EFI device
INFO: 2020-11-04 16:30:44: ...making efi device backup
INFO: 2020-11-04 16:30:46: EFI device backup saved as '/opt/odabr/out/hbi/efi.img'
INFO: 2020-11-04 16:30:46: OCR backup
INFO: 2020-11-04 16:30:47: ...ocr backup saved as '/opt/odabr/out/hbi/ocrbackup_37159.bck'
INFO: 2020-11-04 16:30:47: Making LVM snapshot backup
SUCCESS: 2020-11-04 16:30:49: ...snapshot backup for 'opt' created successfully
SUCCESS: 2020-11-04 16:30:49: ...snapshot backup for 'u01' created successfully
SUCCESS: 2020-11-04 16:30:49: ...snapshot backup for 'root' created successfully
SUCCESS: 2020-11-04 16:30:49: LVM snapshots backup done successfully

[root@ODA01 /]# /opt/odabr/odabr infosnap

--------------------------------------------------------
odabr - ODA node Backup Restore
Author: Ruggero Citton 
RAC Pack, Cloud Innovation and Solution Engineering Team
Copyright Oracle, Inc. 2013, 2019
Version: 2.0.1-47
--------------------------------------------------------


LVM snap name         Status                COW Size              Data%
-------------         ----------            ----------            ------
root_snap             active                30.00 GiB             0.01%
opt_snap              active                60.00 GiB             0.01%
u01_snap              active                100.00 GiB            0.01%

We can as well run orachk excluding the rdbms checks :

[root@ODA01 /]# cd /opt/oracle/dcs/oracle.ahf/orachk

[root@ODA01 orachk]# ./orachk -nordbms

.  .  .  .  .  .

Either Cluster Verification Utility pack (cvupack) does not exist at /opt/oracle/dcs/oracle.ahf/common/cvu or it is an old or invalid cvupack

Checking Cluster Verification Utility (CVU) version at CRS Home - /u01/app/19.0.0.0/grid

This version of Cluster Verification Utility (CVU) was released on 10-Mar-2020 and it is older than 180 days. It is highly recommended that you download the latest version of CVU from MOS patch 30166242 to ensure the highest level of accuracy of the data contained within the report

Do you want to download latest version of Cluster Verification Utility (CVU) from my oracle support? [y/n] [y] n

Running older version of Cluster Verification Utility (CVU) from CRS Home - /u01/app/19.0.0.0/grid

Checking Status of Oracle Software Stack - Clusterware, ASM, RDBMS on oda01

.
.  .  . . . .  .  .  .
-------------------------------------------------------------------------------------------------------
                                                 Oracle Stack Status
-------------------------------------------------------------------------------------------------------
  Host Name       CRS Installed  RDBMS Installed    CRS UP    ASM UP  RDBMS UP    DB Instance Name
-------------------------------------------------------------------------------------------------------
oda01               Yes           No          Yes      Yes       No
-------------------------------------------------------------------------------------------------------
.
.  .  .  .  .  .


.
.
.
.

.



*** Checking Best Practice Recommendations ( Pass / Warning / Fail ) ***


Collections and audit checks log file is
/opt/oracle/dcs/oracle.ahf/data/oda01/orachk/orachk_oda01_110420_163217/log/orachk.log

============================================================
           Node name - oda01
============================================================

 Collecting - ASM Disk Group for Infrastructure Software and Configuration
 Collecting - ASM Diskgroup Attributes
 Collecting - ASM initialization parameters
 Collecting - Disk I/O Scheduler on Linux
 Collecting - Interconnect network card speed
 Collecting - Kernel parameters
 Collecting - Maximum number of semaphore sets on system
 Collecting - Maximum number of semaphores on system
 Collecting - Maximum number of semaphores per semaphore set
 Collecting - OS Packages
 Collecting - Patches for Grid Infrastructure
 Collecting - number of semaphore operations per semop system call
 Collecting - CRS user limits configuration
 Collecting - Database Server Infrastructure Software and Configuration
 Collecting - umask setting for GI owner

Data collections completed. Checking best practices on oda01.
------------------------------------------------------------

 INFO =>     Oracle Database Appliance Best Practice References
 INFO =>     Oracle Data Pump Best practices.
 INFO =>     Important Storage Minimum Requirements for Grid & Database Homes
 WARNING =>  soft or hard memlock are not configured according to recommendation
 INFO =>     CSS disktimeout is not set to the default value
 WARNING =>  OCR is not being backed up daily
 INFO =>     CSS misscount is not set to the default value of 30
 INFO =>     Jumbo frames (MTU >= 9000) are not configured for interconnect
 INFO =>     Information about hanganalyze and systemstate dump
 WARNING =>  One or more diskgroups from v$asm_diskgroups are not registered in clusterware registry
Best Practice checking completed. Checking recommended patches on oda01
--------------------------------------------------------------------------------
Collecting patch inventory on CRS_HOME /u01/app/19.0.0.0/grid
Collecting patch inventory on ASM_HOME /u01/app/19.0.0.0/grid

------------------------------------------------------------
                      CLUSTERWIDE CHECKS
------------------------------------------------------------

------------------------------------------------------------
Detailed report (html) -  /opt/oracle/dcs/oracle.ahf/data/oda01/orachk/orachk_oda01_110420_163217/orachk_oda01_110420_163217.html

UPLOAD [if required] - /opt/oracle/dcs/oracle.ahf/data/oda01/orachk/orachk_oda01_110420_163217.zip

Then we need to ensure to have a good backup for the opened databases that will run on the ODA. If we are patching an ODA having High Availability (Data Guard for EE edition or dbvisit for SE edition), we will ensure to have run switchover and have only standby databases running on the ODA. And we need to stop the databases’ synchronization in that case.

Patching the ODA to 19.9

Once this requirements are met, we can start the patching.

We first need to unzip the downloaded patch files. The patch 31922078 files will be downloaded from oracle web support portal.

[root@ODA01 orachk]# cd /u01/app/patch/

[root@ODA01 patch]# ls -ltrh
total 16G
-rw-r--r-- 1 root root 6.7G Nov  4 14:11 p31922078_199000_Linux-x86-64_2of2.zip
-rw-r--r-- 1 root root 9.2G Nov  4 15:17 p31922078_199000_Linux-x86-64_1of2.zip

[root@ODA01 patch]# unzip p31922078_199000_Linux-x86-64_1of2.zip
Archive:  p31922078_199000_Linux-x86-64_1of2.zip
 extracting: oda-sm-19.9.0.0.0-201023-server1of2.zip
  inflating: README.txt
  
[root@ODA01 patch]# unzip p31922078_199000_Linux-x86-64_2of2.zip
Archive:  p31922078_199000_Linux-x86-64_2of2.zip
 extracting: oda-sm-19.9.0.0.0-201023-server2of2.zip

[root@ODA01 patch]# ls -ltrh
total 32G
-rw-r--r-- 1 root root 9.2G Oct 29 04:51 oda-sm-19.9.0.0.0-201023-server1of2.zip
-rw-r--r-- 1 root root 6.7G Oct 29 04:53 oda-sm-19.9.0.0.0-201023-server2of2.zip
-rw-r--r-- 1 root root  190 Oct 29 06:17 README.txt
-rw-r--r-- 1 root root 6.7G Nov  4 14:11 p31922078_199000_Linux-x86-64_2of2.zip
-rw-r--r-- 1 root root 9.2G Nov  4 15:17 p31922078_199000_Linux-x86-64_1of2.zip

[root@ODA01 patch]# rm -f p31922078_199000_Linux-x86-64_2of2.zip

[root@ODA01 patch]# rm -f p31922078_199000_Linux-x86-64_1of2.zip

We can then update the ODA repository with the patch files :

[root@ODA01 patch]# odacli update-repository -f /u01/app/patch/oda-sm-19.9.0.0.0-201023-server1of2.zip
{
  "jobId" : "0c23cb4e-2455-4ad2-832b-168edce2f40c",
  "status" : "Created",
  "message" : "/u01/app/patch/oda-sm-19.9.0.0.0-201023-server1of2.zip",
  "reports" : [ ],
  "createTimestamp" : "November 04, 2020 16:55:52 PM CET",
  "resourceList" : [ ],
  "description" : "Repository Update",
  "updatedTime" : "November 04, 2020 16:55:52 PM CET"
}

[root@ODA01 patch]# odacli describe-job -i "0c23cb4e-2455-4ad2-832b-168edce2f40c"

Job details
----------------------------------------------------------------
                     ID:  0c23cb4e-2455-4ad2-832b-168edce2f40c
            Description:  Repository Update
                 Status:  Success
                Created:  November 4, 2020 4:55:52 PM CET
                Message:  /u01/app/patch/oda-sm-19.9.0.0.0-201023-server1of2.zip

Task Name                                Start Time                          End Time                            Status
---------------------------------------- ----------------------------------- ----------------------------------- ----------

[root@ODA01 patch]# odacli update-repository -f /u01/app/patch/oda-sm-19.9.0.0.0-201023-server2of2.zip
{
  "jobId" : "04ecd45d-6b92-475c-acd9-202f0137474f",
  "status" : "Created",
  "message" : "/u01/app/patch/oda-sm-19.9.0.0.0-201023-server2of2.zip",
  "reports" : [ ],
  "createTimestamp" : "November 04, 2020 16:58:05 PM CET",
  "resourceList" : [ ],
  "description" : "Repository Update",
  "updatedTime" : "November 04, 2020 16:58:05 PM CET"
}

[root@ODA01 patch]# odacli describe-job -i "04ecd45d-6b92-475c-acd9-202f0137474f"

Job details
----------------------------------------------------------------
                     ID:  04ecd45d-6b92-475c-acd9-202f0137474f
            Description:  Repository Update
                 Status:  Success
                Created:  November 4, 2020 4:58:05 PM CET
                Message:  /u01/app/patch/oda-sm-19.9.0.0.0-201023-server2of2.zip

Task Name                                Start Time                          End Time                            Status
---------------------------------------- ----------------------------------- ----------------------------------- ----------

[root@ODA01 patch]# odacli list-jobs | head -n 3;  odacli list-jobs | tail -n 3

ID                                       Description                                                                 Created                             Status
---------------------------------------- --------------------------------------------------------------------------- ----------------------------------- ----------
0c23cb4e-2455-4ad2-832b-168edce2f40c     Repository Update                                                           November 4, 2020 4:55:52 PM CET     Success
04ecd45d-6b92-475c-acd9-202f0137474f     Repository Update                                                           November 4, 2020 4:58:05 PM CET     Success

We can already clean up the patch folder as the files are not needed any more :

[root@ODA01 patch]# ls -ltrh
total 16G
-rw-r--r-- 1 root root 9.2G Oct 29 04:51 oda-sm-19.9.0.0.0-201023-server1of2.zip
-rw-r--r-- 1 root root 6.7G Oct 29 04:53 oda-sm-19.9.0.0.0-201023-server2of2.zip
-rw-r--r-- 1 root root  190 Oct 29 06:17 README.txt

[root@ODA01 patch]# rm -f *.zip

[root@ODA01 patch]# rm -f README.txt

We will check the current version and available new version :

[root@ODA01 patch]# odacli describe-component
System Version
---------------
19.6.0.0.0

Component                                Installed Version    Available Version
---------------------------------------- -------------------- --------------------
OAK                                       19.6.0.0.0            19.9.0.0.0
GI                                        19.6.0.0.200114       19.9.0.0.201020
DB                                        18.7.0.0.190716       18.12.0.0.201020
DCSAGENT                                  19.6.0.0.0            19.9.0.0.0
ILOM                                      4.0.4.51.r133528      5.0.1.21.r136383
BIOS                                      52021000              52030400
OS                                        7.7                   7.8
FIRMWARECONTROLLER                        VDV1RL02              VDV1RL04
FIRMWAREDISK                              1102                  1132
HMP                                       2.4.5.0.1             2.4.7.0.1

I’m usually stopping the databases at that time. It is not mandatory, but I personally prefer. This can be achieved by stopping each database with srvctl stop database command or srvctl stop home command to stop all databases from same rdbms home.

Now we can update the dcs-agent :

[root@ODA01 patch]# /opt/oracle/dcs/bin/odacli update-dcsagent -v 19.9.0.0.0
{
  "jobId" : "fa6c5e53-b0b7-470e-b856-ccf19a0305ef",
  "status" : "Created",
  "message" : "Dcs agent will be restarted after the update. Please wait for 2-3 mins before executing the other commands",
  "reports" : [ ],
  "createTimestamp" : "November 04, 2020 17:02:53 PM CET",
  "resourceList" : [ ],
  "description" : "DcsAgent patching",
  "updatedTime" : "November 04, 2020 17:02:53 PM CET"
}

[root@ODA01 patch]# odacli describe-job -i "fa6c5e53-b0b7-470e-b856-ccf19a0305ef"

Job details
----------------------------------------------------------------
                     ID:  fa6c5e53-b0b7-470e-b856-ccf19a0305ef
            Description:  DcsAgent patching
                 Status:  Success
                Created:  November 4, 2020 5:02:53 PM CET
                Message:

Task Name                                Start Time                          End Time                            Status
---------------------------------------- ----------------------------------- ----------------------------------- ----------
dcs-agent upgrade  to version 19.9.0.0.0 November 4, 2020 5:02:53 PM CET     November 4, 2020 5:04:28 PM CET     Success
Update System version                    November 4, 2020 5:04:28 PM CET     November 4, 2020 5:04:28 PM CET     Success

We will now update the DCS admin :

[root@ODA01 patch]# /opt/oracle/dcs/bin/odacli update-dcsadmin -v 19.9.0.0.0
{
  "jobId" : "bdcbda55-d325-44ca-8bed-f0b15eeacfae",
  "status" : "Created",
  "message" : null,
  "reports" : [ ],
  "createTimestamp" : "November 04, 2020 17:04:57 PM CET",
  "resourceList" : [ ],
  "description" : "DcsAdmin patching",
  "updatedTime" : "November 04, 2020 17:04:57 PM CET"
}

[root@ODA01 patch]# odacli describe-job -i "bdcbda55-d325-44ca-8bed-f0b15eeacfae"

Job details
----------------------------------------------------------------
                     ID:  bdcbda55-d325-44ca-8bed-f0b15eeacfae
            Description:  DcsAdmin patching
                 Status:  Success
                Created:  November 4, 2020 5:04:57 PM CET
                Message:

Task Name                                Start Time                          End Time                            Status
---------------------------------------- ----------------------------------- ----------------------------------- ----------
Patch location validation                November 4, 2020 5:04:58 PM CET     November 4, 2020 5:04:58 PM CET     Success
dcs-admin upgrade                        November 4, 2020 5:04:58 PM CET     November 4, 2020 5:05:04 PM CET     Success

We will update the DCS components :

[root@ODA01 patch]# /opt/oracle/dcs/bin/odacli update-dcscomponents -v 19.9.0.0.0
{
  "jobId" : "4782c035-86fd-496b-b9f1-1055d77071b3",
  "status" : "Success",
  "message" : null,
  "reports" : null,
  "createTimestamp" : "November 04, 2020 17:05:48 PM CET",
  "description" : "Job completed and is not part of Agent job list",
  "updatedTime" : "November 04, 2020 17:05:48 PM CET"
}

We will run the prepatch report :

[root@ODA01 patch]# /opt/oracle/dcs/bin/odacli create-prepatchreport -s -v 19.9.0.0.0

Job details
----------------------------------------------------------------
                     ID:  d836f326-aba3-44e6-9be4-aaa031b5d730
            Description:  Patch pre-checks for [OS, ILOM, GI, ORACHKSERVER]
                 Status:  Created
                Created:  November 4, 2020 5:07:37 PM CET
                Message:  Use 'odacli describe-prepatchreport -i d836f326-aba3-44e6-9be4-aaa031b5d730' to check details of results

Task Name                                Start Time                          End Time                            Status
---------------------------------------- ----------------------------------- ----------------------------------- ----------

And we will check the report :

[root@ODA01 patch]# odacli describe-prepatchreport -i d836f326-aba3-44e6-9be4-aaa031b5d730

Patch pre-check report
------------------------------------------------------------------------
                 Job ID:  d836f326-aba3-44e6-9be4-aaa031b5d730
            Description:  Patch pre-checks for [OS, ILOM, GI, ORACHKSERVER]
                 Status:  FAILED
                Created:  November 4, 2020 5:07:37 PM CET
                 Result:  One or more pre-checks failed for [ORACHK]

Node Name
---------------
ODA01

Pre-Check                      Status   Comments
------------------------------ -------- --------------------------------------
__OS__
Validate supported versions     Success   Validated minimum supported versions.
Validate patching tag           Success   Validated patching tag: 19.9.0.0.0.
Is patch location available     Success   Patch location is available.
Verify OS patch                 Success   Verified OS patch
Validate command execution      Success   Validated command execution

__ILOM__
Validate supported versions     Success   Validated minimum supported versions.
Validate patching tag           Success   Validated patching tag: 19.9.0.0.0.
Is patch location available     Success   Patch location is available.
Checking Ilom patch Version     Success   Successfully verified the versions
Patch location validation       Success   Successfully validated location
Validate command execution      Success   Validated command execution

__GI__
Validate supported GI versions  Success   Validated minimum supported versions.
Validate available space        Success   Validated free space under /u01
Is clusterware running          Success   Clusterware is running
Validate patching tag           Success   Validated patching tag: 19.9.0.0.0.
Is system provisioned           Success   Verified system is provisioned
Validate ASM in online          Success   ASM is online
Validate minimum agent version  Success   GI patching enabled in current
                                          DCSAGENT version
Validate GI patch metadata      Success   Validated patching tag: 19.9.0.0.0.
Validate clones location exist  Success   Validated clones location
Is patch location available     Success   Patch location is available.
Patch location validation       Success   Successfully validated location
Patch verification              Success   Patches 31771877 not applied on GI
                                          home /u01/app/19.0.0.0/grid on node
                                          ODA01
Validate Opatch update          Success   Successfully updated the opatch in
                                          GiHome /u01/app/19.0.0.0/grid on node
                                          ODA01
Patch conflict check            Success   No patch conflicts found on GiHome
                                          /u01/app/19.0.0.0/grid on node
                                          ODA01
Validate command execution      Success   Validated command execution

__ORACHK__
Running orachk                  Failed    Orachk validation failed: .
Validate command execution      Success   Validated command execution
Software home                   Failed    Software home check failed

The prepatch report has been failing on orachk and the software home part. In the html report from orachk I could check and see that the software home check is failing due to missing files :

FAIL => Software home check failed
 
Error Message:
File "/u01/app/19.0.0.0/grid/jdk/jre/lib/amd64/libjavafx_font_t2k.so" could not be verified on node "oda01". OS error: "No such file or directory"
Error Message:
File "/u01/app/19.0.0.0/grid/jdk/jre/lib/amd64/libkcms.so" could not be verified on node "oda01". OS error: "No such file or directory"
Error Message:
File "/u01/app/19.0.0.0/grid/rdbms/lib/ksms.o" could not be verified on node "oda01". OS error: "No such file or directory"

This files are expected during orachk check as referenced in the XML files :

[root@ODA01 ~]# grep ksms /u01/app/19.0.0.0/grid/cv/cvdata/ora_software_cfg.xml
         <File Path="rdbms/lib/" Name="ksms.o" Permissions="644"/>
         <File Path="bin/" Name="genksms" Permissions="755"/>
         <File Path="rdbms/lib/" Name="genksms.o"/>
         <File Path="rdbms/lib/" Name="ksms.o" Permissions="644"/>
         <File Path="bin/" Name="genksms" Permissions="755"/>
         <File Path="rdbms/lib/" Name="genksms.o"/>
         <File Path="rdbms/lib/" Name="ksms.o" Permissions="644"/>
         <File Path="bin/" Name="genksms" Permissions="755"/>
         <File Path="rdbms/lib/" Name="genksms.o"/>
		 
[root@ODA01 ~]# grep ksms /u01/app/19.0.0.0/grid/cv/cvdata/19/ora_software_cfg.xml
         <File Path="rdbms/lib/" Name="ksms.o" Permissions="644"/>
         <File Path="rdbms/lib/" Name="genksms.o"/>
         <File Path="bin/" Name="genksms" Permissions="755"/>
         <File Path="rdbms/lib/" Name="ksms.o" Permissions="644"/>
         <File Path="rdbms/lib/" Name="genksms.o"/>
         <File Path="bin/" Name="genksms" Permissions="755"/>
         <File Path="rdbms/lib/" Name="ksms.o" Permissions="644"/>
         <File Path="rdbms/lib/" Name="genksms.o"/>
         <File Path="bin/" Name="genksms" Permissions="755"/>

I find following MOS note that can be related to same problem : File “$GRID_HOME/rdbms/lib/ksms.o” could not be verified on node (Doc ID 1908505.1).

As per this note we can ignore this erreur and move forward. I then decided to move forward with the server patching.

I patched the server :

[root@ODA01 patch]# /opt/oracle/dcs/bin/odacli update-server -v 19.9.0.0.0
{
  "jobId" : "78f3ea84-4e31-4e1f-b195-eb4e75429102",
  "status" : "Created",
  "message" : "Success of server update will trigger reboot of the node after 4-5 minutes. Please wait until the node reboots.",
  "reports" : [ ],
  "createTimestamp" : "November 04, 2020 17:17:57 PM CET",
  "resourceList" : [ ],
  "description" : "Server Patching",
  "updatedTime" : "November 04, 2020 17:17:57 PM CET"
}

But the patching failed immediately as orachk was not successful due to the problem just described before :

[root@ODA01 patch]# odacli describe-job -i "78f3ea84-4e31-4e1f-b195-eb4e75429102"

Job details
----------------------------------------------------------------
                     ID:  78f3ea84-4e31-4e1f-b195-eb4e75429102
            Description:  Server Patching
                 Status:  Failure
                Created:  November 4, 2020 5:17:57 PM CET
                Message:  DCS-10702:Orachk validation failed: Please run describe-prepatchreport 78f3ea84-4e31-4e1f-b195-eb4e75429102 to see details.

Task Name                                Start Time                          End Time                            Status
---------------------------------------- ----------------------------------- ----------------------------------- ----------
Server patching                          November 4, 2020 5:18:05 PM CET     November 4, 2020 5:22:05 PM CET     Failure
Orachk Server Patching                   November 4, 2020 5:18:05 PM CET     November 4, 2020 5:22:05 PM CET     Failure

So starting 19.9 it seems that orachk is mandatory before doing any patching, and if orachk will not be successful the patching will then fail.

By chance there is a new skip-orachk option to skip the orachk during server patching :

[root@ODA01 patch]# /opt/oracle/dcs/bin/odacli update-server -v 19.9.0.0.0 -h
Usage: update-server [options]
  Options:
    --component, -c
      The component that is requested for update. The supported components
      include: OS
    --force, -f
      Ignore precheck error and force patching
    --help, -h
      get help
    --json, -j
      json output
    --local, -l
      Update Server Components Locally
    --node, -n
      Node to be updated
    --precheck, -p
      Obsolete flag
    --skip-orachk, -sko
      Option to skip orachk validations
    --version, -v
      Version to be updated

I then could successfully patch the server :

[root@ODA01 patch]# /opt/oracle/dcs/bin/odacli update-server -v 19.9.0.0.0 -sko
{
  "jobId" : "878fac12-a2a0-4302-955c-7df3d4fdd517",
  "status" : "Created",
  "message" : "Success of server update will trigger reboot of the node after 4-5 minutes. Please wait until the node reboots.",
  "reports" : [ ],
  "createTimestamp" : "November 04, 2020 18:03:15 PM CET",
  "resourceList" : [ ],
  "description" : "Server Patching",
  "updatedTime" : "November 04, 2020 18:03:15 PM CET"
}

[root@ODA01 ~]# uptime
 19:06:00 up 2 min,  1 user,  load average: 2.58, 1.32, 0.52
 
[root@ODA01 ~]# odacli describe-job -i "878fac12-a2a0-4302-955c-7df3d4fdd517"

Job details
----------------------------------------------------------------
                     ID:  878fac12-a2a0-4302-955c-7df3d4fdd517
            Description:  Server Patching
                 Status:  Success
                Created:  November 4, 2020 6:03:15 PM CET
                Message:

Task Name                                Start Time                          End Time                            Status
---------------------------------------- ----------------------------------- ----------------------------------- ----------
Patch location validation                November 4, 2020 6:03:23 PM CET     November 4, 2020 6:03:23 PM CET     Success
dcs-controller upgrade                   November 4, 2020 6:03:23 PM CET     November 4, 2020 6:03:28 PM CET     Success
Patch location validation                November 4, 2020 6:03:30 PM CET     November 4, 2020 6:03:30 PM CET     Success
dcs-cli upgrade                          November 4, 2020 6:03:30 PM CET     November 4, 2020 6:03:30 PM CET     Success
Creating repositories using yum          November 4, 2020 6:03:30 PM CET     November 4, 2020 6:03:33 PM CET     Success
Updating YumPluginVersionLock rpm        November 4, 2020 6:03:33 PM CET     November 4, 2020 6:03:33 PM CET     Success
Applying OS Patches                      November 4, 2020 6:03:33 PM CET     November 4, 2020 6:13:18 PM CET     Success
Creating repositories using yum          November 4, 2020 6:13:18 PM CET     November 4, 2020 6:13:18 PM CET     Success
Applying HMP Patches                     November 4, 2020 6:13:18 PM CET     November 4, 2020 6:13:38 PM CET     Success
Client root Set up                       November 4, 2020 6:13:38 PM CET     November 4, 2020 6:13:41 PM CET     Success
Client grid Set up                       November 4, 2020 6:13:41 PM CET     November 4, 2020 6:13:46 PM CET     Success
Patch location validation                November 4, 2020 6:13:46 PM CET     November 4, 2020 6:13:46 PM CET     Success
oda-hw-mgmt upgrade                      November 4, 2020 6:13:46 PM CET     November 4, 2020 6:14:17 PM CET     Success
OSS Patching                             November 4, 2020 6:14:17 PM CET     November 4, 2020 6:14:18 PM CET     Success
Applying Firmware Disk Patches           November 4, 2020 6:14:18 PM CET     November 4, 2020 6:14:21 PM CET     Success
Applying Firmware Controller Patches     November 4, 2020 6:14:21 PM CET     November 4, 2020 6:14:24 PM CET     Success
Checking Ilom patch Version              November 4, 2020 6:14:25 PM CET     November 4, 2020 6:14:27 PM CET     Success
Patch location validation                November 4, 2020 6:14:27 PM CET     November 4, 2020 6:14:28 PM CET     Success
Save password in Wallet                  November 4, 2020 6:14:29 PM CET     November 4, 2020 6:14:30 PM CET     Success
Apply Ilom patch                         November 4, 2020 6:14:30 PM CET     November 4, 2020 6:22:34 PM CET     Success
Copying Flash Bios to Temp location      November 4, 2020 6:22:34 PM CET     November 4, 2020 6:22:34 PM CET     Success
Starting the clusterware                 November 4, 2020 6:22:35 PM CET     November 4, 2020 6:23:58 PM CET     Success
clusterware patch verification           November 4, 2020 6:23:58 PM CET     November 4, 2020 6:24:01 PM CET     Success
Patch location validation                November 4, 2020 6:24:01 PM CET     November 4, 2020 6:24:01 PM CET     Success
Opatch update                            November 4, 2020 6:24:43 PM CET     November 4, 2020 6:24:46 PM CET     Success
Patch conflict check                     November 4, 2020 6:24:46 PM CET     November 4, 2020 6:25:31 PM CET     Success
clusterware upgrade                      November 4, 2020 6:25:52 PM CET     November 4, 2020 6:50:57 PM CET     Success
Updating GiHome version                  November 4, 2020 6:50:57 PM CET     November 4, 2020 6:51:12 PM CET     Success
Update System version                    November 4, 2020 6:51:16 PM CET     November 4, 2020 6:51:16 PM CET     Success
Cleanup JRE Home                         November 4, 2020 6:51:16 PM CET     November 4, 2020 6:51:16 PM CET     Success
preRebootNode Actions                    November 4, 2020 6:51:16 PM CET     November 4, 2020 6:51:57 PM CET     Success
Reboot Ilom                              November 4, 2020 6:51:57 PM CET     November 4, 2020 6:51:57 PM CET     Success

I could check the new current installed version :

[root@ODA01 ~]# odacli describe-component
System Version
---------------
19.9.0.0.0

Component                                Installed Version    Available Version
---------------------------------------- -------------------- --------------------
OAK                                       19.9.0.0.0            up-to-date
GI                                        19.9.0.0.201020       up-to-date
DB                                        18.7.0.0.190716       18.12.0.0.201020
DCSAGENT                                  19.9.0.0.0            up-to-date
ILOM                                      5.0.1.21.r136383      up-to-date
BIOS                                      52030400              up-to-date
OS                                        7.8                   up-to-date
FIRMWARECONTROLLER                        VDV1RL02              VDV1RL04
FIRMWAREDISK                              1102                  1132
HMP                                       2.4.7.0.1             up-to-date

I patched the storage :

[root@ODA01 ~]# odacli update-storage -v 19.9.0.0.0
{
  "jobId" : "61871e3d-088b-43af-8b91-94dc4fa1331a",
  "status" : "Created",
  "message" : "Success of Storage Update may trigger reboot of node after 4-5 minutes. Please wait till node restart",
  "reports" : [ ],
  "createTimestamp" : "November 04, 2020 19:07:17 PM CET",
  "resourceList" : [ ],
  "description" : "Storage Firmware Patching",
  "updatedTime" : "November 04, 2020 19:07:17 PM CET"
}

[root@ODA01 ~]# odacli describe-job -i "61871e3d-088b-43af-8b91-94dc4fa1331a"

Job details
----------------------------------------------------------------
                     ID:  61871e3d-088b-43af-8b91-94dc4fa1331a
            Description:  Storage Firmware Patching
                 Status:  Success
                Created:  November 4, 2020 7:07:17 PM CET
                Message:

Task Name                                Start Time                          End Time                            Status
---------------------------------------- ----------------------------------- ----------------------------------- ----------
Applying Firmware Disk Patches           November 4, 2020 7:07:20 PM CET     November 4, 2020 7:07:21 PM CET     Success
preRebootNode Actions                    November 4, 2020 7:07:21 PM CET     November 4, 2020 7:07:21 PM CET     Success
Reboot Ilom                              November 4, 2020 7:07:21 PM CET     November 4, 2020 7:07:21 PM CET     Success

Surprisingly the storage patching was done immediately and with no reboot.

I checked the version and could see that effectively the storage was still running old firmware versions :

[root@ODA01 ~]# odacli describe-component
System Version
---------------
19.9.0.0.0

Component                                Installed Version    Available Version
---------------------------------------- -------------------- --------------------
OAK                                       19.9.0.0.0            up-to-date
GI                                        19.9.0.0.201020       up-to-date
DB                                        18.7.0.0.190716       18.12.0.0.201020
DCSAGENT                                  19.9.0.0.0            up-to-date
ILOM                                      5.0.1.21.r136383      up-to-date
BIOS                                      52030400              up-to-date
OS                                        7.8                   up-to-date
FIRMWARECONTROLLER                        VDV1RL02              VDV1RL04
FIRMWAREDISK                              1102                  1132
HMP                                       2.4.7.0.1             up-to-date

I opened a SR and could get confirmation from support that this is a bug and can be left as it is. It will not impact any functionnality.
The BUG is the following one :
Bug 32017186 – LNX64-199-CMT : FIRMWARECONTROLLER NOT PATCHED FOR 19.9

As per the rdbms home, they can be patched later. If we are using a High Availability solution, both primary and standby databases’ homes need to be patched during same maintenance windows.

Post patching activities

We can now run post patching activities.

We will ensure there is no new hardware problem :

login as: root
Keyboard-interactive authentication prompts from server:
| Password:
End of keyboard-interactive prompts from server

Oracle(R) Integrated Lights Out Manager

Version 5.0.1.21 r136383

Copyright (c) 2020, Oracle and/or its affiliates. All rights reserved.

Warning: HTTPS certificate is set to factory default.

Hostname: ODA01-ILOM

-> show /SP/faultmgmt

 /SP/faultmgmt
    Targets:
        shell

    Properties:

    Commands:
        cd
        show

->

We can also remove our odabr snapshot backup :

[root@ODA01 ~]# export PATH=/opt/odabr:$PATH

[root@ODA01 ~]# odabr infosnap

--------------------------------------------------------
odabr - ODA node Backup Restore
Author: Ruggero Citton 
RAC Pack, Cloud Innovation and Solution Engineering Team
Copyright Oracle, Inc. 2013, 2019
Version: 2.0.1-47
--------------------------------------------------------


LVM snap name         Status                COW Size              Data%
-------------         ----------            ----------            ------
root_snap             active                30.00 GiB             22.79%
opt_snap              active                60.00 GiB             34.37%
u01_snap              active                100.00 GiB            35.58%


[root@ODA01 ~]# odabr delsnap
INFO: 2020-11-04 19:31:46: Please check the logfile '/opt/odabr/out/log/odabr_81687.log' for more details

INFO: 2020-11-04 19:31:46: Removing LVM snapshots
INFO: 2020-11-04 19:31:46: ...removing LVM snapshot for 'opt'
SUCCESS: 2020-11-04 19:31:46: ...snapshot for 'opt' removed successfully
INFO: 2020-11-04 19:31:46: ...removing LVM snapshot for 'u01'
SUCCESS: 2020-11-04 19:31:47: ...snapshot for 'u01' removed successfully
INFO: 2020-11-04 19:31:47: ...removing LVM snapshot for 'root'
SUCCESS: 2020-11-04 19:31:47: ...snapshot for 'root' removed successfully
SUCCESS: 2020-11-04 19:31:47: Remove LVM snapshots done successfully

We can cleanup previous patching version from repository and give additionnal space to /opt :

[root@ODA01 ~]# df -h / /u01 /opt
Filesystem                          Size  Used Avail Use% Mounted on
/dev/mapper/VolGroupSys-LogVolRoot   30G   11G   18G  38% /
/dev/mapper/VolGroupSys-LogVolU01    99G   59G   35G  63% /u01
/dev/mapper/VolGroupSys-LogVolOpt    75G   60G   12G  84% /opt

[root@ODA01 ~]# odacli cleanup-patchrepo -comp GI,DB -v 19.6.0.0.0
{
  "jobId" : "97b9669b-6945-4358-938e-a3a3f3b73693",
  "status" : "Created",
  "message" : null,
  "reports" : [ ],
  "createTimestamp" : "November 04, 2020 19:32:16 PM CET",
  "resourceList" : [ ],
  "description" : "Cleanup patchrepos",
  "updatedTime" : "November 04, 2020 19:32:16 PM CET"
}

[root@ODA01 ~]# odacli describe-job -i "97b9669b-6945-4358-938e-a3a3f3b73693"

Job details
----------------------------------------------------------------
                     ID:  97b9669b-6945-4358-938e-a3a3f3b73693
            Description:  Cleanup patchrepos
                 Status:  Success
                Created:  November 4, 2020 7:32:16 PM CET
                Message:

Task Name                                Start Time                          End Time                            Status
---------------------------------------- ----------------------------------- ----------------------------------- ----------
Cleanup Repository                       November 4, 2020 7:32:17 PM CET     November 4, 2020 7:32:17 PM CET     Success
Cleanup JRE Home                         November 4, 2020 7:32:17 PM CET     November 4, 2020 7:32:17 PM CET     Success

[root@ODA01 ~]# df -h / /u01 /opt
Filesystem                          Size  Used Avail Use% Mounted on
/dev/mapper/VolGroupSys-LogVolRoot   30G   11G   18G  38% /
/dev/mapper/VolGroupSys-LogVolU01    99G   59G   35G  63% /u01
/dev/mapper/VolGroupSys-LogVolOpt    75G   49G   23G  68% /opt

We can restart our databases with srvctl start database command or srvctl start home command.

Finally we will activate our database synchronization if using Data Guard or dbvisit.

Cet article Patching Oracle Database Appliance to 19.9 est apparu en premier sur Blog dbi services.

↧

Oracle Grid Infrastructure on Windows With 2 Nodes

November 11, 2020, 8:53 am

≫ Next: AWS DynamoDB -> S3 -> OCI Autonomous Database

≪ Previous: Patching Oracle Database Appliance to 19.9

Oracle Grid Infrastructure can be also installed on Windows server. In this blog I am explaining how this installation can be done. I am going to install an environment with to nodes. Oracle 19c is being used. We have two servers
winrac1 :
winrac2 :
with the same characteristics

I would like to note that it is just a test environment on virtual machines on VirtualBox. I did not have any dns server and I just have 1 scan address instead of 3 as recommended. Below my hosts files with the IP used

PS C:\Windows\System32\drivers\etc> Get-Content .\hosts | findstr 192
192.168.168.100         winrac1
192.168.168.101         winrac2
192.168.1.100           winrac1-priv
192.168.1.101           winrac2-priv
192.168.168.110         winrac1-vip
192.168.168.111         winrac2-vip
192.168.168.120         winrac-scan
PS C:\Windows\System32\drivers\etc>

The installation user can be local user or a domain user. If he is a local user he should
-be member of Administrators group
-exist on both nodes if he is a local one
-have the same password on both nodes

In my case the Administrator user was used
The oracle grid sofware WINDOWS.X64_193000_grid_home is already downloaded and unpacked to the GRID_HOME

C:\app\19..0.0.0\grid

The shared disks are Disk1 and Disk2 are already presented to both nodes winrac1 and winrac2

We have to disable write caching on each shared disk if supported by the system. For this right click on Disk1 for example and uncheck the Enable write caching on the device on both nodes

The next steps is to create a volume on shared disks. On the first node do following steps for all shared disks. Right click on the shared disk to create a New Simple Volume.

Once done , do a rescan disks on all other nodes

We now have to create logical partition with the shared disks

DISKPART> list disk

  Disk ###  Status         Size     Free     Dyn  Gpt
  --------  -------------  -------  -------  ---  ---
  Disk 0    Online           60 GB      0 B
* Disk 1    Online         5120 MB  5118 MB
  Disk 2    Online           32 GB    31 GB

DISKPART> select disk 1

Disk 1 is now the selected disk.

DISKPART> create partition extended

DiskPart succeeded in creating the specified partition.

DISKPART> create partition logical

DiskPart succeeded in creating the specified partition.

DISKPART>

DISKPART> select disk 2

Disk 2 is now the selected disk.

DISKPART> create partition extended

DiskPart succeeded in creating the specified partition.

DISKPART> create partition logical

DiskPart succeeded in creating the specified partition.

Do a rescan from all other nodes

Now we are going to prepare our disks to be used with ASM. For this we use the tool asmtoolg.exe. Just launch it on the first node

c:\app\19.0.0.0\grid\bin>c:\app\19.0.0.0\grid\bin\asmtoolg.exe

Repeat these steps for all disks you will use with ASM. You can list your labelled disks with following command

c:\app\19.0.0.0\grid\bin>asmtool.exe  -list
NTFS                             \Device\Harddisk0\Partition1              549M
NTFS                             \Device\Harddisk0\Partition2            60889M
ORCLDISKVOTOCR0                  \Device\Harddisk1\Partition1             5117M
ORCLDISKDATA0                    \Device\Harddisk2\Partition1            32765M

Before launching the installation, we have to set these registry values on both nodes

Computer\HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\W32Time\Config

And to stop the firewall on both nodes with Administrator

C:\Users\Administrator>netsh advfirewall set allprofiles state off
Ok.

c:\app\19.0.0.0\grid>gridSetup.bat

As it is just a test environment, I decided to ignore the errors and to continue

The verification failed because of errors I ignored

But the installation is fine

At the end of the installation I can validate the cluster

C:\Users\Administrator>crsctl check cluster -all
**************************************************************
winrac1:
CRS-4537: Cluster Ready Services is online
CRS-4529: Cluster Synchronization Services is online
CRS-4533: Event Manager is online
**************************************************************
winrac2:
CRS-4537: Cluster Ready Services is online
CRS-4529: Cluster Synchronization Services is online
CRS-4533: Event Manager is online
**************************************************************

C:\Users\Administrator>

The status of the different resources

C:\Users\Administrator>crsctl status res -t
--------------------------------------------------------------------------------
Name           Target  State        Server                   State details
--------------------------------------------------------------------------------
Local Resources
--------------------------------------------------------------------------------
ora.LISTENER.lsnr
               ONLINE  ONLINE       winrac1                  STABLE
               ONLINE  ONLINE       winrac2                  STABLE
ora.net1.network
               ONLINE  ONLINE       winrac1                  STABLE
               ONLINE  ONLINE       winrac2                  STABLE
ora.ons
               ONLINE  ONLINE       winrac1                  STABLE
               ONLINE  ONLINE       winrac2                  STABLE
ora.proxy_advm
               OFFLINE OFFLINE      winrac1                  STABLE
               OFFLINE OFFLINE      winrac2                  STABLE
--------------------------------------------------------------------------------
Cluster Resources
--------------------------------------------------------------------------------
ora.ASMNET1LSNR_ASM.lsnr(ora.asmgroup)
      1        ONLINE  ONLINE       winrac1                  STABLE
      2        ONLINE  ONLINE       winrac2                  STABLE
      3        OFFLINE OFFLINE                               STABLE
ora.LISTENER_SCAN1.lsnr
      1        ONLINE  ONLINE       winrac1                  STABLE
ora.VOTOCR.dg(ora.asmgroup)
      1        ONLINE  ONLINE       winrac1                  STABLE
      2        ONLINE  ONLINE       winrac2                  STABLE
      3        OFFLINE OFFLINE                               STABLE
ora.asm(ora.asmgroup)
      1        ONLINE  ONLINE       winrac1                  Started,STABLE
      2        ONLINE  ONLINE       winrac2                  Started,STABLE
      3        OFFLINE OFFLINE                               STABLE
ora.asmnet1.asmnetwork(ora.asmgroup)
      1        ONLINE  ONLINE       winrac1                  STABLE
      2        ONLINE  ONLINE       winrac2                  STABLE
      3        OFFLINE OFFLINE                               STABLE
ora.cvu
      1        ONLINE  ONLINE       winrac1                  STABLE
ora.qosmserver
      1        ONLINE  ONLINE       winrac1                  STABLE
ora.scan1.vip
      1        ONLINE  ONLINE       winrac1                  STABLE
ora.winrac1.vip
      1        ONLINE  ONLINE       winrac1                  STABLE
ora.winrac2.vip
      1        ONLINE  ONLINE       winrac2                  STABLE
--------------------------------------------------------------------------------

C:\Users\Administrator>

The voting disk

C:\Users\Administrator>crsctl query css votedisk
##  STATE    File Universal Id                File Name Disk group
--  -----    -----------------                --------- ---------
 1. ONLINE   01909d45161e4f74bfad14dff099dcc0 (\\.\ORCLDISKVOTOCR0) [VOTOCR]
Located 1 voting disk(s).

The OCR

C:\Users\Administrator>ocrcheck
Status of Oracle Cluster Registry is as follows :
         Version                  :          4
         Total space (kbytes)     :     491684
         Used space (kbytes)      :      84300
         Available space (kbytes) :     407384
         ID                       :  461265202
         Device/File Name         :    +VOTOCR
                                    Device/File integrity check succeeded

                                    Device/File not configured

                                    Device/File not configured

                                    Device/File not configured

                                    Device/File not configured

         Cluster registry integrity check succeeded

         Logical corruption check succeeded


C:\Users\Administrator>

Conclusion

We just see that Oracle Clusterware can be configured in a Windows environement. In coming blogs we will show how to configure database on it. But it will the same that on Linux environment

Cet article Oracle Grid Infrastructure on Windows With 2 Nodes est apparu en premier sur Blog dbi services.

↧

AWS DynamoDB -> S3 -> OCI Autonomous Database

November 16, 2020, 7:17 am

≫ Next: Recovery in the ☁ with Oracle Autonomous Database

≪ Previous: Oracle Grid Infrastructure on Windows With 2 Nodes

By Franck Pachot

.
I contribute to multiple technologies communities. I’m an Oracle ACE Director for many years, and I also became an AWS Data Hero recently . I got asked if there’s no conflict with that, as Amazon and Oracle are two competitors. Actually, there’s no conflict at all. Those advocacy programs are not about sales, but technology. Many database developers and administrators have to interact with more than one cloud provider, and those cloud providers expand their interoperability for better user experience. Here is an example with two recent news from past months:

Imagine that your application stores some data into DynamoDB because it is one of the easiest serverless datastore that can scale to millions of key-value queries per second with great availability and performance. You may want to export it to the S3 Object Storage and this new DynamoDB feature can export it without any code (no lambda, no pipeline, no ETL…). When in S3 you may want to query it. Of course, there are many possibilities. One is Amazon Athena with the Presto query engine. But that’s for another post. Here I’m using my free Oracle Autonomous Database to query directly from Amazon S3 with the full SQL power of Oracle Database.

Latest AWS CLI

This is a new feature, my current AWS CLI doesn’t know about it:


[opc@a demo]$ aws --version
aws-cli/2.0.50 Python/3.7.3 Linux/4.14.35-2025.400.9.el7uek.x86_64 exe/x86_64.oracle.7

[opc@a demo]$ aws dynamodb export-table-to-point-in-time
Invalid choice: 'export-table-to-point-in-time', maybe you meant:

Let’s update it to the latest version:


cd /var/tmp
wget -c https://awscli.amazonaws.com/awscli-exe-linux-x86_64.zip
unzip awscli-exe-linux-x86_64.zip
sudo ./aws/install --bin-dir /usr/local/bin --install-dir /usr/local/aws-cli --update
aws --version
cd -

If you don’t already have the AWS CLI, the initial install is the same as the upgrade procedure. You will just have to “aws configure” it.


[opc@a demo]$ aws dynamodb export-table-to-point-in-time help | awk '/DESCRIPTION/,/^$/'
DESCRIPTION
       Exports  table  data to an S3 bucket. The table must have point in time
       recovery enabled, and you can export data  from  any  time  within  the
       point in time recovery window.

This is the inline help. Online one is: https://docs.aws.amazon.com/de_de/cli/latest/reference/dynamodb/export-table-to-point-in-time.html

DynamoDB Table

I’m quickly creating a DynamoDB table. I’m doing that in the AWS free tier and set the Throughput (RCU/WCU) to the maximum I can do for free here as I have no other tables:


[opc@a demo]$ aws dynamodb create-table --attribute-definitions  AttributeName=K,AttributeType=S  --key-schema  AttributeName=K,KeyType=HASH --billing-mode PROVISIONED --provisioned-throughput ReadCapacityUnits=25,WriteCapacityUnits=25 --table-name Demo

TABLEDESCRIPTION        2020-11-15T23:02:08.543000+01:00        0       arn:aws:dynamodb:eu-central-1:802756008554:table/Demo   b634ffcb-1ae3-4de0-8e42-5d8262b04a38    Demo    0       CREATING
ATTRIBUTEDEFINITIONS    K       S
KEYSCHEMA       K       HASH
PROVISIONEDTHROUGHPUT   0       25      25

This creates a DynamoDB table with a hash partition key named “K”.

Enable PIT (Point-In-Time Recovery) for consistent view (MVCC)

When you export a whole table, you want a consistent view of it. If you just scan it, you may see data as-of different points in time because there are concurrent updates and DynamoDB is No SQL: No ACID (Durability is there but Atomicity is limited, Consistency has another meaning, and there’s no snapshot Isolation), no locks (except with a lot of DiY code), no MVCC (Multi-Version Concurrency control). However, Amazon DynamoDB provides MVCC-like consistent snapshots for recovery reasons, when enabling Continuous Backups which is like Copy-on-Write on the storage. This can provide a consistent point-in-time snapshot of the whole table. And the “Export to S3” feature is based on that. This means that if you don’t enable Point-In-Time Recovery you can’t use this feature:


[opc@a demo]$ aws dynamodb export-table-to-point-in-time --table-arn arn:aws:dynamodb:eu-central-1:802756008554:table/Demo --s3-bucket franck-pachot-free-tier --export-format DYNAMODB_JSON

An error occurred (PointInTimeRecoveryUnavailableException) when calling the ExportTableToPointInTime operation: Point in time recovery is not enabled for table 'Demo'

Here is how to enable it (which can also be done from the console in table “Backups” tab):


[opc@a demo]$ aws dynamodb update-continuous-backups --table-name Demo --point-in-time-recovery-specification PointInTimeRecoveryEnabled=true

CONTINUOUSBACKUPSDESCRIPTION    ENABLED
POINTINTIMERECOVERYDESCRIPTION  2020-11-15T23:02:15+01:00       2020-11-15T23:02:15+01:00       ENABLED

Put items

I quickly insert a few items just to test this feature:


python3
import boto3, botocore.config, datetime
print(datetime.datetime.utcnow().isoformat())
dynamodb = boto3.resource('dynamodb',config=botocore.config.Config(retries={'mode':'adaptive','total_max_attempts': 10}))
for k in range(1,1000):
 dynamodb.Table('Demo').put_item(Item={'K':f"K-{k:08}",'V':datetime.datetime.utcnow().isoformat()})

quit()

This has created 1000 items with a key from K-00000001 to K-00001000 and a simple value item with one attribute “V” where I put the current timestamp.

They have been inserted sequentially. I’ve highlighted one item inserted in the middle, with its timestamp. The reason is to test the consistent snapshot: only the items inserted before this one should be visible from this point-in-time.

Export to S3

I mention this “Export Time” for the export to S3:


[opc@a demo]$ aws dynamodb export-table-to-point-in-time --export-time 2020-11-15T22:02:33.011659 --table-arn arn:aws:dynamodb:eu-central-1:802756008554:table/Demo --s3-bucket franck-pachot-free-tier --export-format DYNAMODB_JSON

EXPORTDESCRIPTION       e8839402-f7b0-4732-b2cb-63e74f8a4c7b    arn:aws:dynamodb:eu-central-1:802756008554:table/Demo/export/01605477788673-a117c9b4    DYNAMODB_JSON   IN_PROGRESS     2020-11-15T23:02:33+01:00       franck-pachot-free-tier AES256  2020-11-15T23:03:08.673000+01:00
        arn:aws:dynamodb:eu-central-1:802756008554:table/Demo   b634ffcb-1ae3-4de0-8e42-5d8262b04a38

You can do the same from the console (use the preview of the new DynamoDB console).

And that’s all, I have a new AWSDynamoDB folder in the S3 bucket I’ve mentioned:


[opc@a demo]$ aws s3 ls franck-pachot-free-tier/AWSDynamoDB/                                                                                                             
                          PRE 01605477788673-a117c9b4/

The subfolder 01605477788673-a117c9b4 is created for this export. A new export will create another one.

[opc@a demo]$ aws s3 ls franck-pachot-free-tier/AWSDynamoDB/01605477788673-a117c9b4/

                           PRE data/
2020-11-15 23:04:09          0 _started
2020-11-15 23:06:09        197 manifest-files.json
2020-11-15 23:06:10         24 manifest-files.md5
2020-11-15 23:06:10        601 manifest-summary.json
2020-11-15 23:06:10         24 manifest-summary.md5

There’s a lot of metadata coming with this, and a “data” folder.

[opc@a demo]$ aws s3 ls franck-pachot-free-tier/AWSDynamoDB/01605477788673-a117c9b4/data/
2020-11-15 23:05:35       4817 vovfdgilxiy6xkmzy3isnpzqgu.json.gz

This is where my DynamoDB table items have been exported: a text file, with one record per item, each as a JSON object, and gzipped:


{"Item":{"K":{"S":"K-00000412"},"V":{"S":"2020-11-15T22:02:30.725004"}}}
{"Item":{"K":{"S":"K-00000257"},"V":{"S":"2020-11-15T22:02:28.046544"}}}
{"Item":{"K":{"S":"K-00000179"},"V":{"S":"2020-11-15T22:02:26.715717"}}}
{"Item":{"K":{"S":"K-00000274"},"V":{"S":"2020-11-15T22:02:28.333683"}}}
{"Item":{"K":{"S":"K-00000364"},"V":{"S":"2020-11-15T22:02:29.889403"}}}
{"Item":{"K":{"S":"K-00000169"},"V":{"S":"2020-11-15T22:02:26.546318"}}}
{"Item":{"K":{"S":"K-00000121"},"V":{"S":"2020-11-15T22:02:25.628021"}}}
{"Item":{"K":{"S":"K-00000018"},"V":{"S":"2020-11-15T22:02:23.613737"}}}
{"Item":{"K":{"S":"K-00000459"},"V":{"S":"2020-11-15T22:02:31.513156"}}}
{"Item":{"K":{"S":"K-00000367"},"V":{"S":"2020-11-15T22:02:29.934015"}}}
{"Item":{"K":{"S":"K-00000413"},"V":{"S":"2020-11-15T22:02:30.739685"}}}
{"Item":{"K":{"S":"K-00000118"},"V":{"S":"2020-11-15T22:02:25.580729"}}}
{"Item":{"K":{"S":"K-00000303"},"V":{"S":"2020-11-15T22:02:28.811959"}}}
{"Item":{"K":{"S":"K-00000021"},"V":{"S":"2020-11-15T22:02:23.676213"}}}
{"Item":{"K":{"S":"K-00000167"},"V":{"S":"2020-11-15T22:02:26.510452"}}}
...

As you can see, the items are in the DynamoDB API format, mentioning attribute name (I’ve defined “K” and “V”) and the datatype (“S” for string here).

Credentials to access to S3

As my goal is to access it through the internet, I’ve defined a user for that:

the credentials I’ll need are the IAM user Access key ID and Secret access key.

As my goal is access it from my Oracle Autonomous Database, I connect to it:


[opc@a demo]$ TNS_ADMIN=/home/opc/wallet ~/sqlcl/bin/sql demo@atp1_tp

SQLcl: Release 20.3 Production on Sun Nov 15 22:34:32 2020

Copyright (c) 1982, 2020, Oracle.  All rights reserved.
Password? (**********?) ************
Last Successful login time: Sun Nov 15 2020 22:34:39 +01:00

Connected to:
Oracle Database 19c Enterprise Edition Release 19.0.0.0.0 - Production
Version 19.5.0.0.0

DEMO@atp1_tp>

I use SQLcl command line here, with the downloaded wallet, but you can do the same from SQL Developer Web of course.

Here is how I declare those credentials from the Oracle database:


DEMO@atp1_tp> exec dbms_cloud.create_credential('AmazonS3FullAccess','AKIA3VZ74TJVKPXKVFRK','RTcJaqRZ+8BTwdatQeUe4AHpJziR5xVrRGl7pmgd');

PL/SQL procedure successfully completed.

Currently, this DBMS_CLOUD is available only on the Oracle Cloud-managed databases, but I’m quite sure that this will come into on-premises version soon. Data movement through an Object Storage is a common thing today.

Access to S3 from Oracle Autonomous Database

Ok, now many things are possible now, with this DBMS_CLOUD package API, like creating a view on it (also known as External Table) to query directly from this remote file:


DEMO@atp1_tp> 
begin
 dbms_cloud.create_external_table(
  table_name=>'DEMO',
  credential_name=>'AmazonS3FullAccess',
  file_uri_list=>'https://franck-pachot-free-tier.s3.eu-central-1.amazonaws.com/AWSDynamoDB/01605477788673-a117c9b4/data/vovfdgilxiy6xkmzy3isnpzqgu.json.gz',
  format=>json_object( 'compression' value 'gzip' ),column_list=>'ITEM VARCHAR2(32767)'
 );
end;
/

PL/SQL procedure successfully completed.

This is straightforward: the credential name, the file URL, and the format (compressed JSON). As we are in a RDBMS here I define the structure: the name of the External Table (we query it as a table, but nothing is stored locally) and its definition (here only one ITEM column for the JSON string). Please, remember that SQL has evolved from its initial definition. I can do schema-on-read here by just defining an unstructured character string as my VARCHAR2(32767). But I still benefit from RDBMS logical independence: query as a table something that can be in a different format, on a different region, with the same SQL language.

Here is what I can view from this External Table:


DEMO@atp1_tp> select * from DEMO order by 1;
                                                                       ITEM
___________________________________________________________________________
{"Item":{"K":{"S":"K-00000001"},"V":{"S":"2020-11-15T22:02:23.202348"}}}
{"Item":{"K":{"S":"K-00000002"},"V":{"S":"2020-11-15T22:02:23.273001"}}}
{"Item":{"K":{"S":"K-00000003"},"V":{"S":"2020-11-15T22:02:23.292682"}}}
{"Item":{"K":{"S":"K-00000004"},"V":{"S":"2020-11-15T22:02:23.312492"}}}
{"Item":{"K":{"S":"K-00000005"},"V":{"S":"2020-11-15T22:02:23.332054"}}}
...
{"Item":{"K":{"S":"K-00000543"},"V":{"S":"2020-11-15T22:02:32.920001"}}}
{"Item":{"K":{"S":"K-00000544"},"V":{"S":"2020-11-15T22:02:32.940199"}}}
{"Item":{"K":{"S":"K-00000545"},"V":{"S":"2020-11-15T22:02:32.961124"}}}
{"Item":{"K":{"S":"K-00000546"},"V":{"S":"2020-11-15T22:02:32.976036"}}}
{"Item":{"K":{"S":"K-00000547"},"V":{"S":"2020-11-15T22:02:32.992915"}}}

547 rows selected.

DEMO@atp1_tp>

I validate the precision of the DynamoDB Point-In-Time snapshot: I exported as of 2020-11-15T22:02:33.011659 which was the point in time where I inserted K-00000548. At that time only K-00000001 to K-00000547 were there.

On SQL you can easily transform a non-structured JSON collection of items into a structured two-dimensional table:


DEMO@atp1_tp> 
  select json_value(ITEM,'$."Item"."K"."S"'),
         json_value(ITEM,'$."Item"."V"."S"') 
  from DEMO 
  order by 1 fetch first 10 rows only;

   JSON_VALUE(ITEM,'$."ITEM"."K"."S"')    JSON_VALUE(ITEM,'$."ITEM"."V"."S"')
______________________________________ ______________________________________
K-00000001                             2020-11-15T22:02:23.202348
K-00000002                             2020-11-15T22:02:23.273001
K-00000003                             2020-11-15T22:02:23.292682
K-00000004                             2020-11-15T22:02:23.312492
K-00000005                             2020-11-15T22:02:23.332054
K-00000006                             2020-11-15T22:02:23.352470
K-00000007                             2020-11-15T22:02:23.378414
K-00000008                             2020-11-15T22:02:23.395848
K-00000009                             2020-11-15T22:02:23.427374
K-00000010                             2020-11-15T22:02:23.447688

10 rows selected.

I used JSON_VALUE to extract attribute string values through JSON Path, but it will be easier to query with a relational view and proper data types, thanks to the JSON_TABLE function:


DEMO@atp1_tp> 
  select K,V, V-lag(V) over (order by V) "lag"
  from DEMO,
  json_table( ITEM, '$' columns ( 
   K varchar2(10) path '$."Item"."K"."S"' error on error, 
   V timestamp path '$."Item"."V"."S"' error on error
  )) order by K fetch first 10 rows only;

            K                                  V                    lag
_____________ __________________________________ ______________________
K-00000001    15-NOV-20 10.02.23.202348000 PM
K-00000002    15-NOV-20 10.02.23.273001000 PM    +00 00:00:00.070653
K-00000003    15-NOV-20 10.02.23.292682000 PM    +00 00:00:00.019681
K-00000004    15-NOV-20 10.02.23.312492000 PM    +00 00:00:00.019810
K-00000005    15-NOV-20 10.02.23.332054000 PM    +00 00:00:00.019562
K-00000006    15-NOV-20 10.02.23.352470000 PM    +00 00:00:00.020416
K-00000007    15-NOV-20 10.02.23.378414000 PM    +00 00:00:00.025944
K-00000008    15-NOV-20 10.02.23.395848000 PM    +00 00:00:00.017434
K-00000009    15-NOV-20 10.02.23.427374000 PM    +00 00:00:00.031526
K-00000010    15-NOV-20 10.02.23.447688000 PM    +00 00:00:00.020314

10 rows selected.

I’ve declared the timestamp as TIMESTAMP and then further arithmetic can be done, like I did here calculating the time interval between two items.
And yes, that’s a shout-out for the power of SQL for analyzing data, as well as a shout-out for millisecond data-ingest I did on a NoSQL free tier Be careful, I sometimes put hidden messages link this in my demos…

Copy from S3 to Oracle Autonomous Database

I can use the query above to store it locally with a simple CREATE TABLE … AS SELECT:


DEMO@atp1_tp>
  create table DEMO_COPY(K primary key using index local, V) 
  partition by hash(K) partitions 8 
  as select K,V from DEMO, json_table(ITEM,'$' columns ( K varchar2(10) path '$."Item"."K"."S"' , V timestamp path '$."Item"."V"."S"' ));

Table DEMO_COPY created.

DEMO@atp1_tp> desc DEMO_COPY

   Name       Null?            Type
_______ ___________ _______________
K       NOT NULL    VARCHAR2(10)
V                   TIMESTAMP(6)

In order to mimic the DynamoDB scalability, I’ve created it as a hashed partitioned table with local primary key index. Note that there’s a big difference here: the number of partition is mentioned and necessitates an ALTER TABLE to increase it. This is probably something that will evolve in the Oracle Autonomous database but for the moment, DynamoDB still provides the most simple API for automatic sharding.

Here is the execution plan:


DEMO@atp1_tp> select * from DEMO_COPY where K='K-00000042';

            K                                  V
_____________ __________________________________
K-00000042    15-NOV-20 10.02.24.109384000 PM

DEMO@atp1_tp> xc
DEMO@atp1_tp> select plan_table_output from dbms_xplan.display_cursor(format=>'allstats last')

                                                                                                        PLAN_TABLE_OUTPUT
_________________________________________________________________________________________________________________________
SQL_ID  cn4b2uar7hxm5, child number 0
-------------------------------------
select * from DEMO_COPY where K='K-00000042'

Plan hash value: 3000930534

----------------------------------------------------------------------------------------------------------------------
| Id  | Operation                          | Name         | Starts | E-Rows | A-Rows |   A-Time   | Buffers | Reads  |
----------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                   |              |      1 |        |      1 |00:00:00.01 |       3 |      1 |
|   1 |  TABLE ACCESS BY GLOBAL INDEX ROWID| DEMO_COPY    |      1 |      1 |      1 |00:00:00.01 |       3 |      1 |
|*  2 |   INDEX UNIQUE SCAN                | SYS_C0037357 |      1 |      1 |      1 |00:00:00.01 |       2 |      1 |
----------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - access("K"='K-00000042')

19 rows selected.

3 buffer reads are, of course, really fast. And by increasing the number of partitions (always a power of two because a linear hashing algorithm is used). But do not stick to the same data model as the source NoSQL one. If you export data to a relational database, you probably have other access paths.

Remote access

While in execution plans here is the one from the External Table:


DEMO@atp1_tp> 
  select K,V from DEMO, 
  json_table(ITEM,'$' columns ( K varchar2(10) path '$."Item"."K"."S"' , V timestamp path '$."Item"."V"."S"' )) 
  where K='K-00000042';

            K                                  V
_____________ __________________________________
K-00000042    15-NOV-20 10.02.24.109384000 PM

DEMO@atp1_tp> select plan_table_output from dbms_xplan.display_cursor(format=>'allstats last')

                                                                                       PLAN_TABLE_OUTPUT
________________________________________________________________________________________________________
SQL_ID  1vfus4x9mpwp5, child number 0
-------------------------------------
select K,V from DEMO, json_table(ITEM,'$' columns ( K varchar2(10) path
'$."Item"."K"."S"' , V timestamp path '$."Item"."V"."S"' )) where
K='K-00000042'

Plan hash value: 1420242269

-----------------------------------------------------------------------------------------------------
| Id  | Operation                      | Name     | Starts | E-Rows | A-Rows |   A-Time   | Buffers |
-----------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT               |          |      1 |        |      1 |00:00:00.29 |    3053 |
|   1 |  PX COORDINATOR                |          |      1 |        |      1 |00:00:00.29 |    3053 |
|   2 |   PX SEND QC (RANDOM)          | :TQ10000 |      1 |   1067K|      1 |00:00:00.15 |    2024 |
|   3 |    NESTED LOOPS                |          |      1 |   1067K|      1 |00:00:00.15 |    2024 |
|   4 |     PX BLOCK ITERATOR          |          |      1 |        |    547 |00:00:00.15 |    2024 |
|*  5 |      EXTERNAL TABLE ACCESS FULL| DEMO     |      1 |  13069 |    547 |00:00:00.15 |    2024 |
|   6 |     JSONTABLE EVALUATION       |          |    547 |        |      1 |00:00:00.01 |       0 |
-----------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   5 - filter(JSON_EXISTS2("ITEM" FORMAT JSON , '$?(@.Item.K.S=="K-00000042")' FALSE ON
              ERROR)=1)

Note
-----
   - automatic DOP: Computed Degree of Parallelism is 2 because of degree limit

Even if the plan is ready for parallel access, here I have one file to read and unzip and it was not parallelized (Starts=1). There are no statistics and that’s why the rows estimations (E-Rows) is wrong. What is important to understand is that the whole file is read and unzipped (A-Rows=547) before the non-interesting rows being filtered out. However, we can see that the condition on the relational columns has been pushed down as a JSON Path predicate.

I did everything from the command line here, but you can use SQL Developer Web:

Note that rather than creating an External Table and creating a table from it, you can also insert the remote data into an existing table with DBMS_CLOUD:


DEMO@atp1_tp> create table DEMO_ITEM("Item" varchar2(32767));
Table DEMO_ITEM created.

DEMO@atp1_tp> exec dbms_cloud.copy_data(table_name=>'DEMO_ITEM',credential_name=>'AmazonS3FullAccess',file_uri_list=>'https://franck-pachot-free-tier.s3.eu-central-1.amazonaws.com/AWSDynamoDB/01605477788673-a117c9b4/data/vovfdgilxiy6xkmzy3isnpzqgu.json.gz',format=>json_object( 'compression' value 'gzip' ));

PL/SQL procedure successfully completed.

Elapsed: 00:00:01.130

Pricing and summary

I used all free tier here, but be careful that you may have multiple services billed when doing this.
First, I used two options of DynamoDB:

I enabled “Continuous backups (PITR)” which is $0.2448 per GB per months in my region,
and I used “Data Export to Amazon S3” which is $0.1224 per GB there.

And when taking out data from S3 you should check at egress costs.
From The Oracle Autonomous Database, I used the always free database (20GB free for life) and ingress transfer is free as well.
Conclusion: better do the export once and copy data rather than query the External Table multiple times. This is not the correct solution if you want a frequently refreshed copy. But it is very satisfying to know that you can move data from one cloud provider to the other, and one technology to the other. I’ll probably blog about the other way soon.

Cet article AWS DynamoDB -> S3 -> OCI Autonomous Database est apparu en premier sur Blog dbi services.

↧

Recovery in the ☁ with Oracle Autonomous Database

November 18, 2020, 7:49 am

≫ Next: A typical ODA project (and why I love Oracle Database Appliance)

≪ Previous: AWS DynamoDB -> S3 -> OCI Autonomous Database

By Franck Pachot

.
I’ll start this series with the Oracle Autonomous database but my goal is to cover the Point In Time recovery for many managed databases in the major cloud providers. Because I’ve seen lot of confusion about database backups (see What is a database backup (back to the basics)). On one side, in a managed database, we should not have to care about backups, but only about recovery. The way it is done (backups, redo archiving, snapshots, replication…) is not in our responsibility on a managed service. What we want is an easy way to recover the database to a past Point In Time within a recovery window (the recovery point – RPO) and in a predictable time (the recovery time – RTO). However, I’m certain that it is important to know how it is implemented behind this “recovery” service.

Clone from a backup

What I like the most with the Oracle Autonomous Database is that there is no “restore” or “recovery” button on the database service actions. There is one when you go to the list of backups, in order to restore the database in-place from a previous backup. But this is not the common case of recovery. When you need recovery, you probably want to create a new database. The reason is that when you recover to a Point In Time in the past, you probably want to connect to it, check it, read or export data from it, without discarding the transactions that happened since then in the original database. And you are in the cloud: it is easy to provision a new clone and keep the original database until you are sure to know what you need to do (merging recent transactions into the clone or fixing the original database from the clone data). In the cloud, a Point-In-Time Recovery (PITR) is actually a “clone from backup” or “clone from timestamp”. And this is exactly how you do it in the Oracle Autonomous Database: you create a new clone, and rather than cloning from the current state you choose to clone from a previous state, by mentioning a timestamp or a backup.

Point In Time (PIT)

Actually, I think that the “clone from backup” should not even be there. It gives a false impression that you restore a backup like a snapshot of the database. But that’s wrong: the “restore from backup” is just a “restore from timestamp” where the timestamp is the one from the end of the copy of datafiles to the backup set. It is not a special Recovery Point. It is just a point where the Recovery Time is minimal. From a user point of view, you decide to clone from a timestamp. The timestamp is probably the point just before the failure (like an erroneous update for example). Additionally, I hope that one day more utilities will be provided to define this point. Log Miner can search the journal of changes for a specific point. Imagine a recovery service where you mention a time window, the name of a table, an operation (like UPDATE),… and find the exact point-in-time to recover. Oracle has Log Miner and Flashback Transaction and a DBA can use them. A managed service could provide the same. But currently, you need to know the timestamp you want to recover to. The console shows time per half hours but you can enter it with finer granularity: second in the console, millisecond with the CLI:

I have a PL/SQL loop constantly updating a timestamp in a DEMO table:


SQL> exec loop update demo set ts=sys_extract_utc(current_timestamp); commit; end loop;

I just kept that running.

Later I created a a clone with 21:21:21.000 as recovery point in time:


[opc@a demo]$ oci db autonomous-database create-from-backup-timestamp --clone-type FULL --timestamp 2020-11-14T21:21:21.000Z --db-name clone --cpu-core-count 1 --data-storage-size-in-tbs 1 --admin-password COVID-19-Go-Home --autonomous-database-id ocid1.autonomousdatabase.oc1.iad.abuw...

and once opened, I checked the state of my table:


SQL> select * from demo;

TS                       
------------------------ 
2020-11-14T21:21:19.368Z

This is less than 2 seconds before the mentioned recovery point-in-time. RPO is minimal.

Recovery window

With the Oracle Autonomous Database, you always have the possibility to recover to any point in time during the past 2 months. Always: without any additional configuration, without additional cost, and even in the always free tier. You can run additional manual backups but you cannot remove those automatic ones, and you cannot reduce this recovery window.
Here is the documentation:
About Backup and Recovery on Autonomous Data Warehouse
Autonomous Data Warehouse automatically backs up your database for you. The retention period for backups is 60 days. You can restore and recover your database to any point-in-time in this retention period.

This is great. You don’t have to worry about anything in advance. And if something happens (can be a user error dropping a table, updating wrong rows, an application release that messes up everything, a legal request to look at past data,…) you can create a clone of the database at any state within this 60 days timeframe. And these 60 days is the minimum guaranteed. I have a small free-tier database where I can still see the backups from the past 6 months:

I have another database that is stopped for 2 months:

Only one backup there because the database has not been opened in the past two months (my current timestamp is 2020-09-17 15:45:50). This is important to know. Some cloud providers will start your managed database to backup it and apply updates. for example, in AWS RDS a stopped database is stopped only for seven days and it will be automatically started after. But for the Oracle Autonomous Database, the database is actually a PDB, and stopped means closed but the instance is still running. But at least one backup remains!

Multiple backups

This is a very important point. If you rely only on restoring a backup, without the possibility to recover to a further Point In Time, you have the risk that any problem in the backup compromises your recovery. Here, even if the last backup has a problem (you should always design for failure even if the probability of it is low) I can recover from the previous one and recover. It will take longer (24 hours of activity to recover) but data will be there, and consistent, to the requested point in time.

When you know how Oracle Database backups work, you know that any backup configuration mentions either a number of backup (please, always more than one) or a recovery window (which hopefully will contain more than one), bot not both:


RMAN> show all;

using target database control file instead of recovery catalog
RMAN configuration parameters for database with db_unique_name CDB1A_IAD154 are:
CONFIGURE RETENTION POLICY TO REDUNDANCY 1; # default

RMAN> configure retention policy to recovery window of 60 days;

new RMAN configuration parameters:
CONFIGURE RETENTION POLICY TO RECOVERY WINDOW OF 60 DAYS;
new RMAN configuration parameters are successfully stored

In a managed database you don’t have to configure this. But knowing how it works will give you confidence and trust in the data protection. Here for my closed database I can see only one backup.

Limitation in RTO (Recovery Time Objective)

How long does it take to restore the database? This is proportional to the size. But when you understand that a restore is not sufficient and recovery must apply the redo logs to bring all database files to be consistent to the recovery point-in-time, you know it can take longer. As automatic backups are taken every day, you may have 24 hours to recover. And understanding this may help. If you have the choice to select a point-in-time not too far after the end of the backup, then you can minimize the RTO. This is where you may choose “select the backup from the list” rather than “Point In Time clone”. But the, you may increase the RPO.

My main message is: even in a managed service where you are not the DBA, it is important to understand how it works. There is nothing like “restore a backup”. Recovery always happens and your goal when you select the recovery point-in-time is to minimize the RTO for the required RPO. Recovery has to restore the archived logs, read them and apply the relevant redo vectors. And with the Oracle Autonomous Database, this redo stream is common for all databases (redo is at CDB level if you know the multitenant architecture). This can take time for your database even if you didn’t have a lot of updates in your database (which is a PDB). Most of the redo will be discarded, but they have to be restored and read sequentially. In order to verify that, I’ve selected a point-in-time from this morning for the database that was stopped 2 months ago. The last backup is 2 months before the Point In Time to recover to. If you take this as a black box, you may think that this will be very fast: no updates to recover because the database was closed most of the time. But actually, this has to go through 2 months of redo for this database shared by many users. While writing this, the progress bar is showing “18%” for three hours… This could be avoided if the whole CDB was backed-up every day, because a closed autonomous database is just a closed PDB – files are still accessible for backup. Is it the case? I don’t know.

Finally, this recovery took nearly 5 hours. And then I am not sure whether it really restored the datafiles from 2 months ago, and then restoring this amount of archived logs was really fast, or from a more recent backup taken while the database was closed, not visible in the list of backups. Because I see 74 TeraBytes of archived logs during those 60 days:


select dbms_xplan.format_size(sum(blocks*block_size))
from gv$archived_log where dest_id=1 and first_time>sysdate-60;

DBMS_XPLAN.FORMAT_SIZE(SUM(BLOCKS*BLOCK_SIZE)) 
---------------------------------------------- 
74T

And I would be very surprised that they can restore and recover this amount in less than 5 hours… But who knows? We don’t have the logs in a managed database.

Note that this is a shared infrastructure managed database. You have also the choice to provision a dedicated autonomous database when you need better control and isolation. What I’m saying is that if you don’t know it, you will make mistakes. Like this one: as the database was stopped for 2 months, I should have selected a point-in-time closer to the last backup. It could have taken 1 hour instead of 5 hours (I tested it on the same database). I’ve been working on production databases for 20 years, in many companies. I can tell you that when a problem will happen in production, and you will have to recover a database, you will be surrounded by managers asking, every five minutes, how long this recovery will take. The better you understand the recovery process, the more comfortable you will be.

Just imagine that you are in this situation and the only information you have is this 18% progress bar that is there for two hours:

When I understand what happens (2 months of shared archived logs to restore), I can take another action, or at least explain why it takes so long. Here, even if I have no access to the recovery logs in this managed database, I can guess that it restores and recover all archived logs from the past 60 days. It cannot be fast and it is hard to predict in a shared infrastructure, as it depends on the neighbours activity.

Limitation in RPO (Recovery Point Objective)

Managed services have always more limitations because the automation that is behind must standardize everything. When trying to recover from a RPO that is just 30 minutes ago, I hit a limitation. I’ve not seen it documented but the message is clear: Cloning operation failed because the timestamp specified is not at least 2 hours in the past.

I don’t understand exactly the reason. Point-In-Time recovery should be possible even if redo logs are archived every 2 hours, which is not the case anyway:


DEMO@atp1_tp> select * from v$archive_dest where status='VALID';

   DEST_ID             DEST_NAME    STATUS      BINDING    NAME_SPACE     TARGET    ARCHIVER    SCHEDULE                  DESTINATION    LOG_SEQUENCE    REOPEN_SECS    DELAY_MINS    MAX_CONNECTIONS    NET_TIMEOUT    PROCESS    REGISTER    FAIL_DATE    FAIL_SEQUENCE    FAIL_BLOCK    FAILURE_COUNT    MAX_FAILURE    ERROR    ALTERNATE    DEPENDENCY    REMOTE_TEMPLATE    QUOTA_SIZE    QUOTA_USED    MOUNTID    TRANSMIT_MODE    ASYNC_BLOCKS    AFFIRM      TYPE    VALID_NOW      VALID_TYPE    VALID_ROLE    DB_UNIQUE_NAME    VERIFY    COMPRESSION    APPLIED_SCN    CON_ID    ENCRYPTION
__________ _____________________ _________ ____________ _____________ __________ ___________ ___________ ____________________________ _______________ ______________ _____________ __________________ ______________ __________ ___________ ____________ ________________ _____________ ________________ ______________ ________ ____________ _____________ __________________ _____________ _____________ __________ ________________ _______________ _________ _________ ____________ _______________ _____________ _________________ _________ ______________ ______________ _________ _____________
         1 LOG_ARCHIVE_DEST_1    VALID     MANDATORY    SYSTEM        PRIMARY    ARCH        ACTIVE      USE_DB_RECOVERY_FILE_DEST              3,256            300             0                  1              0 ARCH       YES                                     0             0                0              0          NONE         NONE          NONE                           0             0          0 SYNCHRONOUS                    0 NO        PUBLIC    YES          ALL_LOGFILES    ALL_ROLES     NONE              NO        DISABLE                     0         0 DISABLE


DEMO@atp1_tp> select stamp,dest_id,thread#,sequence#,next_time,backup_count,name from v$archived_log where next_time > sysdate-2/24;

           STAMP    DEST_ID    THREAD#    SEQUENCE#              NEXT_TIME    BACKUP_COUNT                                                                     NAME
________________ __________ __________ ____________ ______________________ _______________ ________________________________________________________________________
   1,056,468,560          1          1        3,255 2020-11-14 15:29:14                  1 +RECO/FEIO1POD/ARCHIVELOG/2020_11_14/thread_1_seq_3255.350.1056468555
   1,056,468,562          1          2        3,212 2020-11-14 15:29:16                  1 +RECO/FEIO1POD/ARCHIVELOG/2020_11_14/thread_2_seq_3212.376.1056468557
   1,056,472,195          1          2        3,213 2020-11-14 16:29:51                  1 +RECO/FEIO1POD/ARCHIVELOG/2020_11_14/thread_2_seq_3213.334.1056472191
   1,056,472,199          1          1        3,256 2020-11-14 16:29:52                  1 +RECO/FEIO1POD/ARCHIVELOG/2020_11_14/thread_1_seq_3256.335.1056472193


DEMO@atp1_tp> select * from v$logfile;

   GROUP#    STATUS      TYPE                                             MEMBER    IS_RECOVERY_DEST_FILE    CON_ID
_________ _________ _________ __________________________________________________ ________________________ _________
        2           ONLINE    +DATA/FEIO1POD/ONLINELOG/group_2.276.1041244447    NO                               0
        2           ONLINE    +RECO/FEIO1POD/ONLINELOG/group_2.262.1041244471    YES                              0
        1           ONLINE    +DATA/FEIO1POD/ONLINELOG/group_1.275.1041244447    NO                               0
        1           ONLINE    +RECO/FEIO1POD/ONLINELOG/group_1.261.1041244471    YES                              0
        5           ONLINE    +DATA/FEIO1POD/ONLINELOG/group_5.266.1041245263    NO                               0
        5           ONLINE    +RECO/FEIO1POD/ONLINELOG/group_5.266.1041245277    YES                              0
        6           ONLINE    +DATA/FEIO1POD/ONLINELOG/group_6.268.1041245295    NO                               0
        6           ONLINE    +RECO/FEIO1POD/ONLINELOG/group_6.267.1041245307    YES                              0
        3           ONLINE    +DATA/FEIO1POD/ONLINELOG/group_3.281.1041245045    NO                               0
        3           ONLINE    +RECO/FEIO1POD/ONLINELOG/group_3.263.1041245057    YES                              0
        4           ONLINE    +DATA/FEIO1POD/ONLINELOG/group_4.282.1041245077    NO                               0
        4           ONLINE    +RECO/FEIO1POD/ONLINELOG/group_4.264.1041245089    YES                              0
        7           ONLINE    +DATA/FEIO1POD/ONLINELOG/group_7.267.1041245327    NO                               0
        7           ONLINE    +RECO/FEIO1POD/ONLINELOG/group_7.268.1041245339    YES                              0
        8           ONLINE    +DATA/FEIO1POD/ONLINELOG/group_8.279.1041245359    NO                               0
        8           ONLINE    +RECO/FEIO1POD/ONLINELOG/group_8.269.1041245371    YES                              0

As you can see, the Autonomous Database has two threads (RAC two nodes) with 4 online redo log groups (the Autonomous Database is not protected by Data Guard even if you enable Autonomous Data Guard… but that’s another story) and two members each. And the online logs are archived and backed up. All is there to be able to recover within the past 2 hours. But probably those archived logs are shipped to a dedicated destination to be recovered by the PITR feature “clone from timestamp”.

Anyway, this is a managed service and you must do with it. You don’t want the DBA responsibility, and then you lack some control. Again, if you understand it, everything is fine. In case of failure, you can start to create a clone from 2 hours ago and look at what you can do to repair the initial mistake. And two hours later, you know that you can create another clone which is recovered to the last second before the failure.

In summary

The Oracle Autonomous Database does not provide all the possibilities available when you manage it yourself, but it is still at the top of the main cloud-managed database services: RPO is at second between 2 hours ago and 60 days ago. RTO is in few hours even when dealing with terabytes. This, without anything to configure or pay in addition to the database service, and this includes the always free database. If the database is stopped during the backup window, the backup is skipped but at least one remains even if out of the retention window.

Cet article Recovery in the ☁ with Oracle Autonomous Database est apparu en premier sur Blog dbi services.

↧

A typical ODA project (and why I love Oracle Database Appliance)

December 8, 2020, 9:47 am

≫ Next: Oracle 21c : Create a New Database

≪ Previous: Recovery in the ☁ with Oracle Autonomous Database

Introduction

You can say everything about Oracle infrastructure possibilities but nothing compares to experience based on the realization of real projects. I did quite a lot of ODA projects during the past years (on other platforms too), and I would like to tell you why I trust this solution. And for that, I will tell you the story of one of my latest ODA project.

Before choosing ODAs

A serious audit of the current infrastructure is needed, don’t miss that point. Sizing the storage, the memory, the licenses takes several days, but you will need that study. You cannot take good decisions being blind.

The metrics I would recommend to collect for designing your new ODA infrastructure are:
– the needed segregation (PROD/UAT/DEV/TEST/DR…)
– the DBTime from actual databases consolidated for each group
– the size and growth forecast for each database
– the memory usage for SGA and PGA, and the memory advised by Oracle inside each database
– an overview of future projects coming in the next years
– an overview of legacy applications that will stop to use Oracle in the next years

Once done, this audit will help you to choose how many ODAs, which type of ODA, how many disks on each ODA and so on. For this particular project, the audit was done several months before and led to an infrastructure composed of 6 X8-2M ODAs with various disk configurations.

Before delivery

I usually provide an Excel file to my customer for collecting essential information for the new environment:
– hostnames of the servers
– purpose of each server
– public network configuration (IP, netmask, gateway, DNS, NTP, domain)
– ILOM network configuration (IP, netmask, gateway – this is for the management interface)
– additional networks configuration (for backup or administration networks if additional interfaces have been ordered)
– the DATA/RECO repartition of the disks
– the target uid and gid for Linux users on ODA
– the version(s) of the database needed
– the number of cores to enable on each ODA (if using Enterprise Edition – this is how licenses are configured)

With this file, I’m pretty sure to have almost everything before the server are delivered.

Delivery day

Once the ODAs are delivered, you’ll first need to rack them into your datacenters and plug the power and network cables. If you are well organized, you’ll have done the network patching before and the cable would be ready to plug in. Racking an ODA S and M is easy, less than one hour. Even less if you’re used to racking servers. For ODA HA it’s a little bit more complicated, because 1 ODA HA is actually 2 servers and 1 DAS storage enclosure, or 2 DAS enclosures if you ordered the maximum disk configuration. But these are normal servers, and it shouldn’t be too long or too complicated.

1st Day

The first day is an important day because you can do a lot if everything is prepared.

You’ll need several zipfiles to download from MOS to deploy an ODA, and these zipfiles are quite big, 10 to 40GB depending on your configuration. Don’t wait to download them from the links in the documentation. You can choose an older software version than current one, but commonly it’s a good idea to deploy the very latest version. You need to download:
– the ISO file for bare metal reimaging
– the GI clone
– the DB clones for the database versions you planned to use
– the patch files, because you probably need them (we will see why later)

During the download, you can connect to the ILOM IP of the server and configure static IP address. Gathering the IP of the ILOM will need the help from the network/system team, because each ILOM first looks up for a dynamic IP from DHCP, therefore you need to change it.

Once everything is downloaded, you can start the reimaging of the first ODA from the ILOM of the server, then a few minutes later on the second one if you’ve got multiple ODAs. After the reimaging, you will use the configure-firstnet script on each ODA to configure basic network settings on public interface to be able to connect to the server itself.

Once the first ODA is ready, I usually prepare the json file for appliance deployment from the template and according to the settings provided in the excel file. It takes me about 1 hour to make sure nothing is wrong or missing, and then I start the appliance creation. I always create the appliance with a DBTEST database to make sure everything is fine up to database creation. During the first appliance creation, I copy the json file to the other ODA, change the few parameters that differ from the previous one, and also start the appliance creation.

Once done, I deploy the additional dbhomes if needed on both ODAs in parallel.

Then, I check the version of the components with odacli describe-components. You may know that a reimage does not update the microcodes of the ODAs, you should be aware of that. If firmware/bios/ILOM are not up-to-date, you need to apply the patch on top of your deployment, even if the software side is OK. So copy the patch on both ODAs, and apply it. It will probably need a reboot or two.

Once done, I usually configure the number of cores with odacli update-cpucores to match the license. If my customer only has Standard Edition licenses, I also decrease the number of cores to make sure to benefit from maximum CPU speed. See why here.

Finally, I do a sanity check of the 2 ODAs, checking if everything is running fine.

At the end of this first day, 2 ODAs are ready to use.

2nd Day

I’m quite anxious about my ODAs not being exactly identical. So the next day, I deployed the other ODAs. For this particular project, it was a total of 6 ODAs with various disk and license configuration, but the deployment method is the same.

Based on what I did the previous day, I deployed quite easily the 4 other ODAs with their own json deployment file. Then I applied the patch, configured the license, and did the sanity checks.

At the end of this second day, my 6 ODAs were ready to use.

3rd day

2 ODAs came with hardware warnings, and most of the time these warning are false positive. So I did a check from the ILOM CLI, reset the alerts and restart the ILOM. That solved the problem.

As documentation is something quite important for me and for my customer, I spent the rest of the day on consolidating all information and provide a first version of the documentation to the customer.

4th Day

The fourth day was actually the next week. During this time, my customer created a first development database and put it on “production” without any problem.

With every new ODA software release, new features are available. And I think it’s worth it to test these features because it can bring you something very usefull.

This time, I was quite curious about the Data Guard feature now included with odacli. Data Guard manual configuration is always quite long, it’s not very difficult but a lot of steps are needed to achieve this kind of configuration. And you have to do it for each database, meaning that it can takes days to put all your databases under Data Guard protection.

So I proposed to my customer to test the Data Guard implementation with odacli. I created a test database dedicated for that purpose, and followed the online documentation. It took me all the day, but I was able to create the standby, to configure Data Guard, to do a switchover, to do the switchback and to make a clean and complete procedure for the customer. You need to do that because the documentation has 1 or 2 steps that need more accuracy, and 1 or 2 others that need to be adapt to the customer’s environment.

This new feature will definitely simplify the Data Guard configuration, please take a look at my blog post if you’d like to test it.

5th Day

The goal of this day was to make a procedure for configuring Data Guard between the old platform and the new one. An ACFS to ASM conversion was needed, as well. So we worked on that point, made a lot of tests and finally provide a procedure for most cases. A DUPLICATE DATABASE FOR STANDBY with a BACKUP LOCATION was used in that procedure.

This procedure is not ODA specific, most of the advanced operations on ODA are done using classic tools.

6th, 7th and 8th days

These days were dedicated to the ODA workshop. It’s the perfect timing for this training, because the customer already had quite a lot of information during the deployment, and the servers are not yet on production meaning that we can use them for demos and exercises. At dbi services, we make our own training material. Regarding the ODA workshop, it’s start from the history of ODA until the lifecycle management of the plaform. You need 2 to 3 days to have a good overview of the solution and to get ready to work on it.

9th and 10th days

These days were actually extra days in my opinion, extra days are not useless day. Most often, it’s a good addition because there’s always the need to go deeper for some themes.

This time, we need to refine the Data Guard procedure and test it with older versions: 12.2, 12.1 and 11.2. We discovered that it didn’t work for 11.2, and we tried to debug. Finally, we decided to use only the ODA Data Guard feature for 12.1 and later versions, it was OK because only a few databases will not be able to go higher than 11.2. We also found that configuring Data Guard from scratch only takes 40 minutes, including the backup/restore operations (for the smallest possible database), it definitely validated the efficiency of this method over manual configuration.

We also studied the best way to create additional listeners, because odacli does not include the listener management. Using srvctl to do that was quite convenient and clean, so we provided a procedure to configure these listeners, and we also tested the Data Guard feature with these dedicated listeners, it was OK.

Last task was to provide a procedure for ACFS to ASM migration. Once migrated to 19c, 11gR2 databases can be moved to ASM to get rid of ACFS (as ASM has been chosen by the customer for all 12c and later databases). odacli does not provide a mechanism to move from ACFS to ASM, but it’s possible to restore an ACFS database to ASM quite easily.

Actually, these 2 days were very efficient. And I also had enough time to send the v1.1 of the documentation with all the procedures we elaborated together with my customer.

The next days

For this project, my job was limited to deploying/configuring the ODAs and training my customer plus giving him the best practices. With a team of experimented DBAs, it will be quite easy now to continue without me, the next task being to migrate each database to this new infrastructure. Even on ODA, the migration process takes days because there is a lot of databases coming from different versions.

Conclusion

ODA is a great platform to optimize your work, starting from the migration project. You don’t loose time because this solution is ready to use. I don’t think it’s possible to be as efficient with another on-premise platform. For sure, you will save a lot of time, you will have less problems, you will manage everything by yourself, the performance will be good. ODA is probably the best option for achieving this kind of project with minimal risk and maximal efficiency.

Cet article A typical ODA project (and why I love Oracle Database Appliance) est apparu en premier sur Blog dbi services.

↧

Oracle 21c : Create a New Database

December 9, 2020, 8:58 am

≫ Next: Oracle 21c Security : ORA_STIG_PROFILE and ORA_CIS_PROFILE

≪ Previous: A typical ODA project (and why I love Oracle Database Appliance)

Oracle 21c is now released on the cloud. And in this blog I am just testing my first database creation . As earlier release dbca is still present. Just launch it

[oracle@oraadserver admin]$ dbca

After The creation some query to verify

SQL> select comp_name,version,status from dba_registry;

COMP_NAME                                VERSION    STATUS
---------------------------------------- ---------- ----------
Oracle Database Catalog Views            21.0.0.0.0 VALID
Oracle Database Packages and Types       21.0.0.0.0 VALID
Oracle Real Application Clusters         21.0.0.0.0 OPTION OFF
JServer JAVA Virtual Machine             21.0.0.0.0 VALID
Oracle XDK                               21.0.0.0.0 VALID
Oracle Database Java Packages            21.0.0.0.0 VALID
OLAP Analytic Workspace                  21.0.0.0.0 VALID
Oracle XML Database                      21.0.0.0.0 VALID
Oracle Workspace Manager                 21.0.0.0.0 VALID
Oracle Text                              21.0.0.0.0 VALID
Oracle Multimedia                        21.0.0.0.0 VALID
Oracle OLAP API                          21.0.0.0.0 VALID
Spatial                                  21.0.0.0.0 VALID
Oracle Locator                           21.0.0.0.0 VALID
Oracle Label Security                    21.0.0.0.0 VALID
Oracle Database Vault                    21.0.0.0.0 VALID

16 rows selected.

In coming blogs we will see some new features. Just we will note that it is no longer possible to create a non-cdb instance since Oracle 20c

Cet article Oracle 21c : Create a New Database est apparu en premier sur Blog dbi services.

↧

Oracle 21c Security : ORA_STIG_PROFILE and ORA_CIS_PROFILE

December 10, 2020, 4:14 am

≫ Next: Oracle 21c Security : Gradual Database Password Rollover

≪ Previous: Oracle 21c : Create a New Database

In my previous blog I was testing the creation of a new Oracle 21c database. In this blog I am talking about two changes about the security.
In each new release Oracle strengthens security. That’s why since Oracle 12.2, to meet Security Technical Implementation Guides (STIG) compliance, Oracle Database provided the profile ORA_STIG_PROFILE
With Oracle 21c the profile ORA_STIG_PROFILE was updated and Oracle has provided a new profile to meet CIS standard : the profile ORA_CIS_PROFILE
The ORA_STIG_PROFILE user profile has been updated with the latest Security Technical Implementation Guide’s (STIG) guidelines
The ORA_CIS_PROFILE has the latest Center for Internet Security (CIS) guidelines

ORA_STIG_PROFILE
In an Oracle 19c database, we can fine following for the ORA_STIG_PROFILE.

SQL> select profile,resource_name,limit from dba_profiles where profile='ORA_STIG_PROFILE' order by resource_name;

PROFILE                        RESOURCE_NAME                  LIMIT
------------------------------ ------------------------------ ------------------------------
ORA_STIG_PROFILE               COMPOSITE_LIMIT                DEFAULT
ORA_STIG_PROFILE               CONNECT_TIME                   DEFAULT
ORA_STIG_PROFILE               CPU_PER_CALL                   DEFAULT
ORA_STIG_PROFILE               CPU_PER_SESSION                DEFAULT
ORA_STIG_PROFILE               FAILED_LOGIN_ATTEMPTS          3
ORA_STIG_PROFILE               IDLE_TIME                      15
ORA_STIG_PROFILE               INACTIVE_ACCOUNT_TIME          35
ORA_STIG_PROFILE               LOGICAL_READS_PER_CALL         DEFAULT
ORA_STIG_PROFILE               LOGICAL_READS_PER_SESSION      DEFAULT
ORA_STIG_PROFILE               PASSWORD_GRACE_TIME            5
ORA_STIG_PROFILE               PASSWORD_LIFE_TIME             60
ORA_STIG_PROFILE               PASSWORD_LOCK_TIME             UNLIMITED
ORA_STIG_PROFILE               PASSWORD_REUSE_MAX             10
ORA_STIG_PROFILE               PASSWORD_REUSE_TIME            365
ORA_STIG_PROFILE               PASSWORD_VERIFY_FUNCTION       ORA12C_STIG_VERIFY_FUNCTION
ORA_STIG_PROFILE               PRIVATE_SGA                    DEFAULT
ORA_STIG_PROFILE               SESSIONS_PER_USER              DEFAULT

17 rows selected.

SQL>

Now in in Oracle 21c, we can see that there are some changes.

SQL> select profile,resource_name,limit from dba_profiles where profile='ORA_STIG_PROFILE' order by RESOURCE_NAME;

PROFILE                        RESOURCE_NAME                  LIMIT
------------------------------ ------------------------------ ------------------------------
ORA_STIG_PROFILE               COMPOSITE_LIMIT                DEFAULT
ORA_STIG_PROFILE               CONNECT_TIME                   DEFAULT
ORA_STIG_PROFILE               CPU_PER_CALL                   DEFAULT
ORA_STIG_PROFILE               CPU_PER_SESSION                DEFAULT
ORA_STIG_PROFILE               FAILED_LOGIN_ATTEMPTS          3
ORA_STIG_PROFILE               IDLE_TIME                      15
ORA_STIG_PROFILE               INACTIVE_ACCOUNT_TIME          35
ORA_STIG_PROFILE               LOGICAL_READS_PER_CALL         DEFAULT
ORA_STIG_PROFILE               LOGICAL_READS_PER_SESSION      DEFAULT
ORA_STIG_PROFILE               PASSWORD_GRACE_TIME            0
ORA_STIG_PROFILE               PASSWORD_LIFE_TIME             35
ORA_STIG_PROFILE               PASSWORD_LOCK_TIME             UNLIMITED
ORA_STIG_PROFILE               PASSWORD_REUSE_MAX             5
ORA_STIG_PROFILE               PASSWORD_REUSE_TIME            175
ORA_STIG_PROFILE               PASSWORD_ROLLOVER_TIME         DEFAULT
ORA_STIG_PROFILE               PASSWORD_VERIFY_FUNCTION       ORA12C_STIG_VERIFY_FUNCTION
ORA_STIG_PROFILE               PRIVATE_SGA                    DEFAULT
ORA_STIG_PROFILE               SESSIONS_PER_USER              DEFAULT

18 rows selected.

SQL>

The following parameters were updated

-PASSWORD_GRACE_TIME
-PASSWORD_LIFE_TIME
-PASSWORD_REUSE_MAX
-PASSWORD_REUSE_TIME
-And there is a new parameter PASSWORD_ROLLOVER_TIME

ORA_CIS_PROFILE
Below the new characteristics for the new profile

SQL> select profile,resource_name,limit from dba_profiles where profile='ORA_CIS_PROFILE' order by RESOURCE_NAME;

PROFILE                        RESOURCE_NAME                  LIMIT
------------------------------ ------------------------------ ------------------------------
ORA_CIS_PROFILE                COMPOSITE_LIMIT                DEFAULT
ORA_CIS_PROFILE                CONNECT_TIME                   DEFAULT
ORA_CIS_PROFILE                CPU_PER_CALL                   DEFAULT
ORA_CIS_PROFILE                CPU_PER_SESSION                DEFAULT
ORA_CIS_PROFILE                FAILED_LOGIN_ATTEMPTS          5
ORA_CIS_PROFILE                IDLE_TIME                      DEFAULT
ORA_CIS_PROFILE                INACTIVE_ACCOUNT_TIME          120
ORA_CIS_PROFILE                LOGICAL_READS_PER_CALL         DEFAULT
ORA_CIS_PROFILE                LOGICAL_READS_PER_SESSION      DEFAULT
ORA_CIS_PROFILE                PASSWORD_GRACE_TIME            5
ORA_CIS_PROFILE                PASSWORD_LIFE_TIME             90
ORA_CIS_PROFILE                PASSWORD_LOCK_TIME             1
ORA_CIS_PROFILE                PASSWORD_REUSE_MAX             20
ORA_CIS_PROFILE                PASSWORD_REUSE_TIME            365
ORA_CIS_PROFILE                PASSWORD_ROLLOVER_TIME         DEFAULT
ORA_CIS_PROFILE                PASSWORD_VERIFY_FUNCTION       ORA12C_VERIFY_FUNCTION
ORA_CIS_PROFILE                PRIVATE_SGA                    DEFAULT
ORA_CIS_PROFILE                SESSIONS_PER_USER              10

18 rows selected.

SQL>

These user profiles can be directly used with the database users or as part of your own user profiles. Oracle keeps these profiles up to date to make it easier for you to implement password policies that meet STIG and CIS guidelines.

Cet article Oracle 21c Security : ORA_STIG_PROFILE and ORA_CIS_PROFILE est apparu en premier sur Blog dbi services.

↧

Oracle 21c Security : Gradual Database Password Rollover

December 10, 2020, 4:14 am

≫ Next: Oracle 21c Security : Mandatory Profile

≪ Previous: Oracle 21c Security : ORA_STIG_PROFILE and ORA_CIS_PROFILE

Starting with Oracle 21c, a password of an application can be changed without having to schedule a downtime. This can be done by using the new profile parameter PASSWORD_ROLLOVER_TIME
This will set a rollover period of time where the application can log in using either the old password or the new password. With this enhancement, an administrator does not need any more to take the application down when the application database password is being rotated.
Let see in this blog how this works

SQL> show pdbs

    CON_ID CON_NAME                       OPEN MODE  RESTRICTED
---------- ------------------------------ ---------- ----------
         2 PDB$SEED                       READ ONLY  NO
         3 PDB1                           READ WRITE NO
SQL>

First we create a profile in PDB1

SQL> show con_name;

CON_NAME
------------------------------
PDB1


SQL> CREATE PROFILE testgradualrollover LIMIT
 FAILED_LOGIN_ATTEMPTS 4
 PASSWORD_ROLLOVER_TIME 4;  

Profile created.

SQL>

Note that the parameter PASSWORD_ROLLOVER_TIME is specified in days. For example, 1/24 means 1H.
The minimum value for this parameter is 1h and the maximum value is 60 days or the lower value of the PASSWORD_LIFE_TIME or PASSWORD_GRACE_TIME parameter.
Now let’s create a new user in PDB1 and let’s assign him the profile we created

SQL> create user edge identified by "Borftg8957##"  profile testgradualrollover;

User created.

SQL> grant create session to edge;

Grant succeeded.

SQL>

We can also verify the status of the account in the PDB

SQL>  select username,account_status from dba_users where username='EDGE';

USERNAME             ACCOUNT_STATUS
-------------------- --------------------
EDGE                 OPEN

SQL>

Now let’s log with new user


[oracle@oraadserver admin]$ sqlplus edge/"Borftg8957##"@pdb1

SQL*Plus: Release 21.0.0.0.0 - Production on Thu Dec 10 11:14:07 2020
Version 21.1.0.0.0

Copyright (c) 1982, 2020, Oracle.  All rights reserved.


Connected to:
Oracle Database 21c Enterprise Edition Release 21.0.0.0.0 - Production
Version 21.1.0.0.0

SQL> show con_name;

CON_NAME
------------------------------
PDB1
SQL> show user;
USER is "EDGE"
SQL>

Now let’s change the password of the user edge

SQL> alter user edge identified by "Morfgt5879!!";

User altered.

SQL>

As the rollover period is set to 4 days in the profile testgradualrollover, the user edge should be able to connect during 4 days with either the old password or the new one.
Let’s test with the old password

[oracle@oraadserver admin]$ sqlplus edge/"Borftg8957##"@pdb1

SQL*Plus: Release 21.0.0.0.0 - Production on Thu Dec 10 11:21:02 2020
Version 21.1.0.0.0

Copyright (c) 1982, 2020, Oracle.  All rights reserved.

Last Successful login time: Thu Dec 10 2020 11:14:07 +01:00

Connected to:
Oracle Database 21c Enterprise Edition Release 21.0.0.0.0 - Production
Version 21.1.0.0.0

SQL> show con_name;

CON_NAME
------------------------------
PDB1
SQL> show user;
USER is "EDGE"
SQL>

Let’s test with the new password

[oracle@oraadserver ~]$ sqlplus edge/'Morfgt5879!!'@pdb1

SQL*Plus: Release 21.0.0.0.0 - Production on Thu Dec 10 11:24:52 2020
Version 21.1.0.0.0

Copyright (c) 1982, 2020, Oracle.  All rights reserved.

Last Successful login time: Thu Dec 10 2020 11:21:02 +01:00

Connected to:
Oracle Database 21c Enterprise Edition Release 21.0.0.0.0 - Production
Version 21.1.0.0.0

SQL> show user;
USER is "EDGE"

SQL> show con_name;

CON_NAME
------------------------------
PDB1
SQL>

We can see that the connection is successfully done with both cases. If we query the dba_users we can see the status of the rollover

SQL> select username,account_status from dba_users where username='EDGE';

USERNAME             ACCOUNT_STATUS
-------------------- --------------------
EDGE                 OPEN & IN ROLLOVER

To end the password rollover period
-Let the password rollover expire on its own
-As either the user or an administrator run the command

    Alter user edge expire password rollover period;

-As an administrator, expire the user password

Alter user edge password expire;

Database behavior during the gradual password rollover period can be found here in the documentation

Cet article Oracle 21c Security : Gradual Database Password Rollover est apparu en premier sur Blog dbi services.

↧

Oracle 21c Security : Mandatory Profile

December 11, 2020, 9:21 am

≫ Next: Oracle write consistency bug and multi-thread de-queuing

≪ Previous: Oracle 21c Security : Gradual Database Password Rollover

With Oracle 21c, it is now possible to enforce a password policy (length, number of digits…) for all pluggable databases or for specific pluggable databases via profiles. This is done by creating a mandatory profile in the root CDB and this profile will be attached to corresponding PDBs.
The mandatory profile is a generic profile that can only have a single parameter, the PASSWORD_VERIFY_FUNCTION.
The password complexity verification function of the mandatory profile is checked before the password complexity function that is associated with the user account profile.
For example, the password length defined in the mandatory profile will take precedence on any other password length defined in any other profile associated to the user.
When defined the limit of the mandatory profile will be enforced in addition to the limits of the actual profile of the user.
A mandatory profile cannot be assigned to a user but should attached to a PDB

In this demonstration we will consider a instance DB21 with 3 PDB
-PDB1
-PDB2
-PDB3

We will create 2 mandatory profiles:
c##mand_profile_pdb1_pdb2 which will be assigned to PDB1 and PDB2
c##mand_profile_pdb3 which will be assigned to PDB3

SQL> show pdbs;

    CON_ID CON_NAME                       OPEN MODE  RESTRICTED
---------- ------------------------------ ---------- ----------
         2 PDB$SEED                       READ ONLY  NO
         3 PDB1                           READ WRITE NO
         4 PDB2                           READ WRITE NO
         5 PDB3                           READ WRITE NO
SQL>

We will create two verification functions in the root container that we will associate to our mandatory profiles. The first function will check for a password length to 6

SQL> CREATE OR REPLACE FUNCTION func_pdb1_2_verify_function
 ( username     varchar2,
   password     varchar2,
   old_password varchar2)
 return boolean IS
BEGIN
   if not ora_complexity_check(password, chars => 6) then
      return(false);
   end if;
   return(true);
END;
/  

Function created.

SQL>

The second function will check for a password length to 10

SQL> CREATE OR REPLACE FUNCTION func_pdb3_verify_function
 ( username     varchar2,
   password     varchar2,
   old_password varchar2)
 return boolean IS
BEGIN
   if not ora_complexity_check(password, chars => 10) then
      return(false);
      end if;
   return(true);
END;
/ 

Function created.

SQL>

Now let’s create the two mandatory profiles in the root container

SQL>
CREATE MANDATORY PROFILE c##mand_profile_pdb1_pdb2
LIMIT PASSWORD_VERIFY_FUNCTION func_pdb1_2_verify_function
CONTAINER = ALL;

Profile created.

Remember that we want to associate the mandatoty profile c##mand_profile_pdb1_pdb2 to PDB1 and PDB2. So we can first attach this profile to all PDBs

SQL> CREATE MANDATORY PROFILE c##mand_profile_pdb3
LIMIT PASSWORD_VERIFY_FUNCTION func_pdb3_verify_function
CONTAINER = ALL;  

Profile created.

Remember that we want to associate the mandatoty profile c##mand_profile_pdb1_pdb2 to PDB1 and PDB2. So we can first attach this profile to all PDBs

SQL> show con_name;

CON_NAME
------------------------------
CDB$ROOT

SQL> alter system set mandatory_user_profile=c##mand_profile_pdb1_pdb2;

System altered.

SQL>

To associate the profile c##mand_profile_pdb3 to PDB3, we can edit the spfile of PDB3

SQL> show con_name;

CON_NAME
------------------------------
PDB3
SQL>  alter system set mandatory_user_profile=c##mand_profile_pdb3;

System altered.

SQL>

We can then verify the different values of the parameter MANDATORY_USER_PROFILE in the different PDBs

SQL> show con_name;

CON_NAME
------------------------------
PDB3
SQL>  alter system set mandatory_user_profile=c##mand_profile_pdb3;

System altered.

SQL> show parameter mandatory;

NAME                                 TYPE        VALUE
------------------------------------ ----------- ------------------------------
mandatory_user_profile               string      C##MAND_PROFILE_PDB3
SQL> alter session set container=PDB1;

Session altered.

SQL> show parameter mandatory;

NAME                                 TYPE        VALUE
------------------------------------ ----------- ------------------------------
mandatory_user_profile               string      C##MAND_PROFILE_PDB1_PDB2
SQL>  alter session set container=PDB2;

Session altered.

SQL> show parameter mandatory;

NAME                                 TYPE        VALUE
------------------------------------ ----------- ------------------------------
mandatory_user_profile               string      C##MAND_PROFILE_PDB1_PDB2
SQL>

To test we will try to create a user in PDB3 for example with with a password length < 10

SQL> create user toto identified by "DGDTr##5";
create user toto identified by "DGDTr##5"
*
ERROR at line 1:
ORA-28219: password verification failed for mandatory profile
ORA-20000: password length less than 10 characters


SQL>

Cet article Oracle 21c Security : Mandatory Profile est apparu en premier sur Blog dbi services.

↧

Oracle write consistency bug and multi-thread de-queuing

December 14, 2020, 1:21 am

≫ Next: Efficiently query DBA_EXTENTS for FILE_ID / BLOCK_ID

≪ Previous: Oracle 21c Security : Mandatory Profile

By Franck Pachot

.
This was initially posted on CERN Database blog where it seems to be lost. Here is a copy thanks to web.archive.org
Additional notes:
– I’ve tested and got the same behaviour in Oracle 21c
– you will probably enjoy reading Hatem Mahmoud going further on Write consistency and DML restart

Posted by Franck Pachot on Thursday, 27 September 2018

Here is a quick test I did after encountering an abnormal behavior in write consistency and before finding some references to a bug on StackOverflow (yes, write consistency questions on StackOverflow!) and AskTOM. And a bug opened by Tom Kyte in 2011, that is still there in 18c.

The original issue was with a task management system to run jobs. Here is the simple table where all rows have a ‘NEW’ status and the goal is to have several threads processing them by updating them to the ‘HOLDING’ status’ and adding the process name.


set echo on
drop table DEMO;
create table DEMO (ID primary key,STATUS,NAME,CREATED)
 as select rownum,cast('NEW' as varchar2(10)),cast(null as varchar2(10)),sysdate+rownum/24/60 from xmltable('1 to 10')
/

Now here is the query that selects the 5 oldest rows in status ‘NEW’ and updates them to the ‘HOLDING’ status:


UPDATE DEMO SET NAME = 'NUMBER1', STATUS = 'HOLDING' 
WHERE ID IN (
 SELECT ID FROM (
  SELECT ID, rownum as counter 
  FROM DEMO 
  WHERE STATUS = 'NEW' 
  ORDER BY CREATED
 ) 
WHERE counter <= 5) 
;

Note that the update also sets the name of the session which has processed the rows, here ‘NUMBER1’.

Once the query started, and before the commit, I’ve run the same query from another session, but with ‘NUMBER2’.


UPDATE DEMO SET NAME = 'NUMBER2', STATUS = 'HOLDING' 
WHERE ID IN (
 SELECT ID FROM (
  SELECT ID, rownum as counter 
  FROM DEMO 
  WHERE STATUS = 'NEW' 
  ORDER BY CREATED
 ) 
WHERE counter <= 5) 
;

Of course, this waits on row lock from the first session as it has selected the same rows. Then I commit the first session, and check, from the first session what has been updated:


commit;
set pagesize 1000
select versions_operation,versions_xid,DEMO.* from DEMO versions between scn minvalue and maxvalue order by ID,2;

V VERSIONS_XID             ID STATUS     NAME       CREATED        
- ---------------- ---------- ---------- ---------- ---------------
U 0500110041040000          1 HOLDING    NUMBER1    27-SEP-18 16:48
                            1 NEW                   27-SEP-18 16:48
U 0500110041040000          2 HOLDING    NUMBER1    27-SEP-18 16:49
                            2 NEW                   27-SEP-18 16:49
U 0500110041040000          3 HOLDING    NUMBER1    27-SEP-18 16:50
                            3 NEW                   27-SEP-18 16:50
U 0500110041040000          4 HOLDING    NUMBER1    27-SEP-18 16:51
                            4 NEW                   27-SEP-18 16:51
U 0500110041040000          5 HOLDING    NUMBER1    27-SEP-18 16:52
                            5 NEW                   27-SEP-18 16:52
                            6 NEW                   27-SEP-18 16:53
                            7 NEW                   27-SEP-18 16:54
                            8 NEW                   27-SEP-18 16:55
                            9 NEW                   27-SEP-18 16:56
                           10 NEW                   27-SEP-18 16:57

I have used flashback query to see all versions of the rows. All 10 have been created and the the first 5 of them have been updated by NUMBER1.

Now, my second session continues, updating to NUMBER2. I commit and look at the row versions again:


commit;
set pagesize 1000
select versions_operation,versions_xid,DEMO.* from DEMO versions between scn minvalue and maxvalue order by ID,2;


V VERSIONS_XID             ID STATUS     NAME       CREATED        
- ---------------- ---------- ---------- ---------- ---------------
U 04001B0057030000          1 HOLDING    NUMBER2    27-SEP-18 16:48
U 0500110041040000          1 HOLDING    NUMBER1    27-SEP-18 16:48
                            1 NEW                   27-SEP-18 16:48
U 04001B0057030000          2 HOLDING    NUMBER2    27-SEP-18 16:49
U 0500110041040000          2 HOLDING    NUMBER1    27-SEP-18 16:49
                            2 NEW                   27-SEP-18 16:49
U 04001B0057030000          3 HOLDING    NUMBER2    27-SEP-18 16:50
U 0500110041040000          3 HOLDING    NUMBER1    27-SEP-18 16:50
                            3 NEW                   27-SEP-18 16:50
U 04001B0057030000          4 HOLDING    NUMBER2    27-SEP-18 16:51
U 0500110041040000          4 HOLDING    NUMBER1    27-SEP-18 16:51
                            4 NEW                   27-SEP-18 16:51
U 04001B0057030000          5 HOLDING    NUMBER2    27-SEP-18 16:52
U 0500110041040000          5 HOLDING    NUMBER1    27-SEP-18 16:52
                            5 NEW                   27-SEP-18 16:52
                            6 NEW                   27-SEP-18 16:53
                            7 NEW                   27-SEP-18 16:54
                            8 NEW                   27-SEP-18 16:55
                            9 NEW                   27-SEP-18 16:56
                           10 NEW                   27-SEP-18 16:57

This is not what I expected. I wanted my second session to process the other rows, but here it seems that it has processed the same rows as the first one. What has been done by the NUMBER1 has been lost and overwritten by NUMBER2. This is inconsistent, violates ACID properties, and should not happen. An SQL statement must ensure write consistency: either by locking all the rows as soon as they are read (for non-MVCC databases where reads block writes), or re-starting the update when a mutating row is encountered. Oracle default behaviour is in the second case, NUMBER2 query reads the rows 1 to 5, because the changes by NUMBER1, not committed yet, are invisible from NUMBER2. But the execution should keep track of the columns referenced in the where clause. When attempting to update a row, now that the concurrent change is visible, the update is possible only if the WHERE clause used to select the rows still selects this row. If not, the database should raise an error (this is what happens in serializable isolation level) or re-start the update when in the default statement-level consistency.

Here, probably because of the nested subquery, the write consistency is not guaranteed and this is a bug.

One workaround is not to use subqueries. However, as we need to ORDER BY the rows in order to process the oldest first, we cannot avoid the subquery. The workaround for this is to add STATUS = ‘NEW’ in the WHERE clause of the update, so that the update restart works correctly.

However, the goal of multithreading those processes is to be scalable, and multiple update restarts may finally serialize all those updates.

The preferred solution for this is to ensure that the updates do not attempt to touch the same rows. This can be achieved by a SELECT … FOR UPDATE SKIP LOCKED. As this cannot be added directly to the update statement, we need a cursor. Something like this can do the job:


declare counter number:=5;
begin
 for c in (select /*+ first_rows(5) */ ID FROM DEMO 
           where STATUS = 'NEW' 
           order by CREATED
           for update skip locked)
 loop
  counter:=counter-1;
  update DEMO set NAME = 'NUMBER1', STATUS = 'HOLDING'  where ID = c.ID and STATUS = 'NEW';
  exit when counter=0;
 end loop;
end;
/
commit;

This can be optimized further but just gives an idea of what is needed for a scalable solution. Waiting for locks is not scalable.

Cet article Oracle write consistency bug and multi-thread de-queuing est apparu en premier sur Blog dbi services.

↧

Efficiently query DBA_EXTENTS for FILE_ID / BLOCK_ID

December 15, 2020, 12:15 am

≫ Next: NTP is not working for ODA new deployment (reimage) in version 19.8?

≪ Previous: Oracle write consistency bug and multi-thread de-queuing

By Franck Pachot

.
This was initially posted to CERN Database blog on Thursday, 27 September 2018 where it seems to be lost. Here is a copy thanks to web.archive.org

Did you ever try to query DBA_EXTENTS on a very large database with LMT tablespaces? I had to, in the past, in order to find which segment a corrupt block belonged to. The information about extent allocation is stored in the datafiles headers, visible though X$KTFBUE, and queries on it can be very expensive. In addition to that, the optimizer tends to start with the segments and get to this X$KTFBUE for each of them. At this time, I had quickly created a view on the internal dictionary tables, forcing to start by X$KTFBUE with materialized CTE, to replace DBA_EXTENTS. I published this on dba-village in 2006.

I recently wanted to know the segment/extend for a hot block, identified by its file_id and block_id on a 900TB database with 7000 datafiles and 90000 extents, so I went back to this old query and I got my result in 1 second. The idea is to be sure that we start with the file (X$KCCFE) and then get to the extent allocation (X$KTFBUE) before going to the segments:

So here is the query:


column owner format a6
column segment_type format a20
column segment_name format a15
column partition_name format a15
set linesize 200
set timing on time on echo on autotrace on stat
WITH
 l AS ( /* LMT extents indexed on ktfbuesegtsn,ktfbuesegfno,ktfbuesegbno */
  SELECT ktfbuesegtsn segtsn,ktfbuesegfno segrfn,ktfbuesegbno segbid, ktfbuefno extrfn,
         ktfbuebno fstbid,ktfbuebno + ktfbueblks - 1 lstbid,ktfbueblks extblks,ktfbueextno extno
  FROM sys.x$ktfbue
 ),
 d AS ( /* DMT extents ts#, segfile#, segblock# */
  SELECT ts# segtsn,segfile# segrfn,segblock# segbid, file# extrfn,
         block# fstbid,block# + length - 1 lstbid,length extblks, ext# extno
  FROM sys.uet$
 ),
 s AS ( /* segment information for the tablespace that contains afn file */
  SELECT /*+ materialized */
  f1.fenum afn,f1.ferfn rfn,s.ts# segtsn,s.FILE# segrfn,s.BLOCK# segbid ,s.TYPE# segtype,f2.fenum segafn,t.name tsname,blocksize
  FROM sys.seg$ s, sys.ts$ t, sys.x$kccfe f1,sys.x$kccfe f2 
  WHERE s.ts#=t.ts# AND t.ts#=f1.fetsn AND s.FILE#=f2.ferfn AND s.ts#=f2.fetsn
 ),
 m AS ( /* extent mapping for the tablespace that contains afn file */
SELECT /*+ use_nl(e) ordered */
 s.afn,s.segtsn,s.segrfn,s.segbid,extrfn,fstbid,lstbid,extblks,extno, segtype,s.rfn, tsname,blocksize
 FROM s,l e
 WHERE e.segtsn=s.segtsn AND e.segrfn=s.segrfn AND e.segbid=s.segbid
 UNION ALL
 SELECT /*+ use_nl(e) ordered */ 
 s.afn,s.segtsn,s.segrfn,s.segbid,extrfn,fstbid,lstbid,extblks,extno, segtype,s.rfn, tsname,blocksize
 FROM s,d e
  WHERE e.segtsn=s.segtsn AND e.segrfn=s.segrfn AND e.segbid=s.segbid
 UNION ALL
 SELECT /*+ use_nl(e) use_nl(t) ordered */
 f.fenum afn,null segtsn,null segrfn,null segbid,f.ferfn extrfn,e.ktfbfebno fstbid,e.ktfbfebno+e.ktfbfeblks-1 lstbid,e.ktfbfeblks extblks,null extno, null segtype,f.ferfn rfn,name tsname,blocksize
 FROM sys.x$kccfe f,sys.x$ktfbfe e,sys.ts$ t
 WHERE t.ts#=f.fetsn and e.ktfbfetsn=f.fetsn and e.ktfbfefno=f.ferfn
 UNION ALL
 SELECT /*+ use_nl(e) use_nl(t) ordered */
 f.fenum afn,null segtsn,null segrfn,null segbid,f.ferfn extrfn,e.block# fstbid,e.block#+e.length-1 lstbid,e.length extblks,null extno, null segtype,f.ferfn rfn,name tsname,blocksize
 FROM sys.x$kccfe f,sys.fet$ e,sys.ts$ t
 WHERE t.ts#=f.fetsn and e.ts#=f.fetsn and e.file#=f.ferfn
 ),
 o AS (
  SELECT s.tablespace_id segtsn,s.relative_fno segrfn,s.header_block   segbid,s.segment_type,s.owner,s.segment_name,s.partition_name
  FROM SYS_DBA_SEGS s
 ),
datafile_map as (
SELECT
 afn file_id,fstbid block_id,extblks blocks,nvl(segment_type,decode(segtype,null,'free space','type='||segtype)) segment_type,
 owner,segment_name,partition_name,extno extent_id,extblks*blocksize bytes,
 tsname tablespace_name,rfn relative_fno,m.segtsn,m.segrfn,m.segbid
 FROM m,o WHERE extrfn=rfn and m.segtsn=o.segtsn(+) AND m.segrfn=o.segrfn(+) AND m.segbid=o.segbid(+)
UNION ALL
SELECT
 file_id+(select to_number(value) from v$parameter WHERE name='db_files') file_id,
 1 block_id,blocks,'tempfile' segment_type,
 '' owner,file_name segment_name,'' partition_name,0 extent_id,bytes,
  tablespace_name,relative_fno,0 segtsn,0 segrfn,0 segbid
 FROM dba_temp_files
)
select * from datafile_map where file_id=5495 and 11970455 between block_id and block_id+blocks

And here is the result, with execution statistics:



   FILE_ID   BLOCK_ID     BLOCKS SEGMENT_TYPE         OWNER  SEGMENT_NAME    PARTITION_NAME    EXTENT_ID      BYTES TABLESPACE_NAME      RELATIVE_FNO     SEGTSN     SEGRFN    SEGBID
---------- ---------- ---------- -------------------- ------ --------------- ---------------- ---------- ---------- -------------------- ------------ ---------- ---------- ----------
      5495   11964544            8192 INDEX PARTITION LHCLOG DN_PK           PART_DN_20161022 1342         67108864 LOG_DATA_20161022            1024       6364       1024        162

Elapsed: 00:00:01.25

Statistics
----------------------------------------------------------
        103  recursive calls
       1071  db block gets
      21685  consistent gets
        782  physical reads
        840  redo size
       1548  bytes sent via SQL*Net to client
        520  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
          1  rows processed

Knowing the segment from the block address is important in performance tuning, when we get the file_id/block_id from wait event parameters. It is even more important when a block corrution is detected ans having a fast query may help.

Cet article Efficiently query DBA_EXTENTS for FILE_ID / BLOCK_ID est apparu en premier sur Blog dbi services.

↧

NTP is not working for ODA new deployment (reimage) in version 19.8?

December 18, 2020, 8:46 am

≫ Next: Oracle SPD status on two learning paths

≪ Previous: Efficiently query DBA_EXTENTS for FILE_ID / BLOCK_ID

Having recently reimaged and patched several ODA in version 19.8 and 19.9, I could see an issue with NTP. During my troubleshooting I could determine the root cause and find appropriate solution. Through this blog I would like to share my experience with you.

Symptom/Analysis

ODA version 19.6 or higher is coming with Oracle Linux 7. Since Oracle Linux 7 the default synchronization service is not ntp any more but chrony. In Oracle Linux 7, ntp is still available and can still be used. But ntp service will disappear in Oracle Linux 8.

What I could realize from my last deployments and patching is that :

Patching your ODA to version 19.8 or 19.9 from 19.6 : The system will still use ntpd and chronyd service will be deactivated. All is working fine.
You reimage your ODA to version 19.8 : chronyd will be activated and NTP will not work any more.
You reimage your ODA to version 19.9 : ntpd will be activated and NTP will be working with no problem.

So the problem is only if you reimage your ODA to version 19.8.

Problem explanation

The problem is due to the fact that the odacli script deploying the appliance will still update the ntpd configuration (/etc/ntpd.conf) with the IP address provided and not chronyd. But chronyd will be, by default, activated and started then with no configuration.

Solving the problem

There is 2 solutions.

A/ Configure and use chronyd

You configure /etc/chrony.conf with the NTP addresses given during appliance creation and you restart chronyd service.

Configure chrony :

oracle@ODA01:/u01/app/oracle/local/dmk/etc/ [rdbms19.8.0.0] vi /etc/chrony.conf

oracle@ODA01:/u01/app/oracle/local/dmk/etc/ [rdbms19.8.0.0] cat /etc/chrony.conf
# Use public servers from the pool.ntp.org project.
# Please consider joining the pool (http://www.pool.ntp.org/join.html).
#server 0.pool.ntp.org iburst
#server 1.pool.ntp.org iburst
#server 2.pool.ntp.org iburst
#server 3.pool.ntp.org iburst
server 212.X.X.X.103 prefer
server 212.X.X.X.100
server 212.X.X.X.101


# Record the rate at which the system clock gains/losses time.
driftfile /var/lib/chrony/drift

# Allow the system clock to be stepped in the first three updates
# if its offset is larger than 1 second.
makestep 1.0 3

# Enable kernel synchronization of the real-time clock (RTC).
rtcsync

# Enable hardware timestamping on all interfaces that support it.
#hwtimestamp *

# Increase the minimum number of selectable sources required to adjust
# the system clock.
#minsources 2

# Allow NTP client access from local network.
#allow 192.168.0.0/16

# Serve time even if not synchronized to a time source.
#local stratum 10

# Specify file containing keys for NTP authentication.
#keyfile /etc/chrony.keys

# Specify directory for log files.
logdir /var/log/chrony

# Select which information is logged.
#log measurements statistics tracking

And you restart chrony service :

[root@ODA01 ~]# service chronyd restart
Redirecting to /bin/systemctl restart chronyd.service

B/ Start ntp

Starting ntp will automatically stop chrony service.

[root@ODA01 ~]# ntpq -p
ntpq: read: Connection refused

[root@ODA01 ~]# service ntpd restart
Redirecting to /bin/systemctl restart ntpd.service

Checking synchronization :

[root@ODA01 ~]# ntpq -p
     remote           refid      st t when poll reach   delay   offset  jitter
==============================================================================
lantime. domain_name .STEP.          16 u    - 1024    0    0.000    0.000   0.000
*ntp1. domain_name    131.188.3.223    2 u  929 1024  377    0.935   -0.053   0.914
+ntp2. domain_name    131.188.3.223    2 u  113 1024  377    0.766    0.184   2.779

Checking both ntp and chrony services :

[root@ODA01 ~]# service ntpd status
Redirecting to /bin/systemctl status ntpd.service
● ntpd.service - Network Time Service
   Loaded: loaded (/usr/lib/systemd/system/ntpd.service; enabled; vendor preset: disabled)
   Active: active (running) since Fri 2020-11-27 09:40:08 CET; 31min ago
  Process: 68548 ExecStart=/usr/sbin/ntpd -u ntp:ntp $OPTIONS (code=exited, status=0/SUCCESS)
Main PID: 68549 (ntpd)
    Tasks: 1
   CGroup: /system.slice/ntpd.service
           └─68549 /usr/sbin/ntpd -u ntp:ntp -g

Nov 27 09:40:08 ODA01 ntpd[68549]: ntp_io: estimated max descriptors: 1024, initial socket boundary: 16
Nov 27 09:40:08 ODA01 ntpd[68549]: Listen and drop on 0 v4wildcard 0.0.0.0 UDP 123
Nov 27 09:40:08 ODA01 ntpd[68549]: Listen normally on 1 lo 127.0.0.1 UDP 123
Nov 27 09:40:08 ODA01 ntpd[68549]: Listen normally on 2 btbond1 10.X.X.10 UDP 123
Nov 27 09:40:08 ODA01 ntpd[68549]: Listen normally on 3 priv0 192.X.X.24 UDP 123
Nov 27 09:40:08 ODA01 ntpd[68549]: Listen normally on 4 virbr0 192.X.X.1 UDP 123
Nov 27 09:40:08 ODA01 ntpd[68549]: Listening on routing socket on fd #21 for interface updates
Nov 27 09:40:08 ODA01 ntpd[68549]: 0.0.0.0 c016 06 restart
Nov 27 09:40:08 ODA01 ntpd[68549]: 0.0.0.0 c012 02 freq_set kernel 0.000 PPM
Nov 27 09:40:08 ODA01 ntpd[68549]: 0.0.0.0 c011 01 freq_not_set

[root@ODA01 ~]# service chronyd status
Redirecting to /bin/systemctl status chronyd.service
● chronyd.service - NTP client/server
   Loaded: loaded (/usr/lib/systemd/system/chronyd.service; enabled; vendor preset: enabled)
   Active: inactive (dead) since Fri 2020-11-27 09:40:08 CET; 32min ago
     Docs: man:chronyd(8)
           man:chrony.conf(5)
  Process: 46183 ExecStartPost=/usr/libexec/chrony-helper update-daemon (code=exited, status=0/SUCCESS)
  Process: 46180 ExecStart=/usr/sbin/chronyd $OPTIONS (code=exited, status=0/SUCCESS)
Main PID: 46182 (code=exited, status=0/SUCCESS)

Nov 27 09:18:25 ODA01 systemd[1]: Starting NTP client/server...
Nov 27 09:18:25 ODA01 chronyd[46182]: chronyd version 3.4 starting (+CMDMON +NTP +REFCLOCK +RTC +PRIVDROP +SCFILTER +SIGND +ASYNCDNS +SECHASH +IPV6 +DEBUG)
Nov 27 09:18:25 ODA01 chronyd[46182]: Frequency 0.000 +/- 1000000.000 ppm read from /var/lib/chrony/drift
Nov 27 09:18:25 ODA01 systemd[1]: Started NTP client/server.
Nov 27 09:40:08 ODA01 systemd[1]: Stopping NTP client/server...
Nov 27 09:40:08 ODA01 systemd[1]: Stopped NTP client/server.

You might need to deactivate chronyd service with systemctl to avoid chronyd starting automatically after server reboot.

Are you getting a socket error with chrony?

If you are getting following error starting chrony, you will need to give appropriate option to start chronyd with IPv4 :

Nov 27 09:09:19 ODA01 chronyd[35107]: Could not open IPv6 command socket : Address family not supported by protocol.

Example of error encountered :

[root@ODA01 ~]# service chronyd status
Redirecting to /bin/systemctl status chronyd.service
● chronyd.service - NTP client/server
   Loaded: loaded (/usr/lib/systemd/system/chronyd.service; enabled; vendor preset: enabled)
   Active: active (running) since Fri 2020-11-27 09:09:19 CET; 5min ago
     Docs: man:chronyd(8)
           man:chrony.conf(5)
  Process: 35109 ExecStartPost=/usr/libexec/chrony-helper update-daemon (code=exited, status=0/SUCCESS)
  Process: 35105 ExecStart=/usr/sbin/chronyd $OPTIONS (code=exited, status=0/SUCCESS)
Main PID: 35107 (chronyd)
    Tasks: 1
   CGroup: /system.slice/chronyd.service
           └─35107 /usr/sbin/chronyd

Nov 27 09:09:19 ODA01 systemd[1]: Starting NTP client/server...
Nov 27 09:09:19 ODA01 chronyd[35107]: chronyd version 3.4 starting (+CMDMON +NTP +REFCLOCK +RTC +PRIVDROP +SCFILTER +SIGND +ASYNCDNS +SECHASH +IPV6 +DEBUG)
Nov 27 09:09:19 ODA01 chronyd[35107]: Could not open IPv6 command socket : Address family not supported by protocol
Nov 27 09:09:19 ODA01 chronyd[35107]: Frequency 0.000 +/- 1000000.000 ppm read from /var/lib/chrony/drift
Nov 27 09:09:19 ODA01 systemd[1]: Started NTP client/server.

Chronyd system service is using a variable to set options :

[root@ODA01 ~]# cat /usr/lib/systemd/system/chronyd.service
[Unit]
Description=NTP client/server
Documentation=man:chronyd(8) man:chrony.conf(5)
After=ntpdate.service sntp.service ntpd.service
Conflicts=ntpd.service systemd-timesyncd.service
ConditionCapability=CAP_SYS_TIME

[Service]
Type=forking
PIDFile=/var/run/chrony/chronyd.pid
EnvironmentFile=-/etc/sysconfig/chronyd
ExecStart=/usr/sbin/chronyd $OPTIONS
ExecStartPost=/usr/libexec/chrony-helper update-daemon
PrivateTmp=yes
ProtectHome=yes
ProtectSystem=full

[Install]
WantedBy=multi-user.target

Need to put options -4 to chronyd service configuration file :

[root@ODA01 ~]# cat /etc/sysconfig/chronyd
# Command-line options for chronyd
OPTIONS=""

[root@ODA01 ~]# vi /etc/sysconfig/chronyd

[root@ODA01 ~]# cat /etc/sysconfig/chronyd
# Command-line options for chronyd
OPTIONS="-4"

You will just need to restart chrony service :

[root@ODA01 ~]# service chronyd restart
Redirecting to /bin/systemctl restart chronyd.service

[root@ODA01 ~]# service chronyd status
Redirecting to /bin/systemctl status chronyd.service
● chronyd.service - NTP client/server
   Loaded: loaded (/usr/lib/systemd/system/chronyd.service; enabled; vendor preset: enabled)
   Active: active (running) since Fri 2020-11-27 09:18:25 CET; 4s ago
     Docs: man:chronyd(8)
           man:chrony.conf(5)
  Process: 46183 ExecStartPost=/usr/libexec/chrony-helper update-daemon (code=exited, status=0/SUCCESS)
  Process: 46180 ExecStart=/usr/sbin/chronyd $OPTIONS (code=exited, status=0/SUCCESS)
Main PID: 46182 (chronyd)
    Tasks: 1
   CGroup: /system.slice/chronyd.service
           └─46182 /usr/sbin/chronyd -4

Nov 27 09:18:25 ODA01 systemd[1]: Starting NTP client/server...
Nov 27 09:18:25 ODA01 chronyd[46182]: chronyd version 3.4 starting (+CMDMON +NTP +REFCLOCK +RTC +PRIVDROP +SCFILTER +SIGND +ASYNCDNS +SECHASH +IPV6 +DEBUG)
Nov 27 09:18:25 ODA01 chronyd[46182]: Frequency 0.000 +/- 1000000.000 ppm read from /var/lib/chrony/drift
Nov 27 09:18:25 ODA01 systemd[1]: Started NTP client/server.

Finally you can then use following command to check NTP synchronisation with chronyd :

[root@ODA01 ~]# chronyc tracking

Cet article NTP is not working for ODA new deployment (reimage) in version 19.8? est apparu en premier sur Blog dbi services.

↧