Oracle Database Appliance: which storage capacity to choose?

May 14, 2020, 10:14 am

≫ Next: Always free / always up tmux in the Oracle Cloud with KSplice updates

≪ Previous: 20c: AWR now stores explain plan predicates

Introduction

If you’re considering ODA for your next platform, you surely already appreciate the simplicity of the offer. 3 models with few options, this is definitely easy to choose from.

One of the other benefit is also the hardware support of 5 years, and combined with software updates generally available for up to 7 years old ODAs, you can keep your ODA running even longer for non-critical databases and/or if you have a strong Disaster Recovery solution (including Data Guard or Dbvisit standby). Some of my customers are still using X4-2s and are confident in their ODAs because it’s been quite reliable across the years.

Models and storage limits

One of the main drawback of the ODA: it doesn’t have unlimited storage. Disks are local NVMe SSDs (or in a dedicated enclosure), and it’s not possible (technically possible but not recommended) to add storage through a NAS connexion.

3 ODA models are available, X8-2S and X8-2M are one-node ODAs, and X8-2HA being a two-nodes ODA with DAS storage including SSD and/or HDD (High Performance or High Capacity version).

Please refer to my previous blog post for more information about the current generation.

Storage on ODA is always dedicated to database related files: datafiles, redologs, controlfiles, archivelogs, flashback logs, backups (if you do it locally on ODA), etc. Linux system, Oracle products (Grid Infrastructure and Oracle database engines), home folders and so on reside on internal M2 SSD disks large enough for a normal use.

X8-2S/X8-2M storage limit

ODA X8-2S is the entry level ODA. It only has one CPU, but with 16 powerfull cores and 192GB of memory it’s all but a low end server. 10 empty storage slots are available in the front pane but don’t expect to extend the storage. This ODA is delivered with 2 disks and doesn’t support adding more disks. That’s it. With the two 6.4TB disks, you’ll have a RAW capacity of 12.8TB.

ODA X8-2M is much more capable than his little brother. Physically identical to X8-2S, it has two CPUs and twice the amount of RAM. This 32-cores server fitted with 384GB of RAM is a serious player. It’s still delivered with two 6.4TB disks but unlike the S version, all the 10 empty storage slots can be populated to reach a stunning 76.8TB of RAW storage. This is still not unlimited, but the limit is actually quite high. Disks can be added by pair, so you can have 2-4-6-8-10-12 disks for various configurations and for a maximum of 76.8TB RAW capacity. Only disks dedicated for ODA are suitable, and don’t expect to put bigger disks as it only supports the same 6.4TB disks than those embedded with the base server.

RAW capacity means without redundancy, and you will loose half of the capacity with ASM redundancy. It’s not possible to run an ODA without redundancy, if you think about that. ASM redundancy is the only way to secure data, as no RAID controler is inside the server. You already know that disk capacity and real capacity always differs, so Oracle included several years ago in the documentation the usable capacity depending on your configuration. The usable capacity includes reserved space for a single disk failure (15% starting from 4 disks).

On base ODAs (X8-2S and X8-2M with 2 disks only). The usable storage capacity is actually 5.8TB and no space is reserved for disk failure. If a disk fails, there is no way to rebuild redundancy as only one disk survives.

Usable storage is not database storage, don’t miss that point. You’ll need to split this usable storage between DATA area and RECO area (actually ASM diskgroups). Most often, RECO is sized between 10% and 30% of usable storage.

Here is a table with various configurations. Note that I didn’t include ASM high redundancy configurations here, I’ll explain that later.

Nb disks	Disk size TB	RAW cap. TB	Official cap. TB	DATA ratio	DATA TB	RECO TB
2	6.4	12.8	5.8	90%	5.22	0.58
2	6.4	12.8	5.8	80%	4.64	1.16
2	6.4	12.8	5.8	70%	4.06	1.74
4	6.4	25.6	9.9	90%	8.91	0.99
4	6.4	25.6	9.9	80%	7.92	1.98
4	6.4	25.6	9.9	70%	6.93	2.97
6	6.4	38.4	14.8	90%	13.32	1.48
6	6.4	38.4	14.8	80%	11.84	2.96
6	6.4	38.4	14.8	70%	10.36	4.44
8	6.4	51.2	19.8	90%	17.82	1.98
8	6.4	51.2	19.8	80%	15.84	3.96
8	6.4	51.2	19.8	70%	13.86	5.94
10	6.4	64	24.7	90%	22.23	2.47
10	6.4	64	24.7	80%	19.76	4.94
10	6.4	64	24.7	70%	17.29	7.41
12	6.4	76.8	29.7	90%	26.73	2.97
12	6.4	76.8	29.7	80%	23.76	5.94
12	6.4	76.8	29.7	70%	20.79	8.91

X8-2HA storage limit

Storage is more complex on X8-2HA. If you’re looking for complete information about its storage, review the ODA documentation for all the possibilities.

Briefly, X8-HA is available in two flavors: High Performance, the one I highly recommend, or High Capacity, which is nice if you have really big databases you want to store on only one ODA. But this High Capacity version will make use of spinning disks to achieve such amount of TB. Definitely not the best solution for the performance. The 2 nodes of this ODA are empty, no disk in the front panel, just empty space. All data disks are in a separate enclosure connected on both nodes with SAS cables. Depending on your configuration, you’ll have 6 to 24 SSD (HP) or a mix of 6 SSD and 18 HDD (HC). When your first enclosure is filled with disks, you can also add another storage enclosure of the same kind to eventually double the total capacity. Usable storage starts from 17.8TB to 142.5TB for HP, and from 114.8TB to 230.6TB for HC.

Best practice for storage usage

First you should consider that ODA storage is high performance storage for high database throughput. Thus, storing backups on ODA is a nonsense. Backups are files written once and mainly dedicated to get erased without being used. Don’t loose precious TB for that. Moreover, if backups are done in the FRA, they are actually located on the same disks as DATA. It’s why most of the configuration will be done with 10% to 20% of RECO, not more. Because we definitely won’t put backups on the same disks as DATA. 10% for RECO is a minimum, I wouldn’t recommend setting less than that, Fast Recovery Area being always a problem if too small.

During deployment you’ll have to choose between NORMAL or HIGH redundancy. NORMAL is quite similar to RAID1, but at the block level and without requiring disk parity (you need 2 or more disks). HIGH is available starting from 3 disks and makes each block existing 3 times on 3 different disks. HIGH seems to be better, but you loose even more precious space, and it doesn’t protect you from other failures like disaster in your datacenter or user errors. Most of the failure protection systems embedded in the servers are actually doubling the components: power supplies, network interfaces, system disks, and so on. So increasing the security of block redundancy without increasing the security of other components is not necessary in my opinion. Real solution for increased failure protection is Data Guard or Dbvisit: 2 ODAs, in 2 different geographical regions, with databases replicated from 1 site to the other.

Estimate your storage needs for the next 5 years, and even more

Are you able to do that? It’s not that simple. Most of the time you can estimate for the next 2-3 years, but more than that is highly uncertain. Maybe a new project will start and will require much more storage? Maybe you will have to provision more databases for testing purpose? Maybe your main software will leave Oracle to go to MS SQL or PostgreSQL in 2 years? Maybe a new CTO will arrive and decide that Oracle is too expensive and will build a plan to get rid of Oracle. We never know what’s going to happen in such a long time. But at least you can provide an estimation with all the information you have now and your own margin.

Which margin should I choose?

You probably plan to monitor the free space on your ODA. Based on the classic threshold, higher than 85% of disk usage is something you should not reach. Because you may not have a solution for expanding storage. 75% is on my opinion a good space usage you shouldn’t reach on ODA. So consider 25% less usable space than available when you do your calcultations.

Get bigger to last longer

I don’t like wasting money or resources for things that don’t need to, but in that particular case, I mean on ODA, after years working on X3-2, X4-2, and newer versions, I strongly advise to choose the maximum number of extensions you could. Maybe not 76TB on an ODA X8-2M if you only need 10TB, but 50TB is definitely more secure for 5 years and more. Buying new extensions could be challenging after 3 or 4 years, because you have no guarantee that these extensions will still be available. You can live with memory or CPU contentions, but without enough disk space, it’s much more difficult. Order your ODA fully loaded to make sure no extension will be needed.

The more disk you get, the more fast and secure you are

Last but not least, having more disks on your ODA maximize the throughput: because ASM is mirroring and stripping blocks on all the disks. For sure on NVMe disks you probably won’t use all that bandwith. More disks also adds more security for your data. Loosing one disk in a 4-disk ODA requires the rebalancing of 25% of your data to the 3 safe disks, and rebalancing is not immediate. Loosing one disk in a 8-disk ODA requires the rebalancing of much less data, actually 12.5% assuming you have the same amount of data on the 2 configurations.

A simple example

You need a single-node ODA with expandable storage. So ODA X8-2M seems fine.

You have an overview of your databases growth trend and plan to double the size in 5 years. Starting from 6TB, you plan to reach 12TB at a maximum. As you are aware of the threshold you shouldn’t reach, you know that you’ll need 16TB of usable space for DATA (maximum of 75% of disk space used). You want to make sure to have enough FRA so you plan to set DATA/RECO ratio to 80%/20%. Your RECO should be set to 4TB. Your ODA disk configuration should have at least 20TB of usable disk space. A 8-disk ODA is 19.8TB of usable space, not enough. A 10-disk ODA is 24.7TB of usable space for 19.76TB of DATA and 4.94TB of RECO, 23% more than needed, a comfortable additional margin. And don’t hesitate to take a 12-disk ODA (1 more extension) if you want to secure your choice and be ready for unplanned changes.

Conclusion

Storage on ODA is quite expensive, but don’t forget that you may not find a solution for an ODA with insufficient storage. Take the time to make your calculation, keep a strong margin, and think long-term. Being long-term is definitely the purpose of an ODA.

Cet article Oracle Database Appliance: which storage capacity to choose? est apparu en premier sur Blog dbi services.

↧

Always free / always up tmux in the Oracle Cloud with KSplice updates

May 14, 2020, 1:35 pm

≫ Next: Oracle Text : Using and Indexing – the CONTEXT Index

≪ Previous: Oracle Database Appliance: which storage capacity to choose?

By Franck Pachot

.
I used to have many VirtualBox VMs on my laptop. But now, most of my labs are in the Cloud. Easy access from everywhere.

GCP

There’s the Google Cloud free VM which is not limited in time (I still have the 11g XE I’ve created 2 years ago running there) being able to use 40% of CPU with 2GB of RAM:


top - 21:53:10 up 16 min,  4 users,  load average: 9.39, 4.93, 2.09
Tasks:  58 total,   2 running,  56 sleeping,   0 stopped,   0 zombie
%Cpu(s): 12.9 us,  8.2 sy,  0.0 ni, 12.6 id,  0.0 wa,  0.0 hi,  0.3 si, 66.0 st
GiB Mem :    1.949 total,    0.072 free,    0.797 used,    1.080 buff/cache
GiB Swap:    0.750 total,    0.660 free,    0.090 used.    0.855 avail Mem

This is cool, always free but I cannot ssh to it: only accessible with Cloud Shell and it may take a few minutes to start.

AWS

I also use an AWS free tier but this one is limited in time 1 year.


top - 19:56:11 up 2 days, 13:09,  2 users,  load average: 1.00, 1.00, 1.00
Tasks: 110 total,   2 running,  72 sleeping,   0 stopped,   0 zombie
%Cpu(s): 12.4 us,  0.2 sy,  0.0 ni,  0.0 id,  0.0 wa,  0.0 hi,  0.0 si, 87.4 st
GiB Mem :      1.0 total,      0.1 free,      0.7 used,      0.2 buff/cache
GiB Swap:      0.0 total,      0.0 free,      0.0 used.      0.2 avail Mem

A bit more CPU throtteled by the hypervisor and only 1GB of RAM. However, I can create it with a public IP and access it though SSH. This is interesting especially to run other AWS services as the AWS CLI is installed here in this Amazon Linux. But limited in time and in credits.

OCI

Not limited in time, the Oracle Cloud OCI free tier allows 2 VMs with 1GB RAM and 2 vCPUs throttled to 1/8:


top - 20:01:37 up 54 days,  6:47,  1 user,  load average: 0.91, 0.64, 0.43
Tasks: 113 total,   4 running,  58 sleeping,   0 stopped,   0 zombie
%Cpu(s): 24.8 us,  0.2 sy,  0.0 ni,  0.0 id,  0.0 wa,  0.0 hi,  0.2 si, 74.8 st
KiB Mem :   994500 total,   253108 free,   363664 used,   377728 buff/cache
KiB Swap:  8388604 total,  8336892 free,    51712 used.   396424 avail Mem

  PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+ COMMAND
18666 opc       20   0  112124    748    660 R  25.2  0.1   6:20.18 yes
18927 opc       20   0  112124    700    612 R  24.8  0.1   0:06.53 yes

This is what I definitlely choose as a bastion host for my labs.

tmux

I run mainly one thing here: TMUX. I have windows and panes for all my work and it stays there in this always free VM that never stops. I just ssh to it and ‘tmux attach’ and I’m back to all my terminals opened.

Ksplice

This, always up, with ssh opened to the internet, must be secured. And as I have all my TMUX windows opened, I don’t want to reboot. No problem: I can update the kernel with the latest security patches without reboot. This is Oracle Linux and it includes Ksplice.
My VM is up since February:


[opc@b ~]$ uptime
 20:15:09 up 84 days, 15:29,  1 user,  load average: 0.16, 0.10, 0.03

The kernel is from October:


[opc@b ~]$ uname -r
4.14.35-1902.6.6.el7uek.x86_64

But I’m actually running a newer kernel:


[opc@b ~]$ sudo uptrack-show --available
Available updates:
None

Effective kernel version is 4.14.35-1902.301.1.el7uek
[opc@b ~]$

This kernel is from April. Yes, 2 months more recent than the last reboot. I have simply updated it with uptrack, which downloads and installs the latest Ksplice rebootless kernel updates:

Because there’s no time limit, no credit limit, always up, ready to ssh to it and running the latest security patches without reboot or even quiting my tmux sessions, the Oracle Autonomous Linux is the free tier compute instance I use everyday. You need to open a free trial to get it (https://www.oracle.com/cloud/free/) but it is easy to create a compute instance which is flagged ‘always free’.

Cet article Always free / always up tmux in the Oracle Cloud with KSplice updates est apparu en premier sur Blog dbi services.

↧

Oracle Text : Using and Indexing – the CONTEXT Index

May 16, 2020, 3:46 pm

≫ Next: Install & configure a Nagios Server

≪ Previous: Always free / always up tmux in the Oracle Cloud with KSplice updates

Everybody has already faced performance problem with oracle CLOB columns.

The aim of this blog is to show you (always from a real user case) how to use one of Oracle Text Indexes (CONTEXT index) to solve performance problem with CLOB column.

The oracle text complete documentation is here : Text Application Developer’s Guide

Let’s start with the following SQL query which take more than 6.18 minutes to execute :

SQL> set timing on
SQL> set autotrace traceonly
SQL> select * from v_sc_case_pat_hist pat_hist where upper(pat_hist.note) LIKE '%FC IV%';

168 rows selected.

Elapsed: 00:06:18.09

Execution Plan
----------------------------------------------------------
Plan hash value: 1557300260

--------------------------------------------------------------------------------
---------------------

| Id  | Operation                           | Name          | Rows  | Bytes | Co
st (%CPU)| Time     |

--------------------------------------------------------------------------------
---------------------

|   0 | SELECT STATEMENT                    |               |   521 |  9455K| 24
285   (1)| 00:00:01 |

|   1 |  TABLE ACCESS BY INDEX ROWID BATCHED| DWH_CODE      |     1 |    28 |
  2   (0)| 00:00:01 |

|*  2 |   INDEX SKIP SCAN                   | PK_DWH_CODE   |     1 |       |
  1   (0)| 00:00:01 |

|*  3 |  VIEW                               |               |   521 |  9455K| 24
285   (1)| 00:00:01 |

|*  4 |   TABLE ACCESS FULL                 | CASE_PAT_HIST |   521 |   241K| 24
283   (1)| 00:00:01 |

--------------------------------------------------------------------------------
---------------------


Predicate Information (identified by operation id):
---------------------------------------------------

   2 - access("DWH_CODE_NAME"='DWH_PIT_DATE')
       filter("DWH_CODE_NAME"='DWH_PIT_DATE')
   3 - filter("DWH_VALID_FROM"<=TO_NUMBER("PT"."DWH_PIT_DATE") AND
              "DWH_VALID_TO">TO_NUMBER("PT"."DWH_PIT_DATE"))
   4 - filter(UPPER("NOTE") LIKE '%FC IV%')

Note
-----
   - dynamic statistics used: dynamic sampling (level=2)
   - 1 Sql Plan Directive used for this statement


Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
     134922  consistent gets
     106327  physical reads
        124  redo size
      94346  bytes sent via SQL*Net to client
      37657  bytes received via SQL*Net from client
        338  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
        168  rows processed

SQL>

Checking the execution plan, oracle optimizer does a Full Scan to access the table CASE_PAT_HIST.

As the the sql function upper(pat_hist.note) is used, let’s try to create a function based index :

CREATE INDEX IDX_FBI_I1 ON SCORE.CASE_PAT_HIST (UPPER(note))
                                                *
ERROR at line 1:
ORA-02327: cannot create index on expression with datatype LOB

The message is clear, we cannot create an index on a column with dataype LOB.

Moreover :

Even if my column datatype would not be a CLOB, the search criteria LIKE ‘%FC IV%’ prevent the use of any index since the oracle optimizer has no idea from which letter the string get started, so it will scan the whole table.
Indeed, only the following search criteria will use the index :
1. LIKE ‘%FC IV’
2. LIKE ‘FC IV%’
3. LIKE ‘FC%IV’

So to improve the performance of my SQL query and to index my CLOB column, the solution is to create an Oracle Text Index :

In Oracle 12.1 release, three different Oracle Text Index exists :

CONTEXT: Suited for indexing collections or large coherent documents.
CTXCAT: Suited for small documents or text fragments.
CTXRULE: Used to build a document classification or routing application.

So, let’s try to create an oracle text index of type CONTEXT :

SQL> CREATE INDEX IDX_CPH_I3 ON SCORE.CASE_PAT_HIST LOB(note) INDEXTYPE IS CTXSYS.CONTEXT;

Index created.

Elapsed: 00:00:51.76
SQL> EXEC DBMS_STATS.GATHER_TABLE_STATS('SCORE','CASE_PAT_HIST');

PL/SQL procedure successfully completed.

Elapsed: 00:00:25.20

Now we have to change the queries Where Clause in order to query it with the CONTAINS operator :

SQL> SELECT * from v_sc_case_pat_hist pat_hist WHERE CONTAINS(note, '%FC IV%', 1) > 0;

170 rows selected.

Elapsed: 00:00:00.82

Execution Plan
----------------------------------------------------------
Plan hash value: 768870586

--------------------------------------------------------------------------------
---------------------

| Id  | Operation                           | Name          | Rows  | Bytes | Co
st (%CPU)| Time     |

--------------------------------------------------------------------------------
---------------------

|   0 | SELECT STATEMENT                    |               |  2770 |    49M|  3
355   (1)| 00:00:01 |

|   1 |  TABLE ACCESS BY INDEX ROWID BATCHED| DWH_CODE      |     1 |    28 |
  2   (0)| 00:00:01 |

|*  2 |   INDEX SKIP SCAN                   | PK_DWH_CODE   |     1 |       |
  1   (0)| 00:00:01 |

|*  3 |  VIEW                               |               |  2770 |    49M|  3
355   (1)| 00:00:01 |

|   4 |   TABLE ACCESS BY INDEX ROWID       | CASE_PAT_HIST |  2770 |  1284K|  3
353   (1)| 00:00:01 |

|*  5 |    DOMAIN INDEX                     | IDX_CPH_I3    |       |       |
483   (0)| 00:00:01 |

--------------------------------------------------------------------------------
---------------------


Predicate Information (identified by operation id):
---------------------------------------------------

   2 - access("DWH_CODE_NAME"='DWH_PIT_DATE')
       filter("DWH_CODE_NAME"='DWH_PIT_DATE')
   3 - filter("DWH_VALID_FROM"<=TO_NUMBER("PT"."DWH_PIT_DATE") AND
              "DWH_VALID_TO">TO_NUMBER("PT"."DWH_PIT_DATE"))
   5 - access("CTXSYS"."CONTAINS"("NOTE",'%FC IV%',1)>0)

Note
-----
   - dynamic statistics used: dynamic sampling (level=2)
   - 1 Sql Plan Directive used for this statement


Statistics
----------------------------------------------------------
         59  recursive calls
          0  db block gets
       3175  consistent gets
        417  physical reads
        176  redo size
      95406  bytes sent via SQL*Net to client
      38098  bytes received via SQL*Net from client
        342  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
        170  rows processed

Now the query is executed in just few millisecond.

By checking the execution plan, we note an access to the table SCORE.CASE_PAT_HIST within a DOMAIN INDEX represented by our Context Index (IDX_CPH_I3).

Now let’s compare the results given by the LIKE and the CONTAINS operators:

SQL> SELECT count(*) from v_sc_case_pat_hist pat_hist WHERE upper(note) LIKE '%FC IV%'
  2  UNION ALL
  3  SELECT count(*) from v_sc_case_pat_hist pat_hist WHERE CONTAINS(note, '%FC IV%', 1) > 1;

  COUNT(*)
----------
       168
       170

The CONTAINS clause return 2 rows in more, let’s checking :

SQL> select note from v_sc_case_pat_hist pat_hist WHERE CONTAINS(note, '%FC IV%', 1) > 1 and case_id in (1,2);

NOTE
--------------------------------------------------------------------------------
Text Before , functional class (FC) IV
Text Before , WHO FC-IV

For the LIKE clause, the wildcard %FC IV% returns :

Text Before FC IV
FC IV Text After
Text Before FC IV Text After

For the CONTAINS clause, the wildcard %FC IV% returns :

Text Before FC IV
FC IV Text After
Text Before FC IV Text After
Text Before FC%IV
FC%IV Text After
Text Before FC%IV Text After

So, in term of functionality LIKE and CONTAINS clause are not exactly the same since the former returns less data than the last one.

If we translate the CONTAINS clause in the LIKE clause, we should write : “LIKE ‘%FC%IV%'”

For my customer case, the CONTAINS clause is correct, the business confirmed me this datas must be returned.

The CONTEXT oracle text index has some limitations :

You cannot using several CONTAINS through the operand OR / AND, you will face oracle error “ORA-29907: found duplicate labels in primary invocations”

SQL> SELECT count(*) from v_sc_case_pat_hist pat_hist WHERE CONTAINS(note, '%FC IV%', 1) > 1 or CONTAINS(note, '%FC TZ%', 1) > 1;
SELECT count(*) from v_sc_case_pat_hist pat_hist WHERE CONTAINS(note, '%FC IV%', 1) > 1 or CONTAINS(note, '%BE TZ%', 1) > 1
                                                                                           *
ERROR at line 1:
ORA-29907: found duplicate labels in primary invocations

To solve this issue, let’s rewrite the SQL through an UNION clause :

SQL> SELECT count(*) from v_sc_case_pat_hist pat_hist WHERE CONTAINS(note, '%FC IV%', 1) > 1
  2  UNION
  3  SELECT count(*) from v_sc_case_pat_hist pat_hist WHERE CONTAINS(note, '%BE TZ%', 1) > 1;

  COUNT(*)
----------
       170
       112

Conclusion :

The benefits of a creating Oracle Text Index (CONTEXT index in our case) include fast response time for text queries with the CONTAINS, CATSEARCH and MATCHES Oracle Text operators.We decrease the response tie from 6.18 mins to few milliseconds.
CATSEARCH and MATCHES are respectively the operators used for CTXCAT index and CTXRULE index I will present you in a next blog.
Transparent Data Encryption-enabled column does not support Oracle Text Indexes.
Always check if the data returned by the CONTAINS clause correspond to your business needs.

Cet article Oracle Text : Using and Indexing – the CONTEXT Index est apparu en premier sur Blog dbi services.

↧

Install & configure a Nagios Server

May 19, 2020, 1:38 am

≫ Next: Oracle Standard Edition on AWS ☁ socket arithmetic

≪ Previous: Oracle Text : Using and Indexing – the CONTEXT Index

What is Nagios ?

“Nagios is a powerful monitoring system that enables organizations to identify and resolve IT infrastructure problems before they affect critical business processes.” https://www.nagios.org/

In simple words, you can monitor your servers (linux, MSSSQL, etc …) and databases (Oracle, SQL Server, Postgres, MySQL, MariaDB, etc …) with nagios.

We use the free version !!!

VM configuration :

OS     : CentOS Linux 7 
CPU    : Intel(R) Xeon(R) CPU E5-2620 v4 @ 2.10GHz 
Memory : 3GB 
Disk   : 15GB

What we need :

We need Nagios Core. This is the brain of our Nagios server. All the configuration will be done in this part (set the contacts, set the notification massages, etc …)
We must install Nagios plugins. Plugins are standalone extensions to Nagios Core that make it possible to monitor anything and everything with Core. Plugins process command-line arguments, perform a specific check, and then return the results to Nagios Core
The NRPE addon is designed to allow you to execute Nagios plugins on remote Linux/Unix machines. The main reason for doing this is to allow Nagios to monitor “local” resources (like CPU load, memory usage,etc.) on remote machines. Since these public resources are not usually exposed to external machines, an agent like NRPE must be installed on the remote Linux/Unix machines.

Preconditions

All below installation are as root user.

Installation steps

Install Nagios Core (I will not explain it. Because their documentation is complete)
support.nagios.com/kb/article/nagios-core-installing-nagios-core-from-source-96.html#CentOS
Install Nagios Plugins
support.nagios.com/kb/article/nagios-core-installing-nagios-core-from-source-96.html#CentOS
Install NRPE
https://support.nagios.com/kb/article.php?id=515
Now you must decide which type of database you want to monitor. Then install the check_health plugin for it (you can install all of them if you want)
Here we install the Oracle and SQL Server check_health

Install and configure Oracle Check_health

You need Oracle client to communicate with an Oracle instance

Download and install check_oracle_health (https://labs.consol.de/nagios/check_oracle_health/index.html)

wget https://labs.consol.de/assets/downloads/nagios/check_oracle_health-3.2.1.2.tar.gz 
tar xvfz check_oracle_health-3.2.1.2.tar.gz 
cd check_oracle_health-3.2.1.2 
./configure 
make 
make install

Download and install Oracle Client (here we installed 12c version – https://www.oracle.com/database/technologies/oracle12c-linux-12201-downloads.html)
Create check_oracle_health_wrapped file in /usr/local/nagios/libexec

Set parameters+variables needed to start the plugin check_oracle_health

#!/bin/sh
### -------------------------------------------------------------------------------- ###
### We set some environment variable before to start the check_oracle_health script.
### -------------------------------------------------------------------------------- ###
### Set parameters+variables needed to start the plugin check_oracle_health:

export ORACLE_HOME=/u01/app/oracle/product/12.2.0/client_1
export LD_LIBRARY_PATH=$ORACLE_HOME/lib
export TNS_ADMIN=/usr/local/nagios/tns
export PATH=$PATH:$ORACLE_HOME/bin

export ARGS="$*"

### start the plugin check_oracle_health with the arguments of the Nagios Service:

/usr/local/nagios/libexec/check_oracle_health $ARGS

Create the tns folder in /usr/local/nagios and then make a tnsnames.ora file and add a tns

DBTEST =
  (DESCRIPTION =
    (ADDRESS = (PROTOCOL = TCP)(HOST = 172.22.10.2)(PORT = 1521))
    (CONNECT_DATA =
      (SERVER = DEDICATED)
      (SID = DBTEST)
    )
  )

Test the connection

[nagios@vmnagios objects]$ check_oracle_health_wrapped --connect DBTEST --mode tnsping

Install and configure MSSQL Check_health

Download and install the check_mssql_health

wget https://labs.consol.de/assets/downloads/nagios/check_mssql_health-2.6.4.16.tar.gz
tar xvfz check_mssql_health-2.6.4.16.tar.gz
cd check_mssql_health-2.6.4.16
./configure 
make 
make install

Download and install freetds (www.freetds.org/software.html)

wget ftp://ftp.freetds.org/pub/freetds/stable/freetds-1.1.20.tar.gz
tar xvfz freetds-1.1.20
./configure --prefix=/usr/local/freetds
make
make install
yum install freedts freetds-devel gcc make perl-ExtUtils-CBuilder perl-ExtUtils-MakeMaker

Download and install DBD-Sybase

wget http://search.cpan.org/CPAN/authors/id/M/ME/MEWP/DBD-Sybase-1.15.tar.gz
tar xvfz DBD-Sybase-1.15
cd DBD-Sybase-1.15
export SYBASE=/usr/local/freetds
perl Makefile.PL
make
make install

Add your instance information in : /usr/local/freetds/etc/freetds.conf

Configuration steps

Set your domaine name

[root@vmnagios ~]# cat /etc/resolv.conf
# Generated by NetworkManager
search exemple.ads
nameserver 10.175.222.10

Set SMPT and Postfix

[root@vmnagios ~]# /etc/postfix/main.cf
### ----------------- added by dbi-services ---------------- ###
relayhost = http://smtp.exemple.net
smtp_generic_maps = hash:/etc/postfix/generic
sender_canonical_maps = hash:/etc/postfix/canonical
-------------------------------------------------------- ###

Configure Postfix

[root@vmnagios ~]# cat /etc/postfix/generic
@localdomain.local dba@exemple.com
@.exemple.ads dba@exemple.com

[root@vmnagios ~]# cat /etc/postfix/canonical
root dba@exemple.com
nagios dba@exemple.com

After we need to generate the generic.db and canonical.db
postmap /etc/postfix/generic
postmap /etc/postfix/canonical

Done !

Now you have a new Nagios Server. All you should do is to configure your client and create a config file on your brand new Nagios server.

Cet article Install & configure a Nagios Server est apparu en premier sur Blog dbi services.

↧

Oracle Standard Edition on AWS ☁ socket arithmetic

May 20, 2020, 1:57 am

≫ Next: How to use DBMS_SCHEDULER to improve performance ?

≪ Previous: Install & configure a Nagios Server

By Franck Pachot

.
Note that I’ve written previously about Oracle Standard Edition 2 licensing before but a few rules change. This is written in May 2020.
TL;DR: 4 vCPU count for 1 socket and 2 sockets count for 1 server wherever hyper-threading is enabled or not.

The SE2 rules

I think the Standard Edition rules are quite clear now: maximum server capacity, cluster limit, minimum NUP, and processor metric. Oracle has them in the Database Licensing guideline.

2 socket capacity per server

Oracle Database Standard Edition 2 may only be licensed on servers that have a maximum capacity of 2 sockets.

We are talking about capacity which means that even when you remove a processor from a 4 socket server, it is still a 4 socket server. You cannot run Standard Edition if the servers have be the possibility for more than 2 sockets per server whether there is a processor in the socket or not.

2 socket used per cluster

When used with Oracle Real Application Clusters, Oracle Database Standard Edition 2 may only be licensed on a maximum of 2 one-socket servers

This one is not about capacity. You can remove a processor from a bi-socket to become a one-socket server, and then build a cluster running RAC in Standard Edition with 2 of those nodes. The good thing is that you can even use Oracle hypervisor (OVM or KVM), LPAR or Zones to pin one socket only for the usage of Oracle, and use the other for something else. The bad thing is that as of 19c, RAC with Standard Edition is not possible anymore. You can run the new SE HA which allows more on one node (up to the 2 socket rule) because the other node is stopped (the 10 days rule).

At least 10 NUP per server

The minimum when licensing by Named User Plus (NUP) metric is 10 NUP licenses per server.

Even when you didn’t choose the processor metric you need to count the servers. For example, if your vSphere cluster runs on 4 bi-socket servers, you need to buy 40 NUP licenses even if you can count a smaller population of users.

Processor metric

When licensing Oracle programs with … Standard Edition in the product name, a processor is counted equivalent to a socket; however, in the case of multi-chip modules, each chip in the multi-chip module is counted as one occupied socket.

A socket is a plastic slot where you can put a processor in it. This is what counts for the “2 socket capacity per server”. An occupied socket is one with a processor, physically or pinned with an accepted hard partitioning hypervisor method (Solaris Zones, IBM LPAR, Oracle OVM or KVM,…). This is what counts for the “2 socket occupied per cluster” rule. Intel is not concerned by the multi-module chip exception.

What about the cloud?

So, the rules mention servers, sockets and processors. How does this apply to modern computing where you provision a number of vCPU without knowing anything about the underlying hardware? In the AWS shared responsibility model you are responsible for the Oracle Licences (BYOL – Bring Your Own Licences) but they are responsible for the physical servers.

Oracle established the rules (which may or may not be referenced by your contract) in the Licensing Oracle Software in the Cloud Computing Environment (for educational purposes only – refer to your contract if you want the legal interpretation).

This document is only for AWS and Azure. There’s no agreement with Google Cloud and then you cannot run an Oracle software under license. Same without your local cloud provider: you are reduced to hosting on physical servers. The Oracle Public Cloud has its own rules and you can license Standard Edition on a compute instance with up to 16 OCPU and one processor license covers 4 OCPU (which is 2 hyper-threaded Intel cores).

Oracle authorizes to run on those 2 competitor public clouds. But they generally double the licenses required on competitor platforms in order to be cheaper on their own. They did that on-premises a long time ago for IBM processors and they do that now for Amazon AWS and Microsoft Azure.

So, the arithmetic is based on the following idea: 4 vCPU counts for 1 socket and 2 sockets counts for 1 server

Note that there was a time where it was 1 socket = 2 cores which meant that it was 4 vCPU when hyper-threading is enabled but 2 vCPU when not. They have changed the document and we count vCPU without looking at cores or threads. Needless to say that for optimal performance/price in SE you should disable hyper-threading in AWS in order to have your processes running on full cores. And use instance caging to limit the user sessions in order to leave a core available for the background processes.

Here are the rules:

2 socket capacity per server: maximum 8 vCPU
2 socket occupied per cluster: forget about RAC in Standard Edition and RAC in AWS
Minimum NUP: 10 NUP are ok to cover the maximum allowed 8 vCPU
Processor metric: 1 license covers 4 vCPU

Example

The maximum you can use for one database:
2 SE2 processor licences = 1 server = 2 sockets = 8 AWS vCPU
2 SE2 processor licences = 8 cores = 16 OCPU in Oracle Cloud

The cheaper option means smaller capacity:
1 SE2 processor licences = 1 sockets = 4 AWS vCPU
1 SE2 processor licences = 4 cores = 8 OCPU in Oracle Cloud

As you can see The difference between Standard and Enterprise Edition in the clouds is much smaller than on-premises where a socket can run more and more cores. The per-socket licensing was made at a time where processors had only a few cores. With the evolution, Oracle realized that SE was too cheap. They caged the SE2 usage to 16 threads per database and limit further on their competitor’s cloud. Those limits are not technical but governed by revenue management: they provide a lot of features in SE but also need to ensure that large companies still require EE.

But…

… there’s always an exception. It seems that Amazon has a special deal to allow Oracle Standard Edition on AWS RDS with EC2 instances up to 16 vCPU:

You know that I always try to test what I’m writing in a blog post. So, at least as of the publishing date and with the tested versions, it gets some verified facts.
I started an AWS RDS Oracle database on db.m4.4xlarge which is 16 vCPU. I’ve installed the instant client in my bastion console to access it:


sudo yum localinstall http://yum.oracle.com/repo/OracleLinux/OL7/oracle/instantclient/x86_64/getPackage/oracle-instantclient19.5-basic-19.5.0.0.0-1.x86_64.rpm
sudo yum localinstall http://yum.oracle.com/repo/OracleLinux/OL7/oracle/instantclient/x86_64/getPackage/oracle-instantclient19.5-sqlplus-19.5.0.0.0-1.x86_64.rpm

This is Standard Edition 2:


[ec2-user@ip-10-0-2-28 ~]$ sqlplus admin/FranckPachot@//database-1.ce45l0qjpoax.us-east-1.rds.amazonaws.com/ORCL

SQL*Plus: Release 19.0.0.0.0 - Production on Tue May 19 21:32:47 2020
Version 19.5.0.0.0

Copyright (c) 1982, 2019, Oracle.  All rights reserved.

Last Successful login time: Tue May 19 2020 21:32:38 +00:00

Connected to:
Oracle Database 19c Standard Edition 2 Release 19.0.0.0.0 - Production
Version 19.7.0.0.0

On 16 vCPU:


SQL> show parameter cpu_count

NAME                                 TYPE        VALUE
------------------------------------ ----------- ------------------------------
cpu_count                            integer     16

On AWS:

SQL> host curl http://169.254.169.254/latest/meta-data/services/domain
amazonaws.com

With more than 16 threads in CPU:

SQL> @ snapper.sql ash 10 1 all
Sampling SID all with interval 10 seconds, taking 1 snapshots...

-- Session Snapper v4.31 - by Tanel Poder ( http://blog.tanelpoder.com/snapper ) - Enjoy the Most Advanced Oracle Troubleshooting Script on the Planet! :)


---------------------------------------------------------------------------------------------------------------
  ActSes   %Thread | INST | SQL_ID          | SQL_CHILD | EVENT                               | WAIT_CLASS
---------------------------------------------------------------------------------------------------------------
   19.29   (1929%) |    1 | 3zkr1jbq4ufuk   | 0         | ON CPU                              | ON CPU
    2.71    (271%) |    1 | 3zkr1jbq4ufuk   | 0         | resmgr:cpu quantum                  | Scheduler
     .06      (6%) |    1 |                 | 0         | ON CPU                              | ON CPU

--  End of ASH snap 1, end=2020-05-19 21:34:00, seconds=10, samples_taken=49, AAS=22.1

PL/SQL procedure successfully completed.

I also checked on CloudWatch (the AWS monitoring from the hypervisor) that I am running 100% on CPU.

I tested this on a very limited time free lab environment (this configuration is expensive) and didn’t check whether hyperthreading was enabled or not (my guess is: disabled) and I didn’t test if setting CPU_COUNT would enable instance caging (SE2 is supposed to be internally caged at 16 CPUs but I see more sessions on CPU there).

Of course, I shared my surprise (follow me on Twitter if you like this kind of short info about databases – I don’t really look at the numbers but it seems I may reach 5000 followers soon so I’ll continue at the same rate):

Do you think that running @OracleDatabase Standard Edition 2 in @awscloud is limited to 8 vCPU, as mentioned in https://t.co/J5IIQl49t1 Licensing Oracle Software in the Cloud Computing Environment?
Seems we can go further. This is RDS db.m4.4xlarge: pic.twitter.com/denby1JaLM

— Franck Pachot (@FranckPachot) May 19, 2020

and I’ll update this post when I have more info about this.

Cet article Oracle Standard Edition on AWS ☁ socket arithmetic est apparu en premier sur Blog dbi services.

↧

How to use DBMS_SCHEDULER to improve performance ?

May 24, 2020, 4:28 pm

≫ Next: How to configure additional network card on an ODA X8 family

≪ Previous: Oracle Standard Edition on AWS ☁ socket arithmetic

From an application point of view, the oracle scheduler DBMS_SCHEDULER allows to reach best performance by parallelizing your process.

Let’s start with the following PL/SQL code inserting in serial several rows from a metadata table to a target table. In my example, the metadata table does not contain “directly” the data but a set a of sql statement to be executed and for which the rows returned must be inserted into the target table My_Target_Table_Serial :

Let’s verify the contents of the source table called My_Metadata_Table:

SQL> SELECT priority,dwh_id, amq_name, sql_statement,scope from dwh_amq_v2;
ROWNUM  DWH_ID  AMQ_NAME SQL_STATEMENT          SCOPE
1	7	AAA1	 SELECT SUM(P.age pt.p	TYPE1
2	28	BBB2  	 SELECT CASE WHEN pt.p	TYPE1
3	37	CCC3	 "select cm.case_id fr"	TYPE2
4	48	DDD4	 "select cm.case_id fr"	TYPE2
5	73	EEE5	 SELECT DISTINCT pt.p	TYPE1
6	90	FFF6 	 SELECT LAG(ORW pt.p	TYPE1
7	114	GGG7	 SELECT distinct pt.	TYPE1
8	125	HHH8	 SELECT DISTINCT pt.p	TYPE1
...
148    115     ZZZ48    SELECT ROUND(TO_NUMBER TYPE2

Now let’s check the PL/SQL program :

DECLARE
  l_errm VARCHAR2(200);
  l_sql  VARCHAR2(32767) := NULL;
  sql_statement_1  VARCHAR2(32767) := NULL;
  sql_statement_2  VARCHAR2(32767) := NULL;
  l_amq_name VARCHAR2(200);
  l_date NUMBER;
BEGIN
  SELECT TO_NUMBER(TO_CHAR(SYSDATE,'YYYYMMDDHH24MISS')) INTO l_date FROM dual;
  FOR rec IN (SELECT dwh_id, amq_name, sql_statement,scope 
                FROM My_Metadata_Table,
                     (SELECT dwh_pit_date FROM dwh_code_mv) pt
               WHERE dwh_status = 1
                 AND (pt.dwh_pit_date >= dwh_valid_from AND pt.dwh_pit_date < dwh_valid_to) 
               ORDER BY priority, dwh_id) LOOP
    ...
    sql_statement_1 := substr(rec.sql_statement, 1, 32000);
    sql_statement_2 := substr(rec.sql_statement, 32001);
    IF rec.SCOPE = 'TYPE1' THEN 
      -- TYPE1 LEVEL SELECT
      l_sql := 'INSERT /*+ APPEND */ INTO My_Target_Table_Serial (dwh_pit_date, AMQ_ID, AMQ_TEXT, CASE_ID, ENTERPRISE_ID)'||CHR(13)|| 'SELECT DISTINCT TO_DATE(code.dwh_pit_date, ''YYYYMMDDHH24MISS''),'||rec.dwh_id|| ',''' ||rec.amq_name ||''', case_id, 1'||CHR(13)
      || ' FROM (SELECT dwh_pit_date FROM dwh_code) code, ('||sql_statement_1;
      EXECUTE IMMEDIATE l_sql || sql_statement_2 || ')';
      COMMIT;    
    ELSE 
      -- TYPE2 LEVEL SELECT
      l_sql :=  'INSERT /*+ APPEND */ INTO My_Target_Table_Serial (dwh_pit_date, AMQ_ID, AMQ_TEXT, CASE_ID, ENTERPRISE_ID)
      SELECT DISTINCT TO_DATE(code.dwh_pit_date, ''YYYYMMDDHH24MISS''), '||rec.dwh_id|| ',''' ||rec.amq_name || ''', cm.case_id, cm.enterprise_id'||CHR(13)
      || '  FROM (SELECT dwh_pit_date FROM dwh_code) code, v_sc_case_master cm, v_sc_case_event ce, ('||sql_statement_1;
              
      EXECUTE IMMEDIATE l_sql || sql_statement_2 || ') pt'||CHR(13)
      || ' WHERE cm.case_id = ce.case_id'||CHR(13) 
      || '   AND cm.deleted IS NULL AND cm.state_id <> 1'||CHR(13)
      || '   AND ce.deleted IS NULL AND ce.pref_term = pt.pt_name';
      COMMIT;         
    END IF;
    ...
   END LOOP:
END;
Number of Rows Read : 148 (Means 148 Sql Statement to execute)
START : 16:17:46
END : 16:57:42
Total :  40 mins

As we can see, each Sql Statement is executed in serial, let’s check the audit table recording the loading time (Insert Time) and the “scheduling” :

CREATE_DATE		NAME	START_DATE		END_DATE            LOADING_TIME
22.05.2020 16:46:34	AAA1	22.05.2020 16:46:34	22.05.2020 16:57:42    11.08mins
22.05.2020 16:42:05	BBB2	22.05.2020 16:42:05	22.05.2020 16:46:34    04.29mins
22.05.2020 16:41:15	CCC3	22.05.2020 16:41:15	22.05.2020 16:42:05    50sec
22.05.2020 16:40:42	DDD4	22.05.2020 16:40:42	22.05.2020 16:41:15    32sec
22.05.2020 16:40:20	EEE5	22.05.2020 16:40:20	22.05.2020 16:40:42    22sec
22.05.2020 16:37:23	FFF6	22.05.2020 16:37:23	22.05.2020 16:40:20    02.57mins
22.05.2020 16:37:12	GGG7	22.05.2020 16:37:12	22.05.2020 16:37:23    11sec
...
22.05.2020 16:36:03	ZZZ148	22.05.2020 16:17:35	22.05.2020 16:17:46    11sec

To resume :

The 148 rows (148 Sql Statement) coming from the source table are loaded in serial in 40mins.
The majority of rows have taken less than 01 min to load (Ex. : Name = CCC3,DDD4,EEE5,GGG7 and ZZZ148)
Few rows have taken more than a couple of minutes to load.
The maximum loading time is 11.08mins for the Name “AA1”.
Each row must wait the previous row complete his loading before to start his loading (compare END_DATE previous vs START_DATE current).

To optimize the process, let’s trying to load all the rows coming from the source table in parallel by using the oracle scheduler DBMS_SCHEDULER.

Instead to execute directly the Insert command in the loop, let’s create a job through DBMS_SCHEDULER:

FOR rec IN (SELECT priority,dwh_id, amq_name, sql_statement,scope 
                FROM My_Metadata_Table,
                     (SELECT dwh_pit_date FROM dwh_code_mv) pt
               WHERE dwh_status = 1
                 AND (pt.dwh_pit_date >= dwh_valid_from AND pt.dwh_pit_date < dwh_valid_to) 
               ORDER BY priority, dwh_id) LOOP

     l_amq_name := rec.amq_name;
       IF rec.SCOPE = 'TYPE1' THEN 
        -- TYPE1 LEVEL SELECT
         ...
  
            --Execute Job to insert the AMQ : Background process
            DBMS_SCHEDULER.CREATE_JOB (
            job_name             => 'AMQ_P'||rec.priority||'j'||i||'_'||l_date,
            job_type             => 'PLSQL_BLOCK',
            job_action           => 'BEGIN
                                      LOAD_DATA(''CASE'','||''''||l_amq_name||''''||','||rec.priority||','||l_date||','||v_SESSION_ID||','||i||');
                                     END;',
            start_date    =>  sysdate,  
            enabled       =>  TRUE,  
            auto_drop     =>  TRUE,  
            comments      =>  'job for amq '||l_amq_name);
          END IF;
        ELSE 
            ...
            END IF;
        END IF; 
      i := i +1;
  END LOOP;
Number of Rows Read : 148 (Means 148 Sql Statement to execute)
START : 08:14:03
END : 08:42:32
Total :  27.57 mins

To resume :

The 148 rows (148 Sql Statement) coming from the source table are loaded now in parallel in 27.57mins instead of 40mins in serial.
The options of DBMS_SCHEDULER are :
- As we are limited in number of character for the parameter “job_action”, we have to insert the data through a PL/SQL procedure LOAD_DATA.
- The job is executed immediately (start_date=sysdate) and purged immediately after his execution (auto_drop=TRUE).

Let’s check now how the jobs are scheduled. Since we do a loop of 148 times, I expect to have 148 jobs:

First, let’s check now if the rows (Remember, One Row = One Insert Into Target Table From Source Table) are loaded in parallel :

CREATE_DATE 	    NAME START_DATE 	        END_DATE 				       
22.05.2020 16:46:34 AAA1 23.05.2020 08:14:04	23.05.2020 08:21:19
22.05.2020 16:42:05 BBB2 23.05.2020 08:14:04	23.05.2020 08:20:43
22.05.2020 16:41:15 CCC3 23.05.2020 08:14:04	23.05.2020 08:21:59
22.05.2020 16:40:42 DDD4 23.05.2020 08:14:03	23.05.2020 08:15:29
22.05.2020 16:40:20 EEE5 23.05.2020 08:14:03	23.05.2020 08:15:05
22.05.2020 16:37:23 FFF6 23.05.2020 08:14:03	23.05.2020 08:14:47
22.05.2020 16:37:12 GGG7 23.05.2020 08:14:03	23.05.2020 08:15:59
...                     
22.05.2020 16:36:03 ZZZ148 22.05.2020 16:17:35 22.05.2020 16:17:46

This is the case, all rows have the same start_date, meaning all rows start in parallel. Let’s verify into “all_scheduler_job_run_details” to check we have our 148 jobs in parallel :

SQL> select count(*) from all_scheduler_job_run_details where job_name like '%20200523081403';

  COUNT(*)
----------
       148
SQL> select log_date,job_name,status,req_start_date from all_scheduler_job_run_details where job_name like '%20200523081403';
LOG_DATE		JOB_NAME		        STATUS		REQ_START_DATE
23-MAY-20 08.42.41	AMQ_P3J147_20200523081403	SUCCEEDED	23-MAY-20 02.42.32
23-MAY-20 08.42.32	AMQ_P2J146_20200523081403	SUCCEEDED	23-MAY-20 02.23.13
23-MAY-20 08.37.56	AMQ_P2J145_20200523081403	SUCCEEDED	23-MAY-20 02.23.13
23-MAY-20 08.37.33	AMQ_P2J144_20200523081403	SUCCEEDED	23-MAY-20 02.23.13
23-MAY-20 08.37.22	AMQ_P2J143_20200523081403	SUCCEEDED	23-MAY-20 02.23.13
23-MAY-20 08.37.03	AMQ_P2J141_20200523081403	SUCCEEDED	23-MAY-20 02.23.13
23-MAY-20 08.36.50	AMQ_P2J142_20200523081403	SUCCEEDED	23-MAY-20 02.23.13
23-MAY-20 08.33.57	AMQ_P2J140_20200523081403	SUCCEEDED	23-MAY-20 02.23.13
--Only the first 8 rows are displayed

To resume :

We have 148 jobs all started, most of the time in parallel (job with same REQ_START_DATE, oracle parallelizes jobs per block randomly).
My PL/SQL process now took 27.57 mins instead of 40mins.

But if we have a look in details, we have a lot of small jobs. Those are jobs where run_duration is less than 01 mins:

SQL> select run_duration from all_scheduler_job_run_details where job_name like '%20200523081403' order by run_duration;

RUN_DURATION
+00 00:00:04.000000
+00 00:00:07.000000
+00 00:00:09.000000
+00 00:00:10.000000
+00 00:00:13.000000
+00 00:00:15.000000
+00 00:00:20.000000
+00 00:00:27.000000
+00 00:00:33.000000
+00 00:00:35.000000
+00 00:00:36.000000
+00 00:00:38.000000
+00 00:00:43.000000
+00 00:00:46.000000
+00 00:00:51.000000
+00 00:00:52.000000

As we have a lot of small jobs (short-lived jobs), it will be more interesting to use lightweight jobs instead of regular jobs.

In contrary of regular jobs, lightweight jobs :

Require less meta data, so they have quicker create and drop times.
Suited for short-lived jobs (small jobs, jobs where run_duration is low).

Let’s rewrite our PL/SQL process using lightweight jobs :

To use lightweight jobs, first create a program suitable for a lightweight job :

begin
dbms_scheduler.create_program
(
    program_name=>'LIGHTWEIGHT_PROGRAM',
    program_action=>'LOAD_AMQ',
    program_type=>'STORED_PROCEDURE',
    number_of_arguments=>6, 
    enabled=>FALSE);
END;

Add the arguments (parameters) and enable the program :

BEGIN
dbms_scheduler.DEFINE_PROGRAM_ARGUMENT(
program_name=>'lightweight_program',
argument_position=>1,
argument_type=>'VARCHAR2',
DEFAULT_VALUE=>NULL);

dbms_scheduler.DEFINE_PROGRAM_ARGUMENT(
program_name=>'lightweight_program',
argument_position=>2,
argument_type=>'VARCHAR2');

dbms_scheduler.DEFINE_PROGRAM_ARGUMENT(
program_name=>'lightweight_program',
argument_position=>3,
argument_type=>'NUMBER');

dbms_scheduler.DEFINE_PROGRAM_ARGUMENT(
program_name=>'lightweight_program',
argument_position=>4,
argument_type=>'NUMBER');

dbms_scheduler.DEFINE_PROGRAM_ARGUMENT(
program_name=>'lightweight_program',
argument_position=>5,
argument_type=>'VARCHAR');

dbms_scheduler.DEFINE_PROGRAM_ARGUMENT(
program_name=>'lightweight_program',
argument_position=>6,
argument_type=>'NUMBER');

dbms_scheduler.enable('lightweight_program');  
end;

Into the PL/SQL code, let’s create the lightweight job without forget to set the argument value before running the job:

DECLARE
...
BEGIN
....
LOOP
DBMS_SCHEDULER.create_job (
job_name        => 'AMQ_P'||rec.priority||'j'||i||'_'||l_date,
program_name    => 'LIGHTWEIGHT_PROGRAM',
job_style       => 'LIGHTWEIGHT',
enabled         => FALSE);
                  
 DBMS_SCHEDULER.SET_JOB_ARGUMENT_VALUE (
   job_name                => 'AMQ_P'||rec.priority||'j'||i||'_'||l_date,
   argument_position       => 1,
   argument_value          => rec.scope);
   
DBMS_SCHEDULER.SET_JOB_ARGUMENT_VALUE (
   job_name                => 'AMQ_P'||rec.priority||'j'||i||'_'||l_date,
   argument_position       => 2,
   argument_value          => l_amq_name);
   
DBMS_SCHEDULER.SET_JOB_ARGUMENT_VALUE (
   job_name                => 'AMQ_P'||rec.priority||'j'||i||'_'||l_date,
   argument_position       => 3,
   argument_value          => rec.priority);

DBMS_SCHEDULER.SET_JOB_ARGUMENT_VALUE (
   job_name                => 'AMQ_P'||rec.priority||'j'||i||'_'||l_date,
   argument_position       => 4,
   argument_value          => l_date);   

DBMS_SCHEDULER.SET_JOB_ARGUMENT_VALUE (
   job_name                => 'AMQ_P'||rec.priority||'j'||i||'_'||l_date,
   argument_position       => 5,
   argument_value          => v_SESSION_ID);  

DBMS_SCHEDULER.SET_JOB_ARGUMENT_VALUE (
   job_name                => 'AMQ_P'||rec.priority||'j'||i||'_'||l_date,
   argument_position       => 6,
   argument_value          => i); 

dbms_scheduler.run_job('AMQ_P'||rec.priority||'j'||i||'_'||l_date,TRUE);
...
END LOOP;
Number of Rows Read : 148 (Means 148 Sql Statement to execute) 
START : 18:08:56
END : 18:27:40
Total : 18.84 mins

Let’s check we have always 148 jobs in parallel :

SQL> select count(*) from all_scheduler_job_run_details where job_name like '%20200524175036';

  COUNT(*)
----------
       148
SQL> select log_date,job_name,status,req_start_date from all_scheduler_job_run_details where job_name like '%20200524175036';

LOG_DATE           JOB_NAME     STATUS	        REQ_START_DATE
24-MAY-20 05.50.51 AB1C		SUCCEEDED	24-MAY-20 05.50.36
24-MAY-20 05.50.56 AB1D		SUCCEEDED	24-MAY-20 05.50.51
24-MAY-20 05.51.14 AB1E		SUCCEEDED	24-MAY-20 05.50.56
24-MAY-20 05.51.49 AB1I		SUCCEEDED	24-MAY-20 05.51.14
24-MAY-20 05.52.14 AB1P		SUCCEEDED	24-MAY-20 05.51.49
24-MAY-20 05.52.34 AB1L		SUCCEEDED	24-MAY-20 05.52.14
24-MAY-20 05.52.55 AB1N		SUCCEEDED	24-MAY-20 05.52.34
24-MAY-20 05.53.17 AB1M		SUCCEEDED	24-MAY-20 05.52.55
24-MAY-20 05.53.29 AB1K		SUCCEEDED	24-MAY-20 05.53.17
24-MAY-20 05.53.39 AB1O		SUCCEEDED	24-MAY-20 05.53.29
24-MAY-20 05.53.57 AB1U		SUCCEEDED	24-MAY-20 05.53.39
24-MAY-20 05.54.07 AB1V		SUCCEEDED	24-MAY-20 05.53.57

To resume :

We have 148 jobs all started, most of the time in parallel.
My PL/SQL process now took 18.54 mins (Lightweight Jobs) instead of 27.57mins (Regular Jobs).
If we compare Regular Jobs VS Lightweight Jobs, the former seems to schedule the jobs randomly (start jobs with block of 4,5,6…8) while the last one schedule jobs by block of 3 or 4 (as we can see above).

Conclusion :

DBMS_SCHEDULER (Regular Jobs or Lightweight Jobs) can improve significantly your PL/SQL performance transforming transforming your serial process in parallel process.
If you have small jobs (short lived-jobs), use lightweight jobs instead regular jobs.
Don’t underestimate the development time (development, test, bug solving) to transform your serial process to parallel process. Create 1 job is different to create more than 100 or 1000 jobs through a PL/SQL loop (concurrency problem, CPU used by create/drop the jobs).
As developer, you are responsible to manage your jobs (create,drop,purge) in order to not fill the oracle parameter job_queue_processes (used by a lot of critical oracle processes).

Cet article How to use DBMS_SCHEDULER to improve performance ? est apparu en premier sur Blog dbi services.

↧

How to configure additional network card on an ODA X8 family

May 25, 2020, 1:26 pm

≫ Next: Issue deleting a database on ODA?

≪ Previous: How to use DBMS_SCHEDULER to improve performance ?

During a past project, we were using ODA X8-2M with one additional network card. As per my knowledge, on an appliance, additional cards are used to extend connectivity to additional network. Customer was really enforcing to have network redundancy between the 2 cards. I then took opportunity for some tests. In this post, I would like to share my experience on these tests and how to properly configure network card extension on an ODA.

Introduction

On an appliance, we can use RJ45 or optical fiber connectivity. On ODA X8 family, optical fiber cards have got 2 ports and RJ45 cards 4 ports. First card is installed in PCIe slot 7.
On ODA X8-2S 2 additional cards can be installed in slot 8 and slot 10.
On ODA X8-2M 2 additional cards can be installed in slot 2 and slot 10.
The only requirement is that all cards from the same server must be from the same type. You can not mix RJ45 and optical fiber on the same ODA.

In my case, as I have got 2 optical fiber network cards, my first card will have p7p1 and p7p2 ports configured (btbond1) and my second card will have p2p1 and p2p2 ports configured (btbond3).

ODA bonding is configured to use active-backup mode with no LACP.

Is bonding redundancy possible on an ODA?

Configuring bonding across network card

My first experience was to edit ifcfg-p7p2 and ifcfg-p2p1 linux network script configuration file, and assign p7p2 to btbond3 and p2p1 to btbond1. Of course, this was just for an experience and to keep such solution as definitive would mean to rollback this non supported configuration before any ODA patching. I really do not encourage to do so.
Anyway, my experience was not successful as after a reboot, both files were restored in their original configuration.

Configuring a different IP address for each network on each card

I then tried to configure both cards with 2 IP addresses from the 2 networks as below :
Bond 1 IP 10.3.1.20 (no VLAN)
Bond 3 IP 10.3.1.21 (VLAN 723)
Bond 1 IP 192.20.30.2 (no VLAN)
Bond 3 IP 192.20.30.3 (VLAN 723)

The configuration could be successfully applied and everything was configured as expected :

[root@ODA01 network-scripts]# ip addr show btbond1 8: btbond1: mtu 1500 qdisc noqueue state UP link/ether b0:26:28:72:0e:d0 brd ff:ff:ff:ff:ff:ff inet 10.3.1.20/24 brd 10.3.1.255 scope global btbond1 valid_lft forever preferred_lft forever [root@ODA01 network-scripts]# ip addr show btbond1.723 17: btbond1.723@btbond1: mtu 1500 qdisc noqueue state UP link/ether b0:26:28:72:0e:d0 brd ff:ff:ff:ff:ff:ff inet 192.20.30.2/29 brd 192.20.30.7 scope global btbond1.723 valid_lft forever preferred_lft forever [root@ODA01 network-scripts]# ip addr show btbond3 9: btbond3: mtu 1500 qdisc noqueue state UP link/ether b0:26:28:7c:8a:50 brd ff:ff:ff:ff:ff:ff inet 10.3.1.21/24 brd 10.3.1.255 scope global btbond3 valid_lft forever preferred_lft forever [root@ODA01 network-scripts]# ip addr show btbond3.723 18: btbond3.723@btbond3: mtu 1500 qdisc noqueue state UP link/ether b0:26:28:7c:8a:50 brd ff:ff:ff:ff:ff:ff inet 192.20.30.3/29 brd 192.20.30.7 scope global btbond3.723 valid_lft forever preferred_lft forever

But, unfortunately, such a configuration is not working successfully as some packets got lost by the kernel when routing between both bonding.

So, what’s next?

Opening a SR by Oracle I could get the confirmation that network cards redundancy and physical failover between cards in case of total loss of a network card is not compatible and not possible. Appliance does not support it.

Configure an ODA with 2 network cards

In this part I would like to share how to configure additional network cards on an ODA. Below would be the IP Addresses to use :
Bond 1 IP 10.3.1.20 (no VLAN) for application and backup network
Bond 3 IP 192.20.30.2 (VLAN 723) for redundancy network

btbond1 configuration

btbond1 configuration has been done through configure-firstnet after reimaging the ODA :

btbond3 configuration

To configure an additional card on the ODA we will use the odacli create-network command.

[root@ODA01 network-scripts]# odacli create-network -n btbond3 -t BOND -g 192.20.30.1 -p 192.20.30.2 -v 723 -m Replication -s 255.255.255.248 -w Dataguard { "jobId" : "d4695212-82f9-4ad1-9570-a0b7cd2faca5", "status" : "Created", "message" : null, "reports" : [ ], "createTimestamp" : "March 19, 2020 11:04:17 AM CET", "resourceList" : [ ], "description" : "Network service creation with names btbond3:Replication ", "updatedTime" : "March 19, 2020 11:04:17 AM CET" } [root@ODA01 network-scripts]# odacli describe-job -i "d4695212-82f9-4ad1-9570-a0b7cd2faca5" Job details ---------------------------------------------------------------- ID: d4695212-82f9-4ad1-9570-a0b7cd2faca5 Description: Network service creation with names btbond3:Replication Status: Success Created: March 19, 2020 11:04:17 AM CET Message: Task Name Start Time End Time Status ---------------------------------------- ----------------------------------- ----------------------------------- ---------- Setting network March 19, 2020 11:04:17 AM CET March 19, 2020 11:04:27 AM CET Success Setting up Network March 19, 2020 11:04:17 AM CET March 19, 2020 11:04:17 AM CET Success

Check network

[root@ODA01 network-scripts]# odacli list-networks ID Name NIC InterfaceType IP Address Subnet Mask Gateway VlanId ---------------------------------------- -------------------- ---------- ---------- ------------------ ------------------ ------------------ ---------- e920fc5c-62b4-4877-9008-ef3df96722ff Private-network priv0 INTERNAL 192.168.16.24 255.255.255.240 28446275-ad88-4a62-ae5d-9835bbc73d8a Public-network btbond1 BOND 10.3.1.20 255.255.255.0 10.3.1.1 9faf494b-d319-4190-8ba9-088f24e58918 Replication btbond3 BOND 192.20.30.2 255.255.255.248 192.20.30.1 723

Check IP configuration from OS

[root@ODA01 ~]# ip addr sh 1: lo: mtu 16436 qdisc noqueue state UNKNOWN link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 inet 127.0.0.1/8 scope host lo valid_lft forever preferred_lft forever 2: em1: mtu 1500 qdisc mq state DOWN qlen 1000 link/ether 00:10:e0:ef:52:f6 brd ff:ff:ff:ff:ff:ff 3: p7p1: mtu 1500 qdisc mq master btbond1 state UP qlen 1000 link/ether b0:26:28:72:0e:d0 brd ff:ff:ff:ff:ff:ff 4: p7p2: mtu 1500 qdisc mq master btbond1 state UP qlen 1000 link/ether b0:26:28:72:0e:d0 brd ff:ff:ff:ff:ff:ff 5: p2p1: mtu 1500 qdisc mq master btbond3 state DOWN qlen 1000 link/ether b0:26:28:7c:8a:50 brd ff:ff:ff:ff:ff:ff 6: p2p2: mtu 1500 qdisc mq master btbond3 state UP qlen 1000 link/ether b0:26:28:7c:8a:50 brd ff:ff:ff:ff:ff:ff 7: bond0: mtu 1500 qdisc noop state DOWN link/ether f2:30:97:19:bc:f4 brd ff:ff:ff:ff:ff:ff 8: btbond1: mtu 1500 qdisc noqueue state UP link/ether b0:26:28:72:0e:d0 brd ff:ff:ff:ff:ff:ff inet 10.3.1.20/24 brd 10.3.1.255 scope global btbond1 valid_lft forever preferred_lft forever 9: btbond3: mtu 1500 qdisc noqueue state UP link/ether b0:26:28:7c:8a:50 brd ff:ff:ff:ff:ff:ff inet 192.20.30.2/29 brd 192.20.30.7 scope global btbond3 valid_lft forever preferred_lft forever 12: priv0: mtu 1500 qdisc noqueue state UNKNOWN link/ether 8e:7e:04:71:1b:96 brd ff:ff:ff:ff:ff:ff inet 192.168.16.24/28 brd 192.168.16.31 scope global priv0 valid_lft forever preferred_lft forever 13: virbr0: mtu 1500 qdisc noqueue state DOWN link/ether 52:54:00:c3:87:7d brd ff:ff:ff:ff:ff:ff inet 192.168.122.1/24 brd 192.168.122.255 scope global virbr0 valid_lft forever preferred_lft forever 14: virbr0-nic: mtu 1500 qdisc noop master virbr0 state DOWN qlen 500 link/ether 52:54:00:c3:87:7d brd ff:ff:ff:ff:ff:ff

Ping the IP address

From the other ODA we can check that ODA01 is answering on both IPs.

[root@ODA02 network-scripts]# ping 192.20.30.2 PING 192.20.30.2 (192.20.30.2) 56(84) bytes of data. 64 bytes from 192.20.30.2: icmp_seq=1 ttl=64 time=0.144 ms 64 bytes from 192.20.30.2: icmp_seq=2 ttl=64 time=0.078 ms 64 bytes from 192.20.30.2: icmp_seq=3 ttl=64 time=0.082 ms ^C --- 192.20.30.2 ping statistics --- 3 packets transmitted, 3 received, 0% packet loss, time 2575ms rtt min/avg/max/mdev = 0.078/0.101/0.144/0.031 ms [root@ODA02 network-scripts]# ping 10.3.1.20 PING 10.3.1.20 (10.3.1.20) 56(84) bytes of data. 64 bytes from 10.3.1.20: icmp_seq=1 ttl=64 time=0.153 ms 64 bytes from 10.3.1.20: icmp_seq=2 ttl=64 time=0.077 ms 64 bytes from 10.3.1.20: icmp_seq=3 ttl=64 time=0.075 ms ^C --- 10.3.1.20 ping statistics --- 3 packets transmitted, 3 received, 0% packet loss, time 2182ms rtt min/avg/max/mdev = 0.075/0.101/0.153/0.038 ms

Conclusion

On Oracle Appliance, network redundancy is only possible on the port side, so in case of faulty cable. Today, there is no solution to cover a faulty network card or a card loss. Having bonding crossing several network cards is not possible. Neither can we configure multiple IP addresses of same network over several network cards.
Of course adding network IP from different VLAN on a card is still possible using odaadmcli as long as all bondings are on a different subnet.

Cet article How to configure additional network card on an ODA X8 family est apparu en premier sur Blog dbi services.

↧

Issue deleting a database on ODA?

May 25, 2020, 1:33 pm

≫ Next: 6 things Oracle could do for a better ODA

≪ Previous: How to configure additional network card on an ODA X8 family

I have recently faced an issue deleting database on an ODA. I was getting following error whatever database I wanted to delete : DCS-10001:Internal error encountered: null.

Through this blog, I would like to share with you my experience on this case hoping it will help you if you are facing same problem. On this project I was using ODA Release 18.5 and 18.8 and faced the same problem on both versions. On 18.3 and previous releases this was not the case.

Deleting the database

With odacli I tried to delete my TEST database, running following commands :

[root@ODA01 bin]# odacli delete-database -in TEST -fd { "jobId" : "bcdcbf59-0fe6-44b7-af7f-91f68c7697ed", "status" : "Running", "message" : null, "reports" : [ { "taskId" : "TaskZJsonRpcExt_858", "taskName" : "Validate db d6542252-dfa4-47f9-9cfc-22b4f0575c51 for deletion", "taskResult" : "", "startTime" : "May 06, 2020 11:36:38 AM CEST", "endTime" : "May 06, 2020 11:36:38 AM CEST", "status" : "Success", "taskDescription" : null, "parentTaskId" : "TaskSequential_856", "jobId" : "bcdcbf59-0fe6-44b7-af7f-91f68c7697ed", "tags" : [ ], "reportLevel" : "Info", "updatedTime" : "May 06, 2020 11:36:38 AM CEST" } ], "createTimestamp" : "May 06, 2020 11:36:38 AM CEST", "resourceList" : [ ], "description" : "Database service deletion with db name: TEST with id : d6542252-dfa4-47f9-9cfc-22b4f0575c51", "updatedTime" : "May 06, 2020 11:36:38 AM CEST" }

The job was failing with DCS-10001 Error :

[root@ODA01 bin]# odacli describe-job -i "bcdcbf59-0fe6-44b7-af7f-91f68c7697ed" Job details ---------------------------------------------------------------- ID: bcdcbf59-0fe6-44b7-af7f-91f68c7697ed Description: Database service deletion with db name: TEST with id : d6542252-dfa4-47f9-9cfc-22b4f0575c51 Status: Failure Created: May 6, 2020 11:36:38 AM CEST Message: DCS-10001:Internal error encountered: null. Task Name Start Time End Time Status ---------------------------------------- ----------------------------------- ----------------------------------- ---------- database Service deletion for d6542252-dfa4-47f9-9cfc-22b4f0575c51 May 6, 2020 11:36:38 AM CEST May 6, 2020 11:36:50 AM CEST Failure database Service deletion for d6542252-dfa4-47f9-9cfc-22b4f0575c51 May 6, 2020 11:36:38 AM CEST May 6, 2020 11:36:50 AM CEST Failure Validate db d6542252-dfa4-47f9-9cfc-22b4f0575c51 for deletion May 6, 2020 11:36:38 AM CEST May 6, 2020 11:36:38 AM CEST Success Database Deletion May 6, 2020 11:36:39 AM CEST May 6, 2020 11:36:39 AM CEST Success Unregister Db From Cluster May 6, 2020 11:36:39 AM CEST May 6, 2020 11:36:39 AM CEST Success Kill Pmon Process May 6, 2020 11:36:39 AM CEST May 6, 2020 11:36:39 AM CEST Success Database Files Deletion May 6, 2020 11:36:39 AM CEST May 6, 2020 11:36:40 AM CEST Success Deleting Volume May 6, 2020 11:36:47 AM CEST May 6, 2020 11:36:50 AM CEST Success database Service deletion for d6542252-dfa4-47f9-9cfc-22b4f0575c51 May 6, 2020 11:36:50 AM CEST May 6, 2020 11:36:50 AM CEST Failure

Troubleshooting

In the dcs-agent.log, located in /opt/oracle/dcs/log folder, you might see following errors :

2019-11-27 13:54:30,106 ERROR [database Service deletion for 89e11f5d-9789-44a3-a09d-2444f0fda99e : JobId=05a2d017-9b64-4e92-a7df-3ded603d0644] [] c.o.d.c.j.JsonRequestProcessor: RPC request invocation failed on request: {"classz":"com.oracle.dcs.agent.rpc.service.dataguard.DataguardActions","method":"deleteListenerEntry","params":[{"type":"com.oracle.dcs.agent.model.DB","value":{"updatedTime":1573023492194,"id":"89e11f5d-9789-44a3-a09d-2444f0fda99e","name":"TEST","createTime":1573023439244,"state":{"status":"CONFIGURED"},"dbName":"TEST","databaseUniqueName":"TEST_RZB","dbVersion":"11.2.0.4.190115","dbHomeId":"c58cdcfd-e5b2-4041-b993-8df5a5d5ada4","dbId":null,"isCdb":false,"pdBName":null,"pdbAdminUserName":null,"enableTDE":false,"isBcfgInSync":null,"dbType":"SI","dbTargetNodeNumber":"0","dbClass":"OLTP","dbShape":"odb1","dbStorage":"ACFS","dbOnFlashStorage":false,"level0BackupDay":"sunday","instanceOnly":true,"registerOnly":false,"rmanBkupPassword":null,"dbEdition":"SE","dbDomainName":"ksbl.local","dbRedundancy":null,"dbCharacterSet":{"characterSet":"AL32UTF8","nlsCharacterset":"AL16UTF16","dbTerritory":"AMERICA","dbLanguage":"AMERICAN"},"dbConsoleEnable":false,"backupDestination":"NONE","cloudStorageContainer":null,"backupConfigId":null,"isAutoBackupDisabled":false}}],"revertable":false,"threadId":111} ! java.lang.NullPointerException: null ! at com.oracle.dcs.agent.rpc.service.dataguard.DataguardOperations.deleteListenerEntry(DataguardOperations.java:2258) ! at com.oracle.dcs.agent.rpc.service.dataguard.DataguardActions.deleteListenerEntry(DataguardActions.java:24) ! at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ! at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) ! at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ! at java.lang.reflect.Method.invoke(Method.java:498) ! at com.oracle.dcs.commons.jrpc.JsonRequestProcessor.invokeRequest(JsonRequestProcessor.java:33) ! ... 23 common frames omitted ! Causing: com.oracle.dcs.commons.exception.DcsException: DCS-10001:Internal error encountered: null. ! at com.oracle.dcs.commons.exception.DcsException$Builder.build(DcsException.java:68) ! at com.oracle.dcs.commons.jrpc.JsonRequestProcessor.invokeRequest(JsonRequestProcessor.java:45) ! at com.oracle.dcs.commons.jrpc.JsonRequestProcessor.process(JsonRequestProcessor.java:74) ! at com.oracle.dcs.agent.task.TaskZJsonRpcExt.callInternal(TaskZJsonRpcExt.java:65) ! at com.oracle.dcs.agent.task.TaskZJsonRpc.call(TaskZJsonRpc.java:182) ! at com.oracle.dcs.agent.task.TaskZJsonRpc.call(TaskZJsonRpc.java:26) ! at com.oracle.dcs.commons.task.TaskWrapper.call(TaskWrapper.java:82) ! at com.oracle.dcs.commons.task.TaskApi.call(TaskApi.java:37) ! at com.oracle.dcs.commons.task.TaskSequential.call(TaskSequential.java:39) ! at com.oracle.dcs.commons.task.TaskSequential.call(TaskSequential.java:10) ! at com.oracle.dcs.commons.task.TaskWrapper.call(TaskWrapper.java:82) ! at com.oracle.dcs.commons.task.TaskApi.call(TaskApi.java:37) ! at com.oracle.dcs.commons.task.TaskSequential.call(TaskSequential.java:39) ! at com.oracle.dcs.agent.task.TaskZLockWrapper.call(TaskZLockWrapper.java:64) ! at com.oracle.dcs.agent.task.TaskZLockWrapper.call(TaskZLockWrapper.java:21) ! at com.oracle.dcs.commons.task.TaskWrapper.call(TaskWrapper.java:82) ! at com.oracle.dcs.commons.task.TaskApi.call(TaskApi.java:37) ! at com.oracle.dcs.commons.task.TaskSequential.call(TaskSequential.java:39) ! at com.oracle.dcs.commons.task.TaskSequential.call(TaskSequential.java:10) ! at com.oracle.dcs.commons.task.TaskWrapper.call(TaskWrapper.java:82) ! at com.oracle.dcs.commons.task.TaskWrapper.call(TaskWrapper.java:17) ! at java.util.concurrent.FutureTask.run(FutureTask.java:266) ! at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) ! at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) ! at java.lang.Thread.run(Thread.java:748) 2019-11-27 13:54:30,106 INFO [database Service deletion for 89e11f5d-9789-44a3-a09d-2444f0fda99e : JobId=05a2d017-9b64-4e92-a7df-3ded603d0644] [] c.o.d.a.z.DCSZooKeeper: DCS node id is - node_0 2019-11-27 13:54:30,106 DEBUG [database Service deletion for 89e11f5d-9789-44a3-a09d-2444f0fda99e : JobId=05a2d017-9b64-4e92-a7df-3ded603d0644] [] c.o.d.a.t.TaskZJsonRpc: Task[TaskZJsonRpcExt_124] RPC request 'Local:node_0@deleteListenerEntry()' completed: Failure

The key error to note would be : Local:node_0@deleteListenerEntry()’ completed: Failure

Explaination

This problem comes from the fact that the listener.ora file has been customized. As per Oracle Support, on an ODA, the listener.ora should never be customized and default listener.ora file should be used. I still have a SR opened with Oracle Support to clarify the situation as I’m fully convinced that this is a regression :

It was always possible in previous ODA versions to delete a database with a customized listener file
We need to customize the listener when setting Data Guard on Oracle 11.2.0.4 Version (still supported on ODA)
We need to customize the listener when doing duplication as dynamic registration is not possible when the database is in nomount state and database is restarted during the duplication.

Moreover other ODA documentations are still referring customization of the listener.ora file when using ODA :
White paper : STEPS TO MIGRATE NON-CDB DATABASES TO ACFS ON ORACLE DATABASEAPPLIANCE 12.1.2
Deploying Oracle Data Guard with Oracle Database Appliance – A WhitePaper (2016-7) (Doc ID 2392307.1)

I will update the post as soon as I have some feedback from Oracle support on this.

The workaround would be to set back the default listener.ora file time of the deletion, which would request a maintenance windows for some customer.

Solution/Workaround

Backup of the current listener configuration

OK, so let’s backup our current listener configuration first :

grid@ODA01:/home/grid/ [+ASM1] cd $TNS_ADMIN grid@ODA01:/u01/app/18.0.0.0/grid/network/admin/ [+ASM1] cp -p listener.ora ./history/listener.ora.20200506

Default ODA listener configuration

The backup of the default listener configuration is the following one :

grid@ODA01:/u01/app/18.0.0.0/grid/network/admin/ [+ASM1] cat listener19071611AM2747.bak LISTENER=(DESCRIPTION=(ADDRESS_LIST=(ADDRESS=(PROTOCOL=IPC)(KEY=LISTENER)))) # line added by Agent ASMNET1LSNR_ASM=(DESCRIPTION=(ADDRESS_LIST=(ADDRESS=(PROTOCOL=IPC)(KEY=ASMNET1LSNR_ASM)))) # line added by Agent ENABLE_GLOBAL_DYNAMIC_ENDPOINT_ASMNET1LSNR_ASM=ON # line added by Agent VALID_NODE_CHECKING_REGISTRATION_ASMNET1LSNR_ASM=SUBNET # line added by Agent ENABLE_GLOBAL_DYNAMIC_ENDPOINT_LISTENER=ON # line added by Agent VALID_NODE_CHECKING_REGISTRATION_LISTENER=SUBNET # line added by Agent

Stopping the listener

Let’s stop the listener :

grid@ODA01:/u01/app/18.0.0.0/grid/network/admin/ [+ASM1] srvctl stop listener -listener listener grid@ODA01:/u01/app/18.0.0.0/grid/network/admin/ [+ASM1] srvctl status listener -listener listener Listener LISTENER is enabled Listener LISTENER is not running

Put default listener configuration

grid@ODA01:/u01/app/18.0.0.0/grid/network/admin/ [+ASM1] mv listener.ora listener.ora.before_db_del_20200506 grid@ODA01:/u01/app/18.0.0.0/grid/network/admin/ [+ASM1] cp -p listener19071611AM2747.bak listener.ora

Start the listener

grid@ODA01:/u01/app/18.0.0.0/grid/network/admin/ [+ASM1] srvctl start listener -listener listener grid@ODA01:/u01/app/18.0.0.0/grid/network/admin/ [+ASM1] srvctl status listener -listener listener Listener LISTENER is enabled Listener LISTENER is running on node(s): oda01

Delete database

We will try to delete the database again by running the same odacli command :
[root@ODA01 bin]# odacli delete-database -in TEST -fd { "jobId" : "5655be19-e0fe-4452-b8a9-35382c67bf96", "status" : "Running", "message" : null, "reports" : [ { "taskId" : "TaskZJsonRpcExt_1167", "taskName" : "Validate db d6542252-dfa4-47f9-9cfc-22b4f0575c51 for deletion", "taskResult" : "", "startTime" : "May 06, 2020 11:45:01 AM CEST", "endTime" : "May 06, 2020 11:45:01 AM CEST", "status" : "Success", "taskDescription" : null, "parentTaskId" : "TaskSequential_1165", "jobId" : "5655be19-e0fe-4452-b8a9-35382c67bf96", "tags" : [ ], "reportLevel" : "Info", "updatedTime" : "May 06, 2020 11:45:01 AM CEST" } ], "createTimestamp" : "May 06, 2020 11:45:01 AM CEST", "resourceList" : [ ], "description" : "Database service deletion with db name: TEST with id : d6542252-dfa4-47f9-9cfc-22b4f0575c51", "updatedTime" : "May 06, 2020 11:45:01 AM CEST" }

Unfortunately the deletion will fail with another error : DCS-10011:Input parameter ‘ACFS Device for delete’ cannot be NULL.

This is due to the fact that previous deletion has already removed the corresponding ACFS volume for the database (DATA and REDO). We will have to create them manually again. I have already described this solution in a previous post : Database deletion stuck in deleting-status.

After restoring the corresponding ACFS Volume, we can retry our database deletion again :

[root@ODA01 bin]# odacli delete-database -in TEST -fd { "jobId" : "5e227755-478b-46c5-a5cd-36687cb21ed8", "status" : "Running", "message" : null, "reports" : [ { "taskId" : "TaskZJsonRpcExt_1443", "taskName" : "Validate db d6542252-dfa4-47f9-9cfc-22b4f0575c51 for deletion", "taskResult" : "", "startTime" : "May 06, 2020 11:47:53 AM CEST", "endTime" : "May 06, 2020 11:47:53 AM CEST", "status" : "Success", "taskDescription" : null, "parentTaskId" : "TaskSequential_1441", "jobId" : "5e227755-478b-46c5-a5cd-36687cb21ed8", "tags" : [ ], "reportLevel" : "Info", "updatedTime" : "May 06, 2020 11:47:53 AM CEST" } ], "createTimestamp" : "May 06, 2020 11:47:53 AM CEST", "resourceList" : [ ], "description" : "Database service deletion with db name: TEST with id : d6542252-dfa4-47f9-9cfc-22b4f0575c51", "updatedTime" : "May 06, 2020 11:47:53 AM CEST" }

Which this time will be successful :

[root@ODA01 bin]# odacli describe-job -i "5e227755-478b-46c5-a5cd-36687cb21ed8" Job details ---------------------------------------------------------------- ID: 5e227755-478b-46c5-a5cd-36687cb21ed8 Description: Database service deletion with db name: TEST with id : d6542252-dfa4-47f9-9cfc-22b4f0575c51 Status: Success Created: May 6, 2020 11:47:53 AM CEST Message: Task Name Start Time End Time Status ---------------------------------------- ----------------------------------- ----------------------------------- ---------- Validate db d6542252-dfa4-47f9-9cfc-22b4f0575c51 for deletion May 6, 2020 11:47:53 AM CEST May 6, 2020 11:47:53 AM CEST Success Database Deletion May 6, 2020 11:47:53 AM CEST May 6, 2020 11:47:54 AM CEST Success Unregister Db From Cluster May 6, 2020 11:47:54 AM CEST May 6, 2020 11:47:54 AM CEST Success Kill Pmon Process May 6, 2020 11:47:54 AM CEST May 6, 2020 11:47:54 AM CEST Success Database Files Deletion May 6, 2020 11:47:54 AM CEST May 6, 2020 11:47:54 AM CEST Success Deleting Volume May 6, 2020 11:48:01 AM CEST May 6, 2020 11:48:05 AM CEST Success Delete File Groups of Database TEST May 6, 2020 11:48:05 AM CEST May 6, 2020 11:48:05 AM CEST Success

Restore our customized listener configuration

We can now restore our customized configuration as follows :

grid@ODA01:/u01/app/18.0.0.0/grid/network/admin/ [+ASM1] srvctl stop listener -listener listener grid@ODA01:/u01/app/18.0.0.0/grid/network/admin/ [+ASM1] srvctl status listener -listener listener Listener LISTENER is enabled Listener LISTENER is not running grid@ODA01:/u01/app/18.0.0.0/grid/network/admin/ [+ASM1] mv listener.ora.before_db_del_20200506 listener.ora grid@ODA01:/u01/app/18.0.0.0/grid/network/admin/ [+ASM1] srvctl start listener -listener listener grid@ODA01:/u01/app/18.0.0.0/grid/network/admin/ [+ASM1] srvctl status listener -listener listener Listener LISTENER is enabled Listener LISTENER is running on node(s): oda01

We could also confirm that the listener started successfully by displaying the tnslsnr running processes :
grid@ODA01:/u01/app/18.0.0.0/grid/network/admin/ [+ASM1] ps -ef | grep tnslsnr | grep -v grep grid 14922 1 0 10:52 ? 00:00:00 /u01/app/18.0.0.0/grid/bin/tnslsnr ASMNET1LSNR_ASM -no_crs_notify -inherit grid 97812 1 0 12:07 ? 00:00:00 /u01/app/18.0.0.0/grid/bin/tnslsnr LISTENER -no_crs_notify -inherit

Conclusion

Starting ODA Release 18.5, database deletion will fail if the listener has been customized. Workaround is to to restore the listener default configuration for executing the deletion. This might imply for some customers to have a maintenance windows.

Cet article Issue deleting a database on ODA? est apparu en premier sur Blog dbi services.

↧

6 things Oracle could do for a better ODA

May 26, 2020, 2:44 am

≫ Next: How to add storage on ODA X8-2M

≪ Previous: Issue deleting a database on ODA?

Introduction

With the latest ODA X8 range, at least 80% of the customers could find an ODA configuration that fits their needs. For the others, either they can’t afford it, either they are already in the Cloud, or they need extremely large storage or EXADATA performance. Among these 80% of customers, only a few choose ODA. Let’s see how Oracle could improve the ODA to make it a must.

1) Make robust and reliable releases

This is the main point. ODA is built on Linux, Grid Infrastructure and database software, nearly identical to what you can find on a classic linux server. But it comes bundled and with odacli, a central CLI to manage deployment, database creations, updates, migrations and so on. And it sometimes has annoying bugs. More reliable releases could also make patching less tricky, and customers much more confident in this kind of operation.

It could also be nice to have long-term releases on ODA, like on the database. One long-term release each year, with only bug fixes and deeply tested, for those customers who prefer stability and reliability: most of them.

2) Make a real GUI

An appliance is something that eases your life. You unpack the server, you plug it, you press the power button, and you start configuring it with a smart GUI. ODA is not yet that nice. GUI is quite basic, and most of us use the CLI to have a complete control over all the features. So please Oracle, make the GUI a real strength of the ODA, with a pinch of Cloud Control features but without the complexity of Cloud Control. That would be a game-changer.

3) Integrate Data Guard management

Data Guard works fine on ODA, but you’ll have to setup the configuration yourself. Most of the customers plan to use Data Guard if they are using Enterprise Edition. And actually, ODA doesn’t know about Data Guard. You’ll need to configure everything like if it were a standard server. In my dreams, an ODA could be paired up with another one, and standby databases automatically created on the paired ODA, duplicated and synchronized with the primaries. Later we could easily switchover and switchback from the GUI, without any specific knowledge.

There is a lot of work to achieve this, but it could be a killer feature.

4) Get rid of GI for ODA lites

Yes, Grid Infrastructure adds complexity. And a “lite” appliance means simplified appliance. GI is needed mainly because we need ASM redundancy, and ASM is really nice. It’s actually better than RAID. But do you remember how ASM was configured in 10g? Just by deploying a standalone DBhome and creating a pfile with instance_type=ASM. That’s it. No dependencies between ASM and the other DBHomes. Once ASM is on the server, each instance can use it. And it could make patching easier for sure.

5) Make IPs modifiables

Because sometimes you would need to change the public IP address of an ODA, or its name. Moving to another datacenter is a good example. For now, changing IPs is only possible when appliance is not yet deployed, meaning unused. You can eventually change the network configuration manually, but don’t consider future patches will work. An easy function to change the network configuration on a deployed ODA would be welcome.

6) Be proud of this product!

Last but not least. Yes, Cloud is the future. And Oracle Cloud Infrastructure is a great piece of Cloud. But it will take time for customers to migrate to the Cloud. Some of them are even not considering Cloud at all for the moment. They want on-premise solutions. ODA is THE solution that perfectly fits between OCI and EXADATA. It’s a great product, it’s worth the money and it has many years to live. To promote these appliances, maybe Oracle could make ODA better integrated with OCI, as a cross-technology solution. Being able to backup and restore the ODA configuration to the Cloud, to put a standby database in OCI from the GUI, to duplicate a complete environment to the Cloud for testing purpose, …

Conclusion

ODA is a serious product, but it still needs several improvements to amplify its popularity.

Cet article 6 things Oracle could do for a better ODA est apparu en premier sur Blog dbi services.

↧

How to add storage on ODA X8-2M

May 26, 2020, 4:30 am

≫ Next: Error getting repository data for ol6_x86_64_userspace_ksplice, repository not found

≪ Previous: 6 things Oracle could do for a better ODA

Recently I had to add some storage on an ODA X8-2M that I deployed early February. At that time the last available release was ODA 18.7. In this post I would like to share my experience and the challenge I could face.

ODA X8-2M storage extension

As per Oracle datasheet we can see that we have initially 2 NVMe SSDs installed. With an usable capacity of 5.8 TB. We can extend up to 12 NVMe SSDs per slot of 2 disks, which can bring the ASM storage up to 29.7 TB as usable capacity.
In my configuration we were already having initally 4 NVME SSDs disk and we wanted to add 2 more.

Challenge

During the procedure to add the disk, I surprisingly could see that with release 18.7 the common expand storage command was not recognized.

[root@ODA01 ~]# odaadmcli expand storage -ndisk 2 Command 'odaadmcli expand storage' is not supported

What hell is going here? This was always possible on previous ODA generations and previous releases!
Looking closer to the documentation I could see the following note :
Note:In this release, you can add storage as per your requirement, or deploy the full storage capacity for Oracle Database Appliance X8-2HA and X8-2M hardware models at the time of initial deployment of the appliance. You can only utilize whatever storage you configured during the initial deployment of the appliance (before the initial system power ON and software provisioning and configuration). You cannot add additional storage after the initial deployment of the X8-2HA and X8-2M hardware models, in this release of Oracle Database Appliance, even if the expandable storage slots are present as empty.

Hmmm, 18.5 was still allowing it. Fortunately, the 18.8 version just got released at that time and post installation storage expansion is again possible with that release.
I, then, had to first patch my ODA with release 18.8. A good blog for ODA 18.8 patching from one of my colleague can be found here : Patching ODA from 18.3 to 18.8. Coming from 18.3, 18.5, or 18.7 would follow the same process.

Adding disks on the ODA

Checking ASM usage

Let’s first check the current ASM usage :

grid@ODA01:/home/grid/ [+ASM1] asmcmd ASMCMD> lsdg State Type Rebal Sector Logical_Sector Block AU Total_MB Free_MB Req_mir_free_MB Usable_file_MB Offline_disks Voting_files Name MOUNTED NORMAL N 512 512 4096 4194304 12211200 7550792 3052544 2248618 0 Y DATA/ MOUNTED NORMAL N 512 512 4096 4194304 12209152 2848956 3052032 -102044 0 N RECO/

Check state of the disk

Before adding a new disk, all current disks need to be healthy.
[root@ODA01 ~]# odaadmcli show disk NAME PATH TYPE STATE STATE_DETAILS pd_00 /dev/nvme0n1 NVD ONLINE Good pd_01 /dev/nvme1n1 NVD ONLINE Good pd_02 /dev/nvme3n1 NVD ONLINE Good pd_03 /dev/nvme2n1 NVD ONLINE Good

We are using 2 ASM groups :
[root@ODA01 ~]# odaadmcli show diskgroup DiskGroups ---------- DATA RECO

Run orachk

It is recommended to run orachk and be sure the ODA is healthy before adding some new disk :
[root@ODA01 ~]# cd /opt/oracle.SupportTools/orachk/oracle.ahf/orachk [root@ODA01 orachk]# ./orachk -nordbms

Physical disk installation

In my configuration I have already 4 disks. The 2 additional disks will then be installed in slot 4 and 5. After the disk is plugged in we need to power it on :
[root@ODA01 orachk]# odaadmcli power disk on pd_04 Disk 'pd_04' already powered on

It is recommended to wait at least one minute before plugging in the next disk. The LED of the disk should also shine green. Similarly we can power on the next disk once plugged in the slot 5 of the server :

[root@ODA01 orachk]# odaadmcli power disk on pd_05 Disk 'pd_05' already powered on

Expand the storage

Following command will be used to expand the storage with 2 new disks :
[root@ODA01 orachk]# odaadmcli expand storage -ndisk 2 Precheck passed. Check the progress of expansion of storage by executing 'odaadmcli show disk' Waiting for expansion to finish ...

Check expansion

At the beginning of the expansion, we can check and see that the 2 new disks have been seen and are in the process to be initialized :
[root@ODA01 ~]# odaadmcli show disk NAME PATH TYPE STATE STATE_DETAILS pd_00 /dev/nvme0n1 NVD ONLINE Good pd_01 /dev/nvme1n1 NVD ONLINE Good pd_02 /dev/nvme3n1 NVD ONLINE Good pd_03 /dev/nvme2n1 NVD ONLINE Good pd_04 /dev/nvme4n1 NVD UNINITIALIZED NewDiskInserted pd_05 /dev/nvme5n1 NVD UNINITIALIZED NewDiskInserted

Once the expansion is finished, we can check that all our disk, including the new ones, are OK :
[root@ODA01 ~]# odaadmcli show disk NAME PATH TYPE STATE STATE_DETAILS pd_00 /dev/nvme0n1 NVD ONLINE Good pd_01 /dev/nvme1n1 NVD ONLINE Good pd_02 /dev/nvme3n1 NVD ONLINE Good pd_03 /dev/nvme2n1 NVD ONLINE Good pd_04 /dev/nvme4n1 NVD ONLINE Good pd_05 /dev/nvme5n1 NVD ONLINE Good

We can also query the ASM instance and see that the SQL> col PATH format a50 SQL> set line 300 SQL> set pagesize 500 SQL> select mount_status, header_status, MOUNT_S HEADER_STATU MODE_ST STATE NAME ------- ------------ ------- -------- ------------------------------ CACHED MEMBER ONLINE NORMAL NVD_S00_PHLN9440011FP1 CACHED MEMBER ONLINE NORMAL NVD_S00_PHLN9440011FP2 CACHED MEMBER ONLINE NORMAL NVD_S01_PHLN94410040P1 CACHED MEMBER ONLINE NORMAL NVD_S01_PHLN94410040P2 CACHED MEMBER ONLINE NORMAL NVD_S02_PHLN9490009MP1 CACHED MEMBER ONLINE NORMAL NVD_S02_PHLN9490009MP2 CACHED MEMBER ONLINE NORMAL NVD_S03_PHLN944000SQP1 CACHED MEMBER ONLINE NORMAL NVD_S03_PHLN944000SQP2 CACHED MEMBER ONLINE NORMAL NVD_S04_PHLN947101TZP1 CACHED MEMBER ONLINE NORMAL NVD_S04_PHLN947101TZP2 CACHED MEMBER ONLINE NORMAL NVD_S05_PHLN947100BXP1 CACHED MEMBER ONLINE NORMAL NVD_S05_PHLN947100BXP2 CACHED MEMBER ONLINE DROPPING SSD_QRMDSK_P1 CACHED MEMBER ONLINE DROPPING SSD_QRMDSK_P2 14 rows selected. 2 new disks in slot 4 and 5 are online :
mode_status, state, name, path, label from v$ASM_DISK order by name;
PATH LABEL
-------------------------------------------------- -------------------------------
AFD:NVD_S00_PHLN9440011FP1 NVD_S00_PHLN9440011FP1
AFD:NVD_S00_PHLN9440011FP2 NVD_S00_PHLN9440011FP2
AFD:NVD_S01_PHLN94410040P1 NVD_S01_PHLN94410040P1
AFD:NVD_S01_PHLN94410040P2 NVD_S01_PHLN94410040P2
AFD:NVD_S02_PHLN9490009MP1 NVD_S02_PHLN9490009MP1
AFD:NVD_S02_PHLN9490009MP2 NVD_S02_PHLN9490009MP2
AFD:NVD_S03_PHLN944000SQP1 NVD_S03_PHLN944000SQP1
AFD:NVD_S03_PHLN944000SQP2 NVD_S03_PHLN944000SQP2
AFD:NVD_S04_PHLN947101TZP1 NVD_S04_PHLN947101TZP1
AFD:NVD_S04_PHLN947101TZP2 NVD_S04_PHLN947101TZP2
AFD:NVD_S05_PHLN947100BXP1 NVD_S05_PHLN947100BXP1
AFD:NVD_S05_PHLN947100BXP2 NVD_S05_PHLN947100BXP2
AFD:SSD_QRMDSK_P1 SSD_QRMDSK_P1
AFD:SSD_QRMDSK_P2 SSD_QRMDSK_P2

The operation system will recognize the disks as well :
grid@ODA01:/home/grid/ [+ASM1] cd /dev grid@ODA01:/dev/ [+ASM1] ls -l nvme* crw-rw---- 1 root root 246, 0 May 14 10:31 nvme0 brw-rw---- 1 grid asmadmin 259, 0 May 14 10:31 nvme0n1 brw-rw---- 1 grid asmadmin 259, 1 May 14 10:31 nvme0n1p1 brw-rw---- 1 grid asmadmin 259, 2 May 14 10:31 nvme0n1p2 crw-rw---- 1 root root 246, 1 May 14 10:31 nvme1 brw-rw---- 1 grid asmadmin 259, 5 May 14 10:31 nvme1n1 brw-rw---- 1 grid asmadmin 259, 10 May 14 10:31 nvme1n1p1 brw-rw---- 1 grid asmadmin 259, 11 May 14 14:38 nvme1n1p2 crw-rw---- 1 root root 246, 2 May 14 10:31 nvme2 brw-rw---- 1 grid asmadmin 259, 4 May 14 10:31 nvme2n1 brw-rw---- 1 grid asmadmin 259, 7 May 14 14:38 nvme2n1p1 brw-rw---- 1 grid asmadmin 259, 9 May 14 14:38 nvme2n1p2 crw-rw---- 1 root root 246, 3 May 14 10:31 nvme3 brw-rw---- 1 grid asmadmin 259, 3 May 14 10:31 nvme3n1 brw-rw---- 1 grid asmadmin 259, 6 May 14 10:31 nvme3n1p1 brw-rw---- 1 grid asmadmin 259, 8 May 14 10:31 nvme3n1p2 crw-rw---- 1 root root 246, 4 May 14 14:30 nvme4 brw-rw---- 1 grid asmadmin 259, 15 May 14 14:35 nvme4n1 brw-rw---- 1 grid asmadmin 259, 17 May 14 14:38 nvme4n1p1 brw-rw---- 1 grid asmadmin 259, 18 May 14 14:38 nvme4n1p2 crw-rw---- 1 root root 246, 5 May 14 14:31 nvme5 brw-rw---- 1 grid asmadmin 259, 16 May 14 14:35 nvme5n1 brw-rw---- 1 grid asmadmin 259, 19 May 14 14:38 nvme5n1p1 brw-rw---- 1 grid asmadmin 259, 20 May 14 14:38 nvme5n1p2

Check ASM space

Querying the ASM disk groups we can see that both Volumes have got additional space in relation of the corresponding pourcentage assigned to DATA and RECO disk group during appliance creation. In my case it was 50-50 for DATA and RECO repartition.

grid@ODA01:/dev/ [+ASM1] asmcmd ASMCMD> lsdg State Type Rebal Sector Logical_Sector Block AU Total_MB Free_MB Req_mir_free_MB Usable_file_MB Offline_disks Voting_files Name MOUNTED NORMAL Y 512 512 4096 4194304 18316288 13655792 3052544 5301118 0 Y DATA/ MOUNTED NORMAL Y 512 512 4096 4194304 18313216 8952932 3052032 2949944 0 N RECO/ ASMCMD>

Conclusion

Adding some new disks on an ODA is quite easy and fast. Surprisingly with ODA release 18.7 you are not able to expand ASM storage once the appliance is installed. This is really a regression where you will lose the ability to extend the storage. Fortunately, this has been solved in ODA version 18.8.

Cet article How to add storage on ODA X8-2M est apparu en premier sur Blog dbi services.

↧

Error getting repository data for ol6_x86_64_userspace_ksplice, repository not found

May 27, 2020, 11:58 pm

≫ Next: Applying archive log on a standby is failing with error ORA-00756: recovery detected a lost write of a data block

≪ Previous: How to add storage on ODA X8-2M

During ODA deployment I could see that starting 18.7, immediately after reimaging or patching the ODA, I was getting some regular errors in the root mail box. The error message came every hour at 13 minutes and 43 minutes.

Problem analysis

ksplice is now implemented and use in ODA Version 18.7. It is an open-source extension of the Linux kernel that allows security patches to be applied to a running kernel without the need for reboots, avoiding downtimes and improving availability.

Unfortunately, there is some implementation bug and an email alert is generated every 30 mins in the root linux user mailbox.

The message error is the following one :
1 Cron Daemon Mon Jan 6 18:43 26/1176 "Cron export PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin && [ -x /usr/bin/ksplice ] && (/usr/bin/ksplice --cron user upgrade; /usr/bin/ksp" Message 910: From root@ODA01.local Mon Jan 6 17:43:02 2020 Return-Path: Date: Mon, 6 Jan 2020 17:43:02 +0100 From: root@ODA01.local (Cron Daemon) To: root@ODA01.local Subject: Cron export PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin && [ -x /usr/bin/ksplice ] && (/usr/bin/ksplice --cron user upgrade; /usr/bin/ksplice --cron xen upgrade) Content-Type: text/plain; charset=UTF-8 Auto-Submitted: auto-generated X-Cron-Env: X-Cron-Env: X-Cron-Env: X-Cron-Env: X-Cron-Env: X-Cron-Env: Status: R Error getting repository data for ol6_x86_64_userspace_ksplice, repository not found

During my troubleshooting, I could find some community discussion : https://community.oracle.com/thread/4300505?parent=MOSC_EXTERNAL&sourceId=MOSC&id=4300505
This is considered by oracle as a bug in the 18.7 Release and tracked under Bug 30147824 :
Bug 30147824 – ENABLING AUTOINSTALL=YES WITH KSPLICE SENDS FREQUENT EMAIL TO ROOT ABOUT MISSING OL6_X86_64_USERSPACE_KSPLICE REPO

Workaround

Waiting for a final solution, following workaround can be implemented. Oracle Workaround is just not to execute ksplice.

[root@ODA01 ~]# cd /etc/cron.d [root@ODA01 cron.d]# ls -l total 20 -rw-r--r--. 1 root root 113 Aug 23 2016 0hourly -rw-r--r--. 1 root root 818 Dec 18 19:08 ksplice -rw-------. 1 root root 108 Mar 22 2017 raid-check -rw-------. 1 root root 235 Jan 25 2018 sysstat -rw-r--r--. 1 root root 747 Dec 18 19:08 uptrack [root@ODA01 cron.d]# more ksplice # Replaced by Ksplice on 2019-12-18 # /etc/cron.d/ksplice: cron job for the Ksplice client # # PLEASE DO NOT MODIFY THIS CRON JOB. # Instead, contact Ksplice Support at ksplice-support_ww@oracle.com. # # The offsets below are chosen specifically to distribute server load # and allow for Ksplice server maintenance windows. This cron job # also only contacts the Ksplice server every Nth time it runs, # depending on a load control setting on the Ksplice server. # # If you would like to adjust the frequency with which your # systems check for updates, please contact Ksplice Support at # ksplice-support_ww@oracle.com 13,43 * * * * root export PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin && [ -x /usr/bin/ksplice ] && (/usr/bin/ksplice --cron user upgrade; /usr/bin/ksplice --cron xen upgrade) [root@ODA01 cron.d]# mv ksplice /root/Extras/ [root@ODA01 cron.d]# ls -l total 16 -rw-r--r--. 1 root root 113 Aug 23 2016 0hourly -rw-------. 1 root root 108 Mar 22 2017 raid-check -rw-------. 1 root root 235 Jan 25 2018 sysstat -rw-r--r--. 1 root root 747 Dec 18 19:08 uptrack [root@ODA01 cron.d]# ls -l /root/Extras/ksplice -rw-r--r--. 1 root root 818 Dec 18 19:08 /root/Extras/ksplice

Cet article Error getting repository data for ol6_x86_64_userspace_ksplice, repository not found est apparu en premier sur Blog dbi services.

↧

Applying archive log on a standby is failing with error ORA-00756: recovery detected a lost write of a data block

May 27, 2020, 11:59 pm

≫ Next: Dbvisit – Switchover failing with ORA-00600 error due to Unified Auditing been enabled

≪ Previous: Error getting repository data for ol6_x86_64_userspace_ksplice, repository not found

During one last ODA project, I was deploying dbvisit software version 9.0.10 on Oracle database SE Edition version 18.7. From time to time I was getting lost write of a data block with type KTU UNDO on the standby. Through this blog, I would like to share my investigation and experience on this subject.

Problem description

Applying archive log on a standby database will generate following output :
PID:80969 TRACEFILE:80969_dbvctl_DBTEST_202005141045.trc SERVER:ODA01 ERROR_CODE:2001 ORA-00756: recovery detected a lost write of a data block

The full dbvisit output is the following one :
oracle@ODA01:/u01/app/dbvisit/standby/ [rdbms18000_1] ./dbvctl -d DBTEST ============================================================= Dbvisit Standby Database Technology (9.0.10_0_g064b53e) (pid 80969) dbvctl started on ODA01: Thu May 14 10:45:24 2020 ============================================================= >>> Applying Log file(s) from ODA02 to TESTDB on ODA01: thread 1 sequence 8258 (1_8258_1033287237.arc)... done thread 1 sequence 8259 (1_8259_1033287237.arc)... done ... ... ... <<<>>> PID:80969 TRACEFILE:80969_dbvctl_SALESPRD_202005141045.trc SERVER:SEERP1SOP011-replica ERROR_CODE:2001 ORA-00756: Recovery hat einen verlorenen Schreibvorgang eines Datenblockes ermittelt >>>> Dbvisit Standby terminated <<<<

during Data Guard MRP Recovery process
doing a restore/recover of a database with RMAN

The consequence was that no more archive log could be applied on the standby.

Troubleshooting

Alert log and trace file

In the alert log following errors could be found :
Additional information: 7 ORA-10567: Redo is inconsistent with data block (file# 7, block# 3483690, file offset is 2768584704 bytes) ORA-10564: tablespace UNDOTBS1 ORA-01110: Datendatei 7: '/u02/app/oracle/oradata/DBTEST_DC13/DBTEST_DC13/datafile/o1_mf_undotbs1_h59nykn5_.dbf' ORA-10560: block type 'KTU UNDO BLOCK' 2020-05-14T10:56:46.756001+02:00 ERROR: ORA-00756 detected lost write of a data block Recovery interrupted!

Following errors was displayed in the trace file :
oracle@ODA01:/u01/app/oracle/diag/rdbms/dbtest_dc13/DBTEST/trace/ [DBTEST] grep "KCOX_FUTURE" * DBTEST_ora_10502.trc:KCOX_FUTURE: CHANGE IN FUTURE OF BLOCK DBTEST _ora_20942.trc:KCOX_FUTURE: CHANGE IN FUTURE OF BLOCK DBTEST _ora_22282.trc:KCOX_FUTURE: CHANGE IN FUTURE OF BLOCK DBTEST _ora_22525.trc:KCOX_FUTURE: CHANGE IN FUTURE OF BLOCK DBTEST _ora_37658.trc:KCOX_FUTURE: CHANGE IN FUTURE OF BLOCK DBTEST _ora_4482.trc:KCOX_FUTURE: CHANGE IN FUTURE OF BLOCK DBTEST _ora_50411.trc:KCOX_FUTURE: CHANGE IN FUTURE OF BLOCK DBTEST _ora_56399.trc:KCOX_FUTURE: CHANGE IN FUTURE OF BLOCK DBTEST _ora_64093.trc:KCOX_FUTURE: CHANGE IN FUTURE OF BLOCK DBTEST _ora_67930.trc:KCOX_FUTURE: CHANGE IN FUTURE OF BLOCK DBTEST _ora_78658.trc:KCOX_FUTURE: CHANGE IN FUTURE OF BLOCK DBTEST _ora_80717.trc:KCOX_FUTURE: CHANGE IN FUTURE OF BLOCK DBTEST _ora_91154.trc:KCOX_FUTURE: CHANGE IN FUTURE OF BLOCK DBTEST _ora_9180.trc:KCOX_FUTURE: CHANGE IN FUTURE OF BLOCK DBTEST _ora_95242.trc:KCOX_FUTURE: CHANGE IN FUTURE OF BLOCK

Checking corruption

There is no block corruption
The corruption is only raised on UNDO blocks

oracle@ODA01:/home/oracle/ [DBTEST] rmanh Recovery Manager: Release 18.0.0.0.0 - Production on Tue May 12 16:28:28 2020 Version 18.7.0.0.0 Copyright (c) 1982, 2018, Oracle and/or its affiliates. All rights reserved. RMAN> connect target / connected to target database: DBTEST (DBID=3596833858, not open) RMAN> validate check logical datafile 7; Starting validate at 12-MAY-2020 16:28:44 using target database control file instead of recovery catalog allocated channel: ORA_DISK_1 channel ORA_DISK_1: SID=396 device type=DISK channel ORA_DISK_1: starting validation of datafile channel ORA_DISK_1: specifying datafile(s) for validation input datafile file number=00007 name=/u02/app/oracle/oradata/DBTEST_DC13/DBTEST_DC13/datafile/o1_mf_undotbs1_h59nykn5_.dbf channel ORA_DISK_1: validation complete, elapsed time: 00:00:15 List of Datafiles ================= File Status Marked Corrupt Empty Blocks Blocks Examined High SCN ---- ------ -------------- ------------ --------------- ---------- 7 OK 0 1 3932160 818750393 File Name: /u02/app/oracle/oradata/DBTEST_DC13/DBTEST_DC13/datafile/o1_mf_undotbs1_h59nykn5_.dbf Block Type Blocks Failing Blocks Processed ---------- -------------- ---------------- Data 0 0 Index 0 0 Other 0 3932159 Finished validate at 12-MAY-2020 16:29:00 RMAN> select * from V$DATABASE_BLOCK_CORRUPTION; no rows selected RMAN>

Root cause

This is a known 11.2.0.4 bug that affects 18.7 as well: Bug 21629064 – ORA-600 [3020] KCOX_FUTURE by RECOVERY for KTU UNDO BLOCK SEQ:254 sometime after RMAN Restore of UNDO datafile in Source Database (Doc ID 21629064.8)

Workaround

On both primary and standby databases set _undo_block_compression hidden parameter to false.
SQL> alter system set "_undo_block_compression"=FALSE scope=both; System wurde geandert. SQL>

Knowing this is a hidden parameter, I would recommend you to open an Oracle SR before setting it to your database. Neither the author (that’s me ) nor dbi services would be responsible for any issue or consequence following commands described in this blog. This would be your own responsability.

Cet article Applying archive log on a standby is failing with error ORA-00756: recovery detected a lost write of a data block est apparu en premier sur Blog dbi services.

↧

Dbvisit – Switchover failing with ORA-00600 error due to Unified Auditing been enabled

May 28, 2020, 12:01 am

≫ Next: Patching Oracle Database Appliance From 18.8 to 19.6

≪ Previous: Applying archive log on a standby is failing with error ORA-00756: recovery detected a lost write of a data block

I have been recently deploying dbvisit version 9.0.10 on Standard Edition SE2 18.7. Graceful switchover failed with error : ORA-00600: Interner Fehlercode, Argumente: [17090], [], [], [], [], [], [], [], [], [],[], [] (DBD ERROR: OCIStmtExecute)

Problem description

Running a graceful switchover will fail with ORA-00600 when converting the standby database if Unified Auditing is enabled. Output would be the following one :

oracle@ODA01:/u01/app/dbvisit/standby/ [TESTDB] ./dbvctl -d TESTDB -o switchover ============================================================= Dbvisit Standby Database Technology (9.0.10_0_g064b53e) (pid 15927) dbvctl started on ODA01-replica: Thu May 7 16:22:28 2020 ============================================================= >>> Starting Switchover between ODA01-replica and SEERP1SOP010-replica Running pre-checks ... done Pre processing ... done Processing primary ... done Processing standby ... done Converting standby ... failed Performing rollback ... done >>> Database on server ODA01-replica is still a Primary Database >>> Database on server SEERP1SOP010-replica is still a Standby Database <<<>>> PID:15927 TRACEFILE:15927_dbvctl_switchover_TESTDB_202005071622.trc SERVER:ODA01-replica ERROR_CODE:1 Remote execution error on SEERP1SOP010-replica. ==============Remote Output start: SEERP1SOP010-replica=============== Standby file /u02/app/oracle/oradata/TESTDB_DC13/TESTDB_DC13/datafile/o1_mf_system_h5bcgo03_.dbf renamed to /u02/app/oracle/oradata/TESTDB_DC41/TESTDB_DC41/datafile/o1_mf_system_hbzpn4l6_.dbf in database TESTDB. Standby file ... ... ... /u02/app/oracle/oradata/TESTDB_DC41/TESTDB_DC41/datafile/o1_mf_ts_edc_d_hbzpp9yw_.dbf in database TESTDB. Standby file /u04/app/oracle/redo/TESTDB/TESTDB_DC13/onlinelog/o1_mf_2_h5bfq0xq_.log renamed to /u01/app/dbvisit/standby/gs/TESTDB/X.DBVISIT.REDO_2.LOG in database TESTDB. <<<>>> PID:36409 TRACEFILE:36409_dbvctl_f_gs_convert_standby_TESTDB_202005071626.trc SERVER:SEERP1SOP010-replica ERROR_CODE:600 ORA-00600: Interner Fehlercode, Argumente: [17090], [], [], [], [], [], [], [], [], [],[], [] (DBD ERROR: OCIStmtExecute) >>>> Dbvisit Standby terminated <<<>>> Dbvisit Standby terminated <<<<

Troubleshooting

The problem is known by Dbvisit. Their engineering team is already working on a fix that is planned to be released in version 9.1.XX of Dbvisit.

Workaround would be to disable Unified Auditing.

Disable Unified Auditing

In this part I will describe how to disable Unified Auditing. It is an ODA so my environment is using Oracle Restart.

1- Shutdown the database

Database shutdown is executed with Oracle user using srvctl :
oracle@ODA01:/u01/app/dbvisit/standby/ [rdbms18000_1] srvctl stop database -d TESTDB_DC13

2- Shutdown listener

Listener is own by grid user and stoping the listener will be executed using srvctl :
grid@ODA01:/home/grid/ [+ASM1] srvctl stop listener -listener LISTENER

3- Relink oracle executable

For the current database home, the oracle executable needs to be relinked :
oracle@ODA01:/u01/app/dbvisit/standby/ [rdbms18000_1] cd $ORACLE_HOME/rdbms/lib oracle@ODA01:/u01/app/oracle/product/18.0.0.0/dbhome_1/rdbms/lib/ [rdbms18000_1] make -f ins_rdbms.mk uniaud_off ioracle ORACLE_HOME=$ORACLE_HOME /usr/bin/ar d /u01/app/oracle/product/18.0.0.0/dbhome_1/rdbms/lib/libknlopt.a kzaiang.o /usr/bin/ar cr /u01/app/oracle/product/18.0.0.0/dbhome_1/rdbms/lib/libknlopt.a /u01/app/oracle/product/18.0.0.0/dbhome_1/rdbms/lib/kzanang.o chmod 755 /u01/app/oracle/product/18.0.0.0/dbhome_1/bin - Linking Oracle rm -f /u01/app/oracle/product/18.0.0.0/dbhome_1/rdbms/lib/oracle /u01/app/oracle/product/18.0.0.0/dbhome_1/bin/orald -o /u01/app/oracle/product/18.0.0.0/dbhome_1/rdbms/lib/oracle -m64 -z noexecstack -Wl,--disable-new-dtags -L/u01/app/oracle/product/18.0.0.0/dbhome_1/rdbms/lib/ -L/u01/app/oracle/product/18.0.0.0/dbhome_1/lib/ -L/u01/app/oracle/product/18.0.0.0/dbhome_1/lib/stubs/ -Wl,-E /u01/app/oracle/product/18.0.0.0/dbhome_1/rdbms/lib/opimai.o /u01/app/oracle/product/18.0.0.0/dbhome_1/rdbms/lib/ssoraed.o /u01/app/oracle/product/18.0.0.0/dbhome_1/rdbms/lib/ttcsoi.o -Wl,--whole-archive -lperfsrv18 -Wl,--no-whole-archive /u01/app/oracle/product/18.0.0.0/dbhome_1/lib/nautab.o /u01/app/oracle/product/18.0.0.0/dbhome_1/lib/naeet.o /u01/app/oracle/product/18.0.0.0/dbhome_1/lib/naect.o /u01/app/oracle/product/18.0.0.0/dbhome_1/lib/naedhs.o /u01/app/oracle/product/18.0.0.0/dbhome_1/rdbms/lib/config.o -ldmext -lserver18 -lodm18 -lofs -lcell18 -lnnet18 -lskgxp18 -lsnls18 -lnls18 -lcore18 -lsnls18 -lnls18 -lcore18 -lsnls18 -lnls18 -lxml18 -lcore18 -lunls18 -lsnls18 -lnls18 -lcore18 -lnls18 -lclient18 -lvsnst18 -lcommon18 -lgeneric18 -lknlopt -loraolap18 -lskjcx18 -lslax18 -lpls18 -lrt -lplp18 -ldmext -lserver18 -lclient18 -lvsnst18 -lcommon18 -lgeneric18 `if [ -f /u01/app/oracle/product/18.0.0.0/dbhome_1/lib/libavserver18.a ] ; then echo "-lavserver18" ; else echo "-lavstub18"; fi` `if [ -f /u01/app/oracle/product/18.0.0.0/dbhome_1/lib/libavclient18.a ] ; then echo "-lavclient18" ; fi` -lknlopt -lslax18 -lpls18 -lrt -lplp18 -ljavavm18 -lserver18 -lwwg `cat /u01/app/oracle/product/18.0.0.0/dbhome_1/lib/ldflags` -lncrypt18 -lnsgr18 -lnzjs18 -ln18 -lnl18 -lngsmshd18 -lnro18 `cat /u01/app/oracle/product/18.0.0.0/dbhome_1/lib/ldflags` -lncrypt18 -lnsgr18 -lnzjs18 -ln18 -lnl18 -lngsmshd18 -lnnzst18 -lzt18 -lztkg18 -lmm -lsnls18 -lnls18 -lcore18 -lsnls18 -lnls18 -lcore18 -lsnls18 -lnls18 -lxml18 -lcore18 -lunls18 -lsnls18 -lnls18 -lcore18 -lnls18 -lztkg18 `cat /u01/app/oracle/product/18.0.0.0/dbhome_1/lib/ldflags` -lncrypt18 -lnsgr18 -lnzjs18 -ln18 -lnl18 -lngsmshd18 -lnro18 `cat /u01/app/oracle/product/18.0.0.0/dbhome_1/lib/ldflags` -lncrypt18 -lnsgr18 -lnzjs18 -ln18 -lnl18 -lngsmshd18 -lnnzst18 -lzt18 -lztkg18 -lsnls18 -lnls18 -lcore18 -lsnls18 -lnls18 -lcore18 -lsnls18 -lnls18 -lxml18 -lcore18 -lunls18 -lsnls18 -lnls18 -lcore18 -lnls18 `if /usr/bin/ar tv /u01/app/oracle/product/18.0.0.0/dbhome_1/rdbms/lib/libknlopt.a | grep "kxmnsd.o" > /dev/null 2>&1 ; then echo " " ; else echo "-lordsdo18 -lserver18"; fi` -L/u01/app/oracle/product/18.0.0.0/dbhome_1/ctx/lib/ -lctxc18 -lctx18 -lzx18 -lgx18 -lctx18 -lzx18 -lgx18 -lordimt -lclscest18 -loevm -lclsra18 -ldbcfg18 -lhasgen18 -lskgxn2 -lnnzst18 -lzt18 -lxml18 -lgeneric18 -locr18 -locrb18 -locrutl18 -lhasgen18 -lskgxn2 -lnnzst18 -lzt18 -lxml18 -lgeneric18 -lgeneric18 -lorazip -loraz -llzopro5 -lorabz2 -lipp_z -lipp_bz2 -lippdcemerged -lippsemerged -lippdcmerged -lippsmerged -lippcore -lippcpemerged -lippcpmerged -lsnls18 -lnls18 -lcore18 -lsnls18 -lnls18 -lcore18 -lsnls18 -lnls18 -lxml18 -lcore18 -lunls18 -lsnls18 -lnls18 -lcore18 -lnls18 -lsnls18 -lunls18 -lsnls18 -lnls18 -lcore18 -lsnls18 -lnls18 -lcore18 -lsnls18 -lnls18 -lxml18 -lcore18 -lunls18 -lsnls18 -lnls18 -lcore18 -lnls18 -lasmclnt18 -lcommon18 -lcore18 -ledtn18 -laio -lons -lfthread18 `cat /u01/app/oracle/product/18.0.0.0/dbhome_1/lib/sysliblist` -Wl,-rpath,/u01/app/oracle/product/18.0.0.0/dbhome_1/lib -lm `cat /u01/app/oracle/product/18.0.0.0/dbhome_1/lib/sysliblist` -ldl -lm -L/u01/app/oracle/product/18.0.0.0/dbhome_1/lib `test -x /usr/bin/hugeedit -a -r /usr/lib64/libhugetlbfs.so && test -r /u01/app/oracle/product/18.0.0.0/dbhome_1/rdbms/lib/shugetlbfs.o && echo -Wl,-zcommon-page-size=2097152 -Wl,-zmax-page-size=2097152 -lhugetlbfs` rm -f /u01/app/oracle/product/18.0.0.0/dbhome_1/bin/oracle mv /u01/app/oracle/product/18.0.0.0/dbhome_1/rdbms/lib/oracle /u01/app/oracle/product/18.0.0.0/dbhome_1/bin/oracle chmod 6751 /u01/app/oracle/product/18.0.0.0/dbhome_1/bin/oracle (if [ ! -f /u01/app/oracle/product/18.0.0.0/dbhome_1/bin/crsd.bin ]; then \ getcrshome="/u01/app/oracle/product/18.0.0.0/dbhome_1/srvm/admin/getcrshome" ; \ if [ -f "$getcrshome" ]; then \ crshome="`$getcrshome`"; \ if [ -n "$crshome" ]; then \ if [ $crshome != /u01/app/oracle/product/18.0.0.0/dbhome_1 ]; then \ oracle="/u01/app/oracle/product/18.0.0.0/dbhome_1/bin/oracle"; \ $crshome/bin/setasmgidwrap oracle_binary_path=$oracle; \ fi \ fi \ fi \ fi\ );

4- Start listener

With grid user, execute :
grid@ODA01:/home/grid/ [+ASM1] srvctl start listener -listener LISTENER

5- Start database

With oracle user, execute :
oracle@ODA01:/home/oracle/ [rdbms18000_2] srvctl start database -d TESTDB_DC13

6- Deactivate any existing Unified Auditing policies

oracle@ODA01:/home/oracle/ [SALESPRD] sqh SQL> set line 300 SQL> column user_name format a20 SQL> column policy_name format a50 SQL> column entity_name format a50 SQL> select * from audit_unified_enabled_policies; USER_NAME POLICY_NAME ENABLED ENABLED_OPTION ENTITY_NAME ENTITY_ SUC FAI -------------------- -------------------------------------------------- ------- --------------- -------------------------------------------------- ------- --- --- ALL USERS ORA_SECURECONFIG BY BY USER ALL USERS USER YES YES ALL USERS ORA_LOGON_FAILURES BY BY USER ALL USERS USER NO YES SQL> noaudit policy ORA_SECURECONFIG; NOAUDIT wurde erfolgreich ausgefuhrt. SQL> noaudit policy ORA_LOGON_FAILURES; NOAUDIT wurde erfolgreich ausgefuhrt. SQL> select * from audit_unified_enabled_policies; Es wurden keine Zeilen ausgewahlt

Cet article Dbvisit – Switchover failing with ORA-00600 error due to Unified Auditing been enabled est apparu en premier sur Blog dbi services.

↧

Patching Oracle Database Appliance From 18.8 to 19.6

May 30, 2020, 5:27 am

≫ Next: Oracle 20c : Create a Far Sync Instance Is Now Easy

≪ Previous: Dbvisit – Switchover failing with ORA-00600 error due to Unified Auditing been enabled

The ODA software 19.6 is released and people are starting to patch.
A direct patch to version 19.6 is possible from version 18.8.
Before patching your deployment to Oracle Database Appliance release 19.6, you must upgrade the operating system to Oracle Linux 7.
In this blog I am describing the steps I follow when patching an ODA from 18.8 to 19.6. I am using an ODA X7-2 (one node)
The first step is of course to download the required patch. You will need to copy following patch to your ODA.
Patch 31010832 : ORACLE DATABASE APPLIANCE 19.6.0.0.0 SERVER PATCH FOR ALL ODACLI/DCS STACK
Patch 30403662 : ORACLE DATABASE APPLIANCE RDBMS CLONE FOR ODACLI/DCS STACK

The first patch 31010832 will be used to upgrade the OS to Linux 7 and to patch you server to Appliance 19.6
The patch 30403662 will install the Oracle 19c clone database

The first patch contains 4 files

p31010832_196000_Linux-x86-64_1of4.zip
p31010832_196000_Linux-x86-64_2of4.zip
p31010832_196000_Linux-x86-64_3of4.zip
p31010832_196000_Linux-x86-64_4of4.zip

Just use the unzip command to decompress

oracle@server-oda:/u01/oda_patch_mdi/19.6/ [ORCL] unzip p31010832_196000_Linux-x86-64_1of4.zip
oracle@server-oda:/u01/oda_patch_mdi/19.6/ [ORCL] unzip p31010832_196000_Linux-x86-64_2of4.zip
oracle@server-oda:/u01/oda_patch_mdi/19.6/ [ORCL] unzip p31010832_196000_Linux-x86-64_3of4.zip
oracle@server-oda:/u01/oda_patch_mdi/19.6/ [ORCL] unzip p31010832_196000_Linux-x86-64_4of4.zip

Once done we have following files that we will use to update the repository

oda-sm-19.6.0.0.0-200420-server1of4.zip
oda-sm-19.6.0.0.0-200420-server2of4.zip
oda-sm-19.6.0.0.0-200420-server3of4.zip
oda-sm-19.6.0.0.0-200420-server4of4.zip

After you will have to list your scheduled jobs with the list-schedules option

[root@oda-serveru01]# odacli list-schedules

ID                                       Name                      Description                                        CronExpression                 Disabled
---------------------------------------- ------------------------- -------------------------------------------------- ------------------------------ --------
6d9cd445-8890-4bd6-a713-f6eb8fce35d0     metastore maintenance     internal metastore maintenance                     0 0 0 1/1 * ? *                true   
113ea801-636c-45d9-ad70-448054d825d5     AgentState metastore cleanup internal agentstateentry metastore maintenance     0 0 0 1/1 * ? *                false
e1780bf3-2467-4a89-9298-857fed7fa101     bom maintenance           bom reports generation                             0 0 1 ? * SUN *                false  
2f39ed40-6fc9-4547-b9f1-11889c5c7df9     Big File Upload Cleanup   clean up expired big file uploads.                 0 0 1 ? * SUN *                false  
a624a5c4-4801-444b-9b48-8be99f9f9e48     feature_tracking_job      Feature tracking job                               0 0 20 ? * WED *               false

And disable all enabled jobs (column disabled to false). This done by using the command update-server with -d for disable and -i for the job_id

[root@oda-server~]# odacli update-schedule -d -i  113ea801-636c-45d9-ad70-448054d825d5
Update job schedule success
[root@oda-server~]# odacli  update-schedule -d -i  e1780bf3-2467-4a89-9298-857fed7fa101
Update job schedule success
[root@oda-server~]# odacli  update-schedule -d -i  2f39ed40-6fc9-4547-b9f1-11889c5c7df9
Update job schedule success
[root@oda-server~]# odacli  update-schedule -d -i  a624a5c4-4801-444b-9b48-8be99f9f9e48
Update job schedule success
[root@oda-server~]#

When listing the jobs again, the column disabled must return true for all jobs

odabr is required. for the patching to 19.6. odabr is a tool to backup and recover an ODA. When running the precheck. the result will return failed if odabr is not installed. Please you can consult following document for downloading and using odabr : ODA (Oracle Database Appliance): ODABR a System Backup/Restore Utility (Doc ID 2466177.1)
By default odabr requires 190G free space in the LVM which can be not the case sometimes. In such case odabr should be run using some specific options. In my case I used the command below to take a backup of ODA.

[root@oda-server~]# /opt/odabr/odabr backup -snap -osize 40 -rsize 20 -usize 80
INFO: 2020-05-29 09:31:15: Please check the logfile '/opt/odabr/out/log/odabr_81097.log' for more details


│▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒│
 odabr - ODA node Backup Restore - Version: 2.0.1-55
 Copyright Oracle, Inc. 2013, 2020
 --------------------------------------------------------
 Author: Ruggero Citton 
 RAC Pack, Cloud Innovation and Solution Engineering Team
│▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒│

INFO: 2020-05-29 09:31:15: Checking superuser
INFO: 2020-05-29 09:31:15: Checking Bare Metal
INFO: 2020-05-29 09:31:15: Removing existing LVM snapshots
WARNING: 2020-05-29 09:31:15: LVM snapshot for 'opt' does not exist
WARNING: 2020-05-29 09:31:15: LVM snapshot for 'u01' does not exist
WARNING: 2020-05-29 09:31:16: LVM snapshot for 'root' does not exist
INFO: 2020-05-29 09:31:16: Checking LVM size
INFO: 2020-05-29 09:31:16: Boot device backup
INFO: 2020-05-29 09:31:16: ...getting boot device
INFO: 2020-05-29 09:31:16: ...making boot device backup
INFO: 2020-05-29 09:31:20: ...boot device backup saved as '/opt/odabr/out/hbi/boot.img'
INFO: 2020-05-29 09:31:21: ...boot device backup check passed
INFO: 2020-05-29 09:31:21: Getting EFI device
INFO: 2020-05-29 09:31:21: ...making efi device backup
INFO: 2020-05-29 09:31:24: ...EFI device backup saved as '/opt/odabr/out/hbi/efi.img'
INFO: 2020-05-29 09:31:24: ...EFI device backup check passed
INFO: 2020-05-29 09:31:24: OCR backup
INFO: 2020-05-29 09:31:26: ...ocr backup saved as '/opt/odabr/out/hbi/ocrbackup_81097.bck'
INFO: 2020-05-29 09:31:26: Making LVM snapshot backup
SUCCESS: 2020-05-29 09:31:28: ...snapshot backup for 'opt' created successfully
SUCCESS: 2020-05-29 09:31:29: ...snapshot backup for 'u01' created successfully
SUCCESS: 2020-05-29 09:31:29: ...snapshot backup for 'root' created successfully
SUCCESS: 2020-05-29 09:31:29: LVM snapshots backup done successfully
[root@oda-server~]#

Snapshots can be verified using the infosnap option

[root@oda-server~]# /opt/odabr/odabr infosnap

│▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒│
 odabr - ODA node Backup Restore - Version: 2.0.1-55
 Copyright Oracle, Inc. 2013, 2020
 --------------------------------------------------------
 Author: Ruggero Citton 
 RAC Pack, Cloud Innovation and Solution Engineering Team
│▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒│


LVM snap name         Status                COW Size              Data%
-------------         ----------            ----------            ------
root_snap             active                20.00 GiB             0.02%
opt_snap              active                40.00 GiB             0.21%
u01_snap              active                80.00 GiB             0.02%


You have new mail in /var/spool/mail/root
[root@oda-server~]#

We can now update the repository with the server software using the patch 31010832

For the first file

[root@oda-server19.6]# /opt/oracle/dcs/bin/odacli update-repository -f /u01/oda_patch_mdi/19.6/oda-sm-19.6.0.0.0-200420-server1of4.zip

{
  "jobId" : "e880e05e-e65a-44ea-897e-2ff376b28066",
  "status" : "Created",
  "message" : "/u01/oda_patch_mdi/19.6/oda-sm-19.6.0.0.0-200420-server1of4.zip",
  "reports" : [ ],
  "createTimestamp" : "May 29, 2020 10:11:11 AM CEST",
  "resourceList" : [ ],
  "description" : "Repository Update",
  "updatedTime" : "May 29, 2020 10:11:11 AM CEST"
}
[root@oda-server~]#

Check the job status with describe-job

[root@oda-server19.6]# /opt/oracle/dcs/bin/odacli describe-job -i "e880e05e-e65a-44ea-897e-2ff376b28066"

Job details
----------------------------------------------------------------
                     ID:  e880e05e-e65a-44ea-897e-2ff376b28066
            Description:  Repository Update
                 Status:  Success
                Created:  May 29, 2020 10:11:11 AM CEST
                Message:  /u01/oda_patch_mdi/19.6/oda-sm-19.6.0.0.0-200420-server1of4.zip

Task Name                                Start Time                          End Time                            Status
---------------------------------------- ----------------------------------- ----------------------------------- ----------

[root@oda-server19.6]#

Do this for all 3 other files. All jobs must return success.

[root@oda-server19.6]# /opt/oracle/dcs/bin/odacli update-repository -f /u01/oda_patch_mdi/19.6/oda-sm-19.6.0.0.0-200420-server2of4.zip
[root@oda-server19.6]# /opt/oracle/dcs/bin/odacli update-repository -f /u01/oda_patch_mdi/19.6/oda-sm-19.6.0.0.0-200420-server3of4.zip
[root@oda-server19.6]# /opt/oracle/dcs/bin/odacli update-repository -f /u01/oda_patch_mdi/19.6/oda-sm-19.6.0.0.0-200420-server4of4.zip

Now let’s update the DCS agent

[root@oda-server19.6]# /opt/oracle/dcs/bin/odacli update-dcsagent -v 19.6.0.0.0
{
  "jobId" : "2a2de2bc-d0c4-466a-91a8-818089d18867",
  "status" : "Created",
  "message" : "Dcs agent will be restarted after the update. Please wait for 2-3 mins before executing the other commands",
  "reports" : [ ],
  "createTimestamp" : "May 29, 2020 10:23:38 AM CEST",
  "resourceList" : [ ],
  "description" : "DcsAgent patching",
  "updatedTime" : "May 29, 2020 10:23:38 AM CEST"
}
[root@oda-server19.6]#

Confirm that the job returns success

[root@oda-server19.6]# /opt/oracle/dcs/bin/odacli describe-job -i "2a2de2bc-d0c4-466a-91a8-818089d18867"

Job details
----------------------------------------------------------------
                     ID:  2a2de2bc-d0c4-466a-91a8-818089d18867
            Description:  DcsAgent patching
                 Status:  Success
                Created:  May 29, 2020 10:23:38 AM CEST
                Message:

Task Name                                Start Time                          End Time                            Status
---------------------------------------- ----------------------------------- ----------------------------------- ----------
dcs-agent upgrade  to version 19.6.0.0.0 May 29, 2020 10:23:38 AM CEST       May 29, 2020 10:25:17 AM CEST       Success
Update System version                    May 29, 2020 10:25:18 AM CEST       May 29, 2020 10:25:18 AM CEST       Success

[root@oda-server19.6]#

Before upgrading the OS, we must generate a prepatch report. This will indicate eventual issues that may cause the patching to fail.

[root@oda-server19.6]# /opt/oracle/dcs/bin/odacli create-prepatchreport -v 19.6.0.0.0 -os

Job details
----------------------------------------------------------------
                     ID:  d0a038e3-651d-478f-87a8-6cb1e125e437
            Description:  Patch pre-checks for [OS]
                 Status:  Created
                Created:  May 29, 2020 10:29:55 AM CEST
                Message:  Use 'odacli describe-prepatchreport -i d0a038e3-651d-478f-87a8-6cb1e125e437' to check details of results

Task Name                                Start Time                          End Time                            Status
---------------------------------------- ----------------------------------- ----------------------------------- ----------

[root@oda-server19.6]#

We can monitor the status of the job with the describe-job command. As we can see we have failure

[root@oda-server19.6]# /opt/oracle/dcs/bin/odacli describe-job -i d0a038e3-651d-478f-87a8-6cb1e125e437

Job details
----------------------------------------------------------------
                     ID:  d0a038e3-651d-478f-87a8-6cb1e125e437
            Description:  Patch pre-checks for [OS]
                 Status:  Failure
                Created:  May 29, 2020 10:29:55 AM CEST
                Message:  DCS-10001:Internal error encountered: One or more pre-checks failed. Run describe-prepatchreport for more details.

Task Name                                Start Time                          End Time                            Status
---------------------------------------- ----------------------------------- ----------------------------------- ----------
task:TaskZLockWrapper_132                May 29, 2020 10:30:01 AM CEST       May 29, 2020 10:40:02 AM CEST       Failure
Run patching pre-checks                  May 29, 2020 10:30:01 AM CEST       May 29, 2020 10:40:02 AM CEST       Success
Check pre-check status                   May 29, 2020 10:40:02 AM CEST       May 29, 2020 10:40:02 AM CEST       Failure

[root@oda-server19.6]#

To have more info about these errors, we use the describe-prepatchreport

[root@server-oda 19.6]# /opt/oracle/dcs/bin/odacli describe-prepatchreport -i d0a038e3-651d-478f-87a8-6cb1e125e437

Patch pre-check report
------------------------------------------------------------------------
Job ID: d0a038e3-651d-478f-87a8-6cb1e125e437
Description: Patch pre-checks for [OS]
Status: FAILED
Created: May 29, 2020 10:29:55 AM CEST
Result: One or more pre-checks failed for [OS]

Node Name
---------------
server-oda

Pre-Check Status Comments
------------------------------ -------- --------------------------------------
__OS__
Validate supported versions Success Validated minimum supported versions.
Validate patching tag Success Validated patching tag: 19.6.0.0.0.
Is patch location available Success Patch location is available.
Validate if ODABR is installed Success Validated ODABR is installed
Validate if ODABR snapshots Failed ODABR snapshots are seen on node:
exist server-oda.
Validate LVM free space Failed Insufficient space to create LVM
snapshots on node: server-oda. Expected
free space (GB): 190, available space
(GB): 22.
Space checks for OS upgrade Success Validated space checks.
Install OS upgrade software Success Extracted OS upgrade patches into
/root/oda-upgrade. Do not remove this
directory untill OS upgrade completes.
Verify OS upgrade by running Success Results stored in:
preupgrade checks '/root/preupgrade-results/
preupg_results-200529103957.tar.gz' .
Read complete report file
'/root/preupgrade/result.html' before
attempting OS upgrade.
Validate custom rpms installed Failed Custom RPMs installed. Please check
files
/root/oda-upgrade/
rpms-added-from-ThirdParty and/or
/root/oda-upgrade/
rpms-added-from-Oracle.
Scheduled jobs check Success Scheduled jobs found. Disable
scheduled jobs before attempting OS
upgrade.

[root@server-oda 19.6]#

As we can see the prepatch reports two errors
One related to odabr because I have already taken a backup for my ODA
A second error related to custom rpms installed.

For the error related to odabr as it is mentioned in the documentation

If snapshots are already present on the system when odacli create-prepatchreport is run, this precheck fails, because ODACLI expects to create these snapshots itself. If the user created snapshots or the operating system upgrade was retried (due to a failure) after it had already created the snapshots, this precheck will fail. Note that if snapshots already exist, odacli update-server –c OS still continues with the upgrade.

I decided to ignore the error and to run later the patch with option –force

But be sure before continuing to check these following two files and to remove all mentioned rpm packages.

/root/oda-upgrade/rpms-added-from-Oracle
/root/oda-upgrade/ rpms-added-from-ThirdParty

Just another thing os to verify if you have enough free space. You can if needed cleanup your old repository

[root@server-oda 19.6]# odacli cleanup-patchrepo -comp GI,DB -v 18.8.0.0.0

In my case below is the status of my /, /u01 and /opt filesystems.

[root@oda-server~]# df -h / /u01 /opt
Filesystem            Size  Used Avail Use% Mounted on
/dev/mapper/VolGroupSys-LogVolRoot
                       30G   11G   18G  39% /
/dev/mapper/VolGroupSys-LogVolU01
                      148G   96G   45G  69% /u01
/dev/mapper/VolGroupSys-LogVolOpt
                       63G   40G   21G  66% /opt
[root@oda-server~]#

It’s time now to start the patching process by upgrading the server to Linux 7. I did the following command from the ILOM console as described in the documentation.

With putty connect to your ILOM IP with root. Once connect to the ILOM just type start /SP/console and log in as root on your server.

-> start /SP/console
Are you sure you want to start /SP/console (y/n)? y

Serial console started.  To stop, type ESC (


Oracle Linux Server release 6.10
Kernel 4.1.12-124.33.4.el6uek.x86_64 on an x86_64

oda-serverlogin: root
Password:
Last login: Fri May 29 08:28:34 from 10.14.211.131
[root@oda-server~]#

And with root from the ILOM console run the OS upgrade (Just remember in my case I decided to use the –force)

[root@oda-server~]# odacli update-server -v 19.6.0.0.0 -c os --local --force
Verifying OS upgrade
Current OS base version: 6 is lessthan target OS base version: 7
OS needs to upgrade to 7.7
****************************************************************************
*  Depending on the hardware platform, the upgrade operation, including    *
*  node reboot, may take 30-60 minutes to complete. Individual steps in    *
*  the operation may not show progress messages for a while. Do not abort  *
*  upgrade using ctrl-c or by rebooting the system.                        *
****************************************************************************

run: cmd= '[/usr/bin/expect, /root/upgradeos.exp]'
output redirected to /root/odaUpgrade2020-05-29_13-50-23.0388.log

Running pre-upgrade checks.

Running pre-upgrade checks.
........
Running assessment of the system
.........

This will take some time, but just wait. You may see the progression from the ILOM console


…
[  112.602728] upgrade[6905]: [505/779] (51%) installing libsmbclient-4.9.1-10.el7_7...
[  112.647750] upgrade[6905]: [506/779] (51%) installing mesa-libgbm-18.3.4-6.el7_7...
[  112.691582] upgrade[6905]: [507/779] (51%) installing device-mapper-event-1.02.158-2.0.1.el7_7.2...
[  112.719106] upgrade[6905]: Created symlink /etc/systemd/system/sockets.target.wants/dm-event.socket, pointing to /usr/lib/systemd/system/dm-event.socket.
[  112.736417] upgrade[6905]: Running in chroot, ignoring request.
[  112.745205] upgrade[6905]: [508/779] (51%) installing usbredir-0.7.1-3.el7...
[  112.780151] upgrade[6905]: [509/779] (51%) installing uptrack-offline-1.2.62.offline-0.el7...
[  113.001966] upgrade[6905]: [510/779] (51%) installing dconf-0.28.0-4.el7...
[  113.129938] upgrade[6905]: [511/779] (51%) installing iscsi-initiator-utils-iscsiuio-6.2.0.874-11.0.1.el7...
[  113.178038] upgrade[6905]: [512/779] (51%) installing iscsi-initiator-utils-6.2.0.874-11.0.1.el7...
[  113.362242] upgrade[6905]: [513/779] (52%) installing unbound-libs-1.6.6-1.el7...

At the end of the process the server will reboot. And if everything is fine you should have an Linux 7.7 now

[root@oda-server~]# cat /etc/oracle-release
Oracle Linux Server release 7.7
[root@oda-server~]#

After the operating system upgrade is completed successfully, run the post upgrade checks:

[root@oda-server~]# /opt/oracle/dcs/bin/odacli update-server-postcheck -v 19.6.0.0.0

Upgrade post-check report
-------------------------

Node Name
---------------
server-oda

Comp  Pre-Check                      Status   Comments
----- ------------------------------ -------- --------------------------------------
OS    OS upgrade check               SUCCESS  OS has been upgraded to OL7
GI    GI upgrade check               INFO     GI home needs to update to 19.6.0.0.200114
GI    GI status check                SUCCESS  Clusterware is running on the node
OS    ODABR snapshot                 WARNING  ODABR snapshot found. Run 'odabr delsnap' to delete.
RPM   Extra RPM check                SUCCESS  No extra RPMs found when OS was at OL6
[root@oda-server~]#

As specified let’s remove the snapshots we took with odabr

[root@oda-server~]# /opt/odabr/odabr delsnap
INFO: 2020-05-29 14:22:31: Please check the logfile '/opt/odabr/out/log/odabr_60717.log' for more details

INFO: 2020-05-29 14:22:31: Removing LVM snapshots
INFO: 2020-05-29 14:22:31: ...removing LVM snapshot for 'opt'
SUCCESS: 2020-05-29 14:22:32: ...snapshot for 'opt' removed successfully
INFO: 2020-05-29 14:22:32: ...removing LVM snapshot for 'u01'
SUCCESS: 2020-05-29 14:22:32: ...snapshot for 'u01' removed successfully
INFO: 2020-05-29 14:22:32: ...removing LVM snapshot for 'root'
SUCCESS: 2020-05-29 14:22:32: ...snapshot for 'root' removed successfully
SUCCESS: 2020-05-29 14:22:32: Remove LVM snapshots done successfully

Running the posthecks again

[root@oda-server~]# /opt/oracle/dcs/bin/odacli update-server-postcheck -v 19.6.0.0.0

Upgrade post-check report
-------------------------

Node Name
---------------
server-oda

Comp  Pre-Check                      Status   Comments
----- ------------------------------ -------- --------------------------------------
OS    OS upgrade check               SUCCESS  OS has been upgraded to OL7
GI    GI upgrade check               INFO     GI home needs to update to 19.6.0.0.200114
GI    GI status check                SUCCESS  Clusterware is running on the node
OS    ODABR snapshot                 SUCCESS  No ODABR snapshots found
RPM   Extra RPM check                SUCCESS  No extra RPMs found when OS was at OL6
[root@oda-server~]#

After the OS upgrade now let’s upgrade the remaining components
We start by updating the DSC agent

[root@oda-server~]# /opt/oracle/dcs/bin/odacli update-dcsadmin -v 19.6.0.0.0
{
  "jobId" : "5bdb8d99-d3ae-421c-a05d-60b9ece94021",
  "status" : "Created",
  "message" : null,
  "reports" : [ ],
  "createTimestamp" : "May 29, 2020 14:26:32 PM CEST",
  "resourceList" : [ ],
  "description" : "DcsAdmin patching",
  "updatedTime" : "May 29, 2020 14:26:32 PM CEST"
}
[root@oda-server~]#

We validate that the Job returns success

[root@oda-server~]# /opt/oracle/dcs/bin/odacli describe-job -i "5bdb8d99-d3ae-421c-a05d-60b9ece94021"

Job details
----------------------------------------------------------------
                     ID:  5bdb8d99-d3ae-421c-a05d-60b9ece94021
            Description:  DcsAdmin patching
                 Status:  Success
                Created:  May 29, 2020 2:26:32 PM CEST
                Message:

Task Name                                Start Time                          End Time                            Status
---------------------------------------- ----------------------------------- ----------------------------------- ----------
Patch location validation                May 29, 2020 2:26:32 PM CEST        May 29, 2020 2:26:32 PM CEST        Success
dcsadmin upgrade                         May 29, 2020 2:26:32 PM CEST        May 29, 2020 2:26:32 PM CEST        Success
Update System version                    May 29, 2020 2:26:32 PM CEST        May 29, 2020 2:26:32 PM CEST        Success

[root@oda-server~]#

We update the DCS components

[root@oda-server~]# /opt/oracle/dcs/bin/odacli update-dcscomponents -v 19.6.0.0.0
{
  "jobId" : "4af0626e-d260-4f93-8948-2f241f7b2c48",
  "status" : "Success",
  "message" : null,
  "reports" : null,
  "createTimestamp" : "May 29, 2020 14:27:52 PM CEST",
  "description" : "Job completed and is not part of Agent job list",
  "updatedTime" : "May 29, 2020 14:27:52 PM CEST"
}
[root@oda-server~]#

And then we update the server

[root@oda-server~]# /opt/oracle/dcs/bin/odacli update-server -v 19.6.0.0.0
{
  "jobId" : "91f7b206-b097-4ba2-bae7-42a6b3bb0ba6",
  "status" : "Created",
  "message" : "Success of server update will trigger reboot of the node after 4-5 minutes. Please wait until the node reboots.",
  "reports" : [ ],
  "createTimestamp" : "May 29, 2020 14:31:25 PM CEST",
  "resourceList" : [ ],
  "description" : "Server Patching",
  "updatedTime" : "May 29, 2020 14:31:25 PM CEST"
}
[root@oda-server~]#

This will lasts about 45 minutes. Validate that all is fine

[root@oda-server~]# /opt/oracle/dcs/bin/odacli describe-job -i "91f7b206-b097-4ba2-bae7-42a6b3bb0ba6"

Job details
----------------------------------------------------------------
                     ID:  91f7b206-b097-4ba2-bae7-42a6b3bb0ba6
            Description:  Server Patching
                 Status:  Success
                Created:  May 29, 2020 2:31:25 PM CEST
                Message:

Task Name                                Start Time                          End Time                            Status
---------------------------------------- ----------------------------------- ----------------------------------- ----------
Patch location validation                May 29, 2020 2:31:33 PM CEST        May 29, 2020 2:31:33 PM CEST        Success
dcs-controller upgrade                   May 29, 2020 2:31:33 PM CEST        May 29, 2020 2:31:33 PM CEST        Success
Creating repositories using yum          May 29, 2020 2:31:33 PM CEST        May 29, 2020 2:31:34 PM CEST        Success
Applying HMP Patches                     May 29, 2020 2:31:34 PM CEST        May 29, 2020 2:31:34 PM CEST        Success
Patch location validation                May 29, 2020 2:31:34 PM CEST        May 29, 2020 2:31:34 PM CEST        Success
oda-hw-mgmt upgrade                      May 29, 2020 2:31:35 PM CEST        May 29, 2020 2:31:35 PM CEST        Success
OSS Patching                             May 29, 2020 2:31:35 PM CEST        May 29, 2020 2:31:35 PM CEST        Success
Applying Firmware Disk Patches           May 29, 2020 2:31:35 PM CEST        May 29, 2020 2:31:40 PM CEST        Success
Applying Firmware Expander Patches       May 29, 2020 2:31:40 PM CEST        May 29, 2020 2:31:44 PM CEST        Success
Applying Firmware Controller Patches     May 29, 2020 2:31:44 PM CEST        May 29, 2020 2:31:48 PM CEST        Success
Checking Ilom patch Version              May 29, 2020 2:31:50 PM CEST        May 29, 2020 2:31:52 PM CEST        Success
Patch location validation                May 29, 2020 2:31:52 PM CEST        May 29, 2020 2:31:53 PM CEST        Success
Save password in Wallet                  May 29, 2020 2:31:54 PM CEST        May 29, 2020 2:31:54 PM CEST        Success
Apply Ilom patch                         May 29, 2020 2:31:54 PM CEST        May 29, 2020 2:39:55 PM CEST        Success
Copying Flash Bios to Temp location      May 29, 2020 2:39:55 PM CEST        May 29, 2020 2:39:55 PM CEST        Success
Starting the clusterware                 May 29, 2020 2:39:55 PM CEST        May 29, 2020 2:41:52 PM CEST        Success
Creating GI home directories             May 29, 2020 2:41:53 PM CEST        May 29, 2020 2:41:53 PM CEST        Success
Cloning Gi home                          May 29, 2020 2:41:53 PM CEST        May 29, 2020 2:44:16 PM CEST        Success
Configuring GI                           May 29, 2020 2:44:16 PM CEST        May 29, 2020 2:46:25 PM CEST        Success
Running GI upgrade root scripts          May 29, 2020 2:46:25 PM CEST        May 29, 2020 3:03:15 PM CEST        Success
Resetting DG compatibility               May 29, 2020 3:03:15 PM CEST        May 29, 2020 3:03:33 PM CEST        Success
Running GI config assistants             May 29, 2020 3:03:33 PM CEST        May 29, 2020 3:04:18 PM CEST        Success
restart oakd                             May 29, 2020 3:04:28 PM CEST        May 29, 2020 3:04:39 PM CEST        Success
Updating GiHome version                  May 29, 2020 3:04:39 PM CEST        May 29, 2020 3:04:45 PM CEST        Success
Update System version                    May 29, 2020 3:04:54 PM CEST        May 29, 2020 3:04:54 PM CEST        Success
preRebootNode Actions                    May 29, 2020 3:04:54 PM CEST        May 29, 2020 3:05:40 PM CEST        Success
Reboot Ilom                              May 29, 2020 3:05:40 PM CEST        May 29, 2020 3:05:40 PM CEST        Success

[root@oda-server~]#

And we can verify that the components were upgraded

[root@oda-server~]# /opt/oracle/dcs/bin/odacli describe-component
System Version
---------------
19.6.0.0.0

Component                                Installed Version    Available Version
---------------------------------------- -------------------- --------------------
OAK                                       19.6.0.0.0            up-to-date
GI                                        19.6.0.0.200114       up-to-date
DB                                        11.2.0.4.190115       11.2.0.4.200114
DCSAGENT                                  19.6.0.0.0            up-to-date
ILOM                                      4.0.4.52.r133103      up-to-date
BIOS                                      41060700              up-to-date
OS                                        7.7                   up-to-date
FIRMWARECONTROLLER                        QDV1RF30              up-to-date
FIRMWAREDISK                              0121                  up-to-date
HMP                                       2.4.5.0.1             up-to-date

[root@oda-server~]#

After the server we have to update the storage

[root@oda-server~]# /opt/oracle/dcs/bin/odacli update-storage -v 19.6.0.0.0
{
  "jobId" : "c4b365ff-bd8c-4bca-b53d-d9d9bda90548",
  "status" : "Created",
  "message" : "Success of Storage Update may trigger reboot of node after 4-5 minutes. Please wait till node restart",
  "reports" : [ ],
  "createTimestamp" : "May 29, 2020 15:16:17 PM CEST",
  "resourceList" : [ ],
  "description" : "Storage Firmware Patching",
  "updatedTime" : "May 29, 2020 15:16:17 PM CEST"
}
[root@oda-server~]#

We can verify that the job was successful

[root@oda-server~]# /opt/oracle/dcs/bin/odacli describe-job -i "c4b365ff-bd8c-4bca-b53d-d9d9bda90548"

Job details
----------------------------------------------------------------
                     ID:  c4b365ff-bd8c-4bca-b53d-d9d9bda90548
            Description:  Storage Firmware Patching
                 Status:  Success
                Created:  May 29, 2020 3:16:17 PM CEST
                Message:

Task Name                                Start Time                          End Time                            Status
---------------------------------------- ----------------------------------- ----------------------------------- ----------
Applying Firmware Disk Patches           May 29, 2020 3:16:18 PM CEST        May 29, 2020 3:16:21 PM CEST        Success
Applying Firmware Controller Patches     May 29, 2020 3:16:21 PM CEST        May 29, 2020 3:16:26 PM CEST        Success
preRebootNode Actions                    May 29, 2020 3:16:26 PM CEST        May 29, 2020 3:16:26 PM CEST        Success
Reboot Ilom                              May 29, 2020 3:16:26 PM CEST        May 29, 2020 3:16:26 PM CEST        Success

[root@oda-server~]#

I did not patch the existing database home because I just want to keep the actual version for my existing databases.

To be able to create new 19c databases we have to update the repository with corresponding rdbms clone.
After unzipping the archive

oracle@server-oda:/u01/oda_patch_mdi/19.6/ [ORCL] unzip p30403662_196000_Linux-x86-64.zip
Archive:  p30403662_196000_Linux-x86-64.zip
 extracting: odacli-dcs-19.6.0.0.0-200326-DB-19.6.0.0.zip

I just have to update the repository with the db clone

[root@oda-server19.6]# /opt/oracle/dcs/bin/odacli update-repository -f /u01/oda_patch_mdi/19.6/odacli-dcs-19.6.0.0.0-200326-DB-19.6.0.0.zip
{
  "jobId" : "e8e58467-460f-4b07-86db-d4b71f1e5884",
  "status" : "Created",
  "message" : "/u01/oda_patch_mdi/19.6/odacli-dcs-19.6.0.0.0-200326-DB-19.6.0.0.zip",
  "reports" : [ ],
  "createTimestamp" : "May 29, 2020 15:23:37 PM CEST",
  "resourceList" : [ ],
  "description" : "Repository Update",
  "updatedTime" : "May 29, 2020 15:23:37 PM CEST"
}
[root@oda-server19.6]#

Then I check that the job was fine

[root@oda-server19.6]# /opt/oracle/dcs/bin/odacli describe-job -i "e8e58467-460f-4b07-86db-d4b71f1e5884"

Job details
----------------------------------------------------------------
                     ID:  e8e58467-460f-4b07-86db-d4b71f1e5884
            Description:  Repository Update
                 Status:  Success
                Created:  May 29, 2020 3:23:37 PM CEST
                Message:  /u01/oda_patch_mdi/19.6/odacli-dcs-19.6.0.0.0-200326-DB-19.6.0.0.zip

Task Name                                Start Time                          End Time                            Status
---------------------------------------- ----------------------------------- ----------------------------------- ----------

[root@oda-server19.6]#

Encountered ACFS Issue
During the patch I did not have any issue. All was fine. But at the end of the patch my existing databases did not come up because of an acfs issue.
Indeed, the ASM instance was up but the ASM proxy instance APX was instable

[grid@oda-server~]$ srvctl start  asm -proxy -node oda-server
PRCR-1013 : Failed to start resource ora.proxy_advm
PRCR-1064 : Failed to start resource ora.proxy_advm on node oda-server
CRS-5017: The resource action "ora.proxy_advm start" encountered the following error:
ORA-01092: ORACLE instance terminated. Disconnection forced
Process ID: 0
Session ID: 0 Serial number: 0
. For details refer to "(:CLSN00107:)" in "/u01/app/grid/diag/crs/oda-server/crs/trace/crsd_oraagent_grid.trc".

Looking on the trace file

[root@oda-servertrace]# grep -i ORA- /u01/app/grid/diag/crs/oda-server/crs/trace/crsd_oraagent_grid.trc
2020-05-29 16:51:18.216 :CLSDYNAM:916408064: [ora.proxy_advm]{1:38257:2942} [start] ORA-03113: end-of-file on communication channel
2020-05-29 16:51:18.216 :CLSDYNAM:916408064: [ora.proxy_advm]{1:38257:2942} [start] InstAgent::startInstance 250 ORA-3113retryCount:0 m_instanceType:2 m_lastOCIError:3113
2020-05-29 16:51:39.641 :CLSDYNAM:916408064: [ora.proxy_advm]{1:38257:2942} [start] ORA-03113: end-of-file on communication channel
2020-05-29 16:51:39.641 :CLSDYNAM:916408064: [ora.proxy_advm]{1:38257:2942} [start] InstAgent::startInstance 250 ORA-3113retryCount:1 m_instanceType:2 m_lastOCIError:3113
2020-05-29 16:52:01.078 :CLSDYNAM:916408064: [ora.proxy_advm]{1:38257:2942} [start] ORA-03113: end-of-file on communication channel
2020-05-29 16:52:01.078 :CLSDYNAM:916408064: [ora.proxy_advm]{1:38257:2942} [start] InstAgent::startInstance 250 ORA-3113retryCount:2 m_instanceType:2 m_lastOCIError:3113
2020-05-29 16:52:01.098 :CLSDYNAM:916408064: [ora.proxy_advm]{1:38257:2942} [start] ORA-01092: ORACLE instance terminated. Disconnection forced
2020-05-29 16:52:01.099 :CLSDYNAM:916408064: [ora.proxy_advm]{1:38257:2942} [start] InstAgent::startInstance 380 throw excp what:ORA-01092: ORACLE instance terminated. Disconnection forced
2020-05-29 16:52:01.099 :CLSDYNAM:916408064: [ora.proxy_advm]{1:38257:2942} [start] ORA-01092: ORACLE instance terminated. Disconnection forced
ORA-01092: ORACLE instance terminated. Disconnection forced
[root@oda-servertrace]#

For some reason the acfs module was not loaded. My databases was Oracle 11.2.0.4 and were using acfs.

[root@oda-server~]# /sbin/lsmod | grep oracle
oracleoks             724992  0
oracleafd             229376  0

So I first stop my crs

[root@oda-server~]#  crsctl stop crs

And then I reinstall the acfs module with following acfsroot command

[root@oda-server~]# which acfsroot
/u01/app/19.0.0.0/grid/bin/acfsroot

[root@oda-server~]# /u01/app/19.0.0.0/grid/bin/acfsroot install
ACFS-9300: ADVM/ACFS distribution files found.
ACFS-9314: Removing previous ADVM/ACFS installation.
ACFS-9315: Previous ADVM/ACFS components successfully removed.
ACFS-9294: updating file /etc/sysconfig/oracledrivers.conf
ACFS-9307: Installing requested ADVM/ACFS software.
ACFS-9294: updating file /etc/sysconfig/oracledrivers.conf
ACFS-9308: Loading installed ADVM/ACFS drivers.
ACFS-9321: Creating udev for ADVM/ACFS.
ACFS-9323: Creating module dependencies - this may take some time.
ACFS-9154: Loading 'oracleoks.ko' driver.
ACFS-9154: Loading 'oracleadvm.ko' driver.
ACFS-9154: Loading 'oracleacfs.ko' driver.
ACFS-9327: Verifying ADVM/ACFS devices.
ACFS-9156: Detecting control device '/dev/asm/.asm_ctl_spec'.
ACFS-9156: Detecting control device '/dev/ofsctl'.
ACFS-9309: ADVM/ACFS installation correctness verified.
[root@oda-server~]#

After I reboot my server and all databases and the ASM Proxy Instance become up and running. I just wanted to mention this issue I had and hope this will help.

Conclusion

Some recommendations
Remove all rpm packages that were manually installed
Verify that you have enough space on / , /u01 and /opt filesystems. /opt and /u01 can be increased online
Launch the OS upgrade from the ILOM console
I hope this blog will help.

Cet article Patching Oracle Database Appliance From 18.8 to 19.6 est apparu en premier sur Blog dbi services.

↧

Oracle 20c : Create a Far Sync Instance Is Now Easy

May 30, 2020, 12:07 pm

≫ Next: Functions in SQL with the Multitenant Containers Clause

≪ Previous: Patching Oracle Database Appliance From 18.8 to 19.6

A far sync instance is like a standby instance as it can receive redo from the primary database and can ship that redo to other members of the Data Guard configuration. But unlike a physical standby instance, a far sync instance does not contain any datafiles and then can not be open for access. A far sync instance just manages a controlfile. A far sync instance cannot be converted to a primary instance or any other type of standby
Far sync instances are part of the Oracle Active Data Guard Far Sync feature, which requires an Oracle Active Data Guard license.
Until Oracle 20c, the creation of a far sync install was manual. Until Oracle 20c the far sync install must be manually added to the broker.

Starting with Oracke 20c, Oracle now can create a far sync instance for us and also add it in the broker configuration.

In this blog I am showing how to use this functionnality. Below the actual configuration I am using

DGMGRL> show configuration

Configuration - prod20

  Protection Mode: MaxAvailability
  Members:
  prod20_site1 - Primary database
    prod20_site2 - Physical standby database
    prod20_site4 - Physical standby database

Fast-Start Failover:  Disabled

Configuration Status:
SUCCESS   (status updated 55 seconds ago)

DGMGRL>

And I am going to create a far sync instance fs_site3 to receive changes from the primary database prod20_site20 and to ship these changes to prod20_site4 as shown in this figure

With Oracle there is a new CREATE FAR_SYNC command whichh will create the far sync instance for us. But before using this command there are some steps.
First we have to configure Secure External Password Store for the netalias we use.

In our case we are using following aliases
prod20_site1
prod20_site2
prod20_site3
prod20_site4

oracle@oraadserver:/home/oracle/ [prod20 (CDB$ROOT)] tnsping prod20_site1
…
…
Used TNSNAMES adapter to resolve the alias
Attempting to contact (DESCRIPTION = (ADDRESS_LIST = (ADDRESS = (PROTOCOL = TCP)(HOST = oraadserver)(PORT = 1521))) (CONNECT_DATA = (SERVICE_NAME = prod20_site1_dgmgrl)))
OK (0 msec)

oracle@oraadserver:/home/oracle/ [prod20 (CDB$ROOT)] tnsping prod20_site2
…
…
Used TNSNAMES adapter to resolve the alias
Attempting to contact (DESCRIPTION = (ADDRESS_LIST = (ADDRESS = (PROTOCOL = TCP)(HOST = oraadserver2)(PORT = 1521))) (CONNECT_DATA = (SERVICE_NAME = prod20_site2_dgmgrl)))

oracle@oraadserver:/home/oracle/ [prod20 (CDB$ROOT)] tnsping prod20_site3
…
…
Used TNSNAMES adapter to resolve the alias
Attempting to contact (DESCRIPTION = (ADDRESS_LIST = (ADDRESS = (PROTOCOL = TCP)(HOST = oraadserver3)(PORT = 1521))) (CONNECT_DATA = (SERVICE_NAME = fs_site3_dgmgrl)))
OK (0 msec)

oracle@oraadserver:/home/oracle/ [prod20 (CDB$ROOT)] tnsping prod20_site4
…
…
Used TNSNAMES adapter to resolve the alias
Attempting to contact (DESCRIPTION = (ADDRESS_LIST = (ADDRESS = (PROTOCOL = TCP)(HOST = oraadserver4)(PORT = 1521))) (CONNECT_DATA = (SERVICE_NAME = prod20_site4_dgmgrl)))
OK (0 msec)
oracle@oraadserver:/home/oracle/ [prod20 (CDB$ROOT)]

Basically to configure Secure External Password Store

mkstore -wrl wallet_location -create
mkstore -wrl wallet_location -createCredential prod20_site1 sys 
mkstore -wrl wallet_location -createCredential prod20_site2 sys 
mkstore -wrl wallet_location -createCredential prod20_site3  sys 
…

And after you will have to update your sqlnet.ora file with the location of the wallet.

If everything is OK, you normally should be able to connect using your tnsalias

CDB$ROOT)] sqlplus /@prod20_site1 as sysdba

SQL*Plus: Release 20.0.0.0.0 - Production on Sat May 30 19:20:21 2020
Version 20.2.0.0.0

Copyright (c) 1982, 2019, Oracle.  All rights reserved.

Heure de la derniere connexion reussie : Sam. Mai   30 2020 18:36:15 +02:00

Connecte a :
Oracle Database 20c Enterprise Edition Release 20.0.0.0.0 - Production
Version 20.2.0.0.0

SQL> show parameter db_unique_name

NAME				     TYPE	 VALUE
------------------------------------ ----------- ------------------------------
db_unique_name			     string	 prod20_site1
SQL>

After I have start instance fs_site3 in a no mount mode

SQL> startup nomount
ORACLE instance started.

Total System Global Area  314570960 bytes
Fixed Size		    9566416 bytes
Variable Size		  188743680 bytes
Database Buffers	  113246208 bytes
Redo Buffers		    3014656 bytes
SQL> show parameter db_unique_name

NAME				     TYPE	 VALUE
------------------------------------ ----------- ------------------------------
db_unique_name			     string	 FS_SITE3
SQL>

And then connection to the broker I can use the command CREATE FAR_SYNC

oracle@oraadserver:/u01/ [prod20 (CDB$ROOT)] dgmgrl
DGMGRL for Linux: Release 20.0.0.0.0 - Production on Sat May 30 19:25:31 2020
Version 20.2.0.0.0

Copyright (c) 1982, 2020, Oracle and/or its affiliates.  All rights reserved.

Welcome to DGMGRL, type "help" for information.
DGMGRL> connect /
Connected to "prod20_site1"
Connected as SYSDG.
DGMGRL> CREATE FAR_SYNC fs_site3 AS CONNECT IDENTIFIER IS "prod20_site3";
Creating far sync instance "fs_site3".
Connected to "prod20_site1"
Connected to "FS_SITE3"
far sync instance "fs_site3" created
far sync instance "fs_site3" added
DGMGRL>

The far sync instance is created and added in the configuration as we can verify

DGMGRL> show configuration

Configuration - prod20

  Protection Mode: MaxAvailability
  Members:
  prod20_site1 - Primary database
    prod20_site2 - Physical standby database
    prod20_site4 - Physical standby database
    fs_site3     - Far sync instance (disabled)
      ORA-16905: The member was not enabled yet.

Fast-Start Failover:  Disabled

Configuration Status:
SUCCESS   (status updated 31 seconds ago)

DGMGRL>

Let’s enable the far sync instance

DGMGRL> enable far_sync fs_site3;
Enabled.
DGMGRL> show configuration

Configuration - prod20

  Protection Mode: MaxAvailability
  Members:
  prod20_site1 - Primary database
    prod20_site2 - Physical standby database
    prod20_site4 - Physical standby database
    fs_site3     - Far sync instance

Fast-Start Failover:  Disabled

Configuration Status:
SUCCESS   (status updated 38 seconds ago)

DGMGRL>

Now that the far sync is created, we can configure the redoroutes for the databases.
The following configuration means
-If prod20_site1 is the primary database, it will send the changes to prod20_site2 and to fs_site3
-And the fs_site3 will send the changes to prod20_site4 if prod20_site1 is the primary database

DGMGRL> edit database prod20_site1 set property redoroutes='(local:prod20_site2,fs_site3)';
Property "redoroutes" updated

DGMGRL> edit far_sync fs_site3 set property redoroutes='(prod20_site1:prod20_site4 ASYNC)';
Property "redoroutes" updated

We will talk in deep in redoroutes configuration in coming blogs

Cet article Oracle 20c : Create a Far Sync Instance Is Now Easy est apparu en premier sur Blog dbi services.

↧

Functions in SQL with the Multitenant Containers Clause

June 3, 2020, 6:27 am

≫ Next: Oracle 12c – pre-built join index

≪ Previous: Oracle 20c : Create a Far Sync Instance Is Now Easy

By Clemens Bleile

To prepare a presentation about Multitenant Tuning I wanted to see the METHOD_OPT dbms_stats global preference of all my pluggable DBs. In this specific case I had 3 PBDs called pdb1, pdb2 and pdb3 in my CDB. For testing purposes I changed the global preference in pdb1 from its default ‘FOR ALL COLUMNS SIZE AUTO’ to ‘FOR ALL INDEXED COLUMNS SIZE AUTO’:

c##cbleile_adm@orclcdb@PDB1> exec dbms_stats.set_global_prefs('METHOD_OPT','FOR ALL INDEXED COLUMNS SIZE AUTO');
c##cbleile_adm@orclcdb@PDB1> select dbms_stats.get_prefs('METHOD_OPT') from dual;

DBMS_STATS.GET_PREFS('METHOD_OPT')
------------------------------------
FOR ALL INDEXED COLUMNS SIZE AUTO

Afterwards I ran my SQL with the containers clause from the root container:


c##cbleile_adm@orclcdb@CDB$ROOT> select con_id, dbms_stats.get_prefs('METHOD_OPT') method_opt from containers(dual);

    CON_ID METHOD_OPT
---------- --------------------------------
         1 FOR ALL COLUMNS SIZE AUTO
         3 FOR ALL COLUMNS SIZE AUTO
         4 FOR ALL COLUMNS SIZE AUTO
         5 FOR ALL COLUMNS SIZE AUTO

4 rows selected.

For CON_ID 3 I expected to see “FOR ALL INDEXED COLUMNS SIZE AUTO”. What is wrong here?

I actually got it to work with the following query:


c##cbleile_adm@orclcdb@CDB$ROOT> select con_id, method_opt from containers(select dbms_stats.get_prefs('METHOD_OPT') method_opt from dual);

    CON_ID METHOD_OPT
---------- ----------------------------------
         1 FOR ALL COLUMNS SIZE AUTO
         4 FOR ALL COLUMNS SIZE AUTO
         5 FOR ALL COLUMNS SIZE AUTO
         3 FOR ALL INDEXED COLUMNS SIZE AUTO

4 rows selected.

That is interesting. First of all I didn’t know that you can actually use SELECT-statements in the containers clause (according the syntax diagram it has to be a table or a view-name only) and secondly the function dbms_stats.get_prefs in the first example has obviously been called in the root container after getting the data.

I verified that last statement with a simple test by creating a function in all containers, which just returns the container id of the current container:


create or replace function c##cbleile_adm.show_con_id return number
as
conid number;
begin
     select to_number(sys_context('USERENV', 'CON_ID')) into conid from sys.dual;
     return conid;
  end;
/

And then the test:


c##cbleile_adm@orclcdb@CDB$ROOT> select con_id, show_con_id from containers(dual);

    CON_ID SHOW_CON_ID
---------- -----------
         1           1
         3           1
         4           1
         5           1

4 rows selected.

c##cbleile_adm@orclcdb@CDB$ROOT> select con_id, show_con_id from containers(select show_con_id from dual);

    CON_ID SHOW_CON_ID
---------- -----------
         4           4
         1           1
         3           3
         5           5

4 rows selected.

That proved that the function in the select-list of the first statement is actually called in the root container after getting the data from the PDBs.

Summary:
– be careful when running the containers-clause in a select-statement with a function in the select-list. You may get unexpected results.
– the syntax with a select-statement in the containers clause is interesting.

REMARK: Above tests have been performed with Oracle 19.6.

Cet article Functions in SQL with the Multitenant Containers Clause est apparu en premier sur Blog dbi services.

↧

Oracle 12c – pre-built join index

June 4, 2020, 2:09 pm

≫ Next: Oracle 18c – select from a flat file

≪ Previous: Functions in SQL with the Multitenant Containers Clause

By Franck Pachot

.
This post is part of a series of small examples of recent features. I’m running this in the Oracle 20c preview in the Oracle Cloud. I have created a few tables in the previous post with a mini-snowflake scheme: a fact table CASES with the covid-19 cases per country and day. And a dimension hierarchy for the country with COUNTRIES and CONTINENTS tables.

This title may look strange for people used to Oracle. I am showing the REFRESH FAST ON STATEMENT Materialized View clause here, also known as “Synchronous Refresh for Materialized Views”. This name makes sense only when you already know materialized views, complete and fast refreshes, on commit and on-demand refreshes… But that’s not what people will look for. Indexes are also refreshed by the statements, synchronously. Imagine that they were called “Synchronous Refresh for B*Trees”, do you think they would have been so popular?

A materialized view, like an index, is a redundant structure where data is stored in a different physical layout in order to be optimal for alternative queries. For example, you ingest data per date (which is the case in my covid-19 table – each day a new row with the covid-19 cases per country). But if I want to query all points for a specific country, those are scattered though the physical segment that is behind the table (or the partition). With an index on the country_code, I can identify easily one country, because the index is sorted on the country. I may need to go to the table to get the rows, and that is expensive, but I can avoid it by adding all the attributes in the index. With Oracle, as with many databases, we can build covering indexes, for real index-only access, even if they don’t mention those names.

But with my snowflake schema, I’ll not have the country_code in the fact table and I have to join to a dimension. This is more expensive because the index on the country_name will get the country_id and then I have to go to an index on the fact table to get the rows for this country_id. When it comes to joins, I cannot index the result of the join (I’m skipping bitmap join indexes here because I’m talking about covering indexes). What I would like is an index with values from multiple tables.

A materialized view can achieve much more than an index. We can build the result of the join in one table. And no need for event sourcing or streaming here to keep it up to date. No need to denormalize and risk inconsistency. When NoSQL pioneers tell you that storage is cheap and redundancy is the way to scale, just keep your relational database for integrity and build materialized views on top. When they tell you that joins are expensive, just materialize them upfront. Before 12c, keeping those materialized views consistent with the source required either:

materialized view logs which is similar to event sourcing except that ON COMMIT refresh is strongly consistent
partition change tracking which is ok for bulk changes, when scaling big data

This is different from indexes which are maintained immediately: when you update the row, the index is synchronized because your session has the values and the rowid and can go directly to update the index entry.

refresh fast on statement

In 12c you have the benefit from both: index-like fast maintenance with rowid access, and the MView possibility of querying pre-build joins. Here is an example on the tables created in the previous post.


SQL> create materialized view flatview refresh fast on statement as
  2  select to_date(daterep,'dd/mm/yy') daterep,continent_name,country_name,cases from cases join countries using(country_id) join continents using(continent_id) where cases>0;

select to_date(daterep,'dd/mm/yy') daterep,continent_name,country_name,cases from cases join countries using(country_id) join continents using(continent_id) where cases>0
                                                                                                                                             *
ERROR at line 2:
ORA-12015: cannot create a fast refresh materialized view from a complex query

There are some limitations when we want fast refresh and we have a utility to help us understand what we have to change or add in our select clause.

explain_mview

I need to create the table where the messages will be written to by this utility:


@ ?/rdbms/admin/utlxmv

SQL> set sqlformat ansiconsole
SQL> set pagesize 10000

This has created mv_capabilities_table and I can run dbms_mview.explain_mview() now.

Here is the call, with the select part of the materialized view:


SQL> exec dbms_mview.explain_mview('-
  2  select daterep,continent_name,country_name,cases from cases join countries using(country_id) join continents using(continent_id) where cases>0-
  3  ');

PL/SQL procedure successfully completed.

SQL> select possible "?",capability_name,related_text,msgtxt from mv_capabilities_table where capability_name like 'REFRESH_FAST%' order by seq;

   ?                  CAPABILITY_NAME    RELATED_TEXT                                                                 MSGTXT
____ ________________________________ _______________ ______________________________________________________________________
N    REFRESH_FAST
N    REFRESH_FAST_AFTER_INSERT                        inline view or subquery in FROM list not supported for this type MV
N    REFRESH_FAST_AFTER_INSERT                        inline view or subquery in FROM list not supported for this type MV
N    REFRESH_FAST_AFTER_INSERT                        view or subquery in from list
N    REFRESH_FAST_AFTER_ONETAB_DML                    see the reason why REFRESH_FAST_AFTER_INSERT is disabled
N    REFRESH_FAST_AFTER_ANY_DML                       see the reason why REFRESH_FAST_AFTER_ONETAB_DML is disabled
N    REFRESH_FAST_PCT                                 PCT FAST REFRESH is not possible if query contains an inline view

SQL> rollback;

Rollback complete.

“inline view or subquery in FROM list not supported for this type MV” is actually very misleading. I use ANSI joins and they are translated to query blocks and this is not supported.

No ANSI joins

I rewrite it with the old join syntax:


SQL> exec dbms_mview.explain_mview('-
  2  select daterep,continent_name,country_name,cases from cases , countries , continents where cases.country_id=countries.country_id and countries.continent_id=continents.continent_id and cases>0-
  3  ');

PL/SQL procedure successfully completed.

SQL> select possible "?",capability_name,related_text,msgtxt from mv_capabilities_table where capability_name like 'REFRESH_FAST%' order by seq;
   ?                  CAPABILITY_NAME       RELATED_TEXT                                                                      MSGTXT
____ ________________________________ __________________ ___________________________________________________________________________
N    REFRESH_FAST
N    REFRESH_FAST_AFTER_INSERT        CONTINENTS         the SELECT list does not have the rowids of all the detail tables
N    REFRESH_FAST_AFTER_INSERT        DEMO.CASES         the detail table does not have a materialized view log
N    REFRESH_FAST_AFTER_INSERT        DEMO.COUNTRIES     the detail table does not have a materialized view log
N    REFRESH_FAST_AFTER_INSERT        DEMO.CONTINENTS    the detail table does not have a materialized view log
N    REFRESH_FAST_AFTER_ONETAB_DML                       see the reason why REFRESH_FAST_AFTER_INSERT is disabled
N    REFRESH_FAST_AFTER_ANY_DML                          see the reason why REFRESH_FAST_AFTER_ONETAB_DML is disabled
N    REFRESH_FAST_PCT                                    PCT is not possible on any of the detail tables in the materialized view

SQL> rollback;

Rollback complete.

Now I need to add the ROWID of the table CONTINENTS in the materialized view.

ROWID for all tables

Yes, as I mentioned, the gap between indexes and materialized views is shorter. The REFRESH FAST ON STATEMENT requires access by rowid to update the materialized view, like when a statement updates an index.


SQL> exec dbms_mview.explain_mview('-
  2  select continents.rowid continent_rowid,daterep,continent_name,country_name,cases from cases , countries , continents where cases.country_id=countries.country_id and countries.continent_id=continents.continent_id and cases>0-
  3  ');

PL/SQL procedure successfully completed.

SQL> select possible "?",capability_name,related_text,msgtxt from mv_capabilities_table where capability_name like 'REFRESH_FAST%' order by seq;
   ?                  CAPABILITY_NAME       RELATED_TEXT                                                                      MSGTXT
____ ________________________________ __________________ ___________________________________________________________________________
N    REFRESH_FAST
N    REFRESH_FAST_AFTER_INSERT        COUNTRIES          the SELECT list does not have the rowids of all the detail tables
N    REFRESH_FAST_AFTER_INSERT        DEMO.CASES         the detail table does not have a materialized view log
N    REFRESH_FAST_AFTER_INSERT        DEMO.COUNTRIES     the detail table does not have a materialized view log
N    REFRESH_FAST_AFTER_INSERT        DEMO.CONTINENTS    the detail table does not have a materialized view log
N    REFRESH_FAST_AFTER_ONETAB_DML                       see the reason why REFRESH_FAST_AFTER_INSERT is disabled
N    REFRESH_FAST_AFTER_ANY_DML                          see the reason why REFRESH_FAST_AFTER_ONETAB_DML is disabled
N    REFRESH_FAST_PCT                                    PCT is not possible on any of the detail tables in the materialized view
SQL> rollback;

Rollback complete.

Now, the ROWID for COUNTRIES.

I continue the and finally I’ve added ROWID for all tables involved:


SQL> exec dbms_mview.explain_mview('-
  2  select continents.rowid continent_rowid,daterep,continent_name,country_name,cases from cases , countries , continents where cases.country_id=countries.country_id and countries.continent_id=continents.continent_id and cases>0-
  3  ');

PL/SQL procedure successfully completed.

SQL> select possible "?",capability_name,related_text,msgtxt from mv_capabilities_table where capability_name like 'REFRESH_FAST%' order by seq;
   ?                  CAPABILITY_NAME       RELATED_TEXT                                                                      MSGTXT
____ ________________________________ __________________ ___________________________________________________________________________
N    REFRESH_FAST
N    REFRESH_FAST_AFTER_INSERT        COUNTRIES          the SELECT list does not have the rowids of all the detail tables
N    REFRESH_FAST_AFTER_INSERT        DEMO.CASES         the detail table does not have a materialized view log
N    REFRESH_FAST_AFTER_INSERT        DEMO.COUNTRIES     the detail table does not have a materialized view log
N    REFRESH_FAST_AFTER_INSERT        DEMO.CONTINENTS    the detail table does not have a materialized view log
N    REFRESH_FAST_AFTER_ONETAB_DML                       see the reason why REFRESH_FAST_AFTER_INSERT is disabled
N    REFRESH_FAST_AFTER_ANY_DML                          see the reason why REFRESH_FAST_AFTER_ONETAB_DML is disabled
N    REFRESH_FAST_PCT                                    PCT is not possible on any of the detail tables in the materialized view

SQL> rollback;

Rollback complete.

SQL> exec dbms_mview.explain_mview('-
  2  select cases.rowid case_rowid,countries.rowid country_rowid,continents.rowid continent_rowid,to_date(daterep,'dd/mm/yy') daterep,continent_name,country_name,cases from cases , countries , continents where cases.country_id=countries.country_id and countries.continent_id=continents.continent_id and cases>0-
  3  ');

PL/SQL procedure successfully completed.

SQL> select possible "?",capability_name,related_text,msgtxt from mv_capabilities_table where capability_name like 'REFRESH_FAST%' order by seq;
   ?                  CAPABILITY_NAME       RELATED_TEXT                                                                      MSGTXT
____ ________________________________ __________________ ___________________________________________________________________________
N    REFRESH_FAST
N    REFRESH_FAST_AFTER_INSERT        DEMO.CASES         the detail table does not have a materialized view log
N    REFRESH_FAST_AFTER_INSERT        DEMO.COUNTRIES     the detail table does not have a materialized view log
N    REFRESH_FAST_AFTER_INSERT        DEMO.CONTINENTS    the detail table does not have a materialized view log
N    REFRESH_FAST_AFTER_ONETAB_DML                       see the reason why REFRESH_FAST_AFTER_INSERT is disabled
N    REFRESH_FAST_AFTER_ANY_DML                          see the reason why REFRESH_FAST_AFTER_ONETAB_DML is disabled
N    REFRESH_FAST_PCT                                    PCT is not possible on any of the detail tables in the materialized view

SQL> rollback;

Rollback complete.

Ok, now another message: “the detail table does not have a materialized view log”. But that’s exactly the purpose of statement-level refresh: being able to fast refresh without creating and maintaining materialized view logs, and without full-refreshing a table or a partition.

This’t the limit of DBMS_MVIEW.EXPLAIN_MVIEW. Let’s try to create the materialized view now:


SQL> create materialized view flatview refresh fast on statement as
  2  select cases.rowid case_rowid,countries.rowid country_rowid,continents.rowid continent_rowid,to_date(daterep,'dd/mm/yy') daterep,continent_name,country_name,cases from cases , countries , continents where cases.country_id=countries.country_id and countries.continent_id=continents.continent_id and cases>0;

select cases.rowid case_rowid,countries.rowid country_rowid,continents.rowid continent_rowid,to_date(daterep,'dd/mm/yy') daterep,continent_name,country_name,cases from cases , countries , continents where cases.country_id=countries.country_id and countries.continent_id=continents.continent_id and cases>0
                                                                                                                                                                                                                                                                                    *
ERROR at line 2:
ORA-32428: on-statement materialized join view error: Shape of MV is not
supported(composite PK)

SQL>

That’s clear. I had created the fact primary key on the compound foreign keys.

Surrogate key on fact table

This is not allowed by statement-level refresh, so let’s change that:


SQL> alter table cases add (case_id number);

Table altered.

SQL> update cases set case_id=rownum;

21274 rows updated.

SQL> alter table cases drop primary key;

Table altered.

SQL> alter table cases add primary key(case_id);

Table altered.

SQL> alter table cases add unique(daterep,country_id);
Table altered.

I have added a surrogate key and defined a unique key for the composite one.

Now the creation is sucessful:


SQL> create materialized view flatview refresh fast on statement as
  2  select cases.rowid case_rowid,countries.rowid country_rowid,continents.rowid continent_rowid,to_date(daterep,'dd/mm/yy') daterep,continent_name,country_name,cases from cases , countries , continents where cases.country_id=countries.country_id and countries.continent_id=continents.continent_id and cases>0;

Materialized view created.

Note that I tested later and I am able to create it with the ROWID from the fact table CASES only. But that’s not a good idea: in order to propagate any change to the underlying tables, the materialized view must have the ROWID, like an index. I consider as a bug the possibility to do it.

Here are the columns stored in my materialized view:


SQL> desc flatview

              Name    Null?            Type
__________________ ________ _______________
CASE_ROWID                  ROWID
COUNTRY_ROWID               ROWID
CONTINENT_ROWID             ROWID
DATEREP                     DATE
CONTINENT_NAME              VARCHAR2(30)
COUNTRY_NAME                VARCHAR2(60)
CASES                       NUMBER

Storing the ROWID is not something we should recommend as some maintenance operations may change the physical location of rows. You will need to complete refresh the materialized view after an online move for example.

No-join query

I’ll show query rewrite in another blog post. For the moment, I’ll query this materialized view directly.

Here is a query similar to the one in the previous post:


SQL> select continent_name,country_name,top_date,top_cases from (
  2   select continent_name,country_name,daterep,cases
  3    ,first_value(daterep)over(partition by continent_name order by cases desc) top_date
  4    ,first_value(cases)over(partition by continent_name order by cases desc)top_cases
  5    ,row_number()over(partition by continent_name order by cases desc) r
  6    from flatview
  7   )
  8   where r=1 order by top_cases
  9  ;

   CONTINENT_NAME                COUNTRY_NAME      TOP_DATE    TOP_CASES
_________________ ___________________________ _____________ ____________
Oceania           Australia                   23/03/2020             611
Africa            South_Africa                30/05/2020            1837
Asia              China                       13/02/2020           15141
Europe            Russia                      02/06/2020           17898
America           United_States_of_America    26/04/2020           48529

I have replaced the country_id and continent_id by their name as I didn’t put them in my materialized view. And I repeated the window function everywhere if you want to run the same in versions lower than 20c.

This materialized view is a table. I can partition it by hash to scatter the data. I can cluster on another column. I can add indexes. I have the full power of a SQL databases on it, without the need to join if you think that joins are slow. If you come from NoSQL you can see it like a DynamoDB global index. You can query it without joining, fetching all attributes with one call, and filtering on another key than the primary key. But here we have always strong consistency: the changes are replicated immediately, fully ACID. They will be committed or rolled back by the same transaction that did the change. They will be replicated synchronously or asynchronously with read-only replicas.

DML on base tables

Let’s do some changes here, lowering the covid-19 cases of CHN to 42%:


SQL> alter session set sql_trace=true;

Session altered.

SQL> update cases set cases=cases*0.42 where country_id=(select country_id from countries where country_code='CHN');

157 rows updated.

SQL> alter session set sql_trace=false;

Session altered.

I have set sql_trace because I want to have a look at the magic behind it.

Now running my query on the materialized view:



SQL> select continent_name,country_name,top_date,top_cases from (
  2   select continent_name,country_name,daterep,cases
  3    ,first_value(daterep)over(partition by continent_name order by cases desc) top_date
  4    ,first_value(cases)over(partition by continent_name order by cases desc)top_cases
  5    ,row_number()over(partition by continent_name order by cases desc) r
  6    from flatview
  7   )
  8*  where r=1 order by top_cases;

   CONTINENT_NAME                COUNTRY_NAME      TOP_DATE    TOP_CASES
_________________ ___________________________ _____________ ____________
Oceania           Australia                   23/03/2020             611
Africa            South_Africa                30/05/2020            1837
Asia              India                       04/06/2020            9304
Europe            Russia                      02/06/2020           17898
America           United_States_of_America    26/04/2020           48529

CHN is not the top one in Asia anymore with the 42% correction.

The changes were immediately propagated to the materialized view like when indexes are updated, and we can see that in the trace:


SQL> column value new_value tracefile
SQL> select value from v$diag_info where name='Default Trace File';
                                                                     VALUE
__________________________________________________________________________
/u01/app/oracle/diag/rdbms/cdb1a_iad154/CDB1A/trace/CDB1A_ora_49139.trc


SQL> column value clear
SQL> host tkprof &tracefile trace.txt

TKPROF: Release 20.0.0.0.0 - Development on Thu Jun 4 15:43:13 2020

Copyright (c) 1982, 2020, Oracle and/or its affiliates.  All rights reserved.

sql_trace instruments all executions with time and number of rows. tkprof aggregates those for analysis.

The trace shows two statements on my materialized view: DELETE and INSERT.

The first one is about removing the modified rows.


DELETE FROM "DEMO"."FLATVIEW"
WHERE
 "CASE_ROWID" = :1


call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse      157      0.00       0.00          0          0          0           0
Execute    157      0.01       0.04         42        314        433         141
Fetch        0      0.00       0.00          0          0          0           0
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total      314      0.01       0.04         42        314        433         141

Misses in library cache during parse: 1
Misses in library cache during execute: 1
Optimizer mode: ALL_ROWS
Parsing user id: 634     (recursive depth: 1)
Number of plan statistics captured: 3

Rows (1st) Rows (avg) Rows (max)  Row Source Operation
---------- ---------- ----------  ---------------------------------------------------
         0          0          0  DELETE  FLATVIEW (cr=2 pr=1 pw=0 time=395 us starts=1)
         1          1          1   INDEX UNIQUE SCAN I_OS$_FLATVIEW (cr=2 pr=1 pw=0 time=341 us starts=1 cost=1 size=10 card=1)(object id 78628)

This has been done row-by-row but is optimized with an index on ROWID that has been created autonomously with my materialized view.

The second one is inserting the modified rows:


INSERT INTO  "DEMO"."FLATVIEW" SELECT "CASES".ROWID "CASE_ROWID",
  "COUNTRIES".ROWID "COUNTRY_ROWID","CONTINENTS".ROWID "CONTINENT_ROWID",
  "CASES"."DATEREP" "DATEREP","CONTINENTS"."CONTINENT_NAME" "CONTINENT_NAME",
  "COUNTRIES"."COUNTRY_NAME" "COUNTRY_NAME","CASES"."CASES" "CASES" FROM
  "CONTINENTS" "CONTINENTS","COUNTRIES" "COUNTRIES", (SELECT "CASES".ROWID
  "ROWID","CASES"."DATEREP" "DATEREP","CASES"."CASES" "CASES",
  "CASES"."COUNTRY_ID" "COUNTRY_ID" FROM "DEMO"."CASES" "CASES" WHERE
  "CASES".ROWID=(:Z)) "CASES" WHERE "CASES"."COUNTRY_ID"=
  "COUNTRIES"."COUNTRY_ID" AND "COUNTRIES"."CONTINENT_ID"=
  "CONTINENTS"."CONTINENT_ID" AND "CASES"."CASES">0


call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse      157      0.00       0.01          0          0          0           0
Execute    157      0.01       0.02          0        755        606         141
Fetch        0      0.00       0.00          0          0          0           0
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total      314      0.02       0.03          0        755        606         141

Misses in library cache during parse: 1
Misses in library cache during execute: 1
Optimizer mode: ALL_ROWS
Parsing user id: 634     (recursive depth: 1)
Number of plan statistics captured: 3

Rows (1st) Rows (avg) Rows (max)  Row Source Operation
---------- ---------- ----------  ---------------------------------------------------
         0          0          0  LOAD TABLE CONVENTIONAL  FLATVIEW (cr=8 pr=0 pw=0 time=227 us starts=1)
         1          1          1   NESTED LOOPS  (cr=5 pr=0 pw=0 time=29 us starts=1 cost=3 size=47 card=1)
         1          1          1    NESTED LOOPS  (cr=3 pr=0 pw=0 time=20 us starts=1 cost=2 size=37 card=1)
         1          1          1     TABLE ACCESS BY USER ROWID CASES (cr=1 pr=0 pw=0 time=11 us starts=1 cost=1 size=17 card=1)
         1          1          1     TABLE ACCESS BY INDEX ROWID COUNTRIES (cr=2 pr=0 pw=0 time=9 us starts=1 cost=1 size=20 card=1)
         1          1          1      INDEX UNIQUE SCAN SYS_C009401 (cr=1 pr=0 pw=0 time=4 us starts=1 cost=0 size=0 card=1)(object id 78620)
         1          1          1    TABLE ACCESS BY INDEX ROWID CONTINENTS (cr=2 pr=0 pw=0 time=5 us starts=1 cost=1 size=10 card=1)
         1          1          1     INDEX UNIQUE SCAN SYS_C009399 (cr=1 pr=0 pw=0 time=2 us starts=1 cost=0 size=0 card=1)(object id 78619)

Again, a row-by-row insert apparently as the “execute count” is nearly the same as the “rows count”. 157 is the number of rows I have updated.

You may think that this is a huge overhead, but those operations are optimized for a long time. The materialized view is refreshed and ready for optimal queries: no need to queue, stream, reorg, vacuum,… And I can imagine that if this feature is used, it will be optimized with bulk operations which would allow compression.

Truncate

This looks all good. But… what happens if I truncate the table?


SQL> truncate table cases;

Table truncated.

SQL> select continent_name,country_name,top_date,top_cases from (
  2   select continent_name,country_name,daterep,cases
  3    ,first_value(daterep)over(partition by continent_name order by cases desc) top_date
  4    ,first_value(cases)over(partition by continent_name order by cases desc)top_cases
  5    ,row_number()over(partition by continent_name order by cases desc) r
  6    from flatview
  7   )
  8*  where r=1 order by top_cases;
   CONTINENT_NAME                COUNTRY_NAME      TOP_DATE    TOP_CASES
_________________ ___________________________ _____________ ____________
Oceania           Australia                   23/03/2020             611
Africa            South_Africa                30/05/2020            1837
Asia              India                       04/06/2020            9304
Europe            Russia                      02/06/2020           17898
America           United_States_of_America    26/04/2020           48529

Nothing changed. This is dangerous. You need to refresh it yourself. This may be a bug. What will happen if you insert data back? Note that, like with triggers, direct-path inserts will be transparently run as conventional inserts.

Joins are not expensive

This feature is really good to pre-build the joins in a composition of tables, as a hierarchical key-value, or snowflake dimension fact table. You can partition, compress, order, filter, index,… as with any relational table. There no risk here with the denormalization as it is transparently maintained when you update the underlying tables.

If you develop on a NoSQL database because you have heard that normalization was invented to reduce storage, which is not nexpensive anymore, that’s a myth (you can read this long thread to understand the origin of this myth). Normalization is about database integrity and separation lof logical and physical layers. And that’s what Oracle Database implements with this feature: you update the logical view, tables are normalized for integrity, and the physical layer transparently maintains additional structures like indexes and materialized views to keep queries under single-digit milliseconds. Today you still need to think about indexes and materialized views to build. Some advisors may help. All those are the bricks for the future: an autonomous database where you define only the logical layer for your application and all those optimisations will be done in background.

Cet article Oracle 12c – pre-built join index est apparu en premier sur Blog dbi services.

↧

Oracle 18c – select from a flat file

June 4, 2020, 2:09 pm

≫ Next: DB-Upgrade hangs in SE2 waiting on Streams AQ while gathering statistics on LOGMNR-Tables

≪ Previous: Oracle 12c – pre-built join index

By Franck Pachot

.
This post is the first one from a series of small examples on recent Oracle features. My goal is to present them to people outside of Oracle and relational databases usage, maybe some NoSQL players. And this is why the title is “select from a flat-file” rather than “Inline External Tables”. In my opinion, the names of the features of Oracle Database are invented by the architects and developers, sometimes renamed by Marketing or CTO, and all that is very far from what the users are looking for. In order to understand “Inline External Table” you need to know all the history behind: there were tables, then external tables, and there were queries, and inlined queries, and… But imagine a junior who just wants to query a file, he will never find this feature. He has a file, it is not a table, it is not external, and it is not inline. What is external to him is this SQL language and what we want to show him is that this language can query his file.

I’m running this in the Oracle 20c preview in the Oracle Cloud.

In this post, my goal is to load a small fact and dimension table for the next posts about some recent features that are interesting in data warehouses. It is the occasion to show that with Oracle we can easily select from a .csv file, without the need to run SQL*Loader or create an external table.
I’m running everything from SQLcl and then I use the host command to call curl:


host curl -L http://opendata.ecdc.europa.eu/covid19/casedistribution/csv/ | dos2unix | sort -r > /tmp/covid-19.csv

This gets the latest number of COVID-19 cases per day and per country.

It looks like this:


SQL> host head  /tmp/covid-19.csv
dateRep,day,month,year,cases,deaths,countriesAndTerritories,geoId,countryterritoryCode,popData2018,continentExp
31/12/2019,31,12,2019,27,0,China,CN,CHN,1392730000,Asia
31/12/2019,31,12,2019,0,0,Vietnam,VN,VNM,95540395,Asia
31/12/2019,31,12,2019,0,0,United_States_of_America,US,USA,327167434,America
31/12/2019,31,12,2019,0,0,United_Kingdom,UK,GBR,66488991,Europe
31/12/2019,31,12,2019,0,0,United_Arab_Emirates,AE,ARE,9630959,Asia
31/12/2019,31,12,2019,0,0,Thailand,TH,THA,69428524,Asia
31/12/2019,31,12,2019,0,0,Taiwan,TW,TWN,23780452,Asia
31/12/2019,31,12,2019,0,0,Switzerland,CH,CHE,8516543,Europe
31/12/2019,31,12,2019,0,0,Sweden,SE,SWE,10183175,Europe

I sorted them on date on purpose (next posts may talk about data clustering).

I need a directory object to access the file:


SQL> create or replace directory "/tmp" as '/tmp';

Directory created.

You don’t have to use quoted identifiers if you don’t like it. I find it convenient here.

I can directly select from the file, the EXTERNAL clause mentioning what we had to put in an external table before 18c:


SQL> select *
   from external (
    (
     dateRep                    varchar2(10)
     ,day                       number
     ,month                     number
     ,year                      number
     ,cases                     number
     ,deaths                    number
     ,countriesAndTerritories   varchar2(60)
     ,geoId                     varchar2(30)
     ,countryterritoryCode      varchar2(3)
     ,popData2018               number
     ,continentExp              varchar2(30)
    )
    default directory "/tmp"
    access parameters (
     records delimited by newline skip 1 -- skip header
     logfile 'covid-19.log'
     badfile 'covid-19.bad'
     fields terminated by "," optionally enclosed by '"'
    )
    location ('covid-19.csv')
    reject limit 0
   )
 .

SQL> /
      DATEREP    DAY    MONTH    YEAR    CASES    DEATHS                       COUNTRIESANDTERRITORIES       GEOID    COUNTRYTERRITORYCODE    POPDATA2018    CONTINENTEXP
_____________ ______ ________ _______ ________ _________ _____________________________________________ ___________ _______________________ ______________ _______________
01/01/2020         1        1    2020        0         0 Algeria                                       DZ          DZA                           42228429 Africa
01/01/2020         1        1    2020        0         0 Armenia                                       AM          ARM                            2951776 Europe
01/01/2020         1        1    2020        0         0 Australia                                     AU          AUS                           24992369 Oceania
01/01/2020         1        1    2020        0         0 Austria                                       AT          AUT                            8847037 Europe
01/01/2020         1        1    2020        0         0 Azerbaijan                                    AZ          AZE                            9942334 Europe
01/01/2020         1        1    2020        0         0 Bahrain                                       BH          BHR                            1569439 Asia
ORA-01013: user requested cancel of current operation

SQL>

I cancelled it as that’s too long to display here.

As the query is still in the buffer, I just add a CREATE TABLE in front of it:


SQL> 1
  1* select *
SQL> c/select/create table covid as select/
   create table covid as select *
  2   from external (
  3    (
  4     dateRep                    varchar2(10)
  5     ,day                       number
...

SQL> /

Table created.

SQL>

While I’m there I’ll quickly create a fact table and a dimension hierarchy:


SQL> create table continents as select rownum continent_id, continentexp continent_name from (select distinct continentexp from covid where continentexp!='Other');

Table created.

SQL> create table countries as select country_id,country_code,country_name,continent_id from (select distinct geoid country_id,countryterritorycode country_code,countriesandterritories country_name,continentexp continent_name from covid where continentexp!='Other') left join continents using(continent_name);

Table created.

SQL> create table cases as select daterep, geoid country_id,cases from covid where continentexp!='Other';

Table created.

SQL> alter table continents add primary key (continent_id);

Table altered.

SQL> alter table countries add foreign key (continent_id) references continents;

Table altered.

SQL> alter table countries add primary key (country_id);

Table altered.

SQL> alter table cases add foreign key (country_id) references countries;

Table altered.

SQL> alter table cases add primary key (country_id,daterep);

Table altered.

SQL>

This create a CASES fact table with only one measure (covid-19 cases) and two dimensions. To get it simple, the date dimension here is just a date column (you usually have a foreign key to a calendar dimension). The geographical dimension is a foreign key to the COUNTRIES table which itself has a foreign key referencing the CONTINENTS table.

12c Top-N queries

In 12c we have a nice syntax for Top-N queries with the FETCH FIRST clause of the ORDER BY:


SQL> select continent_name,country_code,max(cases) from cases join countries using(country_id) join continents using(continent_id) group by continent_name,country_code order by max(cases) desc fetch first 10 rows only;

CONTINENT_NAME                 COU MAX(CASES)
------------------------------ --- ----------
America                        USA      48529
America                        BRA      33274
Europe                         RUS      17898
Asia                           CHN      15141
America                        ECU      11536
Asia                           IND       9304
Europe                         ESP       9181
America                        PER       8875
Europe                         GBR       8719
Europe                         FRA       7578

10 rows selected.

This returns the 10 countries which had the maximum covid-19 cases per day.

20c WINDOW clauses

If I want to show the date with the maximum value, I can use analytic functions and in 20c I don’t have to repeat the window several times:


SQL> select continent_name,country_code,top_date,top_cases from (
  2   select continent_name,country_code,daterep,cases
  3    ,first_value(daterep)over(w) top_date
  4    ,first_value(cases)over(w) top_cases
  5    ,row_number()over(w) r
  6    from cases join countries using(country_id) join continents using(continent_id)
  7    window w as (partition by continent_id order by cases desc)
  8   )
  9   where r=1 -- this to get the rows with the highes value only
 10   order by top_cases desc fetch first 10 rows only;

CONTINENT_NAME                 COU TOP_DATE    TOP_CASES
------------------------------ --- ---------- ----------
America                        USA 26/04/2020      48529
Europe                         RUS 02/06/2020      17898
Asia                           CHN 13/02/2020      15141
Africa                         ZAF 30/05/2020       1837
Oceania                        AUS 23/03/2020        611

The same can be done before 20c but you have to write the (partition by continent_id order by cases desc) for each projection.

In the next post I’ll show a very nice feature. Keeping the 3 tables normalized data model but, because storage is cheap, materializing some pre-computed joins. If you are a fan of NoSQL because “storage is cheap” and “joins are expensive”, then you will see what we can do with SQL in this area…

Cet article Oracle 18c – select from a flat file est apparu en premier sur Blog dbi services.

↧

DB-Upgrade hangs in SE2 waiting on Streams AQ while gathering statistics on LOGMNR-Tables

June 5, 2020, 12:51 pm

≫ Next: What is a serverless database?

≪ Previous: Oracle 18c – select from a flat file

A couple of weeks ago I upgraded an Oracle Standard Edition 2 test database from 12.1.0.2 to 12.2.0.1 (with the April 2020 Patch Bundle) on Windows. Recently I upgraded the production database. Both upgrades were done with the Database Upgrade Assistant DBUA. I didn’t use AUTOUPGRADE because I had to upgrade only 1 database and the DBUA handles everything for me (including changing the necessary Windows services and update the timezone file).

Both upgrades did hang at the finalizing phase of the components upgrade.

So I checked what the upgrade process is waiting for in the DB:


SQL> select sid, sql_id, event,p1,p2,p3 from v$session 
   2 where status='ACTIVE' and type='USER' and sid not in 
   3 (select sid from v$mystat);

       SID SQL_ID        EVENT                                                 P1         P2         P3
---------- ------------- -------------------------------------------------- ----- ---------- ----------
      1142 fgus25bx1md8q Streams AQ: waiting for messages in the queue      17409 1.4072E+14 2147483647

SQL> set long 400000 longchunksize 200
SQL> select sql_fulltext from v$sqlarea where sql_id='fgus25bx1md8q';

SQL_FULLTEXT
---------------------------------------------------------------------------------
DECLARE
        cursor table_name_cursor  is
                select  x.name table_name
                from sys.x$krvxdta x
                where bitand(x.flags, 12) != 0;
        filter_lst DBMS_STATS.OBJECTTAB := DBMS_STATS.OBJECTTAB();
        obj_lst    DBMS_STATS.OBJECTTAB := DBMS_STATS.OBJECTTAB();
        ind number := 1;
BEGIN
   for rec in table_name_cursor loop
      begin
        filter_lst.extend(1);
        filter_lst(ind).ownname := 'SYSTEM';
        filter_lst(ind).objname := 'LOGMNR_'|| rec.table_name||'';
        ind := ind + 1;
      end;
   end loop;
   DBMS_STATS.GATHER_SCHEMA_STATS(OWNNAME=>'SYSTEM', objlist=>obj_lst, obj_filter_list=>filter_lst);
END;

So obviously the upgrade process tried to gather stats on LOGMNR-tables owned by SYSTEM and waits for messages in the scheduler queue SCHEDULER$_EVENT_QUEUE (Object ID 17409). I.e. this is something similar as documented in MOS Note 1559487.1.

The upgrade was stuck at this point. So what to do?

Fortunately I remembered a blog about DBUA being restartable in 12.2. from Mike Dietrich:

Restarting a failed Database Upgrade with DBUA 12.2

So I killed the waiting session:


SQL> select serial# from v$session where sid=1142;

   SERIAL#
----------
     59722

SQL> alter system kill session '1142,59722';

System altered.

Then I let the DBUA run into tons of errors and let it finish his work. To restart it I just clicked on “Retry” in the GUI. After some time DBUA went into an error again. I quickly checked the log-files and clicked again on “Retry”. That time it went through without issues. Checking the log-files and the result of the upgrade showed all components migrated correctly.

So in summary: A failed upgrade (crashed or hanging) with DBUA is not such a bad thing anymore as it was before 12.2. You can just let DBUA (or AUTOUPGRADE) retry its work. Of course, usually you have to fix the reason for the failure before restarting/retrying.

REMARK: See Mike Dietrich’s Blog about resumability and restartability of Autoupgrade here:

Troubleshooting, Restoring and Restarting AutoUpgrade

Cet article DB-Upgrade hangs in SE2 waiting on Streams AQ while gathering statistics on LOGMNR-Tables est apparu en premier sur Blog dbi services.

↧

What is a serverless database?

June 5, 2020, 1:44 pm

≫ Next: Oracle 12c – peak detection with MATCH_RECOGNIZE

≪ Previous: DB-Upgrade hangs in SE2 waiting on Streams AQ while gathering statistics on LOGMNR-Tables

By Franck Pachot

.
After reading the https://cloudwars.co/oracle/oracle-deal-8×8-larry-ellison-picks-amazons-pocket-again/ paper, I am writing some thoughts about how a database can be serverless and elastic. Of course, a database needs a server to process its data. Serverless doesn’t mean that there are no servers.

Serverless as not waiting for server provisioning

The first idea of “serverless” is about provisioning. In the past when a developer required a new database to start a new project she had to wait that a server is installed. In 1996 my first development on Oracle Database started like this: we asked Sun for a server and OS and asked Oracle for the database software, all for free for a few months, in order to start our prototype. Today this would be a Cloud Free Tier access. At that time we had to wait to receive, unbox, and install all this. I learned a lot there about Installing an OS, configuring the network, setting up disk mirroring… This was an awesome experience for a junior starting in IT. Interestingly, I think that today a junior can learn the same concepts with a Cloud Foundation training and certification. This has not really changed except the unboxing and cabling. The big difference is that today we do not have to wait weeks for it and can setup the same infrastructure in 10 minutes.

That was my first DevOps experience: we wanted to develop our application without waiting for the IT department. But it was not serverless at all.

A few years later I was starting a new datawarehouse for a mobile telco in Africa. Again, weeks to months were required to order and install a server for it. And we didn’t wait. We started the first version of the datawarehouse on a spare PC we had. This was maybe my first serverless experience: the server provisioning is out of the critical path in the project planning. Of course, a PC is not a server and reliability and performance were not there. But we were lucky and when the server arrived we already had good feedback from this first version.

We need serverless, but we need real servers behind it. Today, this is possible: you don’t need to wait and you can provision a new database in the public or private cloud, or simply on a VM, without waiting. And all security, reliability and performance are there. With Oracle, it is a bit more difficult if you can’t do it in their public cloud because licensing do not count vCPUs and you often need specific hardware for it like in the old days. Appliances like ODA can help. Public Cloud or Cloud@Customer definitely helps.

Serverless as not taking responsibility for server administration

Serverless is not only about running on virtual servers with easy provisioning. If you are serverless, you don’t want to manage those virtual machines. You start and connect to a compute instance. You define its shape (CPU, RAM) but you don’t want to know where it runs physically. Of course, you want to define the region for legal, performance or cost reasons, but not which data center, which rack,… That’s the second step of serverless: you don’t manage the physical servers. In Oracle Cloud, you run a Compute Instance where you can install a database. In AWS this is an EC2 instance where you can install a database.

But, even if you don’t own the responsibility of the servers, this is not yet “serverless”. Because you pay for them. If your CFO still sees a bill for compute instance, you are not serverless.

Serverless as not paying for the server

AWS has a true serverless and elastic database offer: Amazon Aurora Serverless. You don’t have to start or stop the servers. This is done automatically when you connect. More activity adds more servers. No connection stops it. And you pay only for what the application is using. You don’t pay for the database servers running. You really pay for what the application is using.

Azure has also a Serverless SQL Server: https://docs.microsoft.com/en-us/azure/sql-database/sql-database-serverless

Those are, as far as I know, the only true serverless databases yet. If we need to stop and start the compute services ourselves, even with some level of auto-scaling, we can call that on-demand but not serverless.

All AWS RDS services including Aurora can be started and stopped on demand. They can scale up or down with minimal downtime, especially in Multi-AZ because the standby can be scaled and activated. Redshift cannot be stopped because it uses local storage. But you can take a snapshot and terminate the instance, and restore it later.

On Oracle side, the Autonomous Database can be stopped and started. Then again, we can say that we don’t pay when we don’t use the database but cannot say that we don’t pay when we don’t use the application. Because the database is up even if the application is not used. However, you can scale without the need to stop and start. And there’s also some level of auto-scaling where the additional application usage is really billed on CPU usage metrics: you pay for n OCPUs when the ATP or ADB is up and you can use up to n*3 sessions on CPU, with true serverless billing for what is above the provisioned OCPUs. Maybe the future will go further. The technology allows it: multitenant allows PDB level CPU caging where the capacity can be changed online (setting CPU_COUNT) and AWR gathers the CPU load with many metrics that can be used for billing.

Serverless

The name is funny because serverless programs run on servers. And the crush for running without servers is paradoxical. When I started programming, it was on very small computers (ZX-81, Apple //e, IBM PC-XT) and I was really proud when I started to do real stuff running on real servers, with a schema on *the* company database. Actually, what is called serverless today is, in my opinion, showing the full power of servers: don’t need to buy a computer for a project but use some mutualized compute power.

The cloud wars use strange marketing terms, but really good technology and concepts are coming.

Cet article What is a serverless database? est apparu en premier sur Blog dbi services.

↧