IT World https://blog.yannickjaquier.com RDBMS, Unix and many more... Tue, 22 Jun 2021 07:59:21 +0000 en-US hourly 1 https://wordpress.org/?v=5.7.2 GoldenGate for Big Data and Kafka Handlers hands-on – part 2 https://blog.yannickjaquier.com/oracle/goldengate-for-big-data-and-kafka-handlers-hands-on-part-2.html https://blog.yannickjaquier.com/oracle/goldengate-for-big-data-and-kafka-handlers-hands-on-part-2.html#respond Mon, 28 Jun 2021 07:58:16 +0000 https://blog.yannickjaquier.com/?p=5131 Preamble The GoldenGate for Big Data integration with Kafka is possible through three different Kafka Handlers also called connectors: Kafka Generic Handler (Pub/Sub) Kafka Connect Handler Kafka REST Proxy Handler Only the two first are available under the Opensource Apache Licensed version so we will review only those two. Oracle has written few articles on […]

The post GoldenGate for Big Data and Kafka Handlers hands-on – part 2 appeared first on IT World.

]]>

Table of contents

Preamble

The GoldenGate for Big Data integration with Kafka is possible through three different Kafka Handlers also called connectors:

  • Kafka Generic Handler (Pub/Sub)
  • Kafka Connect Handler
  • Kafka REST Proxy Handler

Only the two first are available under the Opensource Apache Licensed version so we will review only those two. Oracle has written few articles on the differences (see references section) but those small sentences sum-up it well:

Kafka Handler

Can send raw bytes messages in four formats: JSON, Avro, XML, delimited text

Kafka Connect Handler

Generates in-memory Kafka Connect schemas and messages. Passes the messages to Kafka Connect converter to convert to bytes to send to Kafka.
There are currently only two converters: JSON and Avro. Only Confluent currently has Avro. But using the Kafka Connect interface allows the user to integrate with the open source Kafka Connect connectors.

This picture from Oracle corporation (see references section for complete article) summarize it well:

kafka04
kafka04

I initially thought the Kafka Connect Handler was provided as a plugin by Conluent (https://www.confluent.io/product/connectors/) but it is included by default in defaut Apache Kafka:

The Kafka Connect framework is also included in the Apache versions as well as Confluent version.

So it is possible to run the OGG Kafka Connect Handler with Apache Kafka. And it is possible to run open source Kafka Connect connectors with Apache Kafka.

One thing Confluent Kafka has that Apache Kafka does not is the Avro schema registry and the Avro Converter.

Oracle test case creation

My simple test case, created in my pdb1 pluggable database, will be as follow:

SQL> create user appuser identified by secure_password;

User created.

SQL> grant connect, resource to appuser;

Grant succeeded.

SQL> alter user appuser quota unlimited on users;

User altered.

SQL> connect appuser/secure_password@pdb1
Connected.

SQL> CREATE TABLE test01(id NUMBER, descr VARCHAR2(50), CONSTRAINT TEST01_PK PRIMARY KEY (id) ENABLE);

Table created.

SQL> desc test01
 Name                                      Null?    Type
 ----------------------------------------- -------- ----------------------------
 ID                                        NOT NULL NUMBER
 DESCR                                              VARCHAR2(50)

GoldenGate extract configuration

In this chapter I create an extract (capture) process to extract figures from my appuser.test01 test table:

GGSCI (server01) 1> add credentialstore

Credential store created.
 
GGSCI (server01) 2> alter credentialstore add user c##ggadmin@orcl alias c##ggadmin
Password:

Credential store altered.

GGSCI (server01) 3> info credentialstore

Reading from credential store:

Default domain: OracleGoldenGate

  Alias: ggadmin
  Userid: c##ggadmin@orcl

Use ‘alter credentialstore delete user’ to remove an alias…

GGSCI (server01) 11> dblogin useridalias c##ggadmin
Successfully logged into database CDB$ROOT.

GGSCI (server01 as c##ggadmin@orcl/CDB$ROOT) 10> add trandata pdb1.appuser.test01

2021-01-22 10:55:21  INFO    OGG-15131  Logging of supplemental redo log data is already enabled for table PDB1.APPUSER.TEST01.

2021-01-22 10:55:21  INFO    OGG-15135  TRANDATA for instantiation CSN has been added on table PDB1.APPUSER.TEST01.

2021-01-22 10:55:21  INFO    OGG-10471  ***** Oracle Goldengate support information on table APPUSER.TEST01 *****
Oracle Goldengate support native capture on table APPUSER.TEST01.
Oracle Goldengate marked following column as key columns on table APPUSER.TEST01: ID.

Configure the extract process (ERROR: Invalid group name (must be at most 8 characters).):

GGSCI (server01) 10> dblogin useridalias c##ggadmin
Successfully logged into database CDB$ROOT.

GGSCI (server01 as c##ggadmin@orcl/CDB$ROOT) 11> edit params ext01



GGSCI (server01 as c##ggadmin@orcl/CDB$ROOT) 12> view params ext01

extract ext01
useridalias c##ggadmin
ddl include mapped
exttrail ./dirdat/ex
sourcecatalog pdb1
table appuser.test01

Add, register extract and add the EXTTRAIL (name must be 2 characters or less !):

GGSCI (server01 as c##ggadmin@orcl/CDB$ROOT) 19> add extract ext01, integrated tranlog, begin now
EXTRACT (Integrated) added.

GGSCI (server01 as c##ggadmin@orcl/CDB$ROOT) 20> register extract ext01 database container (pdb1)

2021-01-22 11:06:30  INFO    OGG-02003  Extract EXT01 successfully registered with database at SCN 7103554.


GGSCI (server01 as c##ggadmin@orcl/CDB$ROOT) 21> add exttrail ./dirdat/ex, extract ext01
EXTTRAIL added.

Finally start it with (you can also use ‘view report ext01’ to get more detailed informations):

GGSCI (server01 as c##ggadmin@orcl/CDB$ROOT) 22> start ext01

Sending START request to MANAGER ...
EXTRACT EXT01 starting


GGSCI (server01 as c##ggadmin@orcl/CDB$ROOT) 23> info all

Program     Status      Group       Lag at Chkpt  Time Since Chkpt

MANAGER     RUNNING
EXTRACT     RUNNING     EXT01       00:00:00      19:13:08

Test it works by inserting a row in your test table:

SQL> insert into test01 values(10,'Ten');

1 row created.

SQL> commit;

Commit complete.

And checking you trail files get created in chosen directory:

[oracle@server01 oggcore_1]$ ll dirdat
total 2
-rw-r----- 1 oracle dba 1294 Jan 22 11:24 ex000000000

GoldenGate for Big Data and Kafka Handler configuration

One cool directory to look at is AdapterExamples directory located inside you GoldenGate for Big Data installation and in my case AdapterExamples/big-data/kafka* sub directories:

[oracle@server01 big-data]$ pwd
/u01/app/oracle/product/19.1.0/oggbigdata_1/AdapterExamples/big-data
[oracle@server01 big-data]$ ll -d kafka*
drwxr-x--- 2 oracle dba 96 Sep 25  2019 kafka
drwxr-x--- 2 oracle dba 96 Sep 25  2019 kafka_connect
drwxr-x--- 2 oracle dba 96 Sep 25  2019 kafka_REST_proxy
[oracle@server01 big-data]$ ll kafka
total 4
-rw-r----- 1 oracle dba  261 Sep  3  2019 custom_kafka_producer.properties
-rw-r----- 1 oracle dba 1082 Sep 25  2019 kafka.props
-rw-r----- 1 oracle dba  332 Sep  3  2019 rkafka.prm

So in kafka directory we have three files that you can copy to dirprm directory of your GoldenGate for Big Data installation. Now you must customize them to match your configuration.

In custom_kafka_producer.properties I have just changed bootstrap.servers variable to match my Kafka server:

bootstrap.servers=localhost:9092
acks=1
reconnect.backoff.ms=1000

value.serializer=org.apache.kafka.common.serialization.ByteArraySerializer
key.serializer=org.apache.kafka.common.serialization.ByteArraySerializer
# 100KB per partition
batch.size=16384
linger.ms=0

In kafka.props I have changed gg.classpath to an extract of a Kafka installation directory (this directory must be owned or readable by oracle account). It means that if Kafka is installed on another server (normal configuration) you must copy the libs to your GoldenGate for Big Data server. The chosen example payload format is avro_op (Avro in operation more verbose format). Can be one of these: xml, delimitedtext, json, json_row, avro_row, avro_op:

gg.handlerlist = kafkahandler
gg.handler.kafkahandler.type=kafka
gg.handler.kafkahandler.KafkaProducerConfigFile=custom_kafka_producer.properties
#The following resolves the topic name using the short table name
gg.handler.kafkahandler.topicMappingTemplate=${tableName}
#The following selects the message key using the concatenated primary keys
gg.handler.kafkahandler.keyMappingTemplate=${primaryKeys}
gg.handler.kafkahandler.format=avro_op
gg.handler.kafkahandler.SchemaTopicName=mySchemaTopic
gg.handler.kafkahandler.BlockingSend =false
gg.handler.kafkahandler.includeTokens=false
gg.handler.kafkahandler.mode=op
gg.handler.kafkahandler.MetaHeaderTemplate=${alltokens}


goldengate.userexit.writers=javawriter
javawriter.stats.display=TRUE
javawriter.stats.full=TRUE

gg.log=log4j
gg.log.level=INFO

gg.report.time=30sec

#Sample gg.classpath for Apache Kafka
gg.classpath=dirprm/:/u01/kafka_2.13-2.7.0/libs/*
#Sample gg.classpath for HDP
#gg.classpath=/etc/kafka/conf:/usr/hdp/current/kafka-broker/libs/*

javawriter.bootoptions=-Xmx512m -Xms32m -Djava.class.path=ggjava/ggjava.jar

In rkafka.prm I have changed MAP | TARGET parameter:

REPLICAT rkafka
-- Trail file for this example is located in "AdapterExamples/trail" directory
-- Command to add REPLICAT
-- add replicat rkafka, exttrail AdapterExamples/trail/tr
TARGETDB LIBFILE libggjava.so SET property=dirprm/kafka.props
REPORTCOUNT EVERY 1 MINUTES, RATE
GROUPTRANSOPS 10000
MAP pdb1.appuser.*, TARGET appuser.*;

As explained in rkafka.prm file I add replicat process (dump) and trail directory (it must be the one of your legacy GoldenGate installation):

GGSCI (server01) 1> add replicat rkafka, exttrail /u01/app/oracle/product/19.1.0/oggcore_1/dirdat/ex
REPLICAT added.


GGSCI (server01) 2> info all

Program     Status      Group       Lag at Chkpt  Time Since Chkpt

MANAGER     RUNNING
REPLICAT    STOPPED     RKAFKA      00:00:00      00:00:03


GGSCI (server01) 3> start rkafka

Sending START request to MANAGER ...
REPLICAT RKAFKA starting


GGSCI (server01) 4> info all

Program     Status      Group       Lag at Chkpt  Time Since Chkpt

MANAGER     RUNNING
REPLICAT    RUNNING     RKAFKA      00:00:00      00:12:08

As a test I insert a row in my test table:

SQL> insert into test01 values(1,'One');

1 row created.

SQL> commit;

Commit complete.

I can read the event on the topic that has my table name (the first event is the test I have done when I configured GoldenGate):

[kafka@server01 kafka_2.13-2.7.0]$ bin/kafka-topics.sh --list --zookeeper localhost:2181
TEST01
__consumer_offsets
mySchemaTopic
quickstart-events
[kafka@server01 kafka_2.13-2.7.0]$ bin/kafka-console-consumer.sh --topic TEST01 --from-beginning --bootstrap-server localhost:9092
APPUSER.TEST01I42021-01-22 11:32:20.00000042021-01-22T15:20:51.081000(00000000000000001729ID$@Ten
APPUSER.TEST01I42021-01-22 15:24:55.00000042021-01-22T15:25:00.665000(00000000000000001872ID▒?One

You can clean the installation with:

stop rkafka
delete replicat rkafka

GoldenGate for Big Data and Kafka Connect Handler configuration

Same as previous chapter I copy the demo configuration files with:

[oracle@server01 kafka_connect]$ pwd
/u01/app/oracle/product/19.1.0/oggbigdata_1/AdapterExamples/big-data/kafka_connect
[oracle@server01 kafka_connect]$ ll
total 4
-rw-r----- 1 oracle dba  592 Sep  3  2019 kafkaconnect.properties
-rw-r----- 1 oracle dba  337 Sep  3  2019 kc.prm
-rw-r----- 1 oracle dba 1733 Sep 25  2019 kc.props
[oracle@server01 kafka_connect]$ cp * ../../../dirprm/

As I’m using the same server for all component kafkaconnect.properties file is already correct. I have just added converter.type to correct a bug. We see that here the example is configured to use JSON for payload:

bootstrap.servers=localhost:9092
acks=1

#JSON Converter Settings
key.converter=org.apache.kafka.connect.json.JsonConverter
key.converter.schemas.enable=true
value.converter=org.apache.kafka.connect.json.JsonConverter
value.converter.schemas.enable=true

#Avro Converter Settings
#key.converter=io.confluent.connect.avro.AvroConverter
#value.converter=io.confluent.connect.avro.AvroConverter
#key.converter.schema.registry.url=http://localhost:8081
#value.converter.schema.registry.url=http://localhost:8081


#Adjust for performance
buffer.memory=33554432
batch.size=16384
linger.ms=0

converter.type=key
converter.type=value
converter.type=header

In kc.props file I only change gg.classpath parameter with exact same comment as Kafka Handler configuration:

gg.handlerlist=kafkaconnect

#The handler properties
gg.handler.kafkaconnect.type=kafkaconnect
gg.handler.kafkaconnect.kafkaProducerConfigFile=kafkaconnect.properties
gg.handler.kafkaconnect.mode=op
#The following selects the topic name based on the fully qualified table name
gg.handler.kafkaconnect.topicMappingTemplate=${fullyQualifiedTableName}
#The following selects the message key using the concatenated primary keys
gg.handler.kafkaconnect.keyMappingTemplate=${primaryKeys}
gg.handler.kafkahandler.MetaHeaderTemplate=${alltokens}

#The formatter properties
gg.handler.kafkaconnect.messageFormatting=row
gg.handler.kafkaconnect.insertOpKey=I
gg.handler.kafkaconnect.updateOpKey=U
gg.handler.kafkaconnect.deleteOpKey=D
gg.handler.kafkaconnect.truncateOpKey=T
gg.handler.kafkaconnect.treatAllColumnsAsStrings=false
gg.handler.kafkaconnect.iso8601Format=false
gg.handler.kafkaconnect.pkUpdateHandling=abend
gg.handler.kafkaconnect.includeTableName=true
gg.handler.kafkaconnect.includeOpType=true
gg.handler.kafkaconnect.includeOpTimestamp=true
gg.handler.kafkaconnect.includeCurrentTimestamp=true
gg.handler.kafkaconnect.includePosition=true
gg.handler.kafkaconnect.includePrimaryKeys=false
gg.handler.kafkaconnect.includeTokens=false

goldengate.userexit.writers=javawriter
javawriter.stats.display=TRUE
javawriter.stats.full=TRUE

gg.log=log4j
gg.log.level=INFO

gg.report.time=30sec

#Apache Kafka Classpath
gg.classpath=/u01/kafka_2.13-2.7.0/libs/*
#Confluent IO classpath
#gg.classpath={Confluent install dir}/share/java/kafka-serde-tools/*:{Confluent install dir}/share/java/kafka/*:{Confluent install dir}/share/java/confluent-common/*

javawriter.bootoptions=-Xmx512m -Xms32m -Djava.class.path=.:ggjava/ggjava.jar:./dirprm

In kc.prm file I only change the MAP | TARGET configuration as follow:

REPLICAT kc
-- Trail file for this example is located in "AdapterExamples/trail" directory
-- Command to add REPLICAT
-- add replicat conf, exttrail AdapterExamples/trail/tr NODBCHECKPOINT
TARGETDB LIBFILE libggjava.so SET property=dirprm/kc.props
REPORTCOUNT EVERY 1 MINUTES, RATE
GROUPTRANSOPS 1000
MAP pdb1.appuser.*, TARGET appuser.*;

Add the Kafka Connect Handler replicat with:

GGSCI (server01) 1> info all

Program     Status      Group       Lag at Chkpt  Time Since Chkpt

MANAGER     RUNNING


GGSCI (server01) 2> add replicat kc, exttrail /u01/app/oracle/product/19.1.0/oggcore_1/dirdat/ex
REPLICAT added.


GGSCI (server01) 3> info all

Program     Status      Group       Lag at Chkpt  Time Since Chkpt

MANAGER     RUNNING
REPLICAT    STOPPED     KC          00:00:00      00:00:05


GGSCI (server01) 3> start kc

Sending START request to MANAGER ...
REPLICAT KC starting


GGSCI (server01) 4> info all

Program     Status      Group       Lag at Chkpt  Time Since Chkpt

MANAGER     RUNNING
REPLICAT    RUNNING     KC          00:00:00      00:00:04

Add a new row in the test table (and commit):

SQL> connect appuser/secure_password@pdb1
Connected.
SQL> select * from test01;

        ID DESCR
---------- --------------------------------------------------
        10 Ten
         1 One

SQL> insert into test01 values(2,'Two');

1 row created.

SQL> commit;

Commit complete.

Reading the new topic you should see new lines coming:

[kafka@server01 kafka_2.13-2.7.0]$ bin/kafka-topics.sh --list --zookeeper localhost:2181
APPUSER.TEST01
TEST01
__consumer_offsets
mySchemaTopic
quickstart-events
[kafka@server01 kafka_2.13-2.7.0]$ bin/kafka-console-consumer.sh --topic APPUSER.TEST01 --from-beginning --bootstrap-server localhost:9092
{"schema":{"type":"struct","fields":[{"type":"string","optional":true,"field":"table"},{"type":"string","optional":true,"field":"op_type"},{"type":"string","optional":true,"field":"op_ts"},{"type":"string","optional":true,"field":"current_ts"},{"type":"string","optional":true,"field":"pos"},{"type":"double","optional":true,"field":"ID"},{"type":"string","optional":true,"field":"DESCR"}],"optional":false,"name":"APPUSER.TEST01"},"payload":{"table":"APPUSER.TEST01","op_type":"I","op_ts":"2021-01-22 11:32:20.000000","current_ts":"2021-01-22 17:36:00.285000","pos":"00000000000000001729","ID":10.0,"DESCR":"Ten"}}
{"schema":{"type":"struct","fields":[{"type":"string","optional":true,"field":"table"},{"type":"string","optional":true,"field":"op_type"},{"type":"string","optional":true,"field":"op_ts"},{"type":"string","optional":true,"field":"current_ts"},{"type":"string","optional":true,"field":"pos"},{"type":"double","optional":true,"field":"ID"},{"type":"string","optional":true,"field":"DESCR"}],"optional":false,"name":"APPUSER.TEST01"},"payload":{"table":"APPUSER.TEST01","op_type":"I","op_ts":"2021-01-22 15:24:55.000000","current_ts":"2021-01-22 17:36:00.727000","pos":"00000000000000001872","ID":1.0,"DESCR":"One"}}
{"schema":{"type":"struct","fields":[{"type":"string","optional":true,"field":"table"},{"type":"string","optional":true,"field":"op_type"},{"type":"string","optional":true,"field":"op_ts"},{"type":"string","optional":true,"field":"current_ts"},{"type":"string","optional":true,"field":"pos"},{"type":"double","optional":true,"field":"ID"},{"type":"string","optional":true,"field":"DESCR"}],"optional":false,"name":"APPUSER.TEST01"},"payload":{"table":"APPUSER.TEST01","op_type":"I","op_ts":"2021-01-22 17:38:23.000000","current_ts":"2021-01-22 17:38:27.800000","pos":"00000000000000002013","ID":2.0,"DESCR":"Two"}}

References

The post GoldenGate for Big Data and Kafka Handlers hands-on – part 2 appeared first on IT World.

]]>
https://blog.yannickjaquier.com/oracle/goldengate-for-big-data-and-kafka-handlers-hands-on-part-2.html/feed 0
GoldenGate for Big Data and Kafka Handlers hands-on – part 1 https://blog.yannickjaquier.com/oracle/goldengate-for-big-data-and-kafka-handlers-hands-on-part-1.html https://blog.yannickjaquier.com/oracle/goldengate-for-big-data-and-kafka-handlers-hands-on-part-1.html#respond Thu, 27 May 2021 08:41:40 +0000 https://blog.yannickjaquier.com/?p=5129 Preamble With the rise of our Cloud migration and hybrid way of working and our SAP Hana migration one of my colleague ask me about on how to transfert on-premise Oracle database information to the Cloud. His high level idea is to make them hitting a Kafka installation we are trying to implement to duplicate […]

The post GoldenGate for Big Data and Kafka Handlers hands-on – part 1 appeared first on IT World.

]]>

Table of contents

Preamble

With the rise of our Cloud migration and hybrid way of working and our SAP Hana migration one of my colleague ask me about on how to transfert on-premise Oracle database information to the Cloud. His high level idea is to make them hitting a Kafka installation we are trying to implement to duplicate events from on-premise to the cloud and open the door to more heterogenous scenarios. The direct answer to this problem is GoldenGate Kafka Handlers !

To try to answer his questions and clear my mind I have decided to implement a simple GoldenGate implementation and a dummy Kafka implementation as well as configuring different Kafka Handlers. I initially thought only GoldenGate for Big Data was required but I have understood that GoldenGate for Big Data requires a traditional GoldenGate installation and is reading files directly from this legacy GoldenGate installation.

You cannot extract (capture) Oracle figures with GoldenGate for Big Data (OGG-01115 Function dbLogin not implemented). GoldenGate for Big Data reads trail files extracted with GoldenGate (for Oracle database).

Even if on the paper I have no issue with this I would say that from license standpoint this is a complete different story. Legacy GoldenGate for Oracle Database public license price is 17,500$ for two x86 cores and GoldenGate for Big Data public license price is 20,000$ for two x86 cores (on top of this you have 22% of maintenance each year).

This blog post will be done in two parts. First part will be binaries installation and basic components configuration. Second part will be simple test case configuration as well as trying to make it working…

Proof Of Concept components version:

  • Oracle Database 19c Enterprise Edition Release 19.0.0.0.0 – Version 19.3.0.0.0 (RU)
  • Oracle GoldenGate 19.1.0.0.4 for Oracle on Linux x86-64, 530.5 MB (V983658-01.zip)
  • Oracle GoldenGate for Big Data 19.1.0.0.1 on Linux x86-64, 88.7 MB (V983760-01.zip)
  • Scala 2.13 – kafka_2.13-2.7.0.tgz (65 MB)
  • OpenJDK 1.8.0 (chosen 8 even if 11 was available…)

Exact OpenJDK version is (I have chosen OpenJDK just to try it following the new licensing model of Oracle JDK):

[oracle@server01 ~]$ java -version
openjdk version "1.8.0_275"
OpenJDK Runtime Environment (build 1.8.0_275-b01)
OpenJDK 64-Bit Server VM (build 25.275-b01, mixed mode)

My test server is a dual socket Intel(R) Xeon(R) CPU E5-2620 v3 @ 2.40GHz 6 cores (12 cores total, 24 thread) physical server with 64GB of RAM. I have installed all components on this unique quite powerful server.

19c pluggable database configuration

The source database I plan to use is a pluggable database called PDB1:

SQL> show pdbs

    CON_ID CON_NAME                       OPEN MODE  RESTRICTED
---------- ------------------------------ ---------- ----------
         2 PDB$SEED                       READ ONLY  NO
         3 PDB1                           READ WRITE NO

I have a TNS entry called pdb1 for it:

[oracle@server01 ~]$ tnsping pdb1

TNS Ping Utility for Linux: Version 19.0.0.0.0 - Production on 21-JAN-2021 11:52:02

Copyright (c) 1997, 2019, Oracle.  All rights reserved.

Used parameter files:
/u01/app/oracle/product/19.0.0/dbhome_1/network/admin/sqlnet.ora


Used TNSNAMES adapter to resolve the alias
Attempting to contact (DESCRIPTION = (ADDRESS = (PROTOCOL = TCP)(HOST = server01.domain.com)(PORT = 1531)) (CONNECT_DATA = (SERVER = DEDICATED) (SERVICE_NAME = pdb1)))
OK (0 msec)

I can connect to my root container with an sql alias set to ‘rlwrap sqlplus / as sysdba’.

First thing to do is to change log mode of my instance:

SQL> SELECT log_mode,supplemental_log_data_min, force_logging FROM v$database;

LOG_MODE     SUPPLEME FORCE_LOGGING
------------ -------- ---------------------------------------
NOARCHIVELOG NO       NO

SQL> show parameter db_recovery_file_dest

NAME                                 TYPE        VALUE
------------------------------------ ----------- ------------------------------
db_recovery_file_dest                string      /u01/app/oracle/product/19.0.0
                                                 /fast_recovery_area
db_recovery_file_dest_size           big integer 1G

SQL> shutdown immediate;
Database closed.
Database dismounted.
ORACLE instance shut down.
SQL> startup mount
ORACLE instance started.

Total System Global Area 1073738400 bytes
Fixed Size                  9142944 bytes
Variable Size             528482304 bytes
Database Buffers          528482304 bytes
Redo Buffers                7630848 bytes
Database mounted.
SQL> ALTER DATABASE archivelog;

Database altered.

SQL> alter database open;

Database altered.

SQL> ALTER SYSTEM SET enable_goldengate_replication=TRUE scope=both;

System altered.

SQL> SELECT log_mode,supplemental_log_data_min, force_logging FROM v$database;

LOG_MODE     SUPPLEME FORCE_LOGGING
------------ -------- ---------------------------------------
ARCHIVELOG   NO       NO

SQL> show pdbs

    CON_ID CON_NAME                       OPEN MODE  RESTRICTED
---------- ------------------------------ ---------- ----------
         2 PDB$SEED                       READ ONLY  NO
         3 PDB1                           MOUNTED
SQL> alter pluggable database pdb1 open;

Pluggable database altered.

SQL> alter pluggable database pdb1 save state;

Pluggable database altered.

SQL> show pdbs

    CON_ID CON_NAME                       OPEN MODE  RESTRICTED
---------- ------------------------------ ---------- ----------
         2 PDB$SEED                       READ ONLY  NO
         3 PDB1                           READ WRITE NO

I could have added supplemental log and force logging at container level with below commands, but decided to try to do it at pluggable database level:

SQL> ALTER DATABASE ADD SUPPLEMENTAL LOG DATA;

Database altered.

SQL> ALTER DATABASE FORCE LOGGING;

Database altered.

First try to change logging mode and supplemental log at pluggable database level was a complete failure:

SQL> alter pluggable database pdb1 enable force logging;
alter pluggable database pdb1 enable force logging
*
ERROR at line 1:
ORA-65046: operation not allowed from outside a pluggable database


SQL> alter pluggable database pdb1 add supplemental log data;
alter pluggable database pdb1 add supplemental log data
*
ERROR at line 1:
ORA-65046: operation not allowed from outside a pluggable database


SQL> alter session set container=pdb1;

Session altered.

SQL> alter pluggable database pdb1 enable force logging;
alter pluggable database pdb1 enable force logging
*
ERROR at line 1:
ORA-65045: pluggable database not in a restricted mode

SQL> alter pluggable database pdb1 add supplemental log data;
alter pluggable database pdb1 add supplemental log data
*
ERROR at line 1:
ORA-31541: Supplemental logging is not enabled in CDB$ROOT.


Ant to put a pluggable database in restricted mode you have to stop it first:

SQL> alter pluggable database pdb1 open restricted;
alter pluggable database pdb1 open restricted
*
ERROR at line 1:
ORA-65019: pluggable database PDB1 already open

Activate minimal supplemental logging at container level:

SQL> alter session set container=cdb$root;

Session altered.

SQL> show pdbs

    CON_ID CON_NAME                       OPEN MODE  RESTRICTED
---------- ------------------------------ ---------- ----------
         2 PDB$SEED                       READ ONLY  NO
         3 PDB1                           READ WRITE YES
SQL> ALTER DATABASE add SUPPLEMENTAL LOG DATA;

Database altered.

SQL> select * from cdb_supplemental_logging;

MIN PRI UNI FOR ALL PRO SUB     CON_ID
--- --- --- --- --- --- --- ----------
YES NO  NO  NO  NO  NO  NO           1

Once minimal supplemental logging has been activated at container level then all pdbs have it immediately (but no harm to issue the command again):

SQL> alter session set container=pdb1;

Session altered.

SQL> select * from cdb_supplemental_logging;

MIN PRI UNI FOR ALL PRO SUB     CON_ID
--- --- --- --- --- --- --- ----------
YES NO  NO  NO  NO  NO  NO           3

SQL> set lines 200
SQL> col pdb_name for a10
SQL> select pdb_name, logging, force_logging from cdb_pdbs;

PDB_NAME   LOGGING   FORCE_LOGGING
---------- --------- ---------------------------------------
PDB1       LOGGING   YES
PDB$SEED   LOGGING   NO

SQL> alter session set container=pdb1;

Session altered.

SQL> select * from cdb_supplemental_logging;

MIN PRI UNI FOR ALL PRO SUB     CON_ID
--- --- --- --- --- --- --- ----------
YES NO  NO  NO  NO  NO  NO           3

SQL> alter pluggable database pdb1 add supplemental log data;

Pluggable database altered.

SQL> alter pluggable database pdb1 enable force logging;

Pluggable database altered.

SQL> select pdb_name, logging, force_logging from cdb_pdbs;

PDB_NAME   LOGGING   FORCE_LOGGING
---------- --------- ---------------------------------------
PDB1       LOGGING   YES

SQL> SELECT log_mode,supplemental_log_data_min, force_logging FROM v$database;

LOG_MODE     SUPPLEME FORCE_LOGGING
------------ -------- ---------------------------------------
ARCHIVELOG   YES      NO

Complete the configuration by switching log file and putting back pluggable database in non-restricted mode:

SQL> alter session set container=cdb$root;

Session altered.

SQL> ALTER SYSTEM SWITCH LOGFILE;

System altered.

SQL> show pdbs;

    CON_ID CON_NAME                       OPEN MODE  RESTRICTED
---------- ------------------------------ ---------- ----------
         2 PDB$SEED                       READ ONLY  NO
         3 PDB1                           READ WRITE YES
SQL> alter pluggable database pdb1 close immediate;

Pluggable database altered.

SQL> alter pluggable database pdb1 open read write;

Pluggable database altered.

SQL> show pdbs;

    CON_ID CON_NAME                       OPEN MODE  RESTRICTED
---------- ------------------------------ ---------- ----------
         2 PDB$SEED                       READ ONLY  NO
         3 PDB1                           READ WRITE NO

Create the global ggadmin GoldenGate administrative user on your container database as specified in documentation. What is not clear to me in the documentation is that this global user should be able to connect to all containers of your multitenant database:

SQL> CREATE USER c##ggadmin IDENTIFIED BY secure_password;

User created.

SQL> GRANT CREATE SESSION, CONNECT, RESOURCE, ALTER SYSTEM, SELECT ANY DICTIONARY, UNLIMITED TABLESPACE TO c##ggadmin CONTAINER=all;

Grant succeeded.

SQL> EXEC DBMS_GOLDENGATE_AUTH.GRANT_ADMIN_PRIVILEGE(grantee=>'c##ggadmin', privilege_type=>'CAPTURE', grant_optional_privileges=>'*', container=>'ALL');

PL/SQL procedure successfully completed.

GoldenGate 19.1 installation

Installation is pretty straightforward and I have already done it with GoldenGate 12c (https://blog.yannickjaquier.com/oracle/goldengate-12c-tutorial.html). Just locate the runInstaller file in the folder where you have unzipped the downloaded file. Choose your database version:

kafka01
kafka01

Choose the target installation directory:

kafka02
kafka02

Then installation is already over with GoldenGate manager already configured (port is 7809) and running:

kafka03
kafka03

You can immediately test it with:

[oracle@server01 ~]$ /u01/app/oracle/product/19.1.0/oggcore_1/ggsci

Oracle GoldenGate Command Interpreter for Oracle
Version 19.1.0.0.4 OGGCORE_19.1.0.0.0_PLATFORMS_191017.1054_FBO
Linux, x64, 64bit (optimized), Oracle 19c on Oct 17 2019 21:16:29
Operating system character set identified as UTF-8.

Copyright (C) 1995, 2019, Oracle and/or its affiliates. All rights reserved.



GGSCI (server01) 1> info mgr

Manager is running (IP port TCP:server01.7809, Process ID 13243).

You can add the directory to your PATH or to not mess-up with GoldenGate for Big Data create an alias to be 100% of the one you are launching (this is also the opportunity to club with rlwrap). So I created ggsci_gg alias in my profile for this traditional GoldenGate installation:

alias ggsci_gg='rlwrap /u01/app/oracle/product/19.1.0/oggcore_1/ggsci'

GoldenGate 19.1 for Big Data Installation

Install Open JDK 1.8 with:

[root@server01 ~]# yum install java-1.8.0-openjdk.x86_64

Add to your profile:

export JAVA_HOME=/usr/lib/jvm/java-1.8.0-openjdk-1.8.0.275.b01-0.el7_9.x86_64/jre
export PATH=$JAVA_HOME/bin:$PATH
export LD_LIBRARY_PATH=$JAVA_HOME/lib/amd64/server:$LD_LIBRARY_PATH

Install GoldenGate for Big Data with a simple unzip/untar:

[oracle@server01 ~]$ mkdir -p /u01/app/oracle/product/19.1.0/oggbigdata_1
[oracle@server01 19.1.0]$ cd /u01/app/oracle/product/19.1.0/oggbigdata_1
[oracle@server01 oggbigdata_1]$ cp /u01/V983760-01.zip .
[oracle@server01 oggbigdata_1]$ unzip V983760-01.zip
Archive:  V983760-01.zip
  inflating: OGGBD-19.1.0.0-README.txt
  inflating: OGG_BigData_19.1.0.0.1_Release_Notes.pdf
  inflating: OGG_BigData_Linux_x64_19.1.0.0.1.tar
[oracle@server01 oggbigdata_1]$ tar xvf OGG_BigData_Linux_x64_19.1.0.0.1.tar
.
.
[oracle@server01 oggbigdata_1]$ rm OGG_BigData_Linux_x64_19.1.0.0.1.tar V983760-01.zip

I also added this alias in my profile:

alias ggsci_bd='rlwrap /u01/app/oracle/product/19.1.0/oggbigdata_1/ggsci'

Create GoldenGate for Big Data subdirectory and configure Manager process:

[oracle@server01 ~]$ ggsci_bd

Oracle GoldenGate for Big Data
Version 19.1.0.0.1 (Build 003)

Oracle GoldenGate Command Interpreter
Version 19.1.0.0.2 OGGCORE_OGGADP.19.1.0.0.2_PLATFORMS_190916.0039
Linux, x64, 64bit (optimized), Generic on Sep 16 2019 02:12:32
Operating system character set identified as UTF-8.

Copyright (C) 1995, 2019, Oracle and/or its affiliates. All rights reserved.


GGSCI (server01) 1> create subdirs

Creating subdirectories under current directory /u01/app/oracle/product/19.1.0/oggbigdata_1

Parameter file                 /u01/app/oracle/product/19.1.0/oggbigdata_1/dirprm: created.
Report file                    /u01/app/oracle/product/19.1.0/oggbigdata_1/dirrpt: created.
Checkpoint file                /u01/app/oracle/product/19.1.0/oggbigdata_1/dirchk: created.
Process status files           /u01/app/oracle/product/19.1.0/oggbigdata_1/dirpcs: created.
SQL script files               /u01/app/oracle/product/19.1.0/oggbigdata_1/dirsql: created.
Database definitions files     /u01/app/oracle/product/19.1.0/oggbigdata_1/dirdef: created.
Extract data files             /u01/app/oracle/product/19.1.0/oggbigdata_1/dirdat: created.
Temporary files                /u01/app/oracle/product/19.1.0/oggbigdata_1/dirtmp: created.
Credential store files         /u01/app/oracle/product/19.1.0/oggbigdata_1/dircrd: created.
Masterkey wallet files         /u01/app/oracle/product/19.1.0/oggbigdata_1/dirwlt: created.
Dump files                     /u01/app/oracle/product/19.1.0/oggbigdata_1/dirdmp: created.


GGSCI (server01) 2> edit params mgr

Insert ‘port 7801’ in manager parameter file. Then start manager with:

GGSCI (server01) 1> view params mgr

port 7801


GGSCI (server01) 3> start mgr
Manager started.


GGSCI (server01) 4> info mgr

Manager is running (IP port TCP:server01.7801, Process ID 11186).

Kafka configuration

For Kafka I’m just following the official quick start documentation. I have just created a dedicated (original) kafka account to run kafka processes. I have also use nohup command not to lock too many shells. As the size is small I have installed Kafka in the home directory of my kafka account (never ever do this in production):

[kafka@server01 ~]$ pwd
/home/kafka
[kafka@server01 ~]$ tar -xzf /tmp/kafka_2.13-2.7.0.tgz
[kafka@server01 ~]$ ll
total 0
drwxr-x--- 6 kafka users 89 Dec 16 15:03 kafka_2.13-2.7.0
[kafka@server01 ~]$ cd kafka_2.13-2.7.0/
[kafka@server01 ~]$ nohup /home/kafka/kafka_2.13-2.7.0/bin/zookeeper-server-start.sh config/zookeeper.properties > zookeeper.log &
[kafka@server01 ~]$ nohup /home/kafka/kafka_2.13-2.7.0/bin/kafka-server-start.sh config/server.properties > broker_service.log &

Then I have done topic creation and dummy events creation and, obviously, all went well…

References

The post GoldenGate for Big Data and Kafka Handlers hands-on – part 1 appeared first on IT World.

]]>
https://blog.yannickjaquier.com/oracle/goldengate-for-big-data-and-kafka-handlers-hands-on-part-1.html/feed 0
Network encryption hands-on with Java, Python and SQL*Plus https://blog.yannickjaquier.com/oracle/network-encryption-hands-on-java-python-sqlplus.html https://blog.yannickjaquier.com/oracle/network-encryption-hands-on-java-python-sqlplus.html#respond Mon, 26 Apr 2021 07:29:24 +0000 https://blog.yannickjaquier.com/?p=5167 Preamble Few years after my blog post on Oracle network encryption we have finally decided to implement it, whenever possible, on our Sarbanes-Oxley (SOX) perimeter at least. For those databases, connecting using network encryption will not be an option so it simplify the database part configuration to reject any connection that is not secure. This […]

The post Network encryption hands-on with Java, Python and SQL*Plus appeared first on IT World.

]]>

Table of contents

Preamble

Few years after my blog post on Oracle network encryption we have finally decided to implement it, whenever possible, on our Sarbanes-Oxley (SOX) perimeter at least. For those databases, connecting using network encryption will not be an option so it simplify the database part configuration to reject any connection that is not secure.

This blog post is about the amount of burden you might have moving to network encryption to encrypt communication between your databases and your clients (users and/or applications).

Testing has been done using:

  • A 19c (19.10) pluggable database (pdb1) running on a RedHat 7.8 physical server.
  • A 19c (19.3) 64 bits Windows client installed on my Windows 10 laptop.
  • OpenJDK version “1.8.0_282”. The Windows binaries have been found on RedHat web site as Microsoft is starting at Java 11.
  • Python 3.7.9 on Windows 64 bits and cx-Oracle 8.1.0.

Database server configuration for network encryption

Upfront nothing is configured and connection to your database server can be unsecure if requested. It is always a good idea to test everything from a simple Oracle client even if in real life your application will not use a client (Java or else instead):

PS C:\> tnsping //server01.domain.com:1531/pdb1

TNS Ping Utility for 64-bit Windows: Version 19.0.0.0.0 - Production on 15-APR-2021 10:38:01

Copyright (c) 1997, 2019, Oracle.  All rights reserved.

Used parameter files:
C:\app\client\product\19.0.0\client_1\network\admin\sqlnet.ora

Used EZCONNECT adapter to resolve the alias
Attempting to contact (DESCRIPTION=(CONNECT_DATA=(SERVICE_NAME=pdb1))(ADDRESS=(PROTOCOL=tcp)(HOST=10.75.43.64)(PORT=1531)))
OK (110 msec)
PS C:\> sqlplus yjaquier@//server01.domain.com:1531/pdb1

SQL*Plus: Release 19.0.0.0.0 - Production on Thu Apr 15 10:38:12 2021
Version 19.3.0.0.0

Copyright (c) 1982, 2019, Oracle.  All rights reserved.

Enter password:

Connected to:
Oracle Database 19c Enterprise Edition Release 19.0.0.0.0 - Production
Version 19.10.0.0.0

SQL> select network_service_banner from v$session_connect_info where sid in (select distinct sid from v$mystat);

NETWORK_SERVICE_BANNER
--------------------------------------------------------------------------------
TCP/IP NT Protocol Adapter for Linux: Version 19.0.0.0.0 - Production
Encryption service for Linux: Version 19.0.0.0.0 - Production
Crypto-checksumming service for Linux: Version 19.0.0.0.0 - Production

Remark:
The query is coming from Administering Oracle Database Classic Cloud Service official Oracle documentation. Here no encryption or crypto activated, the displayed text is simply saying that everything is ready to be used if required…

It is now time to play with sqlnet.ora parameters to activate network encryption:

  • SQLNET.ENCRYPTION_SERVER
  • SQLNET.ENCRYPTION_CLIENT

Possible values of both parameters are:

  • accepted to enable the security service if required or requested by the other side.
  • rejected to disable the security service, even if required by the other side.
  • requested to enable the security service if the other side allows it.
  • required to enable the security service and disallow the connection if the other side is not enabled for the security service.

For my requirement the only acceptable value for SQLNET.ENCRYPTION_SERVER is required… On a side note, except for testing purpose, I am wondering the added value of rejected value. Why would you intentionally reject a secure connection if it is possible ??!!

So in sqlnet.ora of my database server I set:

SQLNET.ENCRYPTION_SERVER=required

Then from my SQL*Plus client if I set nothing I get:

SQL> select network_service_banner from v$session_connect_info where sid in (select distinct sid from v$mystat);

NETWORK_SERVICE_BANNER
--------------------------------------------------------------------------------
TCP/IP NT Protocol Adapter for Linux: Version 19.0.0.0.0 - Production
Encryption service for Linux: Version 19.0.0.0.0 - Production
AES256 Encryption service adapter for Linux: Version 19.0.0.0.0 - Production
Crypto-checksumming service for Linux: Version 19.0.0.0.0 - Production

This because SQLNET.ENCRYPTION_CLIENT default value is accepted. We see then encryption algorithm is AES256, this is because SQLNET.ENCRYPTION_TYPES_CLIENT and SQLNET.ENCRYPTION_TYPES_SERVER contains by default all encryption algorithms i.e.:

  • 3des112 for triple DES with a two-key (112-bit) option
  • 3des168 for triple DES with a three-key (168-bit) option
  • aes128 for AES (128-bit key size)
  • aes192 for AES (192-bit key size)
  • aes256 for AES (256-bit key size)
  • des for standard DES (56-bit key size)
  • des40 for DES (40-bit key size)
  • rc4_40 for RSA RC4 (40-bit key size)
  • rc4_56 for RSA RC4 (56-bit key size)
  • rc4_128 for RSA RC4 (128-bit key size)
  • rc4_256 for RSA RC4 (256-bit key size)

If I explicitly set SQLNET.ENCRYPTION_CLIENT=rejected I get:

ERROR:
ORA-12660: Encryption or crypto-checksumming parameters incompatible

If you want your client to connect to your database server using a chosen algorithm you can set in your database server sqlnet.ora file:

SQLNET.ENCRYPTION_TYPES_SERVER=3des168

And you get when connecting:

SQL> select network_service_banner from v$session_connect_info where sid in (select distinct sid from v$mystat);

NETWORK_SERVICE_BANNER
--------------------------------------------------------------------------------
TCP/IP NT Protocol Adapter for Linux: Version 19.0.0.0.0 - Production
Encryption service for Linux: Version 19.0.0.0.0 - Production
3DES168 Encryption service adapter for Linux: Version 19.0.0.0.0 - Production
Crypto-checksumming service for Linux: Version 19.0.0.0.0 - Production

Activating network encryption with Java

As written above your application will surely not connect to your database server using SQL*Plus. SQL*Plus is quite handy when you need to test that network encryption is working but most probably your application is using Java. So how does it works in Java ? Let try it…

Same as plenty of blog posts I have written on this web site I will be using Eclipse that is free and is a nice Java editor with syntax completion to help you.

First download the JDBC driver that suit your environment. My database and client are in 19c so I have taken 19c JDBC driver and as I’m still using OpenJDK 8 (one day I will have to upgrade myself !!) I have chosen finally to use ojdbc8.jar that is certified with JDK 8.

Choose the JDBC driver that is the exact version of your Oracle client or you will get below error message when using JDBC OCI driver:

Exception in thread "main" java.lang.Error: Incompatible version of libocijdbc[Jdbc:1910000, Jdbc-OCI:193000
	at oracle.jdbc.driver.T2CConnection$1.run(T2CConnection.java:4309)
	at java.security.AccessController.doPrivileged(Native Method)
	at oracle.jdbc.driver.T2CConnection.loadNativeLibrary(T2CConnection.java:4302)
	at oracle.jdbc.driver.T2CConnection.logon(T2CConnection.java:487)
	at oracle.jdbc.driver.PhysicalConnection.connect(PhysicalConnection.java:807)
	at oracle.jdbc.driver.T2CDriverExtension.getConnection(T2CDriverExtension.java:66)
	at oracle.jdbc.driver.OracleDriver.connect(OracleDriver.java:770)
	at oracle.jdbc.driver.OracleDriver.connect(OracleDriver.java:572)
	at java.sql.DriverManager.getConnection(DriverManager.java:664)
	at java.sql.DriverManager.getConnection(DriverManager.java:208)
	at network_encryption.network_encryption.main(network_encryption.java:30)

Add JDBC jar file (ojdbc8.jar) in your Eclipse project with “Add Eternal JAR”:

network_encryption01
network_encryption01

From client perspective Oracle JDBC is made of two different drivers:

  • Thin driver: The JDBC Thin driver is a pure Java, Type IV driver that can be used in applications
  • Oracle Call Interface (OCI) driver: It is used on the client-side with an Oracle client installation. It can be used only with applications.

Statement of Oracle is pretty clear:

In general, unless you need OCI-specific features, such as support for non-TCP/IP networks, use the JDBC Thin driver.

Oracle documentation is providing this clear table and I well recall to have used JDBC OCI driver when testing Transparent Application Failover (TAF):

network_encryption02
network_encryption02

Oracle JDBC Thin Driver

The small java code I have written is:

  /**
  * 
  */
 /**
  * @author Yannick Jaquier
  *
  */
 package network_encryption;
 
 import java.sql.Connection;
 import java.sql.DriverManager;
 import java.sql.ResultSet;
 import java.sql.SQLException;
 import java.util.Properties;
 import oracle.jdbc.OracleConnection;
 //import oracle.jdbc.pool.OracleDataSource;
 
 public class network_encryption {
   public static void main(String[] args) throws Exception {
     Connection connection1 = null;
     String query1 = "select network_service_banner from v$session_connect_info where sid in (select distinct sid from v$mystat)";
     String connect_string = "//server01.domain.com:1531/pdb1";
     ResultSet resultset1 = null;
     Properties props = new Properties();
     //OracleDataSource ods = new OracleDataSource();
     OracleConnection oracleconnection1 = null;
    
     try {
       props.setProperty("user","yjaquier");
       props.setProperty("password","secure_password");
       props.setProperty(OracleConnection.CONNECTION_PROPERTY_THIN_NET_ENCRYPTION_LEVEL, "ACCEPTED");
       props.setProperty(OracleConnection.CONNECTION_PROPERTY_THIN_NET_ENCRYPTION_TYPES, "3des168");
       connection1 = DriverManager.getConnection("jdbc:oracle:thin:@" + connect_string, props);
       oracleconnection1 = (OracleConnection)connection1;
     }
     catch (SQLException e) {
       System.out.println("Connection Failed! Check output console");
       e.printStackTrace();
       System.exit(1);
     }
     System.out.println("Connected to Oracle database...");
    
     if (oracleconnection1!=null) {
       try {
         resultset1 = oracleconnection1.createStatement().executeQuery(query1);
         while (resultset1.next()) {
           System.out.println("Banner: "+resultset1.getString(1));
         }
         System.out.println("Used Encryption Algorithm: "+oracleconnection1.getEncryptionAlgorithmName());
       }
       catch (SQLException e) {
         System.out.println("Query has failed...");
         e.printStackTrace();
         System.exit(1);
       }
     }
     resultset1.close();
     connection1.close(); 
   }
 }

The console output is clear:

network_encryption03
network_encryption03

If for example I set OracleConnection.CONNECTION_PROPERTY_THIN_NET_ENCRYPTION_LEVEL to REJECTED I get below expected feedback (ORA-12660):

network_encryption04
network_encryption04

Oracle JDBC OCI Driver

The Java code for JDBC OCI driver is almost the same except that you have much less available parameters (CONNECTION_PROPERTY_THIN_NET_ENCRYPTION_TYPES) and functions (getEncryptionAlgorithmName). SO the idea is to link your applicative code with an instant client (or thick client if you like) and set getEncryptionAlgorithmName system variable to be able to play with your local sqlnet.ora.

The Java code is almost the same:

  /**
  * 
  */
 /**
  * @author Yannick Jaquier
  *
  */
 package network_encryption;
 
 import java.sql.Connection;
 import java.sql.DriverManager;
 import java.sql.ResultSet;
 import java.sql.SQLException;
 import java.util.Properties;
 import oracle.jdbc.OracleConnection;
 //import oracle.jdbc.pool.OracleDataSource;
 
 public class network_encryption {
   public static void main(String[] args) throws Exception {
     Connection connection1 = null;
     String query1 = "select network_service_banner from v$session_connect_info where sid in (select distinct sid from v$mystat)";
     String connect_string = "//server01.domain.com:1531/pdb1";
     ResultSet resultset1 = null;
     Properties props = new Properties();
     //OracleDataSource ods = new OracleDataSource();
     OracleConnection oracleconnection1 = null;
    
     try {
       props.setProperty("user","yjaquier");
       props.setProperty("password","secure_password");
       System.setProperty("oracle.net.tns_admin","C:\\app\\client\\product\\19.0.0\\client_1\\network\\admin");
       connection1 = DriverManager.getConnection("jdbc:oracle:oci:@" + connect_string, props);
       oracleconnection1 = (OracleConnection)connection1;
     }
     catch (SQLException e) {
       System.out.println("Connection Failed! Check output console");
       e.printStackTrace();
       System.exit(1);
     }
     System.out.println("Connected to Oracle database...");
    
     if (oracleconnection1!=null) {
       try {
         resultset1 = oracleconnection1.createStatement().executeQuery(query1);
         while (resultset1.next()) {
           System.out.println("Banner: "+resultset1.getString(1));
         }
       }
       catch (SQLException e) {
         System.out.println("Query has failed...");
         e.printStackTrace();
         System.exit(1);
       }
     }
     resultset1.close();
     connection1.close(); 
   }
 }

If in my sqlnet.ora I set:

SQLNET.ENCRYPTION_CLIENT=accepted
SQLNET.ENCRYPTION_TYPES_CLIENT=(3des168)

I get:

network_encryption05
network_encryption05

But if I set:

SQLNET.ENCRYPTION_CLIENT=rejected
SQLNET.ENCRYPTION_TYPES_CLIENT=(3des168)

I get:

network_encryption06
network_encryption06

Activating network encryption with Python

The de-facto package to connect to an Oracle database in Python is cx_Oracle ! I am not detailing how to configure this in a Python virtual environment as Internet is full of tutoriels on this already…

The cx_Oracle Python package is relying on the local client installation so you end up using the sqlnet.ora file that we have seen with SQL*Plus client.

The small Python code (network_encrytion.py) I have written is:

import cx_Oracle
import config

connection = None
query1 = "select network_service_banner from v$session_connect_info where sid in (select distinct sid from v$mystat)"
try:
  connection = cx_Oracle.connect(
    config.username,
    config.password,
    config.dsn,
    encoding=config.encoding)

  # show the version of the Oracle Database
  print(connection.version)

  # Fetch and display rows of banner query
  with connection.cursor() as cursor:
    cursor.execute(query1)
    rows = cursor.fetchall()
    if rows:
      for row in rows:
        print(row)

except cx_Oracle.Error as error:
  print(error)
finally:
  # release the connection
  if connection:
    connection.close()

You also need to put is same directory below config.py file:

username = 'yjaquier'
password = 'secure_password'
dsn = 'server01.domain.com:1531/pdb1'
encoding = 'UTF-8'

If in my sqlnet.ora file set SQLNET.ENCRYPTION_CLIENT=rejected I (obviously) get:

PS C:\Yannick\Python> python .\network_encryption.py
ORA-12660: Encryption or crypto-checksumming parameters incompatible

If I set nothing in sqlnet.ora file I get:

PS C:\Yannick\Python> python .\network_encryption.py
19.10.0.0.0
('TCP/IP NT Protocol Adapter for Linux: Version 19.0.0.0.0 - Production',)
('Encryption service for Linux: Version 19.0.0.0.0 - Production',)
('AES256 Encryption service adapter for Linux: Version 19.0.0.0.0 - Production',)
('Crypto-checksumming service for Linux: Version 19.0.0.0.0 - Production',)

I can force the encryption algorithm with SQLNET.ENCRYPTION_TYPES_CLIENT=(3des168) and get:

PS C:\Yannick\Python> python .\network_encryption.py
19.10.0.0.0
('TCP/IP NT Protocol Adapter for Linux: Version 19.0.0.0.0 - Production',)
('Encryption service for Linux: Version 19.0.0.0.0 - Production',)
('3DES168 Encryption service adapter for Linux: Version 19.0.0.0.0 - Production',)
('Crypto-checksumming service for Linux: Version 19.0.0.0.0 - Production',)

Conclusion

All in all as the default value for SQLNET.ENCRYPTION_CLIENT is accepted if you configure your database server to only accept encrypted connection then it should be transparent from application side. At least it is for Java, Python and traditional SQL scripts…

If you really don’t want to touch your application code and choose your preferred encryption algorithm (in case default one, AES256, does not fit with you) you can even imagine limiting the available encryption algorithms from database server side with SQLNET.ENCRYPTION_TYPES_SERVER.

References

The post Network encryption hands-on with Java, Python and SQL*Plus appeared first on IT World.

]]>
https://blog.yannickjaquier.com/oracle/network-encryption-hands-on-java-python-sqlplus.html/feed 0
Automate your cluster with Ambari API and Ambari Metrics System https://blog.yannickjaquier.com/hadoop/automate-your-cluster-with-ambari-api-and-ams.html https://blog.yannickjaquier.com/hadoop/automate-your-cluster-with-ambari-api-and-ams.html#comments Fri, 26 Mar 2021 08:33:11 +0000 https://blog.yannickjaquier.com/?p=5112 Preamble Even if we upgraded our HortonWorks Hadoop Platform (HDP) to 3.1.4 to solve many HDP 2 issues we still encountered few components that are failing for unexpected reason and/or that keep increasing their memory usage, and never decreasing, for no particular reason (at least not one we have found so far) like Hive Metastore. […]

The post Automate your cluster with Ambari API and Ambari Metrics System appeared first on IT World.

]]>

Table of contents

Preamble

Even if we upgraded our HortonWorks Hadoop Platform (HDP) to 3.1.4 to solve many HDP 2 issues we still encountered few components that are failing for unexpected reason and/or that keep increasing their memory usage, and never decreasing, for no particular reason (at least not one we have found so far) like Hive Metastore.

Of course we are using Ambari (2.7.4) and Hive Dashboard (Grafana) but this is only in a reactive approach to stop/start components and to proactively monitor memory usage and so on.

To try to automate we decided to investigate Ambari API and Ambari Metrics System (AMS). The documentation of those two components is not the best I have seen and it might be difficult to enter into the usage of those two components. But once the basic understanding is there the limit on what’s possible to do is infinite…

AMS will be to programmatically get what you would get from Grafana and then trigger an action with Ambari API to restart a component.

All the script below with be in Python 3.8.6 using Python HTTP-speaking package requests for convenience. I guess it can be transposed to any language of your choice but around Big Data platform (and in general) Python is quite popular…

Ambari Metrics System (AMS)

The idea behind the usage of this RESTAPI is to get latest memory usage and configured maximum memory usage value of Hive Metastore.

First identify your AMS server by clicking on Ambari Metrics service and getting details of Metric Collector component:

ams01
ams01

Once you have identified your AMS server (that will be call AMS_SERVER in below) check you can access the REATAPI by using http://AMS_SERVER:6188/ws/v1/timeline/metrics/hosts or http://AMS_SERVER:6188/ws/v1/timeline/metrics/metadata urls in a web browser. First url is giving a picture of your servers with their associated components that I cannot share here and second url is giving a list of all the available metrics (the Firefox 84.0.2 display is much better than Chrome one as JSON output is formatted, you can even filter it):

ams02
ams02

To get metrics value you must HTTP GET on a rule of the form:

http://AMS_HOST:6188/ws/v1/timeline/metrics?metricNames=<>&hostname=<>&appId=<>&startTime=<>&endTime=<>&precision=<>

If you want only the latest raw value of one or many metric names you can omit appId, startTime, endTime and precision parameters of the url…

Then to know which metrics name you need to get you can use your Ambari Grafana component and after authentication you can edit any chart and get metric name (in my example I have my Hive Metastore running on two different hosts):

ams03
ams03

So I need to get default.General.heap.max and default.General.heap.used metric names from the hostnames(s) where is running the metastore appId. In other words an url with this parameter: metricNames=default.General.heap.max,default.General.heap.used&appId=hivemetastore&hostname=hiveserver01.domain.com

In Python it gives something like:

import requests
import json

# -----------------------------------------------------------------------------
#                       Functions
# -----------------------------------------------------------------------------
def human_readable(num):
  """
  this function will convert bytes to MB.... GB... etc
  """
  step_unit = 1024.0
  for x in ['bytes', 'KB', 'MB', 'GB', 'TB']:
    if num < step_unit:
      return "%3.1f %s" % (num, x)
    num /= step_unit

# -----------------------------------------------------------------------------
#         Variables
# -----------------------------------------------------------------------------
AMS_SERVER = 'amshost01.domain.com'
AMS_PORT = '6188'
AMS_URL = 'http://' + AMS_SERVER + ':' + AMS_PORT + '/ws/v1/timeline/'

# -----------------------------------------------------------------------------
#                Main
# -----------------------------------------------------------------------------

try:
  request01 = requests.get(AMS_URL + "metrics?metricNames=default.General.heap.max,default.General.heap.used&appId=hivemetastore&hostname=hiveserver01.domain.com")
  request01_dict = json.loads(request01.text)
  output = {}
  for row in request01_dict['metrics']:
    for key01, value01 in row.items():
      if key01 == 'metricname':
        metricname = value01
      if key01 == 'metrics':
        for key02, value02 in value01.items():
          metricvalue = value02
    output[metricname] = metricvalue
  print('Hive Metastore Heap Max: ' + human_readable(output['default.General.heap.max']))
  print('Hive Metastore Heap Used: ' + human_readable(output['default.General.heap.used']))
  print(("Hive Metastore percentage memory used: {:.0f}").format(output['default.General.heap.used']*100/output['default.General.heap.max']))
except:
  print("Cannot contact AMS server")
  exit(1)

exit(0)

For my live Hive Metastore it gives:

# ./ams_restapi.py
Hive Metastore Heap Max: 24.0 GB
Hive Metastore Heap Used: 5.3 GB
Hive Metastore percentage memory used: 22

You can obviously double-check in Grafana to be 100% sure you are returning what you expect...

Ambari API

In this first chapter we have seen how to get metrics information of your components that can be used to trigger an action of this component. To trigger this action by script you need to use Ambari API.

As you are using Ambari you need to have an Ambari account and password to execute this RESTAPI. This is simple to implement with Python requests package.

All

The first information you need to get is the cluster name you have chosen when installing HDP. In Python this can simply be done with, in below AMBARI_URL is in the form of 'http://' + AMBARI_SERVER + ':' + AMBARI_PORT + '/api/v1/clusters/'. For example 'http://ambariserver01.domain.com:8080/api/v1/clusters/'

try:
  request01 = requests.get(AMBARI_URL, auth=('ambari_account', 'ambari_password'))
  request01_dict = json.loads(request01.text)
  cluster_name = request01_dict['items'][0]['Clusters']['cluster_name']
except:
  logging.error("Cannot contact Ambari server")
  print("Cannot contact Ambari server")
  exit(1)

This cluster_name variable will be used in below sub-chapters...

List services and components

You get service list with:

try:
  request01 = requests.get(AMBARI_URL + cluster_name + '/services', auth=('ambari_account', 'ambari_password'))
  request01_dict = json.loads(request01.text)
  for row in request01_dict['items']:
    print(row['ServiceInfo']['service_name'])
except:
  logging.error("Cannot contact Ambari server")
  print("Cannot contact Ambari server")
  exit(1)

And you can get components list per service with:

try:
  request01 = requests.get(AMBARI_URL + cluster_name + '/services', auth=('ambari_account', 'ambari_password'))
  request01_dict = json.loads(request01.text)
  # print(request01_dict)
  for row01 in request01_dict['items']:
    print('Service: ' + row01['ServiceInfo']['service_name'])
    request02 = requests.get(AMBARI_URL + cluster_name + '/services/' + row01['ServiceInfo']['service_name'] + '/components', auth=('ambari_account', 'ambari_password'))
    request02_dict = json.loads(request02.text)
    for row02 in request02_dict['items']:
      print('Component: ' + row02['ServiceComponentInfo']['component_name'])
except:
  logging.error("Cannot contact Ambari server")
  print("Cannot contact Ambari server")
  exit(1)

Status of services and components

To get service status (SERVICE is the variable containing your service name). I replace INSTALLED by STOPPED because a service/component is in INSTALLED state when stopped:

try:
  request01 = requests.get(AMBARI_URL + cluster_name + '/services/' + SERVICE + '?fields=ServiceInfo/state', auth=('ambari_account', 'ambari_password'))
  request01_dict = json.loads(request01.text)
  print('Service ' + SERVICE + ' status: '+ request01_dict['ServiceInfo']['state'].replace("INSTALLED","STOPPED"))
except:
  logging.error("Cannot contact Ambari server")
  print("Cannot contact Ambari server")
  exit(1)

To get component status (SERVICE and COMPONENT variables respectively contains service and componentn anmes):

try:
  request01 = requests.get(AMBARI_URL + cluster_name + '/services/' + SERVICE + '/components/' + COMPONENT, auth=('ambari_account', 'ambari_password'))
  request01_dict = json.loads(request01.text)
  # If a componennt is running on multiple hosts
  for row in request01_dict["host_components"]:
    print('Component ' + COMPONENT + ' of service ' + SERVICE + ' running on ' + row["HostRoles"]["host_name"] + ' status: '+ request01_dict['ServiceComponentInfo']['state'].replace("INSTALLED","STOPPED"))
except:
  logging.error("Cannot contact Ambari server")
  print("Cannot contact Ambari server")
  exit(1)

Stop services and components

To stop a service you need an HTTP PUT request, so with a body to simply overwrite the state of the service/component by INSTALLED. Context can be customize to see self explaining display in Ambari web application like this:

ambari_api01
ambari_api01

You must also set the header of your PUT request. The PUT request return a message to tell you if your request has been accepted or not:

To stop a service and so all its components:

try:
  data={"RequestInfo":{"context":"Stop service "+SERVICE},"Body":{"ServiceInfo":{"state":"INSTALLED"}}}
  request01 = requests.put(AMBARI_URL + cluster_name + '/services/' + SERVICE, json=data, headers={'X-Requested-By': 'ambari'}, auth=('ambari_account', 'ambari_password'))
  print(request01.text)
except:
  print("Cannot stop service")
  logging.error("Cannot stop service")
  print(request01.status_code)
  print(request01.text)
  exit(1)

To stop a component is a bit more complex because you need to get first the hostname(s) on which your component is running:

try:
  request01 = requests.get(AMBARI_URL + cluster_name + '/services/' + SERVICE + '/components/' + COMPONENT, auth=('ambari_account', 'ambari_password'))
  request01_dict = json.loads(request01.text)
  hosts_put_url = request01_dict['host_components']
except:
  logging.error("Cannot contact Ambari server")
  print("Cannot contact Ambari server")
  exit(1)
try:
  for row in hosts_put_url:
    data={"RequestInfo":{"context":"Stop component " + COMPONENT + " on " + row["HostRoles"]["host_name"]},"Body":{"HostRoles":{"state":"INSTALLED"}}}
    host_put_url = row["href"]
    request02 = requests.put(host_put_url, json=data, headers={'X-Requested-By': 'ambari'}, auth=('ambari_account', 'ambari_password'))
    print(request02.text)
except:
  print("Cannot stop component")
  logging.error("Cannot stop component")
  print(request02.status_code)
  print(request02.text)
  exit(1)

Start services and components

Starting services and components is exactly the same principle as stopping them except that the desired state is STARTED.

To start a service and all its components:

try:
  data={"RequestInfo":{"context":"Start service "+SERVICE},"Body":{"ServiceInfo":{"state":"STARTED"}}}
  request01 = requests.put(AMBARI_URL + cluster_name + '/services/' + SERVICE, json=data, headers={'X-Requested-By': 'ambari'}, auth=('ambari_account', 'ambari_password'))
  print(request01.text)
except:
  print("Cannot start service")
  logging.error("Cannot start service")
  print(request01.status_code)
  print(request01.text)
  exit(1)

To stop a component on all the hostnames where it is running:

try:
  request01 = requests.get(AMBARI_URL + cluster_name + '/services/' + SERVICE + '/components/' + COMPONENT, auth=('ambari_account', 'ambari_password'))
  request01_dict = json.loads(request01.text)
  hosts_put_url = request01_dict['host_components']
except:
  logging.error("Cannot contact Ambari server")
  print("Cannot contact Ambari server")
  exit(1)
try:
  for row in hosts_put_url:
    data={"RequestInfo":{"context":"Start component " + COMPONENT + " on " + row["HostRoles"]["host_name"]},"Body":{"HostRoles":{"state":"STARTED"}}}
    host_put_url = row["href"]
    request02 = requests.put(host_put_url, json=data, headers={'X-Requested-By': 'ambari'}, auth=('ambari_account', 'ambari_password'))
    print(request02.text)
except:
  print("Cannot start component")
  logging.error("Cannot start component")
  print(request02.status_code)
  print(request02.text)
  exit(1)

Conclusion

What I have finally done is an executable file with parameters (using args python package) and I have now an usable script to interreact with components. This script can be given to level 1 operations to perform routine action while on duty like:

# ./ambari_restapi.py --help
usage: ambari_restapi.py [-h] [--list_services] [--list_components] [--status] [--service SERVICE] [--component COMPONENT] [--stop] [--start]

optional arguments:
  -h, --help            show this help message and exit
  --list_services       List all services
  --list_components     List all components
  --status              Status of service or component, works in conjunction with --service or --component
  --service SERVICE     Service name
  --component COMPONENT
                        Component name
  --stop                Stop service or component, works in conjunction with --service or --component
  --start               Start service or component, works in conjunction with --service or --component

# ./ambari_restapi.py --list_services
AMBARI_INFRA_SOLR
AMBARI_METRICS
HBASE
HDFS
HIVE
MAPREDUCE2
OOZIE
PIG
SMARTSENSE
SPARK2
TEZ
YARN
ZEPPELIN
ZOOKEEPER

Then with a monitoring tool if the AMS result is above a defined threshold we can trigger action using Ambari API script...

References

The post Automate your cluster with Ambari API and Ambari Metrics System appeared first on IT World.

]]>
https://blog.yannickjaquier.com/hadoop/automate-your-cluster-with-ambari-api-and-ams.html/feed 2
Privilege Analysis hands-on for least privileges principle https://blog.yannickjaquier.com/oracle/privilege-analysis-for-least-privileges-principle.html https://blog.yannickjaquier.com/oracle/privilege-analysis-for-least-privileges-principle.html#respond Thu, 25 Feb 2021 09:38:06 +0000 https://blog.yannickjaquier.com/?p=5102 Preamble If there is one subject raising quickly where you work it is, with no doubt, security ! The subject is complex in essence since there is obviously not a clear procedure to follow to be secure, would be too simple. When it comes to Oracle databases privileges there is clearly a principle that you […]

The post Privilege Analysis hands-on for least privileges principle appeared first on IT World.

]]>

Table of contents

Preamble

If there is one subject raising quickly where you work it is, with no doubt, security ! The subject is complex in essence since there is obviously not a clear procedure to follow to be secure, would be too simple. When it comes to Oracle databases privileges there is clearly a principle that you MUST apply everywhere called least privileges principle.

Where I work it is always a challenge to apply this principle because in many cases for legacy (bad) reasons many applicative accounts have been granted many xxx ANY yyy privileges to avoid granting real required objects one by one. Not to say that some applicative accounts have even been granted with DBA role… Then it becomes really difficult to guess what the users really need and use to remove the high privileges to grant many other smaller ones. You also often do not get any support from users in such a scenario…

With Oracle 12cR1 (12.1.0.1) has been released a very interesting feature to achieve this goal of least privileges principle call Privilege Analysis (PA). The process is very simple and Oracle official documentation state it very well:

Privilege analysis enables customers to create a profile for a database user and capture the list of system and object privileges that are being used by this user. The customer can then compare the user’s list of used privileges with the list of granted privileges and reduce the list of granted privileges to match the used privileges.

Unfortunately this feature has been made available only within the Database Vault paid option. The excellent news that Oracle published on December 7th, 2018 is that they made this feature free in all Oracle Enterprise Edition version !

My test database is running on a RedHat 7.8 server and is release 19c with October 2020 Release Update (RU) 31771877 so exact database Release is 19.9.0.0.201020. In fact a pluggable database inside this container one.

Privilege analysis test case

When creating my test database I have selected the sample schemas and so have a bit of figures already created inside the database:

SQL> set lines 200
SQL> select * from hr.employees fetch next 5 rows only;

EMPLOYEE_ID FIRST_NAME           LAST_NAME                 EMAIL                     PHONE_NUMBER         HIRE_DATE JOB_ID         SALARY COMMISSION_PCT MANAGER_ID DEPARTMENT_ID
----------- -------------------- ------------------------- ------------------------- -------------------- --------- ---------- ---------- -------------- ---------- -------------
        100 Steven               King                      SKING                     515.123.4567         17-JUN-03 AD_PRES         24000                                      90
        101 Neena                Kochhar                   NKOCHHAR                  515.123.4568         21-SEP-05 AD_VP           17000                       100            90
        102 Lex                  De Haan                   LDEHAAN                   515.123.4569         13-JAN-01 AD_VP           17000                       100            90
        103 Alexander            Hunold                    AHUNOLD                   590.423.4567         03-JAN-06 IT_PROG          9000                       102            60
        104 Bruce                Ernst                     BERNST                    590.423.4568         21-MAY-07 IT_PROG          6000                       103            60

The EMPLOYEES table is part of the well known HR schema.

Now let’s imagine I have a HRUSER account used inside an application and by laziness I grant SELECT ANY TABLE and UPDATE ANY TABLE privileges to this account to be able to (simply) access and update HR schema tables.

This also raise some questions when you have multiple schema owner inside the same database or pluggable database as the powerful SELECT ANY TABLE privilege will allow the account to select figures inside objects of all database schemas and NOT ONLY the desired one linked to the application this account is supporting… No even mentioning the extra threat with UPDATE ANY TABLE, DELETE ANY TABLE and so on…

Simple creation:

SQL> create user hruser identified by "secure_password";

User created.

SQL> grant connect, select any table, update any table to hruser;

Grant succeeded.

Privilege analysis testing

Now that our fictive application is running using HRUSER account we would like to analyze really what the HRUSER account is using as privileges and see how we can reduce its rights to apply the least privileges principle.

All is done through the DBMS_PRIVILEGE_CAPTURE PL/SQL supplied package and you need the CAPTURE_ADMIN role to use it, for convenience I will use SYS account.

I start by creating a privilege analysis policy for my HRUSER account. I have chosen to create a G_CONTEXT analysis policy to focus on my HRUSER account as explained in official documentation:

  • G_DATABASE: Captures all privilege use in the database, except privileges used by the SYS user.
  • G_ROLE: Captures the use of a privilege if the privilege is part of a specified role or list of roles.
  • G_CONTEXT: Captures the use of a privilege if the context specified by the condition parameter evaluates to true.
  • G_ROLE_AND_CONTEXT: Captures the use of a privilege if the privilege is part of the specified list of roles and when the condition specified by the condition parameter is true.
SQL> exec dbms_privilege_capture.create_capture(name=>'hruser_prileges_analysis', description=>'Analyze HRUSER privileges usage',-
> type=>DBMS_PRIVILEGE_CAPTURE.G_CONTEXT,-
> condition=>'SYS_CONTEXT(''USERENV'', ''SESSION_USER'')=''HRUSER''');

PL/SQL procedure successfully completed.

SQL> col description for a40
SQL> col context for a50
SQL> select description,type,context from dba_priv_captures where name='hruser_prileges_analysis';

DESCRIPTION                              TYPE             CONTEXT
---------------------------------------- ---------------- --------------------------------------------------
Analyze HRUSER privileges usage          CONTEXT          SYS_CONTEXT('USERENV', 'SESSION_USER')='HRUSER'

You must then enable the privilege analysis policy with below command. Specifying a run_name is interesting to generate report and compare privilege analysis policy result over multiple capture periods:

SQL> exec dbms_privilege_capture.enable_capture(name=>'hruser_prileges_analysis', run_name=>'hruser_18_dec_2020');

PL/SQL procedure successfully completed.

SQL> col run_name for a20
SQL> select description,type,context,run_name from dba_priv_captures where name='hruser_prileges_analysis';

DESCRIPTION                              TYPE             CONTEXT                                            RUN_NAME
---------------------------------------- ---------------- -------------------------------------------------- --------------------
Analyze HRUSER privileges usage          CONTEXT          SYS_CONTEXT('USERENV', 'SESSION_USER')='HRUSER'    HRUSER_18_DEC_2020

Then with my previously created HRUSER I simulate what would be done through a classical application. I select salary and commission percentage of a sales employee and because he has performed very well over the past year I increase his commission:

SQL> select first_name,last_name,salary,commission_pct from hr.employees where employee_id=165;

FIRST_NAME           LAST_NAME                     SALARY COMMISSION_PCT
-------------------- ------------------------- ---------- --------------
David                Lee                             6800             .1

SQL> update hr.employees set commission_pct=0.2 where employee_id=165;

1 row updated.

SQL> commit;

Commit complete.

SQL> select first_name,last_name,salary,commission_pct from hr.employees where employee_id=165;

FIRST_NAME           LAST_NAME                     SALARY COMMISSION_PCT
-------------------- ------------------------- ---------- --------------
David                Lee                             6800             .2

The duration of the capture activity might be quite complex to determine. What if you have weekly jobs, monthly jobs or even yearly jobs. I tend to say that it must run for at least a week but the good duration is totally up to you and your environment…

Privilege analysis reports and conclusion

Before generating a privilege anaylsys policy report you must disable it:

SQL> exec dbms_privilege_capture.disable_capture(name=> 'hruser_prileges_analysis');

PL/SQL procedure successfully completed.

I generate a privilege analysis report using the run name I have specified when enabling it:

SQL> exec dbms_privilege_capture.generate_result(name=>'hruser_prileges_analysis', run_name=>'HRUSER_18_DEC_2020', dependency=>true);

PL/SQL procedure successfully completed.

To be honest I am a little bit disappointed here as I expected the procedure like AWR one to generate an html nice and usable result. But in fact the generate result procedure simply fill the DBA_USED_* data dictionary privilege analysis views and you have then to fetch them to get your result. The complete list of this views is available in official documentation.

SQL> col username format a10
SQL> col sys_priv format a16
SQL> col object_owner format a13
SQL> col object_name format a23
SQL> select username,sys_priv, object_owner, object_name from dba_used_privs where capture='hruser_prileges_analysis' and run_name='HRUSER_18_DEC_2020';

USERNAME   SYS_PRIV         OBJECT_OWNER  OBJECT_NAME
---------- ---------------- ------------- -----------------------
HRUSER                      SYS           DBMS_APPLICATION_INFO
HRUSER     UPDATE ANY TABLE HR            EMPLOYEES
HRUSER     CREATE SESSION
HRUSER     SELECT ANY TABLE HR            EMPLOYEES
HRUSER                      SYS           DUAL

In below output we see that HRUSER has used UPDATE ANY TABLE and SELECT ANY TABLE to respectively update and read HR.EMPLOYEES table. So a direct grant and update on this table would replace the two high ANY privileges.

Let’s now generate system privileges HRUSER has used:

SQL> select username, sys_priv from dba_used_sysprivs where capture='hruser_prileges_analysis' and run_name='HRUSER_18_DEC_2020';

USERNAME   SYS_PRIV
---------- ----------------
HRUSER     CREATE SESSION
HRUSER     SELECT ANY TABLE
HRUSER     UPDATE ANY TABLE

In my example I have nothing related to objects privileges but if you need you can use DBA_USED_OBJPRIVS(_PATH) and DBA_UNUSED_OBJPRIVS(_PATH) views.

In this final example we can also see that from CONNECT role HRUSER has not used the SET CONTAINER privilege so we could also replace CONNECT role by CREATE SESSION privilege.

SQL> col path for a40
SQL> select sys_priv, path from dba_used_sysprivs_path where capture='hruser_prileges_analysis'  and run_name='HRUSER_18_DEC_2020';

SYS_PRIV         PATH
---------------- ----------------------------------------
UPDATE ANY TABLE GRANT_PATH('HRUSER')
CREATE SESSION   GRANT_PATH('HRUSER', 'CONNECT')
SELECT ANY TABLE GRANT_PATH('HRUSER')

SQL> select sys_priv, path from dba_unused_sysprivs_path where capture='hruser_prileges_analysis' and run_name='HRUSER_18_DEC_2020' and username='HRUSER';

SYS_PRIV         PATH
---------------- ----------------------------------------
SET CONTAINER    GRANT_PATH('HRUSER', 'CONNECT')

All those SQL*Plus queries are cool but at few years light from what you can get with Cloud Control (Security / Privilege Analysis menu). The version I have is Cloud Control 13c release 2 (13.2):

Global view (you even have start and end time capture that do not appear easily in PA views):

privilege_analysis01
privilege_analysis01

Used privileges:

privilege_analysis02
privilege_analysis02

Unused privileges (I had to filter on my HRUSER account in Cloud Control):

privilege_analysis03
privilege_analysis03

References

The post Privilege Analysis hands-on for least privileges principle appeared first on IT World.

]]>
https://blog.yannickjaquier.com/oracle/privilege-analysis-for-least-privileges-principle.html/feed 0
MariaDB ColumnStore installation and testing – part 2 https://blog.yannickjaquier.com/mysql/mariadb-columnstore-installation-and-testing-part-2.html https://blog.yannickjaquier.com/mysql/mariadb-columnstore-installation-and-testing-part-2.html#comments Sun, 24 Jan 2021 08:53:27 +0000 https://blog.yannickjaquier.com/?p=5060 Preamble After a first blog post using the container edition of MariaDB ColumnStore I wanted to deploy it on an existing custom MariaDB server installation. Because where I work we do prefer to put files where we like using the MOCA architecture. I had give up on this part as the MariaDB documentation is really […]

The post MariaDB ColumnStore installation and testing – part 2 appeared first on IT World.

]]>

Table of contents

Preamble

After a first blog post using the container edition of MariaDB ColumnStore I wanted to deploy it on an existing custom MariaDB server installation. Because where I work we do prefer to put files where we like using the MOCA architecture.

I had give up on this part as the MariaDB documentation is really too poor and might come back to this article to update if things evolve positively…

MariaDB Community Server installation and configuration

I have updated my MOCA layout for MariaDB that we have seen a long time ago. MOCA stands for MariaDB Optimal Configuration Architecture (MOCA). So below MariaDB directory naming convention, mariadb01 is the name of the instance:

Directory Used for
/mariadb/data01/mariadb01 Strore MyISAM and InnoDB files, dataxx directories can also be created to spread I/O
/mariadb/dump/mariadb01 All log files (slow log, error log, general log, …)
/mariadb/logs/mariadb01 All binary logs (log-bin, relay_log)
/mariadb/software/mariadb01 MariaDB binaries, you might also want to use /mariadb/software/10.5.4 and factor binaries for multiple MariaDB instances.
I personally believe that the extra 1GB for binaries is worth the flexibility it gives. In other words you can upgrade one without touching the others.
The my.cnf file is then stored in a conf sub-directory, as well as socket and pid files.

I have create a mariadb Linux account in dba group and a /mariadb mount point of 5GB (xfs).

The binaries I downloaded is mariadb-10.5.4-linux-systemd-x86_64.tar.gz (for systems with systemd) as I have a recent Linux… The tar.gz release is obviously deliberate as I want to be able to put it in the directory of my choice:

columnstore07
columnstore07

If you take the RPM you can only have one engine per server that can be really limiting (and really hard to manage with your customers) with modern powerful servers…

I created /mariadb/software/mariadb01/conf/my.cnf file with below content (this is just a starting point, any tuning on it for your own workload is mandatory):

[server]
# Primary variables
basedir                         = /mariadb/software/mariadb01
datadir                         = /mariadb/data01/mariadb01
max_allowed_packet              = 256M
max_connect_errors              = 1000000
pid_file                        = /mariadb/software/mariadb01/conf/mariadb01.pid
skip_external_locking
skip_name_resolve

# Logging
log_error                       = /mariadb/dump/mariadb01/mariadb01.err
log_queries_not_using_indexes   = ON
long_query_time                 = 5
slow_query_log                  = ON     # Disabled for production
slow_query_log_file             = /mariadb/dump/mariadb01/mariadb01-slow.log

tmpdir                          = /tmp
user                            = mariadb

# InnoDB Settings
default_storage_engine          = InnoDB
innodb_buffer_pool_size         = 1G    # Use up to 70-80% of RAM
innodb_file_per_table           = ON
innodb_flush_log_at_trx_commit  = 0
innodb_flush_method             = O_DIRECT
innodb_log_buffer_size          = 16M
innodb_log_file_size            = 512M
innodb_stats_on_metadata        = ON
innodb_read_io_threads          = 64
innodb_write_io_threads         = 64

# New plugin directory for Columnstore
plugin_dir                      = /usr/lib64/mysql/plugin
plugin_maturity                 = beta

[client-server]
port                            = 3316
socket                          = /mariadb/software/mariadb01/conf/mariadb01.sock

As root account I executed:

[root@server4 ~]# /mariadb/software/mariadb01/scripts/mariadb-install-db --defaults-file=/mariadb/software/mariadb01/conf/my.cnf --user=mariadb
Installing MariaDB/MySQL system tables in '/mariadb/data01/mariadb01' ...
OK

To start mysqld at boot time you have to copy
support-files/mysql.server to the right place for your system


Two all-privilege accounts were created.
One is root@localhost, it has no password, but you need to
be system 'root' user to connect. Use, for example, sudo mysql
The second is mariadb@localhost, it has no password either, but
you need to be the system 'mariadb' user to connect.
After connecting you can set the password, if you would need to be
able to connect as any of these users with a password and without sudo

See the MariaDB Knowledgebase at https://mariadb.com/kb or the
MySQL manual for more instructions.

You can start the MariaDB daemon with:
cd '/mariadb/software/mariadb01' ; /mariadb/software/mariadb01/bin/mysqld_safe --datadir='/mariadb/data01/mariadb01'

You can test the MariaDB daemon with mysql-test-run.pl
cd '/mariadb/software/mariadb01/mysql-test' ; perl mysql-test-run.pl

Please report any problems at https://mariadb.org/jira

The latest information about MariaDB is available at https://mariadb.org/.
You can find additional information about the MySQL part at:
https://dev.mysql.com
Consider joining MariaDB's strong and vibrant community:
Get Involved

This is new (at least to me) that from now one you can connect with mariadb or root account without any password. In my mariadb Linux account I created three below aliases:

alias mariadb01='/mariadb/software/mariadb01/bin/mariadb --defaults-file=/mariadb/software/mariadb01/conf/my.cnf --user=mariadb'
alias start_mariadb01='cd /mariadb/software/mariadb01/; ./bin/mariadbd-safe --defaults-file=/mariadb/software/mariadb01/conf/my.cnf &'
alias stop_mariadb01='/mariadb/software/mariadb01/bin/mariadb-admin --defaults-file=/mariadb/software/mariadb01/conf/my.cnf --user=mariadb shutdown' 

Start and stop command are working fine but client connection (mariadb01 alias) failed for:

/mariadb/software/mariadb01/bin/mariadb: error while loading shared libraries: libncurses.so.5: cannot open shared object file: No such file or directory

I resolved it with:

dnf -y install ncurses-compat-libs-6.1-7.20180224.el8.x86_64

You can also connect with root Linux account using (MariaDB accounts cannot be faked):

[root@server4 ~]# /mariadb/software/mariadb01/bin/mariadb --defaults-file=/mariadb/software/mariadb01/conf/my.cnf --user=root
Welcome to the MariaDB monitor.  Commands end with ; or \g.
Your MariaDB connection id is 5
Server version: 10.5.4-MariaDB MariaDB Server

Copyright (c) 2000, 2018, Oracle, MariaDB Corporation Ab and others.

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

MariaDB [(none)]>

MariaDB ColumnStore installation and configuration

I expected, as it is written everywhere, to have ColumnStore available as a storage engine. But found nothing implemented by default:

MariaDB [(none)]> show engines;
+--------------------+---------+-------------------------------------------------------------------------------------------------+--------------+------+------------+
| Engine             | Support | Comment                                                                                         | Transactions | XA   | Savepoints |
+--------------------+---------+-------------------------------------------------------------------------------------------------+--------------+------+------------+
| MRG_MyISAM         | YES     | Collection of identical MyISAM tables                                                           | NO           | NO   | NO         |
| CSV                | YES     | Stores tables as CSV files                                                                      | NO           | NO   | NO         |
| MEMORY             | YES     | Hash based, stored in memory, useful for temporary tables                                       | NO           | NO   | NO         |
| SEQUENCE           | YES     | Generated tables filled with sequential values                                                  | YES          | NO   | YES        |
| Aria               | YES     | Crash-safe tables with MyISAM heritage. Used for internal temporary tables and privilege tables | NO           | NO   | NO         |
| MyISAM             | YES     | Non-transactional engine with good performance and small data footprint                         | NO           | NO   | NO         |
| PERFORMANCE_SCHEMA | YES     | Performance Schema                                                                              | NO           | NO   | NO         |
| InnoDB             | DEFAULT | Supports transactions, row-level locking, foreign keys and encryption for tables                | YES          | YES  | YES        |
+--------------------+---------+-------------------------------------------------------------------------------------------------+--------------+------+------------+
8 rows in set (0.000 sec)

MariaDB [(none)]> show plugins;
+-------------------------------+----------+--------------------+---------+---------+
| Name                          | Status   | Type               | Library | License |
+-------------------------------+----------+--------------------+---------+---------+
| binlog                        | ACTIVE   | STORAGE ENGINE     | NULL    | GPL     |
| mysql_native_password         | ACTIVE   | AUTHENTICATION     | NULL    | GPL     |
| mysql_old_password            | ACTIVE   | AUTHENTICATION     | NULL    | GPL     |
| MRG_MyISAM                    | ACTIVE   | STORAGE ENGINE     | NULL    | GPL     |
| MEMORY                        | ACTIVE   | STORAGE ENGINE     | NULL    | GPL     |
| CSV                           | ACTIVE   | STORAGE ENGINE     | NULL    | GPL     |
| Aria                          | ACTIVE   | STORAGE ENGINE     | NULL    | GPL     |
| MyISAM                        | ACTIVE   | STORAGE ENGINE     | NULL    | GPL     |
| SPATIAL_REF_SYS               | ACTIVE   | INFORMATION SCHEMA | NULL    | GPL     |
| GEOMETRY_COLUMNS              | ACTIVE   | INFORMATION SCHEMA | NULL    | GPL     |
| inet6                         | ACTIVE   | DATA TYPE          | NULL    | GPL     |
| inet_aton                     | ACTIVE   | FUNCTION           | NULL    | GPL     |
| inet_ntoa                     | ACTIVE   | FUNCTION           | NULL    | GPL     |
| inet6_aton                    | ACTIVE   | FUNCTION           | NULL    | GPL     |
| inet6_ntoa                    | ACTIVE   | FUNCTION           | NULL    | GPL     |
| is_ipv4                       | ACTIVE   | FUNCTION           | NULL    | GPL     |
| is_ipv6                       | ACTIVE   | FUNCTION           | NULL    | GPL     |
| is_ipv4_compat                | ACTIVE   | FUNCTION           | NULL    | GPL     |
| is_ipv4_mapped                | ACTIVE   | FUNCTION           | NULL    | GPL     |
| CLIENT_STATISTICS             | ACTIVE   | INFORMATION SCHEMA | NULL    | GPL     |
| INDEX_STATISTICS              | ACTIVE   | INFORMATION SCHEMA | NULL    | GPL     |
| TABLE_STATISTICS              | ACTIVE   | INFORMATION SCHEMA | NULL    | GPL     |
| USER_STATISTICS               | ACTIVE   | INFORMATION SCHEMA | NULL    | GPL     |
| wsrep                         | ACTIVE   | REPLICATION        | NULL    | GPL     |
| SQL_SEQUENCE                  | ACTIVE   | STORAGE ENGINE     | NULL    | GPL     |
| PERFORMANCE_SCHEMA            | ACTIVE   | STORAGE ENGINE     | NULL    | GPL     |
| InnoDB                        | ACTIVE   | STORAGE ENGINE     | NULL    | GPL     |
| INNODB_TRX                    | ACTIVE   | INFORMATION SCHEMA | NULL    | GPL     |
| INNODB_LOCKS                  | ACTIVE   | INFORMATION SCHEMA | NULL    | GPL     |
| INNODB_LOCK_WAITS             | ACTIVE   | INFORMATION SCHEMA | NULL    | GPL     |
| INNODB_CMP                    | ACTIVE   | INFORMATION SCHEMA | NULL    | GPL     |
| INNODB_CMP_RESET              | ACTIVE   | INFORMATION SCHEMA | NULL    | GPL     |
| INNODB_CMPMEM                 | ACTIVE   | INFORMATION SCHEMA | NULL    | GPL     |
| INNODB_CMPMEM_RESET           | ACTIVE   | INFORMATION SCHEMA | NULL    | GPL     |
| INNODB_CMP_PER_INDEX          | ACTIVE   | INFORMATION SCHEMA | NULL    | GPL     |
| INNODB_CMP_PER_INDEX_RESET    | ACTIVE   | INFORMATION SCHEMA | NULL    | GPL     |
| INNODB_BUFFER_PAGE            | ACTIVE   | INFORMATION SCHEMA | NULL    | GPL     |
| INNODB_BUFFER_PAGE_LRU        | ACTIVE   | INFORMATION SCHEMA | NULL    | GPL     |
| INNODB_BUFFER_POOL_STATS      | ACTIVE   | INFORMATION SCHEMA | NULL    | GPL     |
| INNODB_METRICS                | ACTIVE   | INFORMATION SCHEMA | NULL    | GPL     |
| INNODB_FT_DEFAULT_STOPWORD    | ACTIVE   | INFORMATION SCHEMA | NULL    | GPL     |
| INNODB_FT_DELETED             | ACTIVE   | INFORMATION SCHEMA | NULL    | GPL     |
| INNODB_FT_BEING_DELETED       | ACTIVE   | INFORMATION SCHEMA | NULL    | GPL     |
| INNODB_FT_CONFIG              | ACTIVE   | INFORMATION SCHEMA | NULL    | GPL     |
| INNODB_FT_INDEX_CACHE         | ACTIVE   | INFORMATION SCHEMA | NULL    | GPL     |
| INNODB_FT_INDEX_TABLE         | ACTIVE   | INFORMATION SCHEMA | NULL    | GPL     |
| INNODB_SYS_TABLES             | ACTIVE   | INFORMATION SCHEMA | NULL    | GPL     |
| INNODB_SYS_TABLESTATS         | ACTIVE   | INFORMATION SCHEMA | NULL    | GPL     |
| INNODB_SYS_INDEXES            | ACTIVE   | INFORMATION SCHEMA | NULL    | GPL     |
| INNODB_SYS_COLUMNS            | ACTIVE   | INFORMATION SCHEMA | NULL    | GPL     |
| INNODB_SYS_FIELDS             | ACTIVE   | INFORMATION SCHEMA | NULL    | GPL     |
| INNODB_SYS_FOREIGN            | ACTIVE   | INFORMATION SCHEMA | NULL    | GPL     |
| INNODB_SYS_FOREIGN_COLS       | ACTIVE   | INFORMATION SCHEMA | NULL    | GPL     |
| INNODB_SYS_TABLESPACES        | ACTIVE   | INFORMATION SCHEMA | NULL    | GPL     |
| INNODB_SYS_DATAFILES          | ACTIVE   | INFORMATION SCHEMA | NULL    | GPL     |
| INNODB_SYS_VIRTUAL            | ACTIVE   | INFORMATION SCHEMA | NULL    | GPL     |
| INNODB_MUTEXES                | ACTIVE   | INFORMATION SCHEMA | NULL    | GPL     |
| INNODB_SYS_SEMAPHORE_WAITS    | ACTIVE   | INFORMATION SCHEMA | NULL    | GPL     |
| INNODB_TABLESPACES_ENCRYPTION | ACTIVE   | INFORMATION SCHEMA | NULL    | BSD     |
| SEQUENCE                      | ACTIVE   | STORAGE ENGINE     | NULL    | GPL     |
| user_variables                | ACTIVE   | INFORMATION SCHEMA | NULL    | GPL     |
| unix_socket                   | ACTIVE   | AUTHENTICATION     | NULL    | GPL     |
| FEEDBACK                      | DISABLED | INFORMATION SCHEMA | NULL    | GPL     |
| THREAD_POOL_GROUPS            | ACTIVE   | INFORMATION SCHEMA | NULL    | GPL     |
| THREAD_POOL_QUEUES            | ACTIVE   | INFORMATION SCHEMA | NULL    | GPL     |
| THREAD_POOL_STATS             | ACTIVE   | INFORMATION SCHEMA | NULL    | GPL     |
| THREAD_POOL_WAITS             | ACTIVE   | INFORMATION SCHEMA | NULL    | GPL     |
| partition                     | ACTIVE   | STORAGE ENGINE     | NULL    | GPL     |
+-------------------------------+----------+--------------------+---------+---------+
68 rows in set (0.002 sec)

You have also this query that I found on MariaDB web site:

SELECT plugin_name, plugin_version, plugin_maturity FROM information_schema.plugins ORDER BY plugin_name;

I had to configure official MariaDB repository as explained in documentation:

[root@server4 ~]# cat /etc/yum.repos.d/mariadb.repo
# MariaDB 10.5 RedHat repository list - created 2020-07-09 15:06 UTC
# http://downloads.mariadb.org/mariadb/repositories/
[mariadb]
name = MariaDB
baseurl = http://yum.mariadb.org/10.5/rhel8-amd64
module_hotfixes=1
gpgkey=https://yum.mariadb.org/RPM-GPG-KEY-MariaDB
gpgcheck=1

One the repository is configured you can see what’s available with:

dnf list mariadb*

I can see a MariaDB-columnstore-engine.x86_64 but its installation will also install MariaDB-server.x86_64 which I do not want… So far I have not found a way to just have the .so file to inject this ColumStore storage engine in my custom MariaDB Server installation…

[root@server4 ~]# mcsSetConfig CrossEngineSupport Host 127.0.0.1
[root@server4 ~]# mcsSetConfig CrossEngineSupport Port 3316
[root@server4 ~]# mcsSetConfig CrossEngineSupport User cross_engine
[root@server4 ~]# mcsSetConfig CrossEngineSupport Password cross_engine_passwd
MariaDB [(none)]> CREATE USER 'cross_engine'@'127.0.0.1' IDENTIFIED BY "cross_engine_passwd";
Query OK, 0 rows affected (0.001 sec)

MariaDB [(none)]> GRANT SELECT ON *.* TO 'cross_engine'@'127.0.0.1';
Query OK, 0 rows affected (0.001 sec)
[root@server4 ~]# systemctl status mariadb-columnstore
● mariadb-columnstore.service - mariadb-columnstore
   Loaded: loaded (/usr/lib/systemd/system/mariadb-columnstore.service; enabled; vendor preset: disabled)
   Active: active (exited) since Mon 2020-07-13 15:42:19 CEST; 3min 15s ago
  Process: 27960 ExecStop=/usr/bin/mariadb-columnstore-stop.sh (code=exited, status=0/SUCCESS)
  Process: 27998 ExecStart=/usr/bin/mariadb-columnstore-start.sh (code=exited, status=0/SUCCESS)
 Main PID: 27998 (code=exited, status=0/SUCCESS)

Jul 13 15:42:11 server4.domain.com systemd[1]: Stopped mariadb-columnstore.
Jul 13 15:42:11 server4.domain.com systemd[1]: Starting mariadb-columnstore...
Jul 13 15:42:12 server4.domain.com mariadb-columnstore-start.sh[27998]: Job for mcs-storagemanager.service failed because the control process exited with error code.
Jul 13 15:42:12 server4.domain.com mariadb-columnstore-start.sh[27998]: See "systemctl status mcs-storagemanager.service" and "journalctl -xe" for details.
Jul 13 15:42:19 server4.domain.com systemd[1]: Started mariadb-columnstore.
[root@server4 ~]# systemctl status mcs-storagemanager.service
● mcs-storagemanager.service - storagemanager
   Loaded: loaded (/usr/lib/systemd/system/mcs-storagemanager.service; enabled; vendor preset: disabled)
   Active: failed (Result: exit-code) since Mon 2020-07-13 15:42:14 CEST; 8min ago
  Process: 28010 ExecStartPre=/usr/bin/mcs-start-storagemanager.py (code=exited, status=1/FAILURE)

Jul 13 15:42:14 server4.domain.com systemd[1]: Starting storagemanager...
Jul 13 15:42:14 server4.domain.com mcs-start-storagemanager.py[28010]: S3 storage has not been set up for MariaDB ColumnStore. StorageManager service fails to start.
Jul 13 15:42:14 server4.domain.com systemd[1]: mcs-storagemanager.service: Control process exited, code=exited status=1
Jul 13 15:42:14 server4.domain.com systemd[1]: mcs-storagemanager.service: Failed with result 'exit-code'.
Jul 13 15:42:14 server4.domain.com systemd[1]: Failed to start storagemanager.
[root@server4 columnstore]# cat /var/log/mariadb/columnstore/debug.log
Jul 13 15:05:42 server4 IDBFile[26302]: 42.238256 |0|0|0| D 35 CAL0002: Failed to open file: /var/lib/columnstore/data1/systemFiles/dbrm/tablelocks, exception: unable to open Buffered file
Jul 13 15:05:42 server4 controllernode[26302]: 42.238358 |0|0|0| D 29 CAL0000: TableLockServer::load(): could not open the save file/var/lib/columnstore/data1/systemFiles/dbrm/tablelocks
Jul 13 15:42:17 server4 IDBFile[28020]: 17.117913 |0|0|0| D 35 CAL0002: Failed to open file: /var/lib/columnstore/data1/systemFiles/dbrm/tablelocks, exception: unable to open Buffered file
Jul 13 15:42:17 server4 controllernode[28020]: 17.118009 |0|0|0| D 29 CAL0000: TableLockServer::load(): could not open the save file/var/lib/columnstore/data1/systemFiles/dbrm/tablelocks
[root@server4 columnstore]# grep -v ^# /etc/columnstore/storagemanager.cnf | grep -v -e '^$'
[ObjectStorage]
service = LocalStorage
object_size = 5M
metadata_path = /mariadb/columnstore/storagemanager/metadata
journal_path = /mariadb/columnstore/storagemanager/journal
max_concurrent_downloads = 21

max_concurrent_uploads = 21
common_prefix_depth = 3
[S3]
region = some_region
bucket = some_bucket
[LocalStorage]
path = /mariadb/columnstore/storagemanager/fake-cloud
fake_latency = n
max_latency = 50000
[Cache]
cache_size = 2g
path = /mariadb/columnstore/storagemanager/cache
[mariadb@server4 mariadb]$ mkdir -p /mariadb/columnstore/storagemanager/fake-cloud
[mariadb@server4 mariadb]$ mkdir -p /mariadb/columnstore/storagemanager/cache
[mariadb@server4 mariadb]$ mkdir -p /mariadb/columnstore/storagemanager/metadata
[mariadb@server4 mariadb]$ mkdir -p /mariadb/columnstore/storagemanager/journal
[root@server4 ~]# mcsGetConfig -a | grep /var/lib
SystemConfig.DBRoot1 = /var/lib/columnstore/data1
SystemConfig.DBRMRoot = /var/lib/columnstore/data1/systemFiles/dbrm/BRM_saves
SystemConfig.TableLockSaveFile = /var/lib/columnstore/data1/systemFiles/dbrm/tablelocks
SessionManager.TxnIDFile = /var/lib/columnstore/data1/systemFiles/dbrm/SMTxnID
OIDManager.OIDBitmapFile = /var/lib/columnstore/data1/systemFiles/dbrm/oidbitmap
WriteEngine.BulkRoot = /var/lib/columnstore/data/bulk
WriteEngine.BulkRollbackDir = /var/lib/columnstore/data1/systemFiles/bulkRollback
[root@server4 ~]# mcsSetConfig SystemConfig DBRoot1 /mariadb/columnstore/data1
[root@server4 ~]# mcsGetConfig SystemConfig DBRoot1
/mariadb/columnstore/data1
[root@server4 ~]# mcsSetConfig SystemConfig DBRMRoot /mariadb/columnstore/data1/systemFiles/dbrm/BRM_saves
[root@server4 ~]# mcsSetConfig SystemConfig TableLockSaveFile /mariadb/columnstore/data1/systemFiles/dbrm/tablelocks
[root@server4 ~]# mcsSetConfig SessionManager TxnIDFile /mariadb/columnstore/data1/systemFiles/dbrm/SMTxnID
[root@server4 ~]# mcsSetConfig OIDManager OIDBitmapFile /mariadb/columnstore/data1/systemFiles/dbrm/oidbitmap
[root@server4 ~]# mcsSetConfig WriteEngine BulkRoot /mariadb/columnstore/data/bulk
[root@server4 ~]# mcsSetConfig WriteEngine BulkRollbackDir /mariadb/columnstore/data1/systemFiles/bulkRollback
[root@server4 ~]# mkdir -p /mariadb/columnstore/data1/systemFiles/dbrm/BRM_saves /mariadb/columnstore/data1/systemFiles/dbrm/tablelocks
[root@server4 ~]# mkdir -p /mariadb/columnstore/data1/systemFiles/dbrm/SMTxnID /mariadb/columnstore/data1/systemFiles/dbrm/SMTxnID
[root@server4 ~]# mkdir -p /mariadb/columnstore/data/bulk /mariadb/columnstore/data1/systemFiles/bulkRollback

Taking inspiration from the container version I have changed plugin_dir variable and plugin maturity allowance to:

# New plugin directory for Columnstore
plugin_dir                      = /usr/lib64/mysql/plugin
plugin_maturity                 = beta

Plugin maturity parameter is to avoid:

MariaDB [(none)]> INSTALL PLUGIN IF NOT EXISTS Columnstore SONAME 'ha_columnstore.so';
ERROR 1126 (HY000): Can't open shared library 'ha_columnstore.so' (errno: 1, Loading of beta plugin Columnstore is prohibited by --plugin-maturity=gamma)

And tried to load the plugin with:

MariaDB [(none)]> show engines;
+--------------------+---------+-------------------------------------------------------------------------------------------------+--------------+------+------------+
| Engine             | Support | Comment                                                                                         | Transactions | XA   | Savepoints |
+--------------------+---------+-------------------------------------------------------------------------------------------------+--------------+------+------------+
| CSV                | YES     | Stores tables as CSV files                                                                      | NO           | NO   | NO         |
| MRG_MyISAM         | YES     | Collection of identical MyISAM tables                                                           | NO           | NO   | NO         |
| MEMORY             | YES     | Hash based, stored in memory, useful for temporary tables                                       | NO           | NO   | NO         |
| Aria               | YES     | Crash-safe tables with MyISAM heritage. Used for internal temporary tables and privilege tables | NO           | NO   | NO         |
| MyISAM             | YES     | Non-transactional engine with good performance and small data footprint                         | NO           | NO   | NO         |
| SEQUENCE           | YES     | Generated tables filled with sequential values                                                  | YES          | NO   | YES        |
| InnoDB             | DEFAULT | Supports transactions, row-level locking, foreign keys and encryption for tables                | YES          | YES  | YES        |
| PERFORMANCE_SCHEMA | YES     | Performance Schema                                                                              | NO           | NO   | NO         |
+--------------------+---------+-------------------------------------------------------------------------------------------------+--------------+------+------------+
8 rows in set (0.001 sec)

MariaDB [(none)]> INSTALL PLUGIN IF NOT EXISTS Columnstore SONAME 'ha_columnstore.so';
Query OK, 0 rows affected (0.111 sec)

MariaDB [(none)]> show engines;
+--------------------+---------+-------------------------------------------------------------------------------------------------+--------------+------+------------+
| Engine             | Support | Comment                                                                                         | Transactions | XA   | Savepoints |
+--------------------+---------+-------------------------------------------------------------------------------------------------+--------------+------+------------+
| Columnstore        | YES     | ColumnStore storage engine                                                                      | YES          | NO   | NO         |
| MRG_MyISAM         | YES     | Collection of identical MyISAM tables                                                           | NO           | NO   | NO         |
| MEMORY             | YES     | Hash based, stored in memory, useful for temporary tables                                       | NO           | NO   | NO         |
| Aria               | YES     | Crash-safe tables with MyISAM heritage. Used for internal temporary tables and privilege tables | NO           | NO   | NO         |
| MyISAM             | YES     | Non-transactional engine with good performance and small data footprint                         | NO           | NO   | NO         |
| SEQUENCE           | YES     | Generated tables filled with sequential values                                                  | YES          | NO   | YES        |
| InnoDB             | DEFAULT | Supports transactions, row-level locking, foreign keys and encryption for tables                | YES          | YES  | YES        |
| PERFORMANCE_SCHEMA | YES     | Performance Schema                                                                              | NO           | NO   | NO         |
| CSV                | YES     | Stores tables as CSV files                                                                      | NO           | NO   | NO         |
+--------------------+---------+-------------------------------------------------------------------------------------------------+--------------+------+------------+
9 rows in set (0.001 sec)

On the paper it works but the connection is lost when you try to create a table with ColumnStore storage engine…

References

The post MariaDB ColumnStore installation and testing – part 2 appeared first on IT World.

]]>
https://blog.yannickjaquier.com/mysql/mariadb-columnstore-installation-and-testing-part-2.html/feed 1