Further tweaks to documentation and samples.

This commit is contained in:
Anthony Tuininga 2020-12-08 11:47:53 -07:00
parent 30979c6a57
commit 3a23957f6d
10 changed files with 221 additions and 76 deletions

View File

@ -84,7 +84,7 @@ Connection Object
.. method:: Connection.cancel()
Cancel a long-running transaction.
Break a long-running transaction.
.. note::

View File

@ -26,11 +26,12 @@ Cursor Object
.. attribute:: Cursor.arraysize
This read-write attribute can be used to tune the number of rows internally
fetched and buffered by internal calls to the database. The value can
drastically affect the performance of a query since it directly affects the
number of network round trips between Python and the database. For methods
like :meth:`~Cursor.fetchone()` and :meth:`~Cursor.fetchall()` it does not
change how many rows are returned to the application. For
fetched and buffered by internal calls to the database when fetching rows
from SELECT statements and REF CURSORS. The value can drastically affect
the performance of a query since it directly affects the number of network
round trips between Python and the database. For methods like
:meth:`~Cursor.fetchone()` and :meth:`~Cursor.fetchall()` it does not change
how many rows are returned to the application. For
:meth:`~Cursor.fetchmany()` it is the default number of rows to fetch.
Due to the performance benefits, the default ``Cursor.arraysize`` is 100
@ -445,9 +446,9 @@ Cursor Object
.. attribute:: Cursor.prefetchrows
This read-write attribute can be used to tune the number of rows that the
Oracle Client library fetches when a query is executed. This value can
reduce the number of round-trips to the database that are required to
fetch rows but at the cost of additional memory. Setting this value to 0
Oracle Client library fetches when a SELECT statement is executed. This
value can reduce the number of round-trips to the database that are required
to fetch rows but at the cost of additional memory. Setting this value to 0
can be useful when the timing of fetches must be explicitly controlled.
See :ref:`Tuning Fetch Performance <tuningfetch>` for more information.

View File

@ -7,12 +7,27 @@ SODA
`Oracle Database Simple Oracle Document Access (SODA)
<https://docs.oracle.com/en/database/oracle/simple-oracle-document-access>`__
allows documents to be inserted, queried, and retrieved from Oracle Database
using a set of NoSQL-style cx_Oracle methods.
using a set of NoSQL-style cx_Oracle methods. By default, documents are JSON
strings. See the :ref:`user manual <sodausermanual>` for examples.
See :ref:`user manual <sodausermanual>` for an example.
.. _sodarequirements:
-----------------
SODA Requirements
-----------------
To use SODA, the role SODA_APP must be granted to the user. To create
collections, users need the CREATE TABLE privilege. These can be granted by a
DBA:
.. code-block:: sql
SQL> grant soda_app, create table to myuser;
Advanced users who are using Oracle sequences for keys will also need the CREATE
SEQUENCE privilege.
SODA requires Oracle Client 18.3 or higher and Oracle Database 18.1 and higher.
The role SODA_APP must be granted to the user.
.. note::
@ -48,7 +63,8 @@ The role SODA_APP must be granted to the user.
<https://www.oracle.com/pls/topic/lookup?ctx=dblatest&
id=GUID-A2E90F08-BC9F-4688-A9D0-4A948DD3F7A9>`__ to 19 or lower.
Otherwise you may get errors such as "ORA-40659: Data type does not match
Otherwise you may get errors such as "ORA-40842: unsupported value JSON in
the metadata for the field sqlType" or "ORA-40659: Data type does not match
the specification in the collection metadata".
.. _sodadb:

View File

@ -131,6 +131,8 @@ Linux, you might use::
$ python myapp.py 2> log.txt
.. _usinginitoracleclient:
Using cx_Oracle.init_oracle_client() to set the Oracle Client directory
-----------------------------------------------------------------------
@ -138,22 +140,31 @@ Applications can call the function :meth:`cx_Oracle.init_oracle_client()` to
specify the directory containing Oracle Instant Client libraries. The Oracle
Client Libraries are loaded when ``init_oracle_client()`` is called. For
example, if the Oracle Instant Client Libraries are in
``C:\oracle\instantclient_19_6`` on Windows, then you can use:
``C:\oracle\instantclient_19_9`` on Windows or
``$HOME/Downloads/instantclient_19_8`` on macOS, then you can use:
.. code-block:: python
import cx_Oracle
import sys
import os
try:
cx_Oracle.init_oracle_client(lib_dir=r"C:\oracle\instantclient_19_6")
if sys.platform.startswith("darwin"):
lib_dir = os.path.join(os.environ.get("HOME"), "Downloads",
"instantclient_19_8")
cx_Oracle.init_oracle_client(lib_dir=lib_dir)
elif sys.platform.startswith("win32"):
cx_Oracle.init_oracle_client(lib_dir=r"C:\oracle\instantclient_19_9")
except Exception as err:
print("Whoops!")
print(err);
sys.exit(1);
The :meth:`~cx_Oracle.init_oracle_client()` function should only be called
once.
Note the use of a 'raw' string ``r"..."`` on Windows so that backslashes are
treated as directory separators.
The :meth:`~cx_Oracle.init_oracle_client()` function can only be called once.
If you set ``lib_dir`` on Linux and related platforms, you must still have
configured the system library search path to include that directory before

View File

@ -487,8 +487,8 @@ To use cx_Oracle with Oracle Instant Client zip files:
2. Unzip the package into a directory that is accessible to your
application. For example unzip
``instantclient-basic-windows.x64-19.8.0.0.0dbru.zip`` to
``C:\oracle\instantclient_19_8``.
``instantclient-basic-windows.x64-19.9.0.0.0dbru.zip`` to
``C:\oracle\instantclient_19_9``.
3. Oracle Instant Client libraries require a Visual Studio redistributable with
a 64-bit or 32-bit architecture to match Instant Client's architecture.
@ -511,7 +511,7 @@ Configure Oracle Instant Client
.. code-block:: python
import cx_Oracle
cx_Oracle.init_oracle_client(lib_dir=r"C:\oracle\instantclient_19_8")
cx_Oracle.init_oracle_client(lib_dir=r"C:\oracle\instantclient_19_9")
Note a 'raw' string is used because backslashes occur in the path.
@ -523,7 +523,7 @@ Configure Oracle Instant Client
is executed, for example::
REM mypy.bat
SET PATH=C:\oracle\instantclient_19_8;%PATH%
SET PATH=C:\oracle\instantclient_19_9;%PATH%
python %*
Invoke this batch file every time you want to run Python.
@ -536,14 +536,14 @@ Configure Oracle Instant Client
.. code-block:: python
import cx_Oracle
cx_Oracle.init_oracle_client(lib_dir=r"C:\oracle\instantclient_19_8",
cx_Oracle.init_oracle_client(lib_dir=r"C:\oracle\instantclient_19_9",
config_dir=r"C:\oracle\your_config_dir")
Or set the environment variable ``TNS_ADMIN`` to that directory name.
Alternatively, put the files in a ``network\admin`` subdirectory of
Instant Client, for example in
``C:\oracle\instantclient_19_8\network\admin``. This is the default
``C:\oracle\instantclient_19_9\network\admin``. This is the default
Oracle configuration directory for executables linked with this
Instant Client.
@ -837,28 +837,48 @@ If using cx_Oracle fails:
- Do you get the error "``DPI-1047: Oracle Client library cannot be
loaded``"?
- Check that Python, cx_Oracle and your Oracle Client libraries
are all 64-bit or all 32-bit. The ``DPI-1047`` message will
tell you whether the 64-bit or 32-bit Oracle Client is needed
for your Python.
- On Windows and macOS, try using :meth:`~cx_Oracle.init_oracle_client()`.
See :ref:`usinginitoracleclient`.
- Check that Python and your Oracle Client libraries are both 64-bit, or
both 32-bit. The ``DPI-1047`` message will tell you whether the 64-bit
or 32-bit Oracle Client is needed for your Python.
- Set the environment variable ``DPI_DEBUG_LEVEL`` to 64 and restart
cx_Oracle. The trace messages will show how and where cx_Oracle is
looking for the Oracle Client libraries.
At a Windows command prompt, this could be done with::
set DPI_DEBUG_LEVEL=64
On Linux and macOS, you might use::
export DPI_DEBUG_LEVEL=64
- On Windows, if you used :meth:`~cx_Oracle.init_oracle_client()` and have
a full database installation, make sure this database is the `currently
configured database
<https://www.oracle.com/pls/topic/lookup?ctx=dblatest&id=GUID-33D575DD-47FF-42B1-A82F-049D3F2A8791>`__.
- On Windows, if you are not using
:meth:`~cx_Oracle.init_oracle_client()`, then restart your command prompt
and use ``set PATH`` to check the environment variable has the correct
Oracle Client listed before any other Oracle directories.
- On Windows, use the ``DIR`` command to verify that ``OCI.DLL`` exists in
the directory passed to ``init_oracle_client()`` or set in ``PATH``.
- On Windows, check that the correct `Windows Redistributables
<https://oracle.github.io/odpi/doc/installation.html#windows>`__ have
been installed.
- On Linux, check the ``LD_LIBRARY_PATH`` environment variable contains
the Oracle Client library directory. If you are using Oracle Instant
Client, a preferred alternative is to ensure a file in the
``/etc/ld.so.conf.d`` directory contains the path to the Instant Client
directory, and then run ``ldconfig``.
- On macOS, make sure you are not using the bundled Python (use `Homebrew
<https://brew.sh>`__ or `Python.org
<https://www.python.org/downloads>`__ instead). If you are not using

View File

@ -18,10 +18,21 @@ SODA uses a SQL schema to store documents but you do not need to know SQL or
how the documents are stored. However, access via SQL does allow use of
advanced Oracle Database functionality such as analytics for reporting.
Oracle SODA implementations are also available in `Node.js
<https://oracle.github.io/node-oracledb/doc/api.html#sodaoverview>`__, `Java
<https://docs.oracle.com/en/database/oracle/simple-oracle-document-access/java/adsda/index.html>`__,
`PL/SQL <https://www.oracle.com/pls/topic/lookup?ctx=dblatest&id=ADSDP>`__,
`Oracle Call Interface
<https://www.oracle.com/pls/topic/lookup?ctx=dblatest&id=GUID-23206C89-891E-43D7-827C-5C6367AD62FD>`__
and via `REST
<https://docs.oracle.com/en/database/oracle/simple-oracle-document-access/rest/index.html>`__.
For general information on SODA, see the `SODA home page
<https://docs.oracle.com/en/database/oracle/simple-oracle-document-access/index.html>`__
and `Oracle Database Introduction to SODA
<https://www.oracle.com/pls/topic/lookup?ctx=dblatest&id=ADSDI>`__.
and the Oracle Database `Introduction to Simple Oracle Document Access (SODA)
<https://www.oracle.com/pls/topic/lookup?ctx=dblatest&id=ADSDI>`__ manual.
For specified requirements see the cx_Oracle :ref:`SODA requirements <sodarequirements>`.
cx_Oracle uses the following objects for SODA:
@ -62,8 +73,8 @@ cx_Oracle uses the following objects for SODA:
then used by a terminal method to find, count, replace, or remove documents.
This is an internal object that should not be directly accessed.
SODA Example
============
SODA Examples
=============
Creating and adding documents to a collection can be done as follows:
@ -106,3 +117,39 @@ You can also search for documents using query-by-example syntax:
See the `samples directory
<https://github.com/oracle/python-cx_Oracle/tree/master/samples>`__
for runnable SODA examples.
--------------------
Committing SODA Work
--------------------
The general recommendation for SODA applications is to turn on
:attr:`~Connection.autocommit` globally:
.. code-block:: python
connection.autocommit = True
If your SODA document write operations are mostly independent of each other,
this removes the overhead of application transaction management and the need for
explicit :meth:`Connection.commit()` calls.
When deciding how to commit transactions, beware of transactional consistency
and performance requirements. If you are using individual SODA calls to insert
or update a large number of documents with individual calls, you should turn
:attr:`~Connection.autocommit` off and issue a single, explicit
:meth:`~Connection.commit()` after all documents have been processed. Also
consider using :meth:`SodaCollection.insertMany()` or
:meth:`SodaCollection.insertManyAndGet()` which have performance benefits.
If you are not autocommitting, and one of the SODA operations in your
transaction fails, then previous uncommitted operations will not be rolled back.
Your application should explicitly roll back the transaction with
:meth:`Connection.rollback()` to prevent any later commits from committing a
partial transaction.
Note:
- SODA DDL operations do not commit an open transaction the way that SQL always does for DDL statements.
- When :attr:`~Connection.autocommit` is ``True``, most SODA methods will issue a commit before successful return.
- SODA provides optimistic locking, see :meth:`SodaOperation.version()`.
- When mixing SODA and relational access, any commit or rollback on the connection will affect all work.

View File

@ -13,22 +13,63 @@ import sample_env
connection = cx_Oracle.connect(sample_env.get_main_connect_string())
rows = [ (1, "First" ),
(2, "Second" ),
(3, "Third" ),
(4, "Fourth" ),
(5, "Fifth" ),
(6, "Sixth" ),
(7, "Seventh" ) ]
#------------------------------------------------------------------------------
# "Bind by position"
#------------------------------------------------------------------------------
rows = [
(1, "First"),
(2, "Second"),
(3, "Third"),
(4, "Fourth"),
(5, None), # Insert a NULL value
(6, "Sixth"),
(7, "Seventh")
]
cursor = connection.cursor()
# predefine maximum string size to avoid data scans and memory reallocations;
# the None value indicates that the default processing can take place
cursor.setinputsizes(None, 20)
cursor.executemany("insert into mytab(id, data) values (:1, :2)", rows)
# Don't commit - this lets us run the demo multiple times
#connection.commit()
#------------------------------------------------------------------------------
# "Bind by name"
#------------------------------------------------------------------------------
rows = [
{"d": "Eighth", "i": 8},
{"d": "Ninth", "i": 9},
{"d": "Tenth", "i": 10}
]
cursor = connection.cursor()
# Predefine maximum string size to avoid data scans and memory reallocations
cursor.setinputsizes(d=20)
cursor.executemany("insert into mytab(id, data) values (:i, :d)", rows)
#------------------------------------------------------------------------------
# Inserting a single bind still needs tuples
#------------------------------------------------------------------------------
rows = [
("Eleventh",),
("Twelth",)
]
cursor = connection.cursor()
cursor.executemany("insert into mytab(id, data) values (11, :1)", rows)
#------------------------------------------------------------------------------
# Now query the results back
#------------------------------------------------------------------------------
# Don't commit - this lets the demo be run multiple times
#connection.commit()
for row in cursor.execute('select * from mytab'):
print(row)

View File

@ -5,9 +5,9 @@
#------------------------------------------------------------------------------
# QueryArraysize.py
#
# Demonstrate how to alter the array size on a cursor in order to reduce the
# number of network round trips and overhead required to fetch all of the rows
# from a large table.
# Demonstrate how to alter the array size and prefetch rows value on a cursor
# in order to reduce the number of network round trips and overhead required to
# fetch all of the rows from a large table.
#------------------------------------------------------------------------------
import time
@ -19,6 +19,7 @@ connection = cx_Oracle.connect(sample_env.get_main_connect_string())
start = time.time()
cursor = connection.cursor()
cursor.prefetchrows = 1000
cursor.arraysize = 1000
cursor.execute('select * from bigtab')
res = cursor.fetchall()

View File

@ -46,7 +46,7 @@
<li>3.2 Using fetchone()</li>
<li>3.3 Using fetchmany()</li>
<li>3.4 Scrollable cursors</li>
<li>3.5 Tuning with arraysize</li>
<li>3.5 Tuning with arraysize and prefetchrows</li>
</ul>
</li>
<li><a href="#binding">4. Binding Data</a>
@ -892,11 +892,15 @@ print(cur.fetchone())
</li>
<li><h4>3.5 Tuning with arraysize</h4>
<li><h4>3.5 Tuning with arraysize and prefetchrows</h4>
<p>This section demonstrates a way to improve query performance by
increasing the number of rows returned in each batch from Oracle to
the Python program.</p>
<p>This section demonstrates a way to improve query performance by increasing
the number of rows returned in each batch from Oracle to the Python
program.</p>
<p>Row prefetching and array fetching are both internal buffering techniques
to reduce round-trips to the database. The difference is the code layer that
is doing the buffering, and when the buffering occurs.</p>
<p>First, create a table with a large number of rows.
Review <code>query_arraysize.sql</code>:</p>
@ -919,7 +923,6 @@ commit;
<pre><strong>sqlplus /nolog @query_arraysize.sql</strong></pre>
<p>Review the code contained in <code>query_arraysize.py</code>:</p>
<pre>
@ -932,7 +935,8 @@ con = cx_Oracle.connect(db_config.user, db_config.pw, db_config.dsn)
start = time.time()
cur = con.cursor()
cur.arraysize = 10
cur.prefetchrows = 100
cur.arraysize = 100
cur.execute("select * from bigtab")
res = cur.fetchall()
# print(res) # uncomment to display the query results
@ -941,15 +945,14 @@ elapsed = (time.time() - start)
print(elapsed, "seconds")
</pre>
<p>This uses the 'time' module to measure elapsed time of the
query. The arraysize is set to 10. This causes batches of 10
records at a time to be returned from the database to a cache in
Python. This reduces the number of &quot;roundtrips&quot; made to
the database, often reducing network load and reducing the number
of context switches on the database server. The
<code>fetchone()</code>, <code>fetchmany()</code> and
<code>fetchall()</code> methods will read from the cache before
requesting more data from the database.</p>
<p>This uses the 'time' module to measure elapsed time of the query. The
prefetchrows and arraysize values are 100. This causes batches of 100
records at a time to be returned from the database to a cache in Python.
These values can be tuned to reduce the number of &quot;round-trips&quot;
made to the database, often reducing network load and reducing the number of
context switches on the database server. The <code>fetchone()</code>,
<code>fetchmany()</code> and <code>fetchall()</code> methods will read from
the cache before requesting more data from the database.</p>
<p>In a terminal window, run:</p>
@ -957,23 +960,27 @@ print(elapsed, "seconds")
<p>Rerun a few times to see the average times.</p>
<p>Experiment with different arraysize values. For example, edit
<code>query_arraysize.py</code> and change the arraysize to:</p>
<p>Experiment with different prefetchrows and arraysize values. For
example, edit <code>query_arraysize.py</code> and change the arraysize
to:</p>
<pre>cur.arraysize = <strong>2000</strong></pre>
<p>Rerun the script to compare the performance of different
arraysize settings.</p>
<p>In general, larger array sizes improve
performance. Depending on how fast your system is, you may need
to use different arraysizes than those given here to see a
meaningful time difference.</p>
<p>In general, larger array sizes improve performance. Depending on how
fast your system is, you may need to use different values than those
given here to see a meaningful time difference.</p>
<p>The default arraysize used by cx_Oracle is 100. There is a
time/space tradeoff for increasing the arraysize. Larger
arraysizes will require more memory in Python for buffering the
records.</p>
<p>There is a time/space tradeoff for increasing the values. Larger values
will require more memory in Python for buffering the records.</p>
<p>If you know the query returns a fixed number of rows, for example 20
rows, then set arraysize to 20 and prefetchrows to 21. The addition of one
for prefetchrows prevents a round-trip to check for end-of-fetch. The
statement execution and fetch will take a total of one round-trip. This
minimizes load on the database.</p>
<p>If you know a query only returns a few records,
decrease the arraysize from the default to reduce memory
@ -2426,12 +2433,12 @@ used to add a value to the list.</p>
like:</p>
<pre>
if sal &gt; 900000:
print('Salary is way too big')
elif sal &gt; 500000:
print('Salary is huge')
if v == 2 or v == 4:
print('Even')
elif v == 1 or v == 3:
print('Odd')
else:
print('Salary might be OK')
print('Unknown number')
</pre>
<p>This also shows how the clauses are delimited with colons, and each

View File

@ -15,7 +15,8 @@ con = cx_Oracle.connect(db_config.user, db_config.pw, db_config.dsn)
start = time.time()
cur = con.cursor()
cur.arraysize = 10
cur.prefetchrows = 100
cur.arraysize = 100
cur.execute("select * from bigtab")
res = cur.fetchall()
# print(res) # uncomment to display the query results