Further tweaks to documentation and samples.

2020-12-08 11:47:53 -07:00 · 2020-12-08 11:47:53 -07:00 · 3a23957f6d
parent 30979c6a57
commit 3a23957f6d
10 changed files with 221 additions and 76 deletions
--- a/doc/src/api_manual/connection.rst
+++ b/doc/src/api_manual/connection.rst
@ -84,7 +84,7 @@ Connection Object

 .. method:: Connection.cancel()

-    Cancel a long-running transaction.
+    Break a long-running transaction.

    .. note::

--- a/doc/src/api_manual/cursor.rst
+++ b/doc/src/api_manual/cursor.rst
@ -26,11 +26,12 @@ Cursor Object
 .. attribute:: Cursor.arraysize

    This read-write attribute can be used to tune the number of rows internally
-    fetched and buffered by internal calls to the database.  The value can
-    drastically affect the performance of a query since it directly affects the
-    number of network round trips between Python and the database.  For methods
-    like :meth:`~Cursor.fetchone()` and :meth:`~Cursor.fetchall()` it does not
-    change how many rows are returned to the application. For
+    fetched and buffered by internal calls to the database when fetching rows
+    from SELECT statements and REF CURSORS.  The value can drastically affect
+    the performance of a query since it directly affects the number of network
+    round trips between Python and the database.  For methods like
+    :meth:`~Cursor.fetchone()` and :meth:`~Cursor.fetchall()` it does not change
+    how many rows are returned to the application. For
    :meth:`~Cursor.fetchmany()` it is the default number of rows to fetch.

    Due to the performance benefits, the default ``Cursor.arraysize`` is 100
@ -445,9 +446,9 @@ Cursor Object
 .. attribute:: Cursor.prefetchrows

    This read-write attribute can be used to tune the number of rows that the
-    Oracle Client library fetches when a query is executed. This value can
-    reduce the number of round-trips to the database that are required to
-    fetch rows but at the cost of additional memory. Setting this value to 0
+    Oracle Client library fetches when a SELECT statement is executed. This
+    value can reduce the number of round-trips to the database that are required
+    to fetch rows but at the cost of additional memory. Setting this value to 0
    can be useful when the timing of fetches must be explicitly controlled.

    See :ref:`Tuning Fetch Performance <tuningfetch>` for more information.
--- a/doc/src/api_manual/soda.rst
+++ b/doc/src/api_manual/soda.rst
@ -7,12 +7,27 @@ SODA
 `Oracle Database Simple Oracle Document Access (SODA)
 <https://docs.oracle.com/en/database/oracle/simple-oracle-document-access>`__
 allows documents to be inserted, queried, and retrieved from Oracle Database
-using a set of NoSQL-style cx_Oracle methods.
+using a set of NoSQL-style cx_Oracle methods. By default, documents are JSON
+strings. See the :ref:`user manual <sodausermanual>` for examples.

-See :ref:`user manual <sodausermanual>` for an example.
+.. _sodarequirements:
+
+-----------------
+SODA Requirements
+-----------------
+
+To use SODA, the role SODA_APP must be granted to the user.  To create
+collections, users need the CREATE TABLE privilege.  These can be granted by a
+DBA:
+
+.. code-block:: sql
+
+    SQL> grant soda_app, create table to myuser;
+
+Advanced users who are using Oracle sequences for keys will also need the CREATE
+SEQUENCE privilege.

 SODA requires Oracle Client 18.3 or higher and Oracle Database 18.1 and higher.
-The role SODA_APP must be granted to the user.

 .. note::

@ -48,7 +63,8 @@ The role SODA_APP must be granted to the user.
      <https://www.oracle.com/pls/topic/lookup?ctx=dblatest&
      id=GUID-A2E90F08-BC9F-4688-A9D0-4A948DD3F7A9>`__ to 19 or lower.

-    Otherwise you may get errors such as "ORA-40659: Data type does not match
+    Otherwise you may get errors such as "ORA-40842: unsupported value JSON in
+    the metadata for the field sqlType" or "ORA-40659: Data type does not match
    the specification in the collection metadata".

 .. _sodadb:
--- a/doc/src/user_guide/initialization.rst
+++ b/doc/src/user_guide/initialization.rst
@ -131,6 +131,8 @@ Linux, you might use::
    $ python myapp.py 2> log.txt


+.. _usinginitoracleclient:
+
 Using cx_Oracle.init_oracle_client() to set the Oracle Client directory
 -----------------------------------------------------------------------

@ -138,22 +140,31 @@ Applications can call the function :meth:`cx_Oracle.init_oracle_client()` to
 specify the directory containing Oracle Instant Client libraries.  The Oracle
 Client Libraries are loaded when ``init_oracle_client()`` is called.  For
 example, if the Oracle Instant Client Libraries are in
-``C:\oracle\instantclient_19_6`` on Windows, then you can use:
+``C:\oracle\instantclient_19_9`` on Windows or
+``$HOME/Downloads/instantclient_19_8`` on macOS, then you can use:

 .. code-block:: python

    import cx_Oracle
    import sys
+    import os

    try:
-        cx_Oracle.init_oracle_client(lib_dir=r"C:\oracle\instantclient_19_6")
+        if sys.platform.startswith("darwin"):
+            lib_dir = os.path.join(os.environ.get("HOME"), "Downloads",
+                                   "instantclient_19_8")
+            cx_Oracle.init_oracle_client(lib_dir=lib_dir)
+        elif sys.platform.startswith("win32"):
+            cx_Oracle.init_oracle_client(lib_dir=r"C:\oracle\instantclient_19_9")
    except Exception as err:
        print("Whoops!")
        print(err);
        sys.exit(1);

-The :meth:`~cx_Oracle.init_oracle_client()` function should only be called
-once.
+Note the use of a 'raw' string ``r"..."`` on Windows so that backslashes are
+treated as directory separators.
+
+The :meth:`~cx_Oracle.init_oracle_client()` function can only be called once.

 If you set ``lib_dir`` on Linux and related platforms, you must still have
 configured the system library search path to include that directory before
--- a/doc/src/user_guide/installation.rst
+++ b/doc/src/user_guide/installation.rst
@ -487,8 +487,8 @@ To use cx_Oracle with Oracle Instant Client zip files:

 2. Unzip the package into a directory that is accessible to your
   application. For example unzip
-   ``instantclient-basic-windows.x64-19.8.0.0.0dbru.zip`` to
-   ``C:\oracle\instantclient_19_8``.
+   ``instantclient-basic-windows.x64-19.9.0.0.0dbru.zip`` to
+   ``C:\oracle\instantclient_19_9``.

 3. Oracle Instant Client libraries require a Visual Studio redistributable with
   a 64-bit or 32-bit architecture to match Instant Client's architecture.
@ -511,7 +511,7 @@ Configure Oracle Instant Client
     .. code-block:: python

         import cx_Oracle
-         cx_Oracle.init_oracle_client(lib_dir=r"C:\oracle\instantclient_19_8")
+         cx_Oracle.init_oracle_client(lib_dir=r"C:\oracle\instantclient_19_9")

     Note a 'raw' string is used because backslashes occur in the path.

@ -523,7 +523,7 @@ Configure Oracle Instant Client
     is executed, for example::

         REM mypy.bat
-         SET PATH=C:\oracle\instantclient_19_8;%PATH%
+         SET PATH=C:\oracle\instantclient_19_9;%PATH%
         python %*

     Invoke this batch file every time you want to run Python.
@ -536,14 +536,14 @@ Configure Oracle Instant Client
   .. code-block:: python

       import cx_Oracle
-       cx_Oracle.init_oracle_client(lib_dir=r"C:\oracle\instantclient_19_8",
+       cx_Oracle.init_oracle_client(lib_dir=r"C:\oracle\instantclient_19_9",
                                    config_dir=r"C:\oracle\your_config_dir")

   Or set the environment variable ``TNS_ADMIN`` to that directory name.

   Alternatively, put the files in a ``network\admin`` subdirectory of
   Instant Client, for example in
-   ``C:\oracle\instantclient_19_8\network\admin``.  This is the default
+   ``C:\oracle\instantclient_19_9\network\admin``.  This is the default
   Oracle configuration directory for executables linked with this
   Instant Client.

@ -837,28 +837,48 @@ If using cx_Oracle fails:
    - Do you get the error "``DPI-1047: Oracle Client library cannot be
      loaded``"?

-      - Check that Python, cx_Oracle and your Oracle Client libraries
-        are all 64-bit or all 32-bit.  The ``DPI-1047`` message will
-        tell you whether the 64-bit or 32-bit Oracle Client is needed
-        for your Python.
+      - On Windows and macOS, try using :meth:`~cx_Oracle.init_oracle_client()`.
+        See :ref:`usinginitoracleclient`.
+
+      - Check that Python and your Oracle Client libraries are both 64-bit, or
+        both 32-bit.  The ``DPI-1047`` message will tell you whether the 64-bit
+        or 32-bit Oracle Client is needed for your Python.
+
+      - Set the environment variable ``DPI_DEBUG_LEVEL`` to 64 and restart
+        cx_Oracle.  The trace messages will show how and where cx_Oracle is
+        looking for the Oracle Client libraries.
+
+        At a Windows command prompt, this could be done with::
+
+            set DPI_DEBUG_LEVEL=64
+
+        On Linux and macOS, you might use::
+
+              export DPI_DEBUG_LEVEL=64
+
      - On Windows, if you used :meth:`~cx_Oracle.init_oracle_client()` and have
        a full database installation, make sure this database is the `currently
        configured database
        <https://www.oracle.com/pls/topic/lookup?ctx=dblatest&id=GUID-33D575DD-47FF-42B1-A82F-049D3F2A8791>`__.
+
      - On Windows, if you are not using
        :meth:`~cx_Oracle.init_oracle_client()`, then restart your command prompt
        and use ``set PATH`` to check the environment variable has the correct
        Oracle Client listed before any other Oracle directories.
+
      - On Windows, use the ``DIR`` command to verify that ``OCI.DLL`` exists in
        the directory passed to ``init_oracle_client()`` or set in ``PATH``.
+
      - On Windows, check that the correct `Windows Redistributables
        <https://oracle.github.io/odpi/doc/installation.html#windows>`__ have
        been installed.
+
      - On Linux, check the ``LD_LIBRARY_PATH`` environment variable contains
        the Oracle Client library directory. If you are using Oracle Instant
        Client, a preferred alternative is to ensure a file in the
        ``/etc/ld.so.conf.d`` directory contains the path to the Instant Client
        directory, and then run ``ldconfig``.
+
      - On macOS, make sure you are not using the bundled Python (use `Homebrew
        <https://brew.sh>`__ or `Python.org
        <https://www.python.org/downloads>`__ instead).  If you are not using
--- a/doc/src/user_guide/soda.rst
+++ b/doc/src/user_guide/soda.rst
@ -18,10 +18,21 @@ SODA uses a SQL schema to store documents but you do not need to know SQL or
 how the documents are stored. However, access via SQL does allow use of
 advanced Oracle Database functionality such as analytics for reporting.

+Oracle SODA implementations are also available in `Node.js
+<https://oracle.github.io/node-oracledb/doc/api.html#sodaoverview>`__, `Java
+<https://docs.oracle.com/en/database/oracle/simple-oracle-document-access/java/adsda/index.html>`__,
+`PL/SQL <https://www.oracle.com/pls/topic/lookup?ctx=dblatest&id=ADSDP>`__,
+`Oracle Call Interface
+<https://www.oracle.com/pls/topic/lookup?ctx=dblatest&id=GUID-23206C89-891E-43D7-827C-5C6367AD62FD>`__
+and via `REST
+<https://docs.oracle.com/en/database/oracle/simple-oracle-document-access/rest/index.html>`__.
+
 For general information on SODA, see the `SODA home page
 <https://docs.oracle.com/en/database/oracle/simple-oracle-document-access/index.html>`__
-and `Oracle Database Introduction to SODA
-<https://www.oracle.com/pls/topic/lookup?ctx=dblatest&id=ADSDI>`__.
+and the Oracle Database `Introduction to Simple Oracle Document Access (SODA)
+<https://www.oracle.com/pls/topic/lookup?ctx=dblatest&id=ADSDI>`__ manual.
+
+For specified requirements see the cx_Oracle :ref:`SODA requirements <sodarequirements>`.

 cx_Oracle uses the following objects for SODA:

@ -62,8 +73,8 @@ cx_Oracle uses the following objects for SODA:
  then used by a terminal method to find, count, replace, or remove documents.
  This is an internal object that should not be directly accessed.

-SODA Example
-============
+SODA Examples
+=============

 Creating and adding documents to a collection can be done as follows:

@ -106,3 +117,39 @@ You can also search for documents using query-by-example syntax:
 See the `samples directory
 <https://github.com/oracle/python-cx_Oracle/tree/master/samples>`__
 for runnable SODA examples.
+
+--------------------
+Committing SODA Work
+--------------------
+
+The general recommendation for SODA applications is to turn on
+:attr:`~Connection.autocommit` globally:
+
+.. code-block:: python
+
+    connection.autocommit = True
+
+If your SODA document write operations are mostly independent of each other,
+this removes the overhead of application transaction management and the need for
+explicit :meth:`Connection.commit()` calls.
+
+When deciding how to commit transactions, beware of transactional consistency
+and performance requirements.  If you are using individual SODA calls to insert
+or update a large number of documents with individual calls, you should turn
+:attr:`~Connection.autocommit` off and issue a single, explicit
+:meth:`~Connection.commit()` after all documents have been processed.  Also
+consider using :meth:`SodaCollection.insertMany()` or
+:meth:`SodaCollection.insertManyAndGet()` which have performance benefits.
+
+If you are not autocommitting, and one of the SODA operations in your
+transaction fails, then previous uncommitted operations will not be rolled back.
+Your application should explicitly roll back the transaction with
+:meth:`Connection.rollback()` to prevent any later commits from committing a
+partial transaction.
+
+Note:
+
+- SODA DDL operations do not commit an open transaction the way that SQL always does for DDL statements.
+- When :attr:`~Connection.autocommit` is ``True``, most SODA methods will issue a commit before successful return.
+- SODA provides optimistic locking, see :meth:`SodaOperation.version()`.
+- When mixing SODA and relational access, any commit or rollback on the connection will affect all work.
--- a/samples/BindInsert.py
+++ b/samples/BindInsert.py
@ -13,22 +13,63 @@ import sample_env

 connection = cx_Oracle.connect(sample_env.get_main_connect_string())

-rows = [ (1, "First" ),
-         (2, "Second" ),
-         (3, "Third" ),
-         (4, "Fourth" ),
-         (5, "Fifth" ),
-         (6, "Sixth" ),
-         (7, "Seventh" ) ]
+#------------------------------------------------------------------------------
+# "Bind by position"
+#------------------------------------------------------------------------------
+
+rows = [
+    (1, "First"),
+    (2, "Second"),
+    (3, "Third"),
+    (4, "Fourth"),
+    (5, None),     # Insert a NULL value
+    (6, "Sixth"),
+    (7, "Seventh")
+]

 cursor = connection.cursor()
+
+# predefine maximum string size to avoid data scans and memory reallocations;
+# the None value indicates that the default processing can take place
+cursor.setinputsizes(None, 20)
+
 cursor.executemany("insert into mytab(id, data) values (:1, :2)", rows)

-# Don't commit - this lets us run the demo multiple times
-#connection.commit()
+#------------------------------------------------------------------------------
+# "Bind by name"
+#------------------------------------------------------------------------------

+rows = [
+    {"d": "Eighth", "i": 8},
+    {"d": "Ninth",  "i": 9},
+    {"d": "Tenth",  "i": 10}
+]
+
+cursor = connection.cursor()
+
+# Predefine maximum string size to avoid data scans and memory reallocations
+cursor.setinputsizes(d=20)
+
+cursor.executemany("insert into mytab(id, data) values (:i, :d)", rows)
+
+#------------------------------------------------------------------------------
+# Inserting a single bind still needs tuples
+#------------------------------------------------------------------------------
+
+rows = [
+    ("Eleventh",),
+    ("Twelth",)
+]
+
+cursor = connection.cursor()
+cursor.executemany("insert into mytab(id, data) values (11, :1)", rows)
+
+#------------------------------------------------------------------------------
 # Now query the results back
+#------------------------------------------------------------------------------
+
+# Don't commit - this lets the demo be run multiple times
+#connection.commit()

 for row in cursor.execute('select * from mytab'):
    print(row)
-
--- a/samples/QueryArraysize.py
+++ b/samples/QueryArraysize.py
@ -5,9 +5,9 @@
 #------------------------------------------------------------------------------
 # QueryArraysize.py
 #
-# Demonstrate how to alter the array size on a cursor in order to reduce the
-# number of network round trips and overhead required to fetch all of the rows
-# from a large table.
+# Demonstrate how to alter the array size and prefetch rows value on a cursor
+# in order to reduce the number of network round trips and overhead required to
+# fetch all of the rows from a large table.
 #------------------------------------------------------------------------------

 import time
@ -19,6 +19,7 @@ connection = cx_Oracle.connect(sample_env.get_main_connect_string())
 start = time.time()

 cursor = connection.cursor()
+cursor.prefetchrows = 1000
 cursor.arraysize = 1000
 cursor.execute('select * from bigtab')
 res = cursor.fetchall()
--- a/samples/tutorial/Python-and-Oracle-Database-Scripting-for-the-Future.html
+++ b/samples/tutorial/Python-and-Oracle-Database-Scripting-for-the-Future.html
@ -46,7 +46,7 @@
          <li>3.2 Using fetchone()</li>
          <li>3.3 Using fetchmany()</li>
          <li>3.4 Scrollable cursors</li>
-          <li>3.5 Tuning with arraysize</li>
+          <li>3.5 Tuning with arraysize and prefetchrows</li>
        </ul>
      </li>
      <li><a href="#binding">4. Binding Data</a>
@ -892,11 +892,15 @@ print(cur.fetchone())

  </li>

-  <li><h4>3.5 Tuning with arraysize</h4>
+  <li><h4>3.5 Tuning with arraysize and prefetchrows</h4>

-  <p>This section demonstrates a way to improve query performance by
-  increasing the number of rows returned in each batch from Oracle to
-  the Python program.</p>
+  <p>This section demonstrates a way to improve query performance by increasing
+  the number of rows returned in each batch from Oracle to the Python
+  program.</p>
+
+  <p>Row prefetching and array fetching are both internal buffering techniques
+  to reduce round-trips to the database. The difference is the code layer that
+  is doing the buffering, and when the buffering occurs.</p>

  <p>First, create a table with a large number of rows.
  Review <code>query_arraysize.sql</code>:</p>
@ -919,7 +923,6 @@ commit;

    <pre><strong>sqlplus /nolog @query_arraysize.sql</strong></pre>

-
    <p>Review the code contained in <code>query_arraysize.py</code>:</p>

 <pre>
@ -932,7 +935,8 @@ con = cx_Oracle.connect(db_config.user, db_config.pw, db_config.dsn)
 start = time.time()

 cur = con.cursor()
-cur.arraysize = 10
+cur.prefetchrows = 100
+cur.arraysize = 100
 cur.execute("select * from bigtab")
 res = cur.fetchall()
 # print(res)  # uncomment to display the query results
@ -941,15 +945,14 @@ elapsed = (time.time() - start)
 print(elapsed, "seconds")
 </pre>

-    <p>This uses the 'time' module to measure elapsed time of the
-    query. The arraysize is set to 10. This causes batches of 10
-    records at a time to be returned from the database to a cache in
-    Python. This reduces the number of &quot;roundtrips&quot; made to
-    the database, often reducing network load and reducing the number
-    of context switches on the database server. The
-    <code>fetchone()</code>, <code>fetchmany()</code> and
-    <code>fetchall()</code> methods will read from the cache before
-    requesting more data from the database.</p>
+    <p>This uses the 'time' module to measure elapsed time of the query. The
+    prefetchrows and arraysize values are 100.  This causes batches of 100
+    records at a time to be returned from the database to a cache in Python.
+    These values can be tuned to reduce the number of &quot;round-trips&quot;
+    made to the database, often reducing network load and reducing the number of
+    context switches on the database server. The <code>fetchone()</code>,
+    <code>fetchmany()</code> and <code>fetchall()</code> methods will read from
+    the cache before requesting more data from the database.</p>

    <p>In a terminal window, run:</p>

@ -957,23 +960,27 @@ print(elapsed, "seconds")

    <p>Rerun a few times to see the average times.</p>

-    <p>Experiment with different arraysize values.  For example, edit
-    <code>query_arraysize.py</code> and change the arraysize to:</p>
+    <p>Experiment with different prefetchrows and arraysize values.  For
+    example, edit <code>query_arraysize.py</code> and change the arraysize
+    to:</p>

    <pre>cur.arraysize = <strong>2000</strong></pre>

    <p>Rerun the script to compare the performance of different
    arraysize settings.</p>

-    <p>In general, larger array sizes improve
-    performance.  Depending on how fast your system is, you may need
-    to use different arraysizes than those given here to see a
-    meaningful time difference.</p>
+    <p>In general, larger array sizes improve performance.  Depending on how
+    fast your system is, you may need to use different values than those
+    given here to see a meaningful time difference.</p>

-    <p>The default arraysize used by cx_Oracle is 100. There is a
-    time/space tradeoff for increasing the arraysize. Larger
-    arraysizes will require more memory in Python for buffering the
-    records.</p>
+    <p>There is a time/space tradeoff for increasing the values. Larger values
+    will require more memory in Python for buffering the records.</p>
+
+    <p>If you know the query returns a fixed number of rows, for example 20
+    rows, then set arraysize to 20 and prefetchrows to 21.  The addition of one
+    for prefetchrows prevents a round-trip to check for end-of-fetch.  The
+    statement execution and fetch will take a total of one round-trip.  This
+    minimizes load on the database.</p>

    <p>If you know a query only returns a few records,
    decrease the arraysize from the default to reduce memory
@ -2426,12 +2433,12 @@ used to add a value to the list.</p>
 like:</p>

 <pre>
-if sal &gt; 900000:
-    print('Salary is way too big')
-elif sal &gt; 500000:
-    print('Salary is huge')
+if v == 2 or v == 4:
+    print('Even')
+elif v == 1 or v == 3:
+    print('Odd')
 else:
-    print('Salary might be OK')
+    print('Unknown number')
 </pre>

 <p>This also shows how the clauses are delimited with colons, and each
--- a/samples/tutorial/query_arraysize.py
+++ b/samples/tutorial/query_arraysize.py
@ -15,7 +15,8 @@ con = cx_Oracle.connect(db_config.user, db_config.pw, db_config.dsn)
 start = time.time()

 cur = con.cursor()
-cur.arraysize = 10
+cur.prefetchrows = 100
+cur.arraysize = 100
 cur.execute("select * from bigtab")
 res = cur.fetchall()
 # print(res)  # uncomment to display the query results