2012-09-28 09:15:19 +00:00
|
|
|
.. _bulk-import:
|
|
|
|
|
|
|
|
Bulk Import
|
|
|
|
===========
|
|
|
|
|
|
|
|
OpenERP has included a bulk import facility for CSV-ish files for a
|
|
|
|
long time. With 7.0, both the interface and internal implementation
|
|
|
|
have been redone, resulting in
|
|
|
|
:meth:`~openerp.osv.orm.BaseModel.load`.
|
|
|
|
|
|
|
|
.. note::
|
|
|
|
|
|
|
|
the previous bulk-loading method,
|
|
|
|
:meth:`~openerp.osv.orm.BaseModel.import_data`, remains for
|
|
|
|
backwards compatibility but was re-implemented on top of
|
|
|
|
:meth:`~openerp.osv.orm.BaseModel.load`, while its interface is
|
|
|
|
unchanged its precise behavior has likely been altered for some
|
|
|
|
cases (it shouldn't throw exceptions anymore in many cases where
|
|
|
|
it previously did)
|
|
|
|
|
|
|
|
This document attempts to explain the behavior and limitations of
|
|
|
|
:meth:`~openerp.osv.orm.BaseModel.load`.
|
|
|
|
|
|
|
|
Data
|
2012-10-09 10:06:54 +00:00
|
|
|
----
|
2012-09-28 09:15:19 +00:00
|
|
|
|
|
|
|
The input ``data`` is a regular row-major matrix of strings (in Python
|
|
|
|
datatype terms, a ``list`` of rows, each row being a ``list`` of
|
|
|
|
``str``, all rows must be of equal length). Each row must be the same
|
|
|
|
length as the ``fields`` list preceding it in the argslist.
|
|
|
|
|
|
|
|
Each field of ``fields`` maps to a (potentially relational and nested)
|
|
|
|
field of the model under import, and the corresponding column of the
|
|
|
|
``data`` matrix provides a value for the field for each record.
|
|
|
|
|
|
|
|
Generally speaking each row of the input yields a record of output,
|
|
|
|
and each cell of a row yields a value for the corresponding field of
|
|
|
|
the row's record. There is currently one exception for this rule:
|
|
|
|
|
|
|
|
One to Many fields
|
2012-10-09 10:06:54 +00:00
|
|
|
++++++++++++++++++
|
2012-09-28 09:15:19 +00:00
|
|
|
|
|
|
|
Because O2M fields contain multiple records "embedded" in the main
|
|
|
|
one, and these sub-records are fully dependent on the main record (are
|
|
|
|
no other references to the sub-records in the system), they have to be
|
|
|
|
spliced into the matrix somehow. This is done by adding lines composed
|
|
|
|
*only* of o2m record fields below the main record:
|
|
|
|
|
2012-11-11 02:39:03 +00:00
|
|
|
.. literalinclude:: 06_misc_import_o2m.txt
|
2012-09-28 09:15:19 +00:00
|
|
|
|
|
|
|
the sections in double-lines represent the span of two o2m
|
|
|
|
fields. During parsing, they are extracted into their own ``data``
|
|
|
|
matrix for the o2m field they correspond to.
|
|
|
|
|
|
|
|
Import process
|
2012-10-09 10:06:54 +00:00
|
|
|
--------------
|
2012-09-28 09:15:19 +00:00
|
|
|
|
|
|
|
Here are the phases of import. Note that the concept of "phases" is
|
|
|
|
fuzzy as it's currently more of a pipeline, each record moves through
|
|
|
|
the entire pipeline before the next one is processed.
|
|
|
|
|
|
|
|
Extraction
|
2012-10-09 10:06:54 +00:00
|
|
|
++++++++++
|
2012-09-28 09:15:19 +00:00
|
|
|
|
|
|
|
The first phase of the import is the extraction of the current row
|
|
|
|
(and potentially a section of rows following it if it has One to Many
|
|
|
|
fields) into a record dictionary. The keys are the ``fields``
|
|
|
|
originally passed to :meth:`~openerp.osv.orm.BaseModel.load`, and the
|
|
|
|
values are either the string value at the corresponding cell (for
|
|
|
|
non-relational fields) or a list of sub-records (for all relational
|
|
|
|
fields).
|
|
|
|
|
|
|
|
This phase also generates the ``rows`` indexes for any
|
|
|
|
:ref:`import-message` produced thereafter.
|
|
|
|
|
|
|
|
Conversion
|
2012-10-09 10:06:54 +00:00
|
|
|
++++++++++
|
2012-09-28 09:15:19 +00:00
|
|
|
|
2012-12-12 16:17:00 +00:00
|
|
|
This second phase takes the record dicts, extracts the :term:`database
|
|
|
|
ID` and :term:`external ID` if present and attempts to convert each
|
|
|
|
field to a type matching what OpenERP expects to write.
|
2012-10-09 10:03:26 +00:00
|
|
|
|
|
|
|
* Empty fields (empty strings) are replaced with the ``False`` value
|
|
|
|
|
|
|
|
* Non-empty fields are converted through
|
|
|
|
:class:`~openerp.addons.base.ir.ir_fields.ir_fields_converter`
|
|
|
|
|
|
|
|
.. note:: if a field is specified in the import, its default will *never* be
|
|
|
|
used. If some records need to have a value and others need to use
|
|
|
|
the model's default, either specify that default explicitly or do
|
|
|
|
the import in two phases.
|
2012-09-28 09:15:19 +00:00
|
|
|
|
|
|
|
Char, text and binary fields
|
2012-10-09 10:06:54 +00:00
|
|
|
~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
2012-09-28 09:15:19 +00:00
|
|
|
|
|
|
|
Are returned as-is, without any alteration.
|
|
|
|
|
|
|
|
Boolean fields
|
2012-10-09 10:06:54 +00:00
|
|
|
~~~~~~~~~~~~~~
|
2012-09-28 09:15:19 +00:00
|
|
|
|
|
|
|
The string value is compared (in a case-insensitive manner) to ``0``,
|
|
|
|
``false`` and ``no`` as well of any translation thereof loaded in the
|
|
|
|
database. If the value matches one of these, the field is set to
|
|
|
|
``False``.
|
|
|
|
|
|
|
|
Otherwise the field is compared to ``1``, ``true`` and ``yes`` (and
|
|
|
|
any translation of these in the database). The field is always set to
|
|
|
|
``True``, but if the value does not match one of these a warning will
|
|
|
|
also be output.
|
|
|
|
|
|
|
|
Integers and float fields
|
2012-10-09 10:06:54 +00:00
|
|
|
~~~~~~~~~~~~~~~~~~~~~~~~~
|
2012-09-28 09:15:19 +00:00
|
|
|
|
|
|
|
The field is parsed with Python's built-in conversion routines
|
|
|
|
(``int`` and ``float`` respectively), if the conversion fails an error
|
|
|
|
is generated.
|
|
|
|
|
|
|
|
Selection fields
|
2012-10-09 10:06:54 +00:00
|
|
|
~~~~~~~~~~~~~~~~
|
2012-09-28 09:15:19 +00:00
|
|
|
|
|
|
|
The field is compared to 1. the values of the selection (first part of
|
|
|
|
each selection tuple) and 2. all translations of the selection label
|
|
|
|
found in the database.
|
|
|
|
|
|
|
|
If one of these is matched, the corresponding value is set on the
|
|
|
|
field.
|
|
|
|
|
|
|
|
Otherwise an error is generated.
|
|
|
|
|
|
|
|
The same process applies to both list-type and function-type selection
|
|
|
|
fields.
|
|
|
|
|
|
|
|
Many to One field
|
2012-10-09 10:06:54 +00:00
|
|
|
~~~~~~~~~~~~~~~~~
|
2012-09-28 09:15:19 +00:00
|
|
|
|
|
|
|
If the specified field is the relational field itself (``m2o``), the
|
|
|
|
value is used in a ``name_search``. The first record returned by
|
|
|
|
``name_search`` is used as the field's value.
|
|
|
|
|
|
|
|
If ``name_search`` finds no value, an error is generated. If
|
|
|
|
``name_search`` finds multiple value, a warning is generated to warn
|
|
|
|
the user of ``name_search`` collisions.
|
|
|
|
|
2012-12-12 16:17:00 +00:00
|
|
|
If the specified field is a :term:`external ID` (``m2o/id``), the
|
2012-09-28 09:15:19 +00:00
|
|
|
corresponding record it looked up in the database and used as the
|
|
|
|
field's value. If no record is found matching the provided external
|
|
|
|
ID, an error is generated.
|
|
|
|
|
2012-12-12 16:17:00 +00:00
|
|
|
If the specified field is a :term:`database ID` (``m2o/.id``), the
|
|
|
|
process is the same as for external ids (on database identifiers
|
|
|
|
instead of external ones).
|
2012-09-28 09:15:19 +00:00
|
|
|
|
|
|
|
Many to Many field
|
2012-10-09 10:06:54 +00:00
|
|
|
~~~~~~~~~~~~~~~~~~
|
2012-09-28 09:15:19 +00:00
|
|
|
|
|
|
|
The field's value is interpreted as a comma-separated list of names,
|
|
|
|
external ids or database ids. For each one, the process previously
|
|
|
|
used for the many to one field is applied.
|
|
|
|
|
|
|
|
One to Many field
|
2012-10-09 10:06:54 +00:00
|
|
|
~~~~~~~~~~~~~~~~~
|
2012-09-28 09:15:19 +00:00
|
|
|
|
|
|
|
For each o2m record extracted, if the record has a ``name``,
|
2012-12-12 16:17:00 +00:00
|
|
|
:term:`external ID` or :term:`database ID` the :term:`database ID` is
|
|
|
|
looked up and checked through the same process as for m2o fields.
|
2012-09-28 09:15:19 +00:00
|
|
|
|
2012-12-12 16:17:00 +00:00
|
|
|
If a :term:`database ID` was found, a LINK_TO command is emmitted,
|
|
|
|
followed by an UPDATE with the non-db values for the relational field.
|
2012-09-28 09:15:19 +00:00
|
|
|
|
|
|
|
Otherwise a CREATE command is emmitted.
|
|
|
|
|
2012-10-09 10:10:51 +00:00
|
|
|
Date fields
|
|
|
|
~~~~~~~~~~~
|
|
|
|
|
|
|
|
The value's format is checked against
|
|
|
|
:data:`~openerp.tools.misc.DEFAULT_SERVER_DATE_FORMAT`, an error is
|
|
|
|
generated if it does not match the specified format.
|
|
|
|
|
|
|
|
Datetime fields
|
|
|
|
~~~~~~~~~~~~~~~
|
|
|
|
|
|
|
|
The value's format is checked against
|
|
|
|
:data:`~openerp.tools.misc.DEFAULT_SERVER_DATETIME_FORMAT`, an error
|
|
|
|
is generated if it does not match.
|
|
|
|
|
|
|
|
The value is then interpreted as a datetime in the user's
|
|
|
|
timezone. The timezone is specified thus:
|
|
|
|
|
|
|
|
* If the import ``context`` contains a ``tz`` key with a valid
|
|
|
|
timezone name, this is the timezone of the datetime.
|
|
|
|
|
|
|
|
* Otherwise if the user performing the import has a ``tz`` attribute
|
|
|
|
set to a valid timezone name, this is the timezone of the datetime.
|
|
|
|
|
|
|
|
* Otherwise interpret the datetime as being in the ``UTC`` timezone.
|
|
|
|
|
2012-09-28 09:15:19 +00:00
|
|
|
Create/Write
|
2012-10-09 10:06:54 +00:00
|
|
|
++++++++++++
|
2012-09-28 09:15:19 +00:00
|
|
|
|
|
|
|
If the conversion was successful, the converted record is then saved
|
|
|
|
to the database via ``(ir.model.data)._update``.
|
|
|
|
|
|
|
|
Error handling
|
2012-10-09 10:06:54 +00:00
|
|
|
++++++++++++++
|
2012-09-28 09:15:19 +00:00
|
|
|
|
|
|
|
The import process will only catch 2 types of exceptions to convert
|
|
|
|
them to error messages: ``ValueError`` during the conversion process,
|
|
|
|
and sub-exceptions of ``psycopg2.Error`` during the create/write
|
|
|
|
process.
|
|
|
|
|
|
|
|
The import process uses savepoint to:
|
|
|
|
|
|
|
|
* protect the overall transaction from the failure of each ``_update``
|
|
|
|
call, if an ``_update`` call fails the savepoint is rolled back and
|
|
|
|
the import process keeps going in order to obtain as many error
|
|
|
|
messages as possible during each run.
|
|
|
|
|
|
|
|
* protect the import as a whole, a savepoint is created before
|
|
|
|
starting and if any error is generated that savepoint is rolled
|
|
|
|
back. The rest of the transaction (anything not within the import
|
|
|
|
process) will be left untouched.
|
|
|
|
|
|
|
|
.. _import-message:
|
|
|
|
.. _import-messages:
|
|
|
|
|
|
|
|
Messages
|
2012-10-09 10:06:54 +00:00
|
|
|
--------
|
2012-09-28 09:15:19 +00:00
|
|
|
|
|
|
|
A message is a dictionary with 5 mandatory keys and one optional key:
|
|
|
|
|
|
|
|
``type``
|
|
|
|
the type of message, either ``warning`` or ``error``. Any
|
|
|
|
``error`` message indicates the import failed and was rolled back.
|
|
|
|
|
|
|
|
``message``
|
|
|
|
the message's actual text, which should be translated and can be
|
|
|
|
shown to the user directly
|
|
|
|
|
|
|
|
``rows``
|
|
|
|
a dict with 2 keys ``from`` and ``to``, indicates the range of
|
|
|
|
rows in ``data`` which generated the message
|
|
|
|
|
|
|
|
``record``
|
|
|
|
a single integer, for warnings the index of the record which
|
|
|
|
generated the message (can be obtained from a non-false ``ids``
|
|
|
|
result)
|
|
|
|
|
|
|
|
``field``
|
|
|
|
the name of the (logical) OpenERP field for which the error or
|
|
|
|
warning was generated
|
|
|
|
|
|
|
|
``moreinfo`` (optional)
|
|
|
|
A string, a list or a dict, leading to more information about the
|
|
|
|
warning.
|
|
|
|
|
|
|
|
* If ``moreinfo`` is a string, it is a supplementary warnings
|
|
|
|
message which should be hidden by default
|
|
|
|
* If ``moreinfo`` is a list, it provides a number of possible or
|
|
|
|
alternative values for the string
|
|
|
|
* If ``moreinfo`` is a dict, it is an OpenERP action descriptor
|
|
|
|
which can be executed to get more information about the issues
|
|
|
|
with the field. If present, the ``help`` key serves as a label
|
|
|
|
for the action (e.g. the text of the link).
|