[FIX] fields.function: make sure all binary values can always be serialized as valid XML

Nnormally, binary fields should be 7-bit ASCII base64-encoded data, but sometimes
it's not the case, so we do additional sanity checks to make sure the binary values
can pass safely via xmlrpc as strings.

As a last resort we coerce the binary values to unicode to make sure they can
be safely serialized as utf-8-encoded values, always valid XML characters.
When this happens, decoding on the other endpoint is not likely to produce
the expected output, but this isjust a safety mechanism(in these cases base64
data or xmlrpc. Binary values should be returned instead by the function field.

In a future version we should probably switch to using XMLRPC Binary types always for
passing fields.binary values, but this requires more refactoring.

lp bug: https://launchpad.net/bugs/670778 fixed

bzr revid: odo@openerp.com-20101209230742-gwf8e4zvmk43k6ln
This commit is contained in:
Olivier Dony 2010-12-10 00:07:42 +01:00
parent eb25611f00
commit bb82904ba3
1 changed files with 40 additions and 5 deletions

View File

@ -33,12 +33,12 @@
# #
import datetime as DT import datetime as DT
import string import string
import netsvc
import sys import sys
import warnings import warnings
import xmlrpclib
from psycopg2 import Binary from psycopg2 import Binary
import netsvc
import tools import tools
from tools.translate import _ from tools.translate import _
@ -673,6 +673,37 @@ def get_nice_size(a):
size = 0 size = 0
return (x, tools.human_size(size)) return (x, tools.human_size(size))
def sanitize_binary_value(dict_item):
# binary fields should be 7-bit ASCII base64-encoded data,
# but we do additional sanity checks to make sure the values
# will are not something else that won't pass via xmlrpc
index, value = dict_item
if isinstance(value, (xmlrpclib.Binary, tuple, list, dict)):
# these builtin types are meant to pass untouched
return index, value
# For all other cases, handle the value as a binary string:
# it could be a 7-bit ASCII string (e.g base64 data), but also
# any 8-bit content from files, with byte values that cannot
# be passed inside XML!
# See for more info:
# - http://bugs.python.org/issue10066
# - http://www.w3.org/TR/2000/REC-xml-20001006#NT-Char
#
# One solution is to convert the byte-string to unicode,
# so it gets serialized as utf-8 encoded data (always valid XML)
# If invalid XML byte values were present, tools.ustr() uses
# the Latin-1 codec as fallback, which converts any 8-bit
# byte value, resulting in valid utf-8-encoded bytes
# in the end:
# >>> unicode('\xe1','latin1').encode('utf8') == '\xc3\xa1'
# Note: when this happens, decoding on the other endpoint
# is not likely to produce the expected output, but this is
# just a safety mechanism (in these cases base64 data or
# xmlrpc.Binary values should be used instead
return index, tools.ustr(value)
# --------------------------------------------------------- # ---------------------------------------------------------
# Function fields # Function fields
# --------------------------------------------------------- # ---------------------------------------------------------
@ -763,9 +794,13 @@ class function(_column):
if res[r] and res[r] in dict_names: if res[r] and res[r] in dict_names:
res[r] = (res[r], dict_names[res[r]]) res[r] = (res[r], dict_names[res[r]])
if self._type == 'binary' and context.get('bin_size', False): if self._type == 'binary':
# convert the data returned by the function with the size of that data... if context.get('bin_size', False):
res = dict(map( get_nice_size, res.items())) # client requests only the size of binary fields
res = dict(map(get_nice_size, res.items()))
else:
res = dict(map(sanitize_binary_value, res.items()))
if self._type == "integer": if self._type == "integer":
for r in res.keys(): for r in res.keys():
# Converting value into string so that it does not affect XML-RPC Limits # Converting value into string so that it does not affect XML-RPC Limits