The default value for errors, although specified as None in the
function signature is surrogate_then_replace
The most common and recommended values for compatibility between python2 and python3 are:
surrogate_then_replacesurrogate_or_strict
When to use which?
surrogate_then_replace should be used when the data is informational only,
such as when displaying information to the user. Ultimately, just heading to
a log or displayed to the user.
surrogate_or_strict should be used when the data makes a difference to the
computer's understanding of the world. Such as with file paths or database
keys.
This specifies the strategy to use if a nonstring is passed. The default is
simplerepr and will return a string representation using either str(obj)
or repr(obj) preferring the str() method.
Other values are empty which returns an empty string, passthru which
returns the original object, or strict which will raise a TypeError
exception.
An example of using passthru would be when either passing a string or a
file like object for use in a HTTP POST request with to_bytes.
"native" in this context is meant to indicate the default string type on
Python 2 and 3 as produced by str
to_native on the controller, is used for a small set of functionality:
- When converting information for use in exceptions
- When the underlying python API expects a native string type
Typically speaking, native values should not be long lived, and should be
converted at the borders to native where they are needed. If a variable
must be assigned to a native value, the variable should be prefixed with
n_ such as n_output.
- Typically most all strings on the target should utilize the native string
type for the most easy integration of the underlying python APIs. However,
be careful to note the information from the
errorssection, which dictates whicherrorsvalue to use for informational vs operational values.
"bytes" in this context refers to the data type produced by bytes on Python 2
and Python3.
On Python 2 this is str and on Python 3 this is bytes.
Values converted to bytes should not be long lived. Typically values should
be converted at the borders to bytes where they are needed. If a variable must
be assigned to a bytes value, the variable should be prefixed with b_ such
as b_path. This includes params in the function signature, if a function
accepts a bytes value.
When dealing with byte-oriented APIs. This is common when dealing with file paths, or with data being passed through HTTP requests.
"text" in this context is meant to indicate the type produced by the unicode
function on Python2, and str on Python3.
- When data is ingested into Ansible, values should typically be cast to text for the lifetime of that data.
- All information sent to the
Displayclass, such asdisplay.displayordisplay.vvvshould be cast to text.
NOTE: Only on the borders where the data leaves Ansible should it be converted to bytes or native.
It is not likely to need to_text in many scenarios on the target. Only when
the API you are dealing with specifically needs text types, such as in some
MySQL libraries.
Sure. I haven't documented anything around encoding yet, and after today, I at least have some useful things to add for when you would want to use it.