dict_validator package

The package aims to simplify payload validation using schemas represented by by regular Python classes.

In the heart of the library there are just a few top level concepts.

To begin with a Schema is nothing else but a collection of fields that boils down to a definition of the following shape:

class Schema:
    field1 = Sample()  # this a field
    field2 = Other()  # and this is a field

Where each field is a subclass of a Field with zero or more constructor parameters.

Note, you may extend an “object” but it is truly optional.

Once a schema is defined it is possible to employ one of the functions to process the payload and/or schema:

Most common Field subclasses can be found in dict_validator.fields.

dict_validator.validate(schema, value)[source]

Validate value again a given schema.

Parameters:
  • schema – a class representing the structure to be used for validation
  • value – a dict
Yield:

(path, erro_msg), e.g ([“parent”, “child”], “Error message”)

>>> from dict_validator import Field, List, Dict

To report a single error _validate method of a field subclass must return a string description of the problem.

>>> class SampleOnly(Field):
...
...     def _validate(self, value):
...         if value != "sample":
...             return "Not a sample"
>>> class Schema:
...     field = SampleOnly()

If there are no problems - nothing is yielded. Note: validate function is a generator - thus it has to be converted to a list explicitly.

>>> list(validate(Schema, {"field": "sample"}))
[]

Payload must be a dict.

>>> list(validate(Schema, "NOT A DICT"))
[([], 'Not a dict')]

If there are problems - a collection of tuples is yielded. The first element of a tuple is a list representing an absolute path to the field. The second element is a string with a description of the problem.

>>> list(validate(Schema, {"field": "not sample"}))
[(['field'], 'Not a sample')]

By default all fields are required

>>> list(validate(Schema, {}))
[([], 'Key "field" is missing')]

By design no extra fields are allowed - payload must be strictly specified.

>>> list(validate(Schema, {"field": "sample", "unknown_field": "sample"}))
[([], 'Unkown fields: unknown_field')]

Optional fields are marked via required=False parameter. This parameter is available for any field and its behaviour is uniform.

>>> class Schema:
...     field = SampleOnly(required=False)
>>> list(validate(Schema, {"field": "sample"}))
[]
>>> list(validate(Schema, {"field": "not sample"}))
[(['field'], 'Not a sample')]

If optional field is missing - no error is reported.

>>> list(validate(Schema, {}))
[]

Alternative way to report errors is to yield them. It is a good idea if there is a need to validate multiple aspects of the input value.

>>> class WithYieldedError(Field):
...
...     def _validate(self, value):
...         yield "Error 1"
...         yield "Error 2"
>>> class Schema:
...     field = WithYieldedError()

Each error has its own unique error tuple even if the errors originate from the same field.

>>> list(validate(Schema, {"field": "sample"}))
[(['field'], 'Error 1'), (['field'], 'Error 2')]

Nested structures can be described using a Dict field. This field requires a reference to the schema specifying a nested structure.

>>> class Child:
...     other_field = SampleOnly()
>>> class Parent:
...     child = Dict(Child)
>>> list(validate(Parent, {"child": {"other_field": "sample"}}))
[]

The absolute path to the error includes all the nodes of the structure’s tree.

>>> list(validate(Parent, {"child": {"other_field": "not sample"}}))
[(['child', 'other_field'], 'Not a sample')]

To represent collections of data (aka lists) a List should be used. The field requires an instance of some other field as its first argument.

>>> class Schema:
...     field = List(SampleOnly())
>>> list(validate(Schema, {"field": "NOT A LIST"}))
[(['field'], 'Not a list')]
>>> list(validate(Schema, {"field": ["sample"]}))
[]

If the problem is in individual item a path to the node includes an integer index of the node.

>>> list(validate(Schema, {"field": ["not sample", "sample",
...                                  "not sample"]}))
[(['field', 0], 'Not a sample'), (['field', 2], 'Not a sample')]

It is possible to have a sophisticated nesting using a combination of list and dict fields.

>>> class Child:
...     other_field = SampleOnly()
>>> class Parent:
...     child = List(Dict(Child))
>>> list(validate(Parent, {"child": [{"other_field": "sample"}]}))
[]
>>> list(validate(Parent, {"child": [{"other_field": "not sample"}]}))
[(['child', 0, 'other_field'], 'Not a sample')]
dict_validator.describe(schema)[source]

Describe a given schema.

Understands primitive, list and dict fields.

Parameters:schema – a class representing the structure to be documented
Yield:(path, {…description…}), e.g ([“parent”, “child”], {“required”: False})
>>> from dict_validator import Field, List, Dict

Each custom field must be a Field subclass with _validate method implemented.

See validate function for details.

>>> class AnyValue(Field):
...
...     def _validate(self, value):
...         pass

To document a field - pass a “description” parameter to the constructor. The value must be a string. The “description” can be added to any field.

>>> class Child:
...     items = List(AnyValue("AnyValue item"),
...                       description="A collection of important items")
>>> class Parent:
...     '''Schema docstring'''
...
...     ignored_field = "Nothing"
...     child = Dict(Child, description="Dict child")
...     plain_field = AnyValue(description="Pure string", required=False)

Since return value is a generator it has to be explicitly converted to a list.

Note, when it comes to documenting the items of the list “{N}” string is used to denote such a field.

Also note that a docstring of the schema class is transformed into a description of the top-level schema.

>>> from pprint import pprint
>>> pprint(sorted(describe(Parent)), width=60)
[([], {'description': 'Schema docstring', 'type': 'Dict'}),
 (['child'], {'description': 'Dict child', 'type': 'Dict'}),
 (['child', 'items'],
  {'description': 'A collection of important items',
   'type': 'List'}),
 (['child', 'items', '{N}'],
  {'description': 'AnyValue item', 'type': 'AnyValue'}),
 (['plain_field'],
  {'description': 'Pure string',
   'required': False,
   'type': 'AnyValue'})]
dict_validator.serialize(schema, value)[source]

Serialize a value before sending it over the wire.

Understands primitive, list and dict fields.

Parameters:
  • schema – a class representing the structure to be used for serialization
  • value – a pythonic object
Returns:

a dict ready to be sent over the wire

>>> from dict_validator import Field, List, Dict, serialize

Each custom field must be a Field should implement a serialize method to enable value transformations by default the value is returned as is.

See Field docs for details

>>> class AnyValue(Field):
...
...     def _validate(self, value):
...         pass
...
...     def serialize(self, value):
...         return "SERIALIZED {}".format(super(AnyValue, self)
...             .serialize(value))
>>> class Child:
...     items = List(AnyValue("String item"),
...                       description="A collection of important items")
>>> class Parent:
...     child = Dict(Child, description="Dict child")
...     plain_field = AnyValue(description="Pure string",
...                                 required=False)

In order to construct a tree of python objects to serialize it later one just use a Namespace class from the standard library.

>>> from argparse import Namespace
>>> payload = Namespace(
...     plain_field="OUTGOING",
...     child=Namespace(
...         items=["OUTGOING"]
...     )
... )
>>> from pprint import pprint
>>> pprint(serialize(Parent, payload))
{'child': {'items': ['SERIALIZED OUTGOING']},
 'plain_field': 'SERIALIZED OUTGOING'}

When serializing optional fields the missing values are set to None.

>>> class Schema:
...     field = AnyValue(required=False)
>>> serialize(Schema, {})
{'field': None}
dict_validator.deserialize(schema, value)[source]

Deserialize a value after sending it over the wire into a pythonic object.

Understands primitive, list and dict fields.

Parameters:
  • schema – a class representing the structure to be used for deserialization
  • value – a dict sent over the wire
Returns:

a pythonic object

>>> from dict_validator import Field, List, Dict, deserialize

Each custom field must be a Field should implement a deserialize method to enable value transformations by default the value is returned as is.

See Field docs for details

>>> class AnyValue(Field):
...
...     def _validate(self, value):
...         pass
...
...     def deserialize(self, value):
...         return "DESERIALIZED {}".format(super(AnyValue, self)
...             .deserialize(value))
>>> class Child:
...     items = List(AnyValue("String item"),
...                       description="A collection of important items")
>>> class Parent:
...     child = Dict(Child, description="Dict child")
...     plain_field = AnyValue(description="Pure string",
...                                 required=False)
>>> parent = deserialize(Parent, {
...     "child": {
...         "items": ["INCOMING"]
...     },
...     "plain_field": "INCOMING"
... })
>>> parent.plain_field
'DESERIALIZED INCOMING'
>>> parent.child.items[0]
'DESERIALIZED INCOMING'

When deserializing optional fields the missing values are set to None.

>>> class Schema:
...     field = AnyValue(required=False)
>>> deserialize(Schema, {}).field
class dict_validator.Field(description=None, required=True)[source]

Bases: object

The “leaf” primitive data type in a schema. E.g. string, integer, float, etc.

Parameters:
  • description – textual explanation of what the field represents
  • required – True by default. If false - the field is optional

Each field subclass must implement _validate() abstract method.

Each field may also implement _describe() method.

Apart from that if custom serialization mechanisms should be in place serialize and deserialize methods can be overridden to provide non-default behaviour.

See helper functions for reference implementations of the class.

_describe()[source]

Implement to supply extra info for field’s public description.

Returns:str:%JSON-SERIALIZABLE% key:value pairs
Return type:dict
_validate(value)[source]

Validate the incoming value, return error message or yield several error messages if there are errors.

describe()[source]

Do not override.

Yield:(path, {…description…}), e.g ([“parent”, “child”], {“required”: False})
deserialize(value)[source]
Parameters:value – a payload sent over the wire
Returns:a payload with Python specific data types
required

Do not override.

Returns:True if the field has to be present in the incoming dict
serialize(value)[source]
Returns:a payload ready to be sent over the wire
validate(value)[source]

Do not override.

Parameters:value – a payload
Yield:(path, erro_msg), e.g ([“parent”, “child”], “Error message”)
class dict_validator.Dict(schema, *args, **kwargs)[source]

Bases: dict_validator.field.Field

A dict of values.

Parameters:schema (dict_validator.Schema) – class to be used for validation/serialization of the incoming values
describe()[source]

Do not override.

Yield:(path, {…description…}), e.g ([“parent”, “child”], {“required”: False})
deserialize(value)[source]
Parameters:value – a payload sent over the wire
Returns:a payload with Python specific data types
serialize(value)[source]
Returns:a payload ready to be sent over the wire
class dict_validator.List(schema, *args, **kwargs)[source]

Bases: dict_validator.field.Field

A collection of arbitrary items.

Parameters:schema – Field subclass to be used to validate/serialize/deserialize individual items of the collection
describe()[source]

Do not override.

Yield:(path, {…description…}), e.g ([“parent”, “child”], {“required”: False})
deserialize(value)[source]
Parameters:value – a payload sent over the wire
Returns:a payload with Python specific data types
serialize(value)[source]
Returns:a payload ready to be sent over the wire
dict_validator.serialize_errors(validation_results)[source]

Transform a denormalized generator over the collection of errors into a serializable normalized dict.

Parameters:validation_generator – a collection of errors returned by validate()
Returns:dict with field paths as keys and lists of errors corresponding to those paths as values
>>> error_collection = iter([
...     (['field', 0], 'Error #1'),
...     (['field', 2], 'Error #1'),
...     (['field', 2], 'Error #2')
... ])
>>> from pprint import pprint
>>> pprint(serialize_errors(error_collection))
{'field.0': ['Error #1'], 'field.2': ['Error #1', 'Error #2']}

Submodules

class dict_validator.dict_field.Dict(schema, *args, **kwargs)[source]

Bases: dict_validator.field.Field

A dict of values.

Parameters:schema (dict_validator.Schema) – class to be used for validation/serialization of the incoming values
describe()[source]

Do not override.

Yield:(path, {…description…}), e.g ([“parent”, “child”], {“required”: False})
deserialize(value)[source]
Parameters:value – a payload sent over the wire
Returns:a payload with Python specific data types
serialize(value)[source]
Returns:a payload ready to be sent over the wire
class dict_validator.field.Field(description=None, required=True)[source]

Bases: object

The “leaf” primitive data type in a schema. E.g. string, integer, float, etc.

Parameters:
  • description – textual explanation of what the field represents
  • required – True by default. If false - the field is optional

Each field subclass must implement _validate() abstract method.

Each field may also implement _describe() method.

Apart from that if custom serialization mechanisms should be in place serialize and deserialize methods can be overridden to provide non-default behaviour.

See helper functions for reference implementations of the class.

_describe()[source]

Implement to supply extra info for field’s public description.

Returns:str:%JSON-SERIALIZABLE% key:value pairs
Return type:dict
_validate(value)[source]

Validate the incoming value, return error message or yield several error messages if there are errors.

describe()[source]

Do not override.

Yield:(path, {…description…}), e.g ([“parent”, “child”], {“required”: False})
deserialize(value)[source]
Parameters:value – a payload sent over the wire
Returns:a payload with Python specific data types
required

Do not override.

Returns:True if the field has to be present in the incoming dict
serialize(value)[source]
Returns:a payload ready to be sent over the wire
validate(value)[source]

Do not override.

Parameters:value – a payload
Yield:(path, erro_msg), e.g ([“parent”, “child”], “Error message”)
dict_validator.helpers.describe(schema)[source]

Describe a given schema.

Understands primitive, list and dict fields.

Parameters:schema – a class representing the structure to be documented
Yield:(path, {…description…}), e.g ([“parent”, “child”], {“required”: False})
>>> from dict_validator import Field, List, Dict

Each custom field must be a Field subclass with _validate method implemented.

See validate function for details.

>>> class AnyValue(Field):
...
...     def _validate(self, value):
...         pass

To document a field - pass a “description” parameter to the constructor. The value must be a string. The “description” can be added to any field.

>>> class Child:
...     items = List(AnyValue("AnyValue item"),
...                       description="A collection of important items")
>>> class Parent:
...     '''Schema docstring'''
...
...     ignored_field = "Nothing"
...     child = Dict(Child, description="Dict child")
...     plain_field = AnyValue(description="Pure string", required=False)

Since return value is a generator it has to be explicitly converted to a list.

Note, when it comes to documenting the items of the list “{N}” string is used to denote such a field.

Also note that a docstring of the schema class is transformed into a description of the top-level schema.

>>> from pprint import pprint
>>> pprint(sorted(describe(Parent)), width=60)
[([], {'description': 'Schema docstring', 'type': 'Dict'}),
 (['child'], {'description': 'Dict child', 'type': 'Dict'}),
 (['child', 'items'],
  {'description': 'A collection of important items',
   'type': 'List'}),
 (['child', 'items', '{N}'],
  {'description': 'AnyValue item', 'type': 'AnyValue'}),
 (['plain_field'],
  {'description': 'Pure string',
   'required': False,
   'type': 'AnyValue'})]
dict_validator.helpers.deserialize(schema, value)[source]

Deserialize a value after sending it over the wire into a pythonic object.

Understands primitive, list and dict fields.

Parameters:
  • schema – a class representing the structure to be used for deserialization
  • value – a dict sent over the wire
Returns:

a pythonic object

>>> from dict_validator import Field, List, Dict, deserialize

Each custom field must be a Field should implement a deserialize method to enable value transformations by default the value is returned as is.

See Field docs for details

>>> class AnyValue(Field):
...
...     def _validate(self, value):
...         pass
...
...     def deserialize(self, value):
...         return "DESERIALIZED {}".format(super(AnyValue, self)
...             .deserialize(value))
>>> class Child:
...     items = List(AnyValue("String item"),
...                       description="A collection of important items")
>>> class Parent:
...     child = Dict(Child, description="Dict child")
...     plain_field = AnyValue(description="Pure string",
...                                 required=False)
>>> parent = deserialize(Parent, {
...     "child": {
...         "items": ["INCOMING"]
...     },
...     "plain_field": "INCOMING"
... })
>>> parent.plain_field
'DESERIALIZED INCOMING'
>>> parent.child.items[0]
'DESERIALIZED INCOMING'

When deserializing optional fields the missing values are set to None.

>>> class Schema:
...     field = AnyValue(required=False)
>>> deserialize(Schema, {}).field
dict_validator.helpers.serialize(schema, value)[source]

Serialize a value before sending it over the wire.

Understands primitive, list and dict fields.

Parameters:
  • schema – a class representing the structure to be used for serialization
  • value – a pythonic object
Returns:

a dict ready to be sent over the wire

>>> from dict_validator import Field, List, Dict, serialize

Each custom field must be a Field should implement a serialize method to enable value transformations by default the value is returned as is.

See Field docs for details

>>> class AnyValue(Field):
...
...     def _validate(self, value):
...         pass
...
...     def serialize(self, value):
...         return "SERIALIZED {}".format(super(AnyValue, self)
...             .serialize(value))
>>> class Child:
...     items = List(AnyValue("String item"),
...                       description="A collection of important items")
>>> class Parent:
...     child = Dict(Child, description="Dict child")
...     plain_field = AnyValue(description="Pure string",
...                                 required=False)

In order to construct a tree of python objects to serialize it later one just use a Namespace class from the standard library.

>>> from argparse import Namespace
>>> payload = Namespace(
...     plain_field="OUTGOING",
...     child=Namespace(
...         items=["OUTGOING"]
...     )
... )
>>> from pprint import pprint
>>> pprint(serialize(Parent, payload))
{'child': {'items': ['SERIALIZED OUTGOING']},
 'plain_field': 'SERIALIZED OUTGOING'}

When serializing optional fields the missing values are set to None.

>>> class Schema:
...     field = AnyValue(required=False)
>>> serialize(Schema, {})
{'field': None}
dict_validator.helpers.serialize_errors(validation_results)[source]

Transform a denormalized generator over the collection of errors into a serializable normalized dict.

Parameters:validation_generator – a collection of errors returned by validate()
Returns:dict with field paths as keys and lists of errors corresponding to those paths as values
>>> error_collection = iter([
...     (['field', 0], 'Error #1'),
...     (['field', 2], 'Error #1'),
...     (['field', 2], 'Error #2')
... ])
>>> from pprint import pprint
>>> pprint(serialize_errors(error_collection))
{'field.0': ['Error #1'], 'field.2': ['Error #1', 'Error #2']}
dict_validator.helpers.validate(schema, value)[source]

Validate value again a given schema.

Parameters:
  • schema – a class representing the structure to be used for validation
  • value – a dict
Yield:

(path, erro_msg), e.g ([“parent”, “child”], “Error message”)

>>> from dict_validator import Field, List, Dict

To report a single error _validate method of a field subclass must return a string description of the problem.

>>> class SampleOnly(Field):
...
...     def _validate(self, value):
...         if value != "sample":
...             return "Not a sample"
>>> class Schema:
...     field = SampleOnly()

If there are no problems - nothing is yielded. Note: validate function is a generator - thus it has to be converted to a list explicitly.

>>> list(validate(Schema, {"field": "sample"}))
[]

Payload must be a dict.

>>> list(validate(Schema, "NOT A DICT"))
[([], 'Not a dict')]

If there are problems - a collection of tuples is yielded. The first element of a tuple is a list representing an absolute path to the field. The second element is a string with a description of the problem.

>>> list(validate(Schema, {"field": "not sample"}))
[(['field'], 'Not a sample')]

By default all fields are required

>>> list(validate(Schema, {}))
[([], 'Key "field" is missing')]

By design no extra fields are allowed - payload must be strictly specified.

>>> list(validate(Schema, {"field": "sample", "unknown_field": "sample"}))
[([], 'Unkown fields: unknown_field')]

Optional fields are marked via required=False parameter. This parameter is available for any field and its behaviour is uniform.

>>> class Schema:
...     field = SampleOnly(required=False)
>>> list(validate(Schema, {"field": "sample"}))
[]
>>> list(validate(Schema, {"field": "not sample"}))
[(['field'], 'Not a sample')]

If optional field is missing - no error is reported.

>>> list(validate(Schema, {}))
[]

Alternative way to report errors is to yield them. It is a good idea if there is a need to validate multiple aspects of the input value.

>>> class WithYieldedError(Field):
...
...     def _validate(self, value):
...         yield "Error 1"
...         yield "Error 2"
>>> class Schema:
...     field = WithYieldedError()

Each error has its own unique error tuple even if the errors originate from the same field.

>>> list(validate(Schema, {"field": "sample"}))
[(['field'], 'Error 1'), (['field'], 'Error 2')]

Nested structures can be described using a Dict field. This field requires a reference to the schema specifying a nested structure.

>>> class Child:
...     other_field = SampleOnly()
>>> class Parent:
...     child = Dict(Child)
>>> list(validate(Parent, {"child": {"other_field": "sample"}}))
[]

The absolute path to the error includes all the nodes of the structure’s tree.

>>> list(validate(Parent, {"child": {"other_field": "not sample"}}))
[(['child', 'other_field'], 'Not a sample')]

To represent collections of data (aka lists) a List should be used. The field requires an instance of some other field as its first argument.

>>> class Schema:
...     field = List(SampleOnly())
>>> list(validate(Schema, {"field": "NOT A LIST"}))
[(['field'], 'Not a list')]
>>> list(validate(Schema, {"field": ["sample"]}))
[]

If the problem is in individual item a path to the node includes an integer index of the node.

>>> list(validate(Schema, {"field": ["not sample", "sample",
...                                  "not sample"]}))
[(['field', 0], 'Not a sample'), (['field', 2], 'Not a sample')]

It is possible to have a sophisticated nesting using a combination of list and dict fields.

>>> class Child:
...     other_field = SampleOnly()
>>> class Parent:
...     child = List(Dict(Child))
>>> list(validate(Parent, {"child": [{"other_field": "sample"}]}))
[]
>>> list(validate(Parent, {"child": [{"other_field": "not sample"}]}))
[(['child', 0, 'other_field'], 'Not a sample')]
class dict_validator.list_field.List(schema, *args, **kwargs)[source]

Bases: dict_validator.field.Field

A collection of arbitrary items.

Parameters:schema – Field subclass to be used to validate/serialize/deserialize individual items of the collection
describe()[source]

Do not override.

Yield:(path, {…description…}), e.g ([“parent”, “child”], {“required”: False})
deserialize(value)[source]
Parameters:value – a payload sent over the wire
Returns:a payload with Python specific data types
serialize(value)[source]
Returns:a payload ready to be sent over the wire