dict_validator package¶
The package aims to simplify payload validation using schemas represented by by regular Python classes.
In the heart of the library there are just a few top level concepts.
To begin with a Schema is nothing else but a collection of fields that boils down to a definition of the following shape:
class Schema:
field1 = Sample() # this a field
field2 = Other() # and this is a field
Where each field is a subclass of a Field
with zero or more
constructor parameters.
Note, you may extend an “object” but it is truly optional.
Once a schema is defined it is possible to employ one of the functions to process the payload and/or schema:
validate()
- to check the payloadserialize_errors()
- to to transformvalidate()
results into a flat dict that could be sent over the wiredescribe()
- to present the schema in a serializable formatserialize()
- to transform the payload with Python specific types into something that could be sent over the wiredeserialize()
- reverse ofserialize()
Most common Field subclasses can be found in dict_validator.fields
.
-
dict_validator.
validate
(schema, value)[source]¶ Validate value again a given schema.
Parameters: - schema – a class representing the structure to be used for validation
- value – a dict
Yield: (path, erro_msg), e.g ([“parent”, “child”], “Error message”)
>>> from dict_validator import Field, List, Dict
To report a single error _validate method of a field subclass must return a string description of the problem.
>>> class SampleOnly(Field): ... ... def _validate(self, value): ... if value != "sample": ... return "Not a sample"
>>> class Schema: ... field = SampleOnly()
If there are no problems - nothing is yielded. Note: validate function is a generator - thus it has to be converted to a list explicitly.
>>> list(validate(Schema, {"field": "sample"})) []
Payload must be a dict.
>>> list(validate(Schema, "NOT A DICT")) [([], 'Not a dict')]
If there are problems - a collection of tuples is yielded. The first element of a tuple is a list representing an absolute path to the field. The second element is a string with a description of the problem.
>>> list(validate(Schema, {"field": "not sample"})) [(['field'], 'Not a sample')]
By default all fields are required
>>> list(validate(Schema, {})) [([], 'Key "field" is missing')]
By design no extra fields are allowed - payload must be strictly specified.
>>> list(validate(Schema, {"field": "sample", "unknown_field": "sample"})) [([], 'Unkown fields: unknown_field')]
Optional fields are marked via required=False parameter. This parameter is available for any field and its behaviour is uniform.
>>> class Schema: ... field = SampleOnly(required=False)
>>> list(validate(Schema, {"field": "sample"})) []
>>> list(validate(Schema, {"field": "not sample"})) [(['field'], 'Not a sample')]
If optional field is missing - no error is reported.
>>> list(validate(Schema, {})) []
Alternative way to report errors is to yield them. It is a good idea if there is a need to validate multiple aspects of the input value.
>>> class WithYieldedError(Field): ... ... def _validate(self, value): ... yield "Error 1" ... yield "Error 2"
>>> class Schema: ... field = WithYieldedError()
Each error has its own unique error tuple even if the errors originate from the same field.
>>> list(validate(Schema, {"field": "sample"})) [(['field'], 'Error 1'), (['field'], 'Error 2')]
Nested structures can be described using a Dict field. This field requires a reference to the schema specifying a nested structure.
>>> class Child: ... other_field = SampleOnly()
>>> class Parent: ... child = Dict(Child)
>>> list(validate(Parent, {"child": {"other_field": "sample"}})) []
The absolute path to the error includes all the nodes of the structure’s tree.
>>> list(validate(Parent, {"child": {"other_field": "not sample"}})) [(['child', 'other_field'], 'Not a sample')]
To represent collections of data (aka lists) a List should be used. The field requires an instance of some other field as its first argument.
>>> class Schema: ... field = List(SampleOnly())
>>> list(validate(Schema, {"field": "NOT A LIST"})) [(['field'], 'Not a list')]
>>> list(validate(Schema, {"field": ["sample"]})) []
If the problem is in individual item a path to the node includes an integer index of the node.
>>> list(validate(Schema, {"field": ["not sample", "sample", ... "not sample"]})) [(['field', 0], 'Not a sample'), (['field', 2], 'Not a sample')]
It is possible to have a sophisticated nesting using a combination of list and dict fields.
>>> class Child: ... other_field = SampleOnly()
>>> class Parent: ... child = List(Dict(Child))
>>> list(validate(Parent, {"child": [{"other_field": "sample"}]})) []
>>> list(validate(Parent, {"child": [{"other_field": "not sample"}]})) [(['child', 0, 'other_field'], 'Not a sample')]
-
dict_validator.
describe
(schema)[source]¶ Describe a given schema.
Understands primitive, list and dict fields.
Parameters: schema – a class representing the structure to be documented Yield: (path, {…description…}), e.g ([“parent”, “child”], {“required”: False}) >>> from dict_validator import Field, List, Dict
Each custom field must be a Field subclass with _validate method implemented.
See validate function for details.
>>> class AnyValue(Field): ... ... def _validate(self, value): ... pass
To document a field - pass a “description” parameter to the constructor. The value must be a string. The “description” can be added to any field.
>>> class Child: ... items = List(AnyValue("AnyValue item"), ... description="A collection of important items")
>>> class Parent: ... '''Schema docstring''' ... ... ignored_field = "Nothing" ... child = Dict(Child, description="Dict child") ... plain_field = AnyValue(description="Pure string", required=False)
Since return value is a generator it has to be explicitly converted to a list.
Note, when it comes to documenting the items of the list “{N}” string is used to denote such a field.
Also note that a docstring of the schema class is transformed into a description of the top-level schema.
>>> from pprint import pprint
>>> pprint(sorted(describe(Parent)), width=60) [([], {'description': 'Schema docstring', 'type': 'Dict'}), (['child'], {'description': 'Dict child', 'type': 'Dict'}), (['child', 'items'], {'description': 'A collection of important items', 'type': 'List'}), (['child', 'items', '{N}'], {'description': 'AnyValue item', 'type': 'AnyValue'}), (['plain_field'], {'description': 'Pure string', 'required': False, 'type': 'AnyValue'})]
-
dict_validator.
serialize
(schema, value)[source]¶ Serialize a value before sending it over the wire.
Understands primitive, list and dict fields.
Parameters: - schema – a class representing the structure to be used for serialization
- value – a pythonic object
Returns: a dict ready to be sent over the wire
>>> from dict_validator import Field, List, Dict, serialize
Each custom field must be a Field should implement a serialize method to enable value transformations by default the value is returned as is.
See Field docs for details
>>> class AnyValue(Field): ... ... def _validate(self, value): ... pass ... ... def serialize(self, value): ... return "SERIALIZED {}".format(super(AnyValue, self) ... .serialize(value))
>>> class Child: ... items = List(AnyValue("String item"), ... description="A collection of important items")
>>> class Parent: ... child = Dict(Child, description="Dict child") ... plain_field = AnyValue(description="Pure string", ... required=False)
In order to construct a tree of python objects to serialize it later one just use a Namespace class from the standard library.
>>> from argparse import Namespace
>>> payload = Namespace( ... plain_field="OUTGOING", ... child=Namespace( ... items=["OUTGOING"] ... ) ... )
>>> from pprint import pprint
>>> pprint(serialize(Parent, payload)) {'child': {'items': ['SERIALIZED OUTGOING']}, 'plain_field': 'SERIALIZED OUTGOING'}
When serializing optional fields the missing values are set to None.
>>> class Schema: ... field = AnyValue(required=False)
>>> serialize(Schema, {}) {'field': None}
-
dict_validator.
deserialize
(schema, value)[source]¶ Deserialize a value after sending it over the wire into a pythonic object.
Understands primitive, list and dict fields.
Parameters: - schema – a class representing the structure to be used for deserialization
- value – a dict sent over the wire
Returns: a pythonic object
>>> from dict_validator import Field, List, Dict, deserialize
Each custom field must be a Field should implement a deserialize method to enable value transformations by default the value is returned as is.
See Field docs for details
>>> class AnyValue(Field): ... ... def _validate(self, value): ... pass ... ... def deserialize(self, value): ... return "DESERIALIZED {}".format(super(AnyValue, self) ... .deserialize(value))
>>> class Child: ... items = List(AnyValue("String item"), ... description="A collection of important items")
>>> class Parent: ... child = Dict(Child, description="Dict child") ... plain_field = AnyValue(description="Pure string", ... required=False)
>>> parent = deserialize(Parent, { ... "child": { ... "items": ["INCOMING"] ... }, ... "plain_field": "INCOMING" ... })
>>> parent.plain_field 'DESERIALIZED INCOMING'
>>> parent.child.items[0] 'DESERIALIZED INCOMING'
When deserializing optional fields the missing values are set to None.
>>> class Schema: ... field = AnyValue(required=False)
>>> deserialize(Schema, {}).field
-
class
dict_validator.
Field
(description=None, required=True)[source]¶ Bases:
object
The “leaf” primitive data type in a schema. E.g. string, integer, float, etc.
Parameters: - description – textual explanation of what the field represents
- required – True by default. If false - the field is optional
Each field subclass must implement
_validate()
abstract method.Each field may also implement
_describe()
method.Apart from that if custom serialization mechanisms should be in place serialize and deserialize methods can be overridden to provide non-default behaviour.
See helper functions for reference implementations of the class.
-
_describe
()[source]¶ Implement to supply extra info for field’s public description.
Returns: str:%JSON-SERIALIZABLE% key:value pairs Return type: dict
-
_validate
(value)[source]¶ Validate the incoming value, return error message or yield several error messages if there are errors.
-
describe
()[source]¶ Do not override.
Yield: (path, {…description…}), e.g ([“parent”, “child”], {“required”: False})
-
deserialize
(value)[source]¶ Parameters: value – a payload sent over the wire Returns: a payload with Python specific data types
-
required
¶ Do not override.
Returns: True if the field has to be present in the incoming dict
-
class
dict_validator.
Dict
(schema, *args, **kwargs)[source]¶ Bases:
dict_validator.field.Field
A dict of values.
Parameters: schema (dict_validator.Schema) – class to be used for validation/serialization of the incoming values -
describe
()[source]¶ Do not override.
Yield: (path, {…description…}), e.g ([“parent”, “child”], {“required”: False})
-
-
class
dict_validator.
List
(schema, *args, **kwargs)[source]¶ Bases:
dict_validator.field.Field
A collection of arbitrary items.
Parameters: schema – Field subclass to be used to validate/serialize/deserialize individual items of the collection -
describe
()[source]¶ Do not override.
Yield: (path, {…description…}), e.g ([“parent”, “child”], {“required”: False})
-
-
dict_validator.
serialize_errors
(validation_results)[source]¶ Transform a denormalized generator over the collection of errors into a serializable normalized dict.
Parameters: validation_generator – a collection of errors returned by validate()
Returns: dict with field paths as keys and lists of errors corresponding to those paths as values >>> error_collection = iter([ ... (['field', 0], 'Error #1'), ... (['field', 2], 'Error #1'), ... (['field', 2], 'Error #2') ... ])
>>> from pprint import pprint
>>> pprint(serialize_errors(error_collection)) {'field.0': ['Error #1'], 'field.2': ['Error #1', 'Error #2']}
Subpackages¶
Submodules¶
-
class
dict_validator.dict_field.
Dict
(schema, *args, **kwargs)[source]¶ Bases:
dict_validator.field.Field
A dict of values.
Parameters: schema (dict_validator.Schema) – class to be used for validation/serialization of the incoming values -
describe
()[source]¶ Do not override.
Yield: (path, {…description…}), e.g ([“parent”, “child”], {“required”: False})
-
-
class
dict_validator.field.
Field
(description=None, required=True)[source]¶ Bases:
object
The “leaf” primitive data type in a schema. E.g. string, integer, float, etc.
Parameters: - description – textual explanation of what the field represents
- required – True by default. If false - the field is optional
Each field subclass must implement
_validate()
abstract method.Each field may also implement
_describe()
method.Apart from that if custom serialization mechanisms should be in place serialize and deserialize methods can be overridden to provide non-default behaviour.
See helper functions for reference implementations of the class.
-
_describe
()[source]¶ Implement to supply extra info for field’s public description.
Returns: str:%JSON-SERIALIZABLE% key:value pairs Return type: dict
-
_validate
(value)[source]¶ Validate the incoming value, return error message or yield several error messages if there are errors.
-
describe
()[source]¶ Do not override.
Yield: (path, {…description…}), e.g ([“parent”, “child”], {“required”: False})
-
deserialize
(value)[source]¶ Parameters: value – a payload sent over the wire Returns: a payload with Python specific data types
-
required
¶ Do not override.
Returns: True if the field has to be present in the incoming dict
-
dict_validator.helpers.
describe
(schema)[source]¶ Describe a given schema.
Understands primitive, list and dict fields.
Parameters: schema – a class representing the structure to be documented Yield: (path, {…description…}), e.g ([“parent”, “child”], {“required”: False}) >>> from dict_validator import Field, List, Dict
Each custom field must be a Field subclass with _validate method implemented.
See validate function for details.
>>> class AnyValue(Field): ... ... def _validate(self, value): ... pass
To document a field - pass a “description” parameter to the constructor. The value must be a string. The “description” can be added to any field.
>>> class Child: ... items = List(AnyValue("AnyValue item"), ... description="A collection of important items")
>>> class Parent: ... '''Schema docstring''' ... ... ignored_field = "Nothing" ... child = Dict(Child, description="Dict child") ... plain_field = AnyValue(description="Pure string", required=False)
Since return value is a generator it has to be explicitly converted to a list.
Note, when it comes to documenting the items of the list “{N}” string is used to denote such a field.
Also note that a docstring of the schema class is transformed into a description of the top-level schema.
>>> from pprint import pprint
>>> pprint(sorted(describe(Parent)), width=60) [([], {'description': 'Schema docstring', 'type': 'Dict'}), (['child'], {'description': 'Dict child', 'type': 'Dict'}), (['child', 'items'], {'description': 'A collection of important items', 'type': 'List'}), (['child', 'items', '{N}'], {'description': 'AnyValue item', 'type': 'AnyValue'}), (['plain_field'], {'description': 'Pure string', 'required': False, 'type': 'AnyValue'})]
-
dict_validator.helpers.
deserialize
(schema, value)[source]¶ Deserialize a value after sending it over the wire into a pythonic object.
Understands primitive, list and dict fields.
Parameters: - schema – a class representing the structure to be used for deserialization
- value – a dict sent over the wire
Returns: a pythonic object
>>> from dict_validator import Field, List, Dict, deserialize
Each custom field must be a Field should implement a deserialize method to enable value transformations by default the value is returned as is.
See Field docs for details
>>> class AnyValue(Field): ... ... def _validate(self, value): ... pass ... ... def deserialize(self, value): ... return "DESERIALIZED {}".format(super(AnyValue, self) ... .deserialize(value))
>>> class Child: ... items = List(AnyValue("String item"), ... description="A collection of important items")
>>> class Parent: ... child = Dict(Child, description="Dict child") ... plain_field = AnyValue(description="Pure string", ... required=False)
>>> parent = deserialize(Parent, { ... "child": { ... "items": ["INCOMING"] ... }, ... "plain_field": "INCOMING" ... })
>>> parent.plain_field 'DESERIALIZED INCOMING'
>>> parent.child.items[0] 'DESERIALIZED INCOMING'
When deserializing optional fields the missing values are set to None.
>>> class Schema: ... field = AnyValue(required=False)
>>> deserialize(Schema, {}).field
-
dict_validator.helpers.
serialize
(schema, value)[source]¶ Serialize a value before sending it over the wire.
Understands primitive, list and dict fields.
Parameters: - schema – a class representing the structure to be used for serialization
- value – a pythonic object
Returns: a dict ready to be sent over the wire
>>> from dict_validator import Field, List, Dict, serialize
Each custom field must be a Field should implement a serialize method to enable value transformations by default the value is returned as is.
See Field docs for details
>>> class AnyValue(Field): ... ... def _validate(self, value): ... pass ... ... def serialize(self, value): ... return "SERIALIZED {}".format(super(AnyValue, self) ... .serialize(value))
>>> class Child: ... items = List(AnyValue("String item"), ... description="A collection of important items")
>>> class Parent: ... child = Dict(Child, description="Dict child") ... plain_field = AnyValue(description="Pure string", ... required=False)
In order to construct a tree of python objects to serialize it later one just use a Namespace class from the standard library.
>>> from argparse import Namespace
>>> payload = Namespace( ... plain_field="OUTGOING", ... child=Namespace( ... items=["OUTGOING"] ... ) ... )
>>> from pprint import pprint
>>> pprint(serialize(Parent, payload)) {'child': {'items': ['SERIALIZED OUTGOING']}, 'plain_field': 'SERIALIZED OUTGOING'}
When serializing optional fields the missing values are set to None.
>>> class Schema: ... field = AnyValue(required=False)
>>> serialize(Schema, {}) {'field': None}
-
dict_validator.helpers.
serialize_errors
(validation_results)[source]¶ Transform a denormalized generator over the collection of errors into a serializable normalized dict.
Parameters: validation_generator – a collection of errors returned by validate()
Returns: dict with field paths as keys and lists of errors corresponding to those paths as values >>> error_collection = iter([ ... (['field', 0], 'Error #1'), ... (['field', 2], 'Error #1'), ... (['field', 2], 'Error #2') ... ])
>>> from pprint import pprint
>>> pprint(serialize_errors(error_collection)) {'field.0': ['Error #1'], 'field.2': ['Error #1', 'Error #2']}
-
dict_validator.helpers.
validate
(schema, value)[source]¶ Validate value again a given schema.
Parameters: - schema – a class representing the structure to be used for validation
- value – a dict
Yield: (path, erro_msg), e.g ([“parent”, “child”], “Error message”)
>>> from dict_validator import Field, List, Dict
To report a single error _validate method of a field subclass must return a string description of the problem.
>>> class SampleOnly(Field): ... ... def _validate(self, value): ... if value != "sample": ... return "Not a sample"
>>> class Schema: ... field = SampleOnly()
If there are no problems - nothing is yielded. Note: validate function is a generator - thus it has to be converted to a list explicitly.
>>> list(validate(Schema, {"field": "sample"})) []
Payload must be a dict.
>>> list(validate(Schema, "NOT A DICT")) [([], 'Not a dict')]
If there are problems - a collection of tuples is yielded. The first element of a tuple is a list representing an absolute path to the field. The second element is a string with a description of the problem.
>>> list(validate(Schema, {"field": "not sample"})) [(['field'], 'Not a sample')]
By default all fields are required
>>> list(validate(Schema, {})) [([], 'Key "field" is missing')]
By design no extra fields are allowed - payload must be strictly specified.
>>> list(validate(Schema, {"field": "sample", "unknown_field": "sample"})) [([], 'Unkown fields: unknown_field')]
Optional fields are marked via required=False parameter. This parameter is available for any field and its behaviour is uniform.
>>> class Schema: ... field = SampleOnly(required=False)
>>> list(validate(Schema, {"field": "sample"})) []
>>> list(validate(Schema, {"field": "not sample"})) [(['field'], 'Not a sample')]
If optional field is missing - no error is reported.
>>> list(validate(Schema, {})) []
Alternative way to report errors is to yield them. It is a good idea if there is a need to validate multiple aspects of the input value.
>>> class WithYieldedError(Field): ... ... def _validate(self, value): ... yield "Error 1" ... yield "Error 2"
>>> class Schema: ... field = WithYieldedError()
Each error has its own unique error tuple even if the errors originate from the same field.
>>> list(validate(Schema, {"field": "sample"})) [(['field'], 'Error 1'), (['field'], 'Error 2')]
Nested structures can be described using a Dict field. This field requires a reference to the schema specifying a nested structure.
>>> class Child: ... other_field = SampleOnly()
>>> class Parent: ... child = Dict(Child)
>>> list(validate(Parent, {"child": {"other_field": "sample"}})) []
The absolute path to the error includes all the nodes of the structure’s tree.
>>> list(validate(Parent, {"child": {"other_field": "not sample"}})) [(['child', 'other_field'], 'Not a sample')]
To represent collections of data (aka lists) a List should be used. The field requires an instance of some other field as its first argument.
>>> class Schema: ... field = List(SampleOnly())
>>> list(validate(Schema, {"field": "NOT A LIST"})) [(['field'], 'Not a list')]
>>> list(validate(Schema, {"field": ["sample"]})) []
If the problem is in individual item a path to the node includes an integer index of the node.
>>> list(validate(Schema, {"field": ["not sample", "sample", ... "not sample"]})) [(['field', 0], 'Not a sample'), (['field', 2], 'Not a sample')]
It is possible to have a sophisticated nesting using a combination of list and dict fields.
>>> class Child: ... other_field = SampleOnly()
>>> class Parent: ... child = List(Dict(Child))
>>> list(validate(Parent, {"child": [{"other_field": "sample"}]})) []
>>> list(validate(Parent, {"child": [{"other_field": "not sample"}]})) [(['child', 0, 'other_field'], 'Not a sample')]
-
class
dict_validator.list_field.
List
(schema, *args, **kwargs)[source]¶ Bases:
dict_validator.field.Field
A collection of arbitrary items.
Parameters: schema – Field subclass to be used to validate/serialize/deserialize individual items of the collection -
describe
()[source]¶ Do not override.
Yield: (path, {…description…}), e.g ([“parent”, “child”], {“required”: False})
-