Pydantic Tutorial: Information Validation in Python Made Easy – KDnuggets


Picture by Creator

 

 

Python is a dynamically typed language. So you may create variables with out explicitly specifying the info sort. And you’ll at all times assign a very completely different worth to the identical variable. Whereas this makes issues simpler for novices, it additionally makes it simply as straightforward to create invalid objects in your Python software.

Nicely, you may create information courses which permit defining fields with sort hints. However they don’t provide direct assist for validating information. Enter Pydantic, a preferred information validation and serialization library. Pydantic gives out-of-the-box assist for information validation and serialization. That means you may:

  • leverage Python’s sort hints to validate fields, 
  • use the customized fields and built-in validators Pydantic gives, and 
  • outline customized validators as wanted.

On this tutorial, we’ll mannequin a easy ‘Employee’ class and validate the values of the completely different fields utilizing the info validation performance of Pydantic. Let’s get began!

 

 

When you’ve got Python 3.8 or a later model, you may set up Pydantic utilizing pip:

 

In case you want electronic mail validation in your software, you may set up the non-obligatory email-validator dependency when putting in Pydantic like so:

$ pip set up pydantic[email]

 

Alternatively, you may run the next command to put in email-validator:

$ pip set up email-validator

 

Notice: In our instance, we’ll use electronic mail validation. So please set up the dependency for those who’d prefer to code alongside.

 

 

Now let’s create a easy Worker class. FIrst, we create a category that inherits from the BaseModel class. The varied fields and the anticipated sorts are specified as proven:

# major.py

from pydantic import BaseModel, EmailStr

class Worker(BaseModel):
    identify: str
    age: int
    electronic mail: EmailStr
    division: str
    employee_id: str

 

Discover that we’ve specified electronic mail to be of the EmailStr sort that Pydantic helps as an alternative of a daily Python string. It’s because all legitimate strings might not be legitimate emails.

 

 

As a result of the Worker class is straightforward, let’s add validation for the next fields:

  • electronic mail: ought to be a legitimate electronic mail. Specifying the EmailStr accounts for this, and we run into errors creating objects with invalid electronic mail.
  • employee_id: ought to be a legitimate worker ID. We’ll implement a customized validation for this area.

 

Implementing Customized Validation

 

For this instance, as an instance the employee_id ought to be a string of size 6 containing solely alphanumeric characters.

We are able to use the @validator decorator with the employee_id area on the argument and outline the validate_employee_id technique as proven: 

# major.py 

from pydantic import BaseModel, EmailStr, validator

...

@validator("employee_id")
    def validate_employee_id(cls, v):
        if not v.isalnum() or len(v) != 6:
            increase ValueError("Employee ID must be exactly 6 alphanumeric characters")
        return v

 

Now this technique checks if the employee_id is legitimate for the Worker objects we attempt to create.

At this level, your script ought to appear to be so:

# major.py

from pydantic import BaseModel, EmailStr, validator

class Worker(BaseModel):
    identify: str
    age: int
    electronic mail: EmailStr
    division: str
    employee_id: str

    @validator("employee_id")
     def validate_employee_id(cls, v):
         if not v.isalnum() or len(v) != 6:
             increase ValueError("Employee ID must be exactly 6 alphanumeric characters")
         return v

 

 

In apply, it is quite common to parse JSON responses from APIs into information buildings like Python dictionaries. Say we now have an ‘employees.json’ file (within the present listing) with the next information:

# staff.json

[
	{
    	"name": "John Doe",
    	"age": 30,
    	"email": "john.doe@example.com",
    	"department": "Engineering",
    	"employee_id": "EMP001"
	},
	{
    	"name": "Jane Smith",
    	"age": 25,
    	"email": "jane.smith@example.com",
    	"department": "Marketing",
    	"employee_id": "EMP002"
	},
	{
    	"name": "Alice Brown",
    	"age": 35,
    	"email": "invalid-email",
    	"department": "Finance",
    	"employee_id": "EMP0034"
	},
	{
    	"name": "Dave West",
    	"age": 40,
    	"email": "dave.west@example.com",
    	"department": "HR",
    	"employee_id": "EMP005"
	}
]

 

We are able to see that within the third document equivalent to ‘Alice Brown’, we now have two fields which are invalid: the electronic mail and the employee_id:

Pydantic Tutorial: Data Validation in Python Made Simple

 

As a result of we’ve specified that electronic mail ought to be EmailStr, the e-mail string will likely be mechanically validated. We’ve additionally added the validate_employee_id class technique to examine if the objects have a legitimate worker ID.

Now let’s add the code to parse the JSON file and create worker objects (we’ll use the built-in json module for this).  We additionally import the ValidationError class from Pydantic. In essence, we attempt to create objects, deal with ValidationError exceptions when the info validation fails, and in addition print out the errors:

# major.py

import json
from pydantic import BaseModel, EmailStr, ValidationError, validator
...

# Load and parse the JSON information
with open("employees.json", "r") as f:
    information = json.load(f)

# Validate every worker document
for document in information:
    strive:
        worker = Worker(**document)
        print(f"Valid employee record: {employee.name}")
    besides ValidationError as e:
        print(f"Invalid employee record: {record['name']}")
        print(f"Errors: {e.errors()}")

 

Whenever you run the script, you need to see an identical output:

Output >>>

Legitimate worker document: John Doe
Legitimate worker document: Jane Smith
Invalid worker document: Alice Brown
Errors: [{'type': 'value_error', 'loc': ('email',), 'msg': 'value is not a valid email address: The email address is not valid. It must have exactly one @-sign.', 'input': 'invalid-email', 'ctx': {'reason': 'The email address is not valid. It must have exactly one @-sign.'}}, {'type': 'value_error', 'loc': ('employee_id',), 'msg': 'Value error, Employee ID must be exactly 6 alphanumeric characters', 'input': 'EMP0034', 'ctx': {'error': ValueError('Employee ID must be exactly 6 alphanumeric characters')}, 'url': 'https://errors.pydantic.dev/2.6/v/value_error'}]
Legitimate worker document: Dave West

 

As anticipated, solely the document equivalent to ‘Alice Brown’ is not a legitimate worker object. Zooming in to the related a part of the output, you may see an in depth message on why the electronic mail and employee_id fields are invalid.

Right here’s the whole code:

# major.py

import json
from pydantic import BaseModel, EmailStr, ValidationError, validator

class Worker(BaseModel):
    identify: str
    age: int
    electronic mail: EmailStr
    division: str
    employee_id: str

    @validator("employee_id")
     def validate_employee_id(cls, v):
         if not v.isalnum() or len(v) != 6:
             increase ValueError("Employee ID must be exactly 6 alphanumeric characters")
         return v

# Load and parse the JSON information
with open("employees.json", "r") as f:
    information = json.load(f)

# Validate every worker document
for document in information:
    strive:
        worker = Worker(**document)
        print(f"Valid employee record: {employee.name}")
    besides ValidationError as e:
        print(f"Invalid employee record: {record['name']}")
        print(f"Errors: {e.errors()}")

 

 

That is all for this tutorial! That is an introductory tutorial to Pydantic. I hope you discovered the fundamentals of modeling your information, and utilizing each built-in and customized validations that Pydantic gives. All of the code used on this tutorial is on GitHub. 

Subsequent, you might strive utilizing Pydantic in your Python initiatives and in addition discover serialization  capabilities. Blissful coding!
 
 

Bala Priya C is a developer and technical author from India. She likes working on the intersection of math, programming, information science, and content material creation. Her areas of curiosity and experience embody DevOps, information science, and pure language processing. She enjoys studying, writing, coding, and occasional! At the moment, she’s engaged on studying and sharing her information with the developer neighborhood by authoring tutorials, how-to guides, opinion items, and extra. Bala additionally creates partaking useful resource overviews and coding tutorials.

Recent articles

5 Methods for Gathering Cyber Menace Intelligence

To defend your group towards cyber threats, you want...

CISA Warns of Lively Exploitation in SolarWinds Assist Desk Software program Vulnerability

î ‚Oct 16, 2024î „Ravie LakshmananVulnerability / Knowledge Safety The U.S. Cybersecurity...

Astaroth Banking Malware Resurfaces in Brazil by way of Spear-Phishing Assault

î ‚Oct 16, 2024î „Ravie LakshmananCyber Assault / Banking Trojan A brand...

GitHub Patches Crucial Flaw in Enterprise Server Permitting Unauthorized Occasion Entry

î ‚Oct 16, 2024î „Ravie LakshmananEnterprise Safety / Vulnerability GitHub has launched...

LEAVE A REPLY

Please enter your comment!
Please enter your name here