HomeArtificial IntelligenceThe best way to Construct a Light-weight Knowledge Pipeline with Airtable and...

The best way to Construct a Light-weight Knowledge Pipeline with Airtable and Python


The best way to Construct a Light-weight Knowledge Pipeline with Airtable and PythonThe best way to Construct a Light-weight Knowledge Pipeline with Airtable and Python
Picture by Editor | ChatGPT

 

Introduction

 
Airtable not solely provides a versatile, spreadsheet-like interface for knowledge storage and evaluation, it additionally supplies an API for programmatic interplay. In different phrases, you may join it to exterior instruments and applied sciences — as an example, Python — to construct knowledge pipelines or processing workflows, bringing your outcomes again to your Airtable database (or just “base”, in Airtable jargon).

This text demonstrates the best way to create a easy, ETL-like pipeline utilizing the Airtable Python API. We are going to follow the free tier, guaranteeing the method works with out paid options.

 

Airtable Dataset Setup

 
Whereas the pipeline constructed on this article may be simply tailored to quite a lot of datasets, for these new to Airtable and needing an Airtable challenge and saved dataset as a place to begin, we suggest you observe this current introductory tutorial to Airtable and create a tabular dataset known as “Prospects”, containing 200 rows and the next columns (see picture):

 

Customers dataset/table in AirtableCustomers dataset/table in Airtable
Prospects dataset/desk in Airtable | Picture by Creator

 

Airtable-Python Knowledge Pipeline

 
In Airtable, go to your person avatar — on the time of writing, it’s the circled avatar situated on the bottom-left nook of the app interface — and choose “Builder Hub”. Within the new display screen (see screenshot under), click on on “Private entry tokens”, then on “Create token”. Give it a reputation, and be sure you add at the very least these two scopes: knowledge.information:learn and knowledge.information:write. Likewise, choose the bottom the place your prospects desk is situated within the “Entry” part, in order that your token has configured entry to this base.

 

Creating Airtable API tokenCreating Airtable API token
Creating an Airtable API token | Picture by Creator

 

As soon as the token has been created, copy and retailer it fastidiously in a protected place, as will probably be proven solely as soon as. We’ll want it later. The token begins with pat adopted by a protracted alphanumeric code.

One other key piece of data we might want to construct our Python-based pipeline that interacts with Airtable is the ID of our base. Return to your base within the Airtable net interface, and as soon as there, you must see that its URL within the browser has a syntax like: https://airtable.com/app[xxxxxx]/xxxx/xxxx. The half we’re interested by copying is the app[xxxx] ID contained between two consecutive slashes (/): that is the bottom ID we’ll want. 

With this in hand, and assuming you have already got a populated desk known as “Prospects” in your base, we’re prepared to start out our Python program. I shall be utilizing a pocket book for coding it. In case you are utilizing an IDE, chances are you’ll must barely change the half the place the three Airtable atmosphere variables are outlined, to have them learn from an .env file as a substitute. On this model, for simplicity and ease of illustration, we’ll straight outline them in our pocket book. Let’s begin by putting in the required dependencies:

!pip set up pyairtable python-dotenv

 

Subsequent, we outline the Airtable atmosphere variables. Discover that for the primary two, you must substitute the worth along with your precise entry token and base ID, respectively:

import os
from dotenv import load_dotenv # Needed provided that studying variables from a .env file
from pyairtable import Api, Desk
import pandas as pd

PAT = "pat-xxx" # Your PAT (Private Entry Token) is pasted right here
BASE_ID = "app-xxx" # Your Airtable Base ID is pasted right here
TABLE_NAME = "Prospects"

api = Api(PAT)
desk = Desk(PAT, BASE_ID, TABLE_NAME)

 

We’ve got simply arrange an occasion of the Python Airtable API and instantiated a connection level to the shoppers desk in our base. Now, that is how we learn the complete dataset contained in our Airtable desk and cargo it right into a Pandas DataFrame. You simply must be cautious to make use of the precise column names from the supply desk for the string arguments contained in the get() technique calls:

rows = []
for rec in desk.all():  # honors 5 rps; auto-retries on 429s
    fields = rec.get("fields", {})
    rows.append({
        "id": rec["id"],
        "CustomerID": fields.get("CustomerID"),
        "Gender": fields.get("Gender"),
        "Age": fields.get("Age"),
        "Annual Revenue (okay$)": fields.get("Annual Revenue (okay$)"),
        "Spending Rating (1-100)": fields.get("Spending Rating (1-100)"),
        "Revenue class": fields.get("Revenue Class"),
    })

df = pd.DataFrame(rows)

 

As soon as the info has been loaded, it’s time to apply a easy transformation. For simplicity, we’ll simply apply one transformation, however we may apply as many as wanted, simply as we might often do when preprocessing or cleansing datasets with Pandas. We are going to create a brand new binary attribute, known as Is Excessive Worth, to indicate high-value prospects, i.e., these whose earnings and spending rating are each excessive:

def high_value(row):
    strive:
        return (row["Spending Score (1-100)"] >= 70) and (row["Annual Income (k$)"] >= 70)
    besides TypeError:
        return False

df["Is High Value"] = df.apply(high_value, axis=1)
df.head()

 

Ensuing dataset:

 

Airtable data transformation with Python and PandasAirtable data transformation with Python and Pandas
Airtable knowledge transformation with Python and Pandas | Picture by Creator

 

Lastly, it’s time to write the modifications again to Airtable by incorporating the brand new knowledge related to the brand new column. There’s a little caveat: we first must manually create a brand new column named “Excessive Worth” in our Airtable prospects desk, with its kind set to “Checkbox” (the equal of binary categorical attributes). As soon as this clean column has been created, run the next code in your Python program, and the brand new knowledge shall be routinely added to Airtable!

updates = []
for _, r in df.iterrows():
    if pd.isna(r["id"]):
        proceed
    updates.append({
        "id": r["id"],
        "fields": {
            "Excessive Worth": bool(r["Is High Value"])
        }
    })

if updates:
    desk.batch_update(updates)

 

Time to return to Airtable and see what modified in our supply prospects desk! If at first look you see no modifications and the brand new column nonetheless appears empty, do not panic simply but. Not many purchasers are labeled as “excessive worth”, and chances are you’ll must scroll down slightly to see some labeled with a inexperienced tick signal:

 

Updated customers tableUpdated customers table
Up to date prospects desk | Picture by Creator

 

That is it! You simply constructed your individual light-weight, ETL-like knowledge pipeline based mostly on a bidirectional interplay between Airtable and Python. Nicely executed!

 

Wrapping Up

 
This text centered on showcasing knowledge capabilities with Airtable, a flexible and user-friendly cloud-based platform for knowledge administration and evaluation that mixes options of spreadsheets and relational databases with AI-powered capabilities. Particularly, we confirmed the best way to run a light-weight knowledge transformation pipeline with the Airtable Python API that reads knowledge from Airtable, transforms it, and hundreds it again to Airtable — all throughout the capabilities and limitations of Airtable’s free model.
 
 

Iván Palomares Carrascosa is a pacesetter, author, speaker, and adviser in AI, machine studying, deep studying & LLMs. He trains and guides others in harnessing AI in the true world.

RELATED ARTICLES

LEAVE A REPLY

Please enter your comment!
Please enter your name here

- Advertisment -
Google search engine

Most Popular

Recent Comments