lakeFS High-Level Python SDK

lakeFS High Level SDK for Python, provides developers with the following features:

  1. Simpler programming interface with less configuration

  2. Inferring identity from environment

  3. Better abstractions for common, more complex operations (I/O, transactions, imports)


Python 3.9+

Installation & Usage

pip install

pip install lakefs

Import the package

import lakefs

Getting Started

Please follow the installation procedure and afterward refer to the following example snippet for a quick start:

import lakefs
from lakefs.client import Client

# Using default client will attempt to authenticate with lakeFS server using configured credentials
# If environment variables or .lakectl.yaml file exist
repo = lakefs.repository(repository_id="my-repo")

# Or explicitly initialize and provide a Client object
clt = Client(username="<lakefs_access_key_id>", password="<lakefs_secret_access_key>", host="<lakefs_endpoint>")
repo = lakefs.Repository(repository_id="my-repo", client=clt)

# From this point, proceed using the package according to documentation
main_branch = repo.create(storage_namespace="<storage_namespace>").branch(branch_id="main")


Difference between two branches

for i in lakefs.Repository("repo").ref("main").diff("twig"):

You can also use the ref expressions here, for instance .diff("main~2") also works. Ref expressions are the lakeFS analogues of how Git specifies revisions.

Search a stored object for a string

with lakefs.Repository("repo").ref("main").object("path/to/data").reader(mode="r") as f:
   for l in f:
     if "quick" in l:

Upload and commit some data

with lakefs.Repository("golden").branch("main").object("path/to/new").writer(mode="wb") as f:
   f.write(b"my data")

# Returns a Reference
lakefs.Repository("golden").branch("main").commit("added my data using lakeFS high-level SDK")

# Prints "my data"
with lakefs.Repository("golden").branch("main").object("path/to/new").reader(mode="r") as f:
   for l in f:

Unlike references, branches are readable. This example couldn’t work if we used a ref.


To run the tests using pytest, first clone the lakeFS git repository

git clone
cd lakefs/clients/python-wrapper

Unit Tests

Inside the tests folder, execute pytest utests to run the unit tests.

Integration Tests

See testing documentation for more information


lakeFS Python SDK


API Reference

Indices and Tables