lakefs.object module

Module containing lakeFS reference implementation

class lakefs.object.LakeFSIOBase(obj, mode, pre_sign=None, client=None)[source]

Bases: _BaseLakeFSObject, IO

Base class for the lakeFS Reader and Writer classes

close()[source]

Finalizes any existing operations on object and close open descriptors Inheriting classes need to override close() as needed.

Return type:

None

fileno()[source]

The file descriptor number as defined by the operating system. In the context of lakeFS it has no meaning

Raises:

io.UnsupportedOperation – Always, since fileno is not supported for lakeFS objects

Return type:

int

abstract flush()[source]
Return type:

None

isatty()[source]

Irrelevant for the lakeFS implementation

Return type:

bool

property mode: str

Returns the open mode for this object

property name: str

Returns the name of the object relative to the repo and reference

abstract read(n=None)[source]
Return type:

str | bytes

abstract readable()[source]
Return type:

bool

readline(limit=-1)[source]

Must be explicitly implemented by inheriting class

readlines(hint=-1)[source]

Must be explicitly implemented by inheriting class

abstract seek(offset, whence=0)[source]
Return type:

int

abstract seekable()[source]
Return type:

bool

tell()[source]

For readers - read position, for writers can be used as an indication for bytes written

Return type:

int

truncate(size=None)[source]

Unsupported by lakeFS implementation

Return type:

int

abstract writable()[source]
Return type:

bool

abstract write(s)[source]
Return type:

int

writelines(lines)[source]

Unsupported by lakeFS implementation

Return type:

None

class lakefs.object.ObjectReader(obj, mode, pre_sign=None, client=None)[source]

Bases: LakeFSIOBase

ObjectReader provides read-only functionality for lakeFS objects with IO semantics. This Object is instantiated and returned for immutable reference types (Commit, Tag…)

close()[source]

Close open descriptors

Return type:

None

property closed: bool

Returns True after the object is closed

flush()[source]

Nothing to do for reader

Raises:

ValueError – if reader is closed

Return type:

None

property pre_sign

Returns whether the pre_sign mode is enabled

read(n=None)[source]

Read object data

Parameters:

n (int) – How many bytes to read. If read_bytes is None, will read from current position to end. If current position + read_bytes > object size.

Return type:

str | bytes

Returns:

The bytes read

Raises:
  • ValueError – if reader is closed

  • OSError – if read_bytes is non-positive

  • ObjectNotFoundException – if repository id, reference id or object path does not exist

  • PermissionException – if user is not authorized to perform this operation, or operation is forbidden

  • ServerException – for any other errors

readable()[source]

Returns True always

Return type:

bool

readline(limit=-1)[source]

Read and return a line from the stream.

Parameters:

limit (int) – If limit > -1 returns at most limit bytes

Raises:
seek(offset, whence=0)[source]

Move the object’s reading position

Parameters:
  • offset (int) – The offset from the beginning of the file

  • whence (int) – Optional. The whence argument is optional and defaults to os.SEEK_SET or 0 (absolute file positioning); other values are os.SEEK_CUR or 1 (seek relative to the current position) and os.SEEK_END or 2 (seek relative to the file’s end)

Raises:
  • ValueError – if reader is closed

  • OSError – if calculated new position is negative

  • io.UnsupportedOperation – If whence value is unsupported

Return type:

int

seekable()[source]

Returns True always

Return type:

bool

writable()[source]

Unsupported - read only object

Return type:

bool

write(s)[source]

Unsupported for reader object

Return type:

int

class lakefs.object.ObjectWriter(obj, mode, pre_sign=None, content_type=None, metadata=None, client=None)[source]

Bases: LakeFSIOBase

ObjectWriter provides write-only functionality for lakeFS objects with IO semantics. This Object is instantiated and returned from the WriteableObject writer method. For the data to be actually written to the lakeFS server the close() method must be invoked explicitly or implicitly when using writer as a context.

close()[source]

Write the data to the lakeFS server

Return type:

None

property closed: bool

Returns True after the object is closed

discard()[source]

Discards of the write buffer and closes writer

Return type:

None

flush()[source]

Flush buffer to file. Prevent flush if total write size is still smaller than _BUFFER_SIZE so that we avoid unnecessary write to disk.

Raises:

ValueError – if writer is closed

Return type:

None

property pre_sign: bool

Returns whether the pre_sign mode is enabled

read(n=None)[source]

Unsupported for writer class

Return type:

str | bytes

readable()[source]

ObjectWriter is write-only - return False always

Return type:

bool

seek(offset, whence=0)[source]

Unsupported for writer class

Return type:

int

seekable()[source]

ObjectWriter is not seekable. Returns False always

Return type:

bool

writable()[source]

Returns True always

Return type:

bool

write(s)[source]

Write data to buffer

Parameters:

s (AnyStr) – The data to write

Return type:

int

Returns:

The number of bytes written to buffer

Raises:

ValueError – if writer is closed

class lakefs.object.StoredObject(repository_id, reference_id, path, client=None)[source]

Bases: _BaseLakeFSObject

Class representing an object in lakeFS.

copy(destination_branch_id, destination_path)[source]

Copy the object to a destination branch

Parameters:
  • destination_branch_id (str) – The destination branch to copy the object to

  • destination_path (str) – The path of the copied object in the destination branch

Return type:

WriteableObject

Returns:

The newly copied Object

Raises:
exists()[source]

Returns True if object exists in lakeFS, False otherwise

Return type:

bool

property path: str

Returns the object’s path relative to repository and reference ids

reader(mode='rb', pre_sign=None)[source]

Context manager which provide a file-descriptor like object that allow reading the given object.

Usage Example:

import lakefs

obj = lakefs.repository("<repository_name>").branch("<branch_name>").object("file.txt")
file_size = obj.stat().size_bytes

with obj.reader(mode='r', pre_sign=True) as fd:
    # print every other 10 chars
    while fd.tell() < file_size
        print(fd.read(10))
        fd.seek(10, os.SEEK_CUR)
Parameters:
  • mode (Literal['r', 'rb']) – Read mode - as supported by ReadModes

  • pre_sign (Optional[bool]) – (Optional), enforce the pre_sign mode on the lakeFS server. If not set, will probe server for information.

Return type:

ObjectReader

Returns:

A Reader object

property ref: str

Returns the object’s reference id

property repo: str

Returns the object’s repository id

stat()[source]

Return the Stat object representing this object

Return type:

ObjectInfo

class lakefs.object.WriteableObject(repository_id, reference_id, path, client=None)[source]

Bases: StoredObject

WriteableObject inherits from ReadableObject and provides read/write functionality for lakeFS objects using IO semantics. This Object is instantiated and returned upon invoking writer() on Branch reference type.

delete()[source]

Delete object from lakeFS

Raises:
Return type:

None

upload(data, mode='w', pre_sign=None, content_type=None, metadata=None)[source]

Upload a new object or overwrites an existing object

Parameters:
  • data (str | bytes) – The contents of the object to write (can be bytes or string)

  • mode (WriteModes) –

    Write mode:

    ’x’ - Open for exclusive creation

    ’xb’ - Open for exclusive creation in binary mode

    ’w’ - Create a new object or truncate if exists

    ’wb’ - Create or truncate in binary mode

  • pre_sign (Optional[bool]) – (Optional) Explicitly state whether to use pre_sign mode when uploading the object. If None, will be taken from pre_sign property.

  • content_type (Optional[str]) – (Optional) Explicitly set the object Content-Type

  • metadata (Optional[dict[str, str]]) – (Optional) User metadata

Return type:

WriteableObject

Returns:

The Stat object representing the newly created object

Raises:
writer(mode='wb', pre_sign=None, content_type=None, metadata=None)[source]

Context manager which provide a file-descriptor like object that allow writing the given object to lakeFS The writes are saved in a buffer as long as the writer is open. Only when it closes it writes the data into lakeFS. The optional parameters can be modified by accessing the respective fields as long as the writer is still open.

Usage example of reading a file from local file system and writing it to lakeFS:

import lakefs

obj = lakefs.repository("<repository_name>").branch("<branch_name>").object("my_image")

with open("my_local_image", mode='rb') as reader, obj.writer("wb") as writer:
    writer.write(reader.read())
Parameters:
  • mode (Literal['x', 'xb', 'w', 'wb']) – Write mode - as supported by WriteModes

  • pre_sign (Optional[bool]) – (Optional), enforce the pre_sign mode on the lakeFS server. If not set, will probe server for information.

  • content_type (Optional[str]) – (Optional) Specify the data media type

  • metadata (Optional[dict[str, str]]) – (Optional) User defined metadata to save on the object

Return type:

ObjectWriter

Returns:

A Writer object