lakefs.object module¶
Module containing lakeFS reference implementation
- class lakefs.object.LakeFSIOBase(obj, mode, pre_sign=None, client=None)[source]¶
Bases:
_BaseLakeFSObject
,IO
Base class for the lakeFS Reader and Writer classes
- close()[source]¶
Finalizes any existing operations on object and close open descriptors Inheriting classes need to override close() as needed.
- Return type:
None
- fileno()[source]¶
The file descriptor number as defined by the operating system. In the context of lakeFS it has no meaning
- Raises:
io.UnsupportedOperation – Always, since fileno is not supported for lakeFS objects
- Return type:
int
- property mode: str¶
Returns the open mode for this object
- property name: str¶
Returns the name of the object relative to the repo and reference
- class lakefs.object.ObjectReader(obj, mode, pre_sign=None, client=None)[source]¶
Bases:
LakeFSIOBase
ObjectReader provides read-only functionality for lakeFS objects with IO semantics. This Object is instantiated and returned for immutable reference types (Commit, Tag…)
- property closed: bool¶
Returns True after the object is closed
- flush()[source]¶
Nothing to do for reader
- Raises:
ValueError – if reader is closed
- Return type:
None
- property pre_sign¶
Returns whether the pre_sign mode is enabled
- read(n=None)[source]¶
Read object data
- Parameters:
n (int) – How many bytes to read. If read_bytes is None, will read from current position to end. If current position + read_bytes > object size.
- Return type:
str | bytes
- Returns:
The bytes read
- Raises:
ValueError – if reader is closed
OSError – if read_bytes is non-positive
ObjectNotFoundException – if repository id, reference id or object path does not exist
PermissionException – if user is not authorized to perform this operation, or operation is forbidden
ServerException – for any other errors
- readline(limit=-1)[source]¶
Read and return a line from the stream.
- Parameters:
limit (
int
) – If limit > -1 returns at most limit bytes- Raises:
ValueError – if reader is closed
ObjectNotFoundException – if repository id, reference id or object path does not exist
PermissionException – if user is not authorized to perform this operation, or operation is forbidden
ServerException – for any other errors
- seek(offset, whence=0)[source]¶
Move the object’s reading position
- Parameters:
offset (
int
) – The offset from the beginning of the filewhence (
int
) – Optional. The whence argument is optional and defaults to os.SEEK_SET or 0 (absolute file positioning); other values are os.SEEK_CUR or 1 (seek relative to the current position) and os.SEEK_END or 2 (seek relative to the file’s end)
- Raises:
ValueError – if reader is closed
OSError – if calculated new position is negative
io.UnsupportedOperation – If whence value is unsupported
- Return type:
int
- class lakefs.object.ObjectWriter(obj, mode, pre_sign=None, content_type=None, metadata=None, client=None)[source]¶
Bases:
LakeFSIOBase
ObjectWriter provides write-only functionality for lakeFS objects with IO semantics. This Object is instantiated and returned from the WriteableObject writer method. For the data to be actually written to the lakeFS server the close() method must be invoked explicitly or implicitly when using writer as a context.
- property closed: bool¶
Returns True after the object is closed
- flush()[source]¶
Flush buffer to file. Prevent flush if total write size is still smaller than _BUFFER_SIZE so that we avoid unnecessary write to disk.
- Raises:
ValueError – if writer is closed
- Return type:
None
- property pre_sign: bool¶
Returns whether the pre_sign mode is enabled
- class lakefs.object.StoredObject(repository_id, reference_id, path, client=None)[source]¶
Bases:
_BaseLakeFSObject
Class representing an object in lakeFS.
- copy(destination_branch_id, destination_path)[source]¶
Copy the object to a destination branch
- Parameters:
destination_branch_id (
str
) – The destination branch to copy the object todestination_path (
str
) – The path of the copied object in the destination branch
- Return type:
- Returns:
The newly copied Object
- Raises:
ObjectNotFoundException – if repo id,reference id, destination branch id or object path does not exist
PermissionException – if user is not authorized to perform this operation, or operation is forbidden
ServerException – for any other errors
- property path: str¶
Returns the object’s path relative to repository and reference ids
- reader(mode='rb', pre_sign=None)[source]¶
Context manager which provide a file-descriptor like object that allow reading the given object.
Usage Example:
import lakefs obj = lakefs.repository("<repository_name>").branch("<branch_name>").object("file.txt") file_size = obj.stat().size_bytes with obj.reader(mode='r', pre_sign=True) as fd: # print every other 10 chars while fd.tell() < file_size print(fd.read(10)) fd.seek(10, os.SEEK_CUR)
- Parameters:
mode (
Literal
['r'
,'rb'
]) – Read mode - as supported by ReadModespre_sign (
Optional
[bool
]) – (Optional), enforce the pre_sign mode on the lakeFS server. If not set, will probe server for information.
- Return type:
- Returns:
A Reader object
- property ref: str¶
Returns the object’s reference id
- property repo: str¶
Returns the object’s repository id
- class lakefs.object.WriteableObject(repository_id, reference_id, path, client=None)[source]¶
Bases:
StoredObject
WriteableObject inherits from ReadableObject and provides read/write functionality for lakeFS objects using IO semantics. This Object is instantiated and returned upon invoking writer() on Branch reference type.
- delete()[source]¶
Delete object from lakeFS
- Raises:
ObjectNotFoundException – if repo id, reference id or object path does not exist
PermissionException – if user is not authorized to perform this operation, or operation is forbidden
ServerException – for any other errors
- Return type:
None
- upload(data, mode='w', pre_sign=None, content_type=None, metadata=None)[source]¶
Upload a new object or overwrites an existing object
- Parameters:
data (str | bytes) – The contents of the object to write (can be bytes or string)
mode (WriteModes) –
Write mode:
’x’ - Open for exclusive creation
’xb’ - Open for exclusive creation in binary mode
’w’ - Create a new object or truncate if exists
’wb’ - Create or truncate in binary mode
pre_sign (Optional[bool]) – (Optional) Explicitly state whether to use pre_sign mode when uploading the object. If None, will be taken from pre_sign property.
content_type (Optional[str]) – (Optional) Explicitly set the object Content-Type
metadata (Optional[dict[str, str]]) – (Optional) User metadata
- Return type:
WriteableObject
- Returns:
The Stat object representing the newly created object
- Raises:
ObjectExistsException – if object exists and mode is exclusive (‘x’)
ObjectNotFoundException – if repo id, reference id or object path does not exist
PermissionException – if user is not authorized to perform this operation, or operation is forbidden
ServerException – for any other errors
- writer(mode='wb', pre_sign=None, content_type=None, metadata=None)[source]¶
Context manager which provide a file-descriptor like object that allow writing the given object to lakeFS The writes are saved in a buffer as long as the writer is open. Only when it closes it writes the data into lakeFS. The optional parameters can be modified by accessing the respective fields as long as the writer is still open.
Usage example of reading a file from local file system and writing it to lakeFS:
import lakefs obj = lakefs.repository("<repository_name>").branch("<branch_name>").object("my_image") with open("my_local_image", mode='rb') as reader, obj.writer("wb") as writer: writer.write(reader.read())
- Parameters:
mode (
Literal
['x'
,'xb'
,'w'
,'wb'
]) – Write mode - as supported by WriteModespre_sign (
Optional
[bool
]) – (Optional), enforce the pre_sign mode on the lakeFS server. If not set, will probe server for information.content_type (
Optional
[str
]) – (Optional) Specify the data media typemetadata (
Optional
[dict
[str
,str
]]) – (Optional) User defined metadata to save on the object
- Return type:
- Returns:
A Writer object