pub trait ObjectStoreRegistry:
Send
+ Sync
+ Debug
+ 'static {
// Required methods
fn register_store(
&self,
url: &Url,
store: Arc<dyn ObjectStore>,
) -> Option<Arc<dyn ObjectStore>>;
fn get_store(&self, url: &Url) -> Result<Arc<dyn ObjectStore>>;
}
Expand description
ObjectStoreRegistry
maps a URL to an ObjectStore
instance,
and allows DataFusion to read from different ObjectStore
instances. For example DataFusion might be configured so that
-
s3://my_bucket/lineitem/
mapped to the/lineitem
path on an AWS S3 object store bound tomy_bucket
-
s3://my_other_bucket/lineitem/
mapped to the (same)/lineitem
path on a different AWS S3 object store bound tomy_other_bucket
When given a ListingTableUrl
, DataFusion tries to find an
appropriate ObjectStore
. For example
create external table unicorns stored as parquet location 's3://my_bucket/lineitem/';
In this particular case, the url s3://my_bucket/lineitem/
will be provided to
ObjectStoreRegistry::get_store
and one of three things will happen:
-
If an
ObjectStore
has been registered withObjectStoreRegistry::register_store
withs3://my_bucket
, thatObjectStore
will be returned -
If an AWS S3 object store can be ad-hoc discovered by the url
s3://my_bucket/lineitem/
, this object store will be registered with keys3://my_bucket
and returned. -
Otherwise an error will be returned, indicating that no suitable
ObjectStore
could be found
This allows for two different use-cases:
-
Systems where object store buckets are explicitly created using DDL, can register these buckets using
ObjectStoreRegistry::register_store
-
Systems relying on ad-hoc discovery, without corresponding DDL, can create
ObjectStore
lazily by providing a custom implementation ofObjectStoreRegistry
Required Methods§
sourcefn register_store(
&self,
url: &Url,
store: Arc<dyn ObjectStore>,
) -> Option<Arc<dyn ObjectStore>>
fn register_store( &self, url: &Url, store: Arc<dyn ObjectStore>, ) -> Option<Arc<dyn ObjectStore>>
If a store with the same key existed before, it is replaced and returned
sourcefn get_store(&self, url: &Url) -> Result<Arc<dyn ObjectStore>>
fn get_store(&self, url: &Url) -> Result<Arc<dyn ObjectStore>>
Get a suitable store for the provided URL. For example:
- URL with scheme
file:///
or no scheme will return the default LocalFS store - URL with scheme
s3://bucket/
will return the S3 store - URL with scheme
hdfs://hostname:port/
will return the hdfs store
If no ObjectStore
found for the url
, ad-hoc discovery may be executed depending on
the url
and ObjectStoreRegistry
implementation. An ObjectStore
may be lazily
created and registered.
Implementors§
impl ObjectStoreRegistry for DefaultObjectStoreRegistry
Stores are registered based on the scheme, host and port of the provided URL
with a LocalFileSystem::new
automatically registered for file://
(if the
target arch is not wasm32
).
For example:
file:///my_path
will return the default LocalFS stores3://bucket/path
will return a store registered withs3://bucket
if anyhdfs://host:port/path
will return a store registered withhdfs://host:port
if any