Skip to content

Commit 110b613

Browse files
[HOPSWORKS-2206] HSFS profile to install with and without Hive dependencies (#200)
1 parent 8a59a4e commit 110b613

File tree

4 files changed

+22
-7
lines changed

4 files changed

+22
-7
lines changed

docs/integrations/python.md

Lines changed: 5 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -30,9 +30,13 @@ Create a file called `featurestore.key` in your designated Python environment an
3030
To be able to access the Hopsworks Feature Store, the `HSFS` Python library needs to be installed in the environment from which you want to connect to the Feature Store. You can install the library through pip. We recommend using a Python environment manager such as *virtualenv* or *conda*.
3131

3232
```
33-
pip install hsfs~=[HOPSWORKS_VERSION]
33+
pip install hsfs[hive]~=[HOPSWORKS_VERSION]
3434
```
3535

36+
!!! attention "Hive Dependencies"
37+
38+
By default, `HSFS` assumes Spark/EMR is used as execution engine and therefore Hive dependencies are not installed. Hence, on a local Python evnironment, if you are planning to use a regular Python Kernel **without Spark/EMR**, make sure to install the **"hive"** extra dependencies (`hsfs[hive]`).
39+
3640
!!! attention "Matching Hopsworks version"
3741
The **major version of `HSFS`** needs to match the **major version of Hopsworks**.
3842

docs/integrations/sagemaker.md

Lines changed: 5 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -141,9 +141,13 @@ You have two options to make your API key accessible from SageMaker:
141141
To be able to access the Hopsworks Feature Store, the `HSFS` Python library needs to be installed. One way of achieving this is by opening a Python notebook in SageMaker and installing the `HSFS` with a magic command and pip:
142142

143143
```
144-
!pip install hsfs~=[HOPSWORKS_VERSION]
144+
!pip install hsfs[hive]~=[HOPSWORKS_VERSION]
145145
```
146146

147+
!!! attention "Hive Dependencies"
148+
149+
By default, `HSFS` assumes Spark/EMR is used as execution engine and therefore Hive dependencies are not installed. Hence, on AWS SageMaker, if you are planning to use a regular Python Kernel **without Spark/EMR**, make sure to install the **"hive"** extra dependencies (`hsfs[hive]`).
150+
147151
!!! attention "Matching Hopsworks version"
148152
The **major version of `HSFS`** needs to match the **major version of Hopsworks**.
149153

python/hsfs/engine/__init__.py

Lines changed: 10 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -14,7 +14,8 @@
1414
# limitations under the License.
1515
#
1616

17-
from hsfs.engine import spark, hive
17+
from hsfs.engine import spark
18+
from hsfs.client import exceptions
1819

1920
_engine = None
2021

@@ -25,6 +26,14 @@ def init(engine_type, host=None, cert_folder=None, project=None, cert_key=None):
2526
if engine_type == "spark":
2627
_engine = spark.Engine()
2728
elif engine_type == "hive":
29+
try:
30+
from hsfs.engine import hive
31+
except ImportError:
32+
raise exceptions.FeatureStoreException(
33+
"Trying to instantiate Hive as engine, but 'hive' extras are "
34+
"missing in HSFS installation. Install with `pip install "
35+
"hsfs[hive]`."
36+
)
2837
_engine = hive.Engine(host, cert_folder, project, cert_key)
2938

3039

python/setup.py

Lines changed: 2 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -22,10 +22,7 @@ def read(fname):
2222
"boto3",
2323
"pandas",
2424
"numpy",
25-
"pyhopshive[thrift]",
26-
"PyMySQL",
2725
"pyjks",
28-
"sqlalchemy",
2926
"mock",
3027
],
3128
extras_require={
@@ -37,7 +34,8 @@ def read(fname):
3734
"mkdocs",
3835
"mkdocs-material",
3936
"keras-autodoc",
40-
"markdown-include"]
37+
"markdown-include"],
38+
"hive": ["pyhopshive[thrift]", "sqlalchemy", "PyMySQL"],
4139
},
4240
author="Logical Clocks AB",
4341
author_email="[email protected]",

0 commit comments

Comments
 (0)