-
Notifications
You must be signed in to change notification settings - Fork 488
Closed
Description
I've walked through the getting started guide, here's my suggestions on a few improvements that we can do to improve the overall user experience (fairly low hanging fruit IMO):
- Use minikube (or local cluster). The getting started guide uses a custom cluster with gpu, which is, in most cases, impractical for new comers to setup and try out katib
- Do not use ingress. Similar to the above issue, we should lower the barrier to try out katib; setting up ingress, along with hostname resolution, can cost time. Using node port should suffice.
- Use mnist as example to reduce waiting time (existing guide creates a cifar10 training); faster turnaround makes it easy to observe outcome and experiment with katib features
- Versioned image and source code. Since the project is in its early stage, breaking changes are almost always possible. Errors in getting started guide can scare off potential users and contributors, let's stick to a particular version (and update thereafter in case of new versions available).
- Finish the getting started guide by providing instructions on modeldb visualization. After creating the Study using
random-cpu.yml
, katib creates two jobs. However, after training, they both disappear and I'm not able to see the result neither in Kubernetes nor in modeldb. This can be that I'm not familiar with the system, or the guide is not complete. In either case, I think we should provide guidance on what to do after creating Study, to avoid confusion for people new to the system like me :)
/cc @gaocegege @YujiOshima
/area documentation
gaocegege, YujiOshima and wbuchwalter