Visual & Interactive NLP pipeline editor
Spark NLP pipelines are highly configurable, enabling data scientists to quickly experiment with different algorithms, models, and configurations. Since doing that requires understanding the library's design and API's in depth, some of that flexibility and power is under-utilized. The goal of this project is to build a visual, interactive, web-based NLP pipeline editor that will enable users to build, edit, test, and publish Spark NLP pipelines in a drag & drop fashion. This user interface should come with the 30+ pre-trained pipelines that the library already ships with - enabling to easily understand and test their functionality - but then also enable users to create their own pipelines, or clone and edit existing ones to fit their specific needs.
With one or two engineers working the timeline would be either 6 or 12 months, depending on what gets funded.
Beyond an educational tool, this will be a pragmatic testing and debugging tool when training new NLP pipelines and models for a new application. The user interface will be an extension of the (currently private) Spark NLP Server, which powers the Spark NLP demo UI and also enables publishing Spark NLP pipelines as REST API's. Therefore delivering this project will provide the open source community with a Spark NLP Server, the ability to easily publish NLP models & pipelines as API's, the ability to visually edit and test them, and the ability to publish and explain them easily to others.