As you can see in the history, many new features were donated.
If having enough time or getting some donations, we will implement more features:
- Machine learning for automatic classification
- Machine learning for topic modelling (learn and explore structure/connected concepts of unknown contents)
- Easy user interfaces for faster indexing of main text first, before running plugins which need much time like OCR
- Web user interface for config
- Integration of data visualizations for quantitative data (maybe integration of Apache Zeppelin, HUE search or Kibana)
- A web crawler for whole websites, since at the moment you can only index single web pages if not using additional framework ManifoldCF. Maybe since Tika extracting links automatically own crawler based on reading links facet of yet indexed pages.
- Roadmap of Open Semantic ETL for crawling and importing data
- Roadmap of Open Semantic Search Appliance (Search server VM)
- Visual graph search interface