Research tools for easier searching, analytics, discovery & text mining of heterogenous and large document sets with free software on your own computer or server

Search engine
(Fulltext search)

Easy full text search in many data sources and formats: Just enter a search query (can include powerful search operators) and navigate throught the results.

Thesaurus & Grammar
(Semantic search)

The semantic search engine will find synonyms, hyponyms and aliases, too. Using heuristics for grammar rules like stemming it will find other word forms, too.

Interactive filters
(Faceted search)

Easy navigation trough many results with interactive filters (faceted search) which aggregate an overview over (meta) data like authors, dates, tags or document types

Exploration, browsing & preview
(Exploratory search)

Explore your data or search results with an overview of aggregated search results by different facets with named entities (i.e. paths, tags, persons, locations or organisations), while browsing with comfortable navigation through search results or document sets.
View previews (i.e. PDF, extracted Text, Table rows or Images).
Analyze or review document sets by preview, extracted text or wordlists for textmining.

Collaborative annotation and tagging

Tag your documents with keywords, categories, named entities or text notes that are not included in the original content to find them better in other research or search contexts. Or evaluate, value or assess documents (i.e. for validation).

Datavisualization (Dataviz)

Visualizing data like document dates as trend charts or text analysis for example as word clouds or view results with geodata as interactive maps.

Alerts & Watchlists (Newsfeeds)

Stay informed via watchlists, activity streams or news alerts: Subscribe searches as RSS-Newsfeed and get notifications when there are changed or new results.

Supports different file formats

No matter if structured data like databases, tables or spreadsheets or unstructured data like text documents, E-Mails or even scanned legacy documents: Search in many different formats and content types (text files, Word and other Microsoft Office documents or OpenOffice documents, Excel or LibreOffice Calc tables, PDF, E-Mail, CSV, doc, images, photos, pictures, JPG, TIFF, videos and many more file formats).

Supports multiple data sources

Find all your data at one place: Search in many different data sources like Files and directories, fileserver, file shares, databases, websites, Content Management Systems, RSS-Feeds and many more.

The Connectors and Importers of the Extract Transfer Load (ETL) framework for Data Integration connects all data sources and the Data Enrichment framework enhances the data with the analysis results of diverse analytics tools.

Automatic text recoginition

Optical character recognition (OCR) or automatic text recognition for images and text content stored in graphical format like scanned legacy documents, screenshots or photographed documents in the form of image files or embedded in PDF files.

Open-Source enterprise search technology based on interoperable open standards

Mobile (Responsive Design)

Open Semantic Search can not only be used with every desktop (Linux, Windows or Mac) or web browser. With its responsive design and open standards like HTML5 it is possible to search with tablets, smartphones and other mobiles.

Metadata management (RDF)

Structure your research, investigation, navigation, metadata forms or notes in a Semantic Wiki or another CMS with taxonomies and custom fields for tagging documents, annotations, linking relationships, mapping and structured notes. So you integrate powerful and flexible metadata management tools using interoperable open standards (Resource Description Framework)

Filesystem monitoring

Using file monitoring, new or changed files are indexed within seconds without frequent recrawls (which is not possible often if many files).
Colleagues are able to find new data immediately without (often forgotten) uploads to a data or document management system (DMS) or filling out a data registration form for each new or changed document or dataset.