Home » Software » Corpus Sense

Corpus Sense

Making sense of text the easy way

Try it now!

What is Corpus Sense?

Corpus Sense is a corpus query tool that allows you to easily analyze a collection of texts. It is specially designed for content and discourse analysis, although it also features functionalities commonly found in other corpus tools. Corpus Sense combines quantitative, qualitative and AI features to jumpstart your text analysis.

What’s special about Corpus Sense?

Corpus Sense has a number of smart features aimed at making sense of your corpus easily:

The Snapshot will give you a pretty good idea of the contents of your corpus. Keywords and named entities are automatically identified using advanced Natural Language Processing techniques. If you work with social media texts, you also get frequencies and distributions of emojis and hashtags.
Corpus sense has Semantic Search built in. Unlike traditional lexical search (which is also available), semantic search will retrieve snippets of text that are related in meaning to your search expression, even if the precise words you use are not present in the text.
The Topics feature allows you to dive deeper into the contents of your corpus by listing the main themes.
The Insights feature leverages the power of Large Language Models to query your corpus about very specific aspects, such as style, readability, emotions, etc. Insights can be generated in many languages (regardless of the corpus language), so even if your texts are in a language you don’t understand, you will be able to make sense of them.
Corpus Sense has been designed with the average user in mind. Its interface makes user interaction intuitive and fluid. At the same time, it offers a number of features for power users, such as regular expressions and morpho-syntactic pattern search.

What languages are supported?

Corpus Sense supports the following languages:

  • Catalan
  • Chinese
  • Croatian
  • Danish
  • Dutch
  • English
  • Finnish
  • French
  • German
  • Greek
  • Italian
  • Japanese
  • Korean
  • Lithuanian
  • Macedonian
  • Polish
  • Portuguese
  • Romanian
  • Russian
  • Slovenian
  • Spanish
  • Swedish
  • Ukrainian

When you upload your corpus Corpus Sense will automatically identify the language and process the texts accordingly. Multilingual corpora are not supported.

Please, be aware that Corpus Sense has not been thoroughly tested with all of these language, so your mileage may vary!

How large can my corpus be?

The maximum number of words for users with an academic license is 2.5M words. The limit for general users is 2.5M words.

How do I use it?

To get started, simply upload a text file (in txt/pdf/docx/xml/html format) or a zip file containing text files. Corpus Sense is extremely intuitive and most functionalities will be self-explanatory. If you are new to corpus analysis, some of the terms may be a little intimidating. Do read the blue boxes. They contain the basic information that explain many of the options and how to interpret what you are seeing. The Help Page contains more detailed explanations of how to do things.

How do I use it?

To get started, simply upload a text file (in txt/pdf/docx/xml/html format) or a zip file containing text files. Corpus Sense is extremely intuitive and most functionalities will be self-explanatory. If you are new to corpus analysis, some of the terms may be a little intimidating. Do read the blue boxes. They contain the basic information that explain many of the options and how to interpret what you are seeing. The Help Page contains more detailed explanations of how to do things.

How can I get an account?

Currently, only pre-authorized users can register. If you think your research could benefit from our software, tell us about how you will use it.

Contact us