Harvest was designed to alleviate the opaqueness of complex data models and large data sets. Each come with their own set of challenges.
Complex Data Models
Complex is more attractive sounding, but even simple data models can benefit from a really good data access layer API.
Perceivably flat data access layer
It is simpler to choose/search from a list of items than to attempt to traverse a relational data model or document store. Many times you simply don't know what you're looking for or when you do, you don't (and shouldn't) know where to look.
Descriptive metadata (and domain specificity)
Merely having a data model and its constraints is not enough. The first barrier for users is figuring out what about the data they can search for. Humanize the data model by adding some descriptive love.
Free-text data model search
An intuitive and powerful search depends on the first two points mentioned above. The (highly descriptive) metadata as well as the structural metadata can be indexed and searched against directly. A match in this case would be particular data point whether it is for query or display purposes.
Expand the search by including the data itself
To make the search more robust, discreate data can be indexed and associated with each data point as well. As an example, if I type
male, the available query or view options may result in
gender. This enables users to find what they are looking for by directly searching for the a known data value. One caveat to this is regarding permissions. If certain end are not allowed to view certain data, a match occuring from typing
male would potentially mean there are
male data values.
Humans aren't constrained, databases schemas are (and for good reason)
Databases have data types to allow for fast and effective search on data. For example, you cannot query the string
hello world using a numerical operator (as least in a way that makes sense). For this reason, data are split up into multiple fields suting the needs of the data. For example when you view a cooking recipe, you would expect to read an ingredient such as 2 teaspoons of salt. What would happen if you only knew the ingredient name i.e. salt? You wouldn't know how much of the ingredient you need. Likewise if you only saw teaspoons without 2, you would not know the quantity of salt to add.
The power of the database comes from storing and indexing discrete values which enables fast search and sorting capabilities. Humans however need to be able to view these discrete values in way that means something to them.
Large Data Sets
Large data sets should only benefit users in the sense that they have more data to explore. Similar to the comment on complexity.. small data sets will work just fine here as well.
Usability must have an O(1) relationship to data size
The scale of the data must not tax its usability. Interfaces must be able to scale with the data transparently and not burden the user with too many options at once.
Most data should be looked at an aggregate level. If you choose the view
gender data, the appropriate view is a series aggregate counts for
unknown. This immediately gives the user a sense of the data. For example, if the are interesting in the
male population, but the data set only has a few, they can make the decision to continue or not.
These statistics can be thought of as another set of metadata. This time it's computed from the data itself.
This goes hand-in-hand with displaying aggregate statistics. This is particularly important for continuous data where simply listing a bunch of data value counts would be overwhelming. Again, the goal is for a user to get a sense of the data before having to query or view it.