KDnuggets Home » News » 2014 » Jan » News, Software » Clustify 4.0 adds Real-Time Predictive Coding ( 14:n03 )

Clustify 4.0 adds Real-Time Predictive Coding


Clustify updates the predicted relevance scores for the entire document population each time a document is categorized, showing the impact on the progress pie and the precision-recall curve instantly.



AVERTOWN, Pa. - (PRLog) - January 21, 2014 - Hot Neuron LLC announces version 4.0 of its

Clustify Clustify™ software, the first technology-assisted review tool to offer real-time predictive coding.

Predictive coding is a machine learning technique where software learns to predict appropriate issue codes, tags, or categories for documents based on examples provided by a human reviewer, significantly reducing the time and expense of the document review phase of e-discovery. The real-time predictive coding capability added in Clustify 4.0 updates the predicted relevance scores for the entire document population each time a document is reviewed, showing the impact on the progress pie and the precision-recall curve instantly. The software also warns the user immediately, while the user's reasoning about the document is still fresh, if the tags applied to a document seem inconsistent with the tags the reviewer applied to other documents, helping to avoid errors.

Clustify 4.0 offers powerful sampling capabilities. It allows both random and judgmental sampling when choosing documents to train the algorithm. It offers several different active learning algorithms that suggest training documents for review that will help the system to learn efficiently. It also allows the user to specify a diversity level when choosing training documents to ensure that no training documents are too similar to documents that have already been reviewed.

In a test on 1.3 million documents totaling 3.3 gigabytes of text on a modest desktop computer, Clustify took an average of a tenth of a second to update the relevance scores for the population when a training document was reviewed. Speed will depend on the details of the document set.

Read more.


Sign Up