How SCIO works

Automate Information Extraction

sc.io uses sophisticated machine learning to automatically turn “dark” data (unstructured data buried in text, tables or figures which by definition cannot be processed by existing software or analytics platforms) from documents such as legal and commercial contracts, regulatory filings, web pages, news articles or annual reports into machine readable datasets.

sc.io automates complex screening and analysis processes by extracting relevant data points and producing a predefined structured data output, replacing tasks that would otherwise necessitate tremendous human effort.


What makes sc.io special


Quality at scale


sc.io achieves better than human extraction quality while scaling up to very large number of documents.

Extraction process automation via machine learning


sc.io leverages the latest advances in ML models to extract complex document-level information that is expressed in the form of not only free text, but also tables or in visually distinctive ways.

Easy to set up


sc.io is designed as an end to end integrated workflow from data collection to the production of structured results, accessible via simple REST API calls.


How does sc.io work?


1

Define the specific data points (name, date, entity, tables, etc.) that you need to retrieve

2

Train a ML model on a subset of the documents (text, PDFs, articles, web pages, etc.)

3

Once the ML is ready and deployed, send the documents to our hosted infrastructure or process the documents locally

4

Retrieve a JSON/XML result file containing the extracted data points in a structured form via an API call