The CIS platform is designed to crawl web content such as news articles, press releases, long-form content, social media, and other forms of unstructured information. This is fed into a database on a real-time basis, where the content is stored for future analysis. Once in the system, the CIS platform uses Apache Airflow to schedule tasks, workflows, and repeating processes either when certain types of content are found or when relevant information is discovered. This system further feeds into the AI engine, which is a modular system using a number of different libraries including Scikit Learn, spaCy, and others. The AI engine is used to determine the type of content being crawled and extract useful and relevant information. All of this is then available via our Postgres-based database or custom API.

A secondary task queue also runs events to enable custom actions, such as sending of personalized e-mails, generating alerts, or firing webhooks to services that are dependent on the content being crawled or information being extracted. This queue powers user experiences dependent on model results or other parts of the AI and analytics stack.
We believe in world-class customer experiences and world-class technologies. Our platform is designed for speed, scalability, and timely reporting of information. All crawled content, labelled data, and model results are available via APIs or via direct connections to the Postgres databases that contain the content itself. Via Airflow, we are able to monitor scheduled tasks, both in terms of their successful completion as well as the length of time it takes to generate and apply model results. In the latter case, we proactively monitor how long model application takes to ensure that our modeling infrastructure is never a bottleneck, even if retraining or reapplying models.
In addition to the architecture outlined above, CIS also provides custom development services to white-label the software and provide your business and employees with custom content-based experiences, analysis, and applications. Contact us to learn more.
© Chimera Information Systems Inc., 2021
Let's Chat