all cases

data
parsing

  • Scrape data out of millions of pages and documents from five separate judicial websites; don’t overload them.
  • Daily collect updated data from the judicial cases files. Not only the structured data but also built-in PDF, Word, JPG files, etc.
  • Keep scripts continuously running, add new cases files and update old ones if something had changed.
Mailing platform customer location

Law Consulting company based in South America. Client automate, manage, classify and store files of court cases, documents and contracts of all kinds via AI algorithms.

*any information is published solely with the consent of the customer

our solution

data‍ parsing dashboard

Created a distributed system architecture with Linux nodes and dynamic pipeline which makes managing high peaks and set priorities possible.

Created an algorithm, which scrapes new files immediately during the daytime based on traffic and makes massive updates during the nighttime.

Implemented proxies and AI technologies used to overcome bot protection and process 14.8 million pages daily. Daily we download about 14 Gb of important data.

Created a cloud SQL database with daily dumps to Elasticsearch to keep data we use; files directly uploaded to Elasticsearch.

your questions and special requests are always welcome

let’s talk