Behind the scenes: The training data pipeline for SAP's Document Information Extraction service
Overcoming the challenges of data labeling efficiency and quality in the face of highly complex ontologies.
January 26th, 2023 - 5:00 p.m CEST / 11:00 a.m EDT - English
Register to participate!
Would you like to learn how to build an automated document information extraction application with SAP's principal data scientist?
To do so, join our upcoming webinar, and get insights on SAP's approach to Intelligent Document Processing!
On January 26th, at 5:00 p.m CET / 11:00 a.m EDT, Manuel Zeise will share his insider tips on SAP’s:
- management of complex annotation problems and how they've been mitigated through the Kili Technology data labeling platform;
- usage of ML and Kili Technology to scale data labeling
through OCR and pre-annotation;
- prioritization of labeling tasks to reduce the time to first model.
Principal Data Scientist
"We have to spend a lot of time preparing high-quality data before we train the model. Just as a chef would spend a lot of time to source and prepare high-quality ingredients before they cook a meal, a lot of the emphasis of the work in AI should shift to systematic data preparation.“
Co-Founder @Google Brain
"Even after 4 years I still haven't "solved" labeling workflows. Labeling, QA, final QA, auto-labeling, error-spotting, diversity massaging, labeling docs & versioning, ppl training, escalations, data cleaning, throughput & quality stats, etc."
“But the real-world experience of those who put them into production shows that (...) it's often the quality of
data (...) that makes your AI project succeed or fail.”
Co-founder & CTO