
Data has been compared to a well-known phrase: new oil that business needs to run. IBM CEO Ginni Rometty explains it on the World Economic Forum in Davos in 2019, “I think the real point to that metaphor,” Rometty said, “is value goes to those that actually refine it, not to those that just hold it.”
Another different view of data came from Alphabet CFO Ruth Porat. “Data is actually more like sunlight than it is like oil because it is actually unlimited,” she said during a panel discussion in Davos. “It can be applied to multiple applications at one time. We keep using it and regenerating.”
An article entitled “Are data more like oil or sunlight?” published in the Economist in February 2020 has highlighted different aspects of data: it is considered as the “most valuable resource” and in the meantime, data can be also a public asset that people should share and make the most use of collectively.
AI is booming yet the data labeling behind is inefficient
Many industries are actively embracing AI to integrate it into their structure transformation. From autonomous driving to drones, from medical systems that assist in diagnosis to digital marketing, AI has empowered more and more areas to be more efficient and intelligent.
Turing Award winner Yann LeCun once expressed that developers need labeled data to train the AI model and more quality labeled data brings more accurate AI systems from the perspective of business and technology. LeCun is one of the godfathers of deep learning and the inventor of convolutional neural networks (CNN), one of the key elements that have spurred a revolution in AI in the past decade.
In the face of AI blue ocean, a large number of data providers have poured in. The data service companies recruit a large amount of data labelers, get them trained on each specific task and distribute the workload to different teams. Or they subcontract the labeling project to smaller data factories that again recruit people to intensively process the divided datasets. The subcontractors or data factories are usually located in India or China due to cheap labor. When the subcontractors complete the first data inspection, they collect the labeled datasets and pass on to the final data service provider who goes through its own data inspection once again and deliver the data results to the AI team.
Complicated, right? Unlike the AI industry, such a traditional working process is inefficient as it takes longer processing time and higher overhead costs, which unfortunately is wasted in secondary and tertiary distribution stages. ML companies are forced to pay high yet the actual small labeling teams could hardly benefit.
ByteBridge: an automated data annotation platform to empower AI
ByteBridge.io has made a breakthrough with its automated data labeling platform in order to empower data scientists and AI companies in an effective and engaged way.
With a completely automated data service system, ByteBridge.io has developed a mature and transparent workflow. In ByteBridge’s dashboard, developers can create the project by themselves, check the ongoing process simultaneously on a pay-per-task model with clear estimated time and price.
ByteBridge.io thinks highly of application scenarios, such as autonomous driving, retail, agriculture and smart households. It is dedicated to providing the best data solutions for AI development and unleashing the most value of data. “We focus on addressing practical issues in different application scenarios for AI development though one-stop, automated data solutions. Data labeling industry should take technology-driven as core competitiveness with efficiency and cost advantage,” said Brian Cheong, CEO and founder ByteBridge.io
It is undeniable that data has become a rare and precious social resource. Whatever metaphors of data are, such as new gold, oil, currency or sunlight, raw data is meaningless at first. It needs to be collected, cleaned and labeled before it grows into valuable goods. ByteBridge.io has realized the magic power of data and aimed at providing the best data labeling service to accelerate the development of AI with accuracy and efficiency.