Showing posts with label covid-19. Show all posts
Showing posts with label covid-19. Show all posts

Tuesday, November 3, 2020

How an Automated Data Labeling Platform Accelerates AI industry’s Development During COVID-19

 The impact of AI on COVID-19 has been widely reported across the globe, yet the impact of COVID-19 on AI has not received much attention. As a direct result of Covid-19, AI enterprises are enhancing their strategies for digital transformation and business automation.

Data is the core of any AI/ML development. The quality and depth of data determines the level of AI applications. Considering that the better the data that goes into building the ML training model, the better the output. ML teams need to go through proper data preparation such as data collection, cleansing and labeling.

Data labeling is a simple but difficult task

When it comes to data labeling, the essential step to process raw data (images, text files, videos, etc.) for computer vision so that machine learning models can learn from the labeled dataset, some data labeling companies were forced to move to a work-from-home model due to the pandemic, which has posed challenges in terms of communication, data quality and inspection. For example, Google Cloud has officially announced that its data labeling services are limited or unavailable until further notice. Users can only request data labeling tasks through email but cannot start new data labeling tasks through the Cloud Console, Google Cloud SDK, or the API.

Insiders say that data labeling is a simple but difficult task. On one side, as soon as the labeling standard is set, data labelers just need to follow the rules directly with patience and profession. On the other side, however, data labeling is meant to pursue high quality for ML which demands accuracy, efficiency and high cost of labor and time regarding the massive amount of data to be labeled.

A majority of AI organizations said the process of training AI with data has been more difficult than expected, according to a report released from Alegion. Lack of data and data quality issues become their main obstacles to AI application.

An automated data labeling platform aims to transform the industry

To deal with such issues, Bytebridge.io has launched its automated data labeling platform this year. It aims to provide high quality data with efficiency through a real-time workflow management for AI developers so as to free them from the pressure of data preparation.

An autonomous driving company in Korea needs to label roadblocks and 2D bounding boxes for cars. Considering data security, they have built in-house labeling team. However, they ran into a couple of unexpected problems due to improper labeling tools and low efficiency. Upon trying Bytebridge, their project managers are able to improve working efficiency through Bytebridge’s online real-time monitoring function. The number of monthly labeled images has increased from 600k to 750k and they are able to save 60% of budget.

On Bytebridge’s dashboard, developers can upload raw data and create the labeling projects by themselves. They can check the labeling status and quality anytime, even the estimated price and time required. Such an automated and online platform greatly ensures labeling efficiency and quality. Bytebridge’s easy-to-integrate API enables continuous feeding of high-quality data into machine learning systems. Data can be processed 24/7 by the global contractors, in-house experts and the AI technology.

“We want to create an automated data labeling platform that helps AI/ML companies to accelerate their data project and generate high-quality work,” said Brian Cheong, CEO and founder of Bytebridge.io.

Monday, September 21, 2020

AI Creates More Jobs, but is Conditional without ByteBridge.io

 The Rise of Machine Learning

Machine learning is actively presenting job opportunities across the world. According to the World Economic Forum (WEF), Artificial Intelligence (AI) can create 58 million net increase of new employment by 2022. Besides tech giants such as Apple, Facebook, Google and Amazon are hiring Machine Learning engineers, other industries are increasingly leveraging this emerging technology at advanced levels. The huge demand for Machine Learning skills links to all kinds of fields, including but not limited to Financial Analysis, Smart Farming, Health Services, Online Education etc.,

COVID 19’s AI Boom

During the current COVID-19 pandemic crisis, the appeal of remote work is surging globally. The massive needs for intelligent machines are growing in our workforce driven by the outbreak of telecommuting. When facing the huge dataset particularly, AI allows smooth cybersecurity checks, optimizes the pulling data process, and fosters telework with significant growth in efficiency and time-saving.

Apart from benefiting remote workforce, this powerful emerging technology has been deployed to fight against the virus. Microsoft AI can be used for early detection for COVID-19 and allocating limited resources such as medical supplies and hospital spaces with smart decision making effectively.

Image for post

The future of intelligence machine is promising. According to Acumen Research and Consulting, the expected global investment of Machine-Learning-related products and services can reach up to US$76.8 billion by 2026. The application of AI technology will soon break new ground as the business industry keeps fueling the world market.

Skilled-biased Opportunity

While AI is agreed to have great potential, the application of this emerging digital technology marks a major shift in quality, location and requirements for the new roles. Machine learning, as a sub-technique under AI, automating analytical model building, is now increasingly adopted across industries. But not everyone stands to benefit automatically.

Based on Uria-Recio’s TEDxIMU Talk, AI will continuously push human professionals up the skillset ladder into cognitive human skills. Process-oriented employment i.e., jobs with repetitive activities such as machine operators is now declining. Over the next decades, more than 80% of them will be done by intelligence machines.

Image for post

In the meantime, a large number of job opportunity created involves cross-functional reasoning skills. When routine jobs are replaced by AI systems, businesses start looking for educated workers for new roles. Accordingly, the “human-machine collaboration” prefers applicants with advanced cognitive skills.

Workforce Transition

Given the increasing demand for creative and reasoning labors, job seekers should now upgrade their skills to adapt to new opportunities. ByteBridge.ioa tech startup providing data training solutions, facilitates this workforce transition for workers.

ByteBridge.io simplifies the advanced cognitive skills required for well-trained labelers. By interpreting complicated annotation rules, organizing the model into multiple stages, and dividing a big task into small pieces, ByteBridge.io lowers the needs for well-educated workers. The design of the ByteBridge’s platform is very clear and easy-to-use.

Moreover, unlike the traditional machine learning companies hiring trained employees or managed teams for data labelling, ByteBridge.io incorporates blockchain technology into the data training solutions. ByteBridge’s algorithm borrows the idea of a consensus mechanism from Cryptocurrency, distributing tasks to all users on the data platform.

ByteBridge.io replaces technical quality check of trained labelers by a general agreement system. The platform assigns several people do the same work, and the correct answer is the one that comes back from the majority of labelers. A single task could be completed multiple times by different users. As a result, this process involves the contributions from hundreds of thousands of participants who work on verification and authentication of data labelling.

For the business with data training needs, ByteBridge.io provides options for accuracy levels. Benchmark measures the consistency among users. A score of 75 indicates 75% of users agree the label is correct. So higher benchmark scores can improve the accuracy of the data labeling task, implying the better quality of the data. This greatly improves the distribution efficiency through a consensus mechanism. And the customer can get a large amount of accurate data in a very short time.

Image for post

Addressing Worldwide Technological Unemployment

ByteBridge.io, not only provides work locally, is now tackling the skill-biased Machine Learning Revolution for all individuals around the world. With more than 100,000 registered users across Asia, North America, EU, and Africa, ByteBridge.io is offering tens of millions of online job opportunities based on big data and recommendation services all around the world.

Importantly, ByteBridge.io provides mutual benefit for business customers as well. This worldwide data factory “hires” all kinds of employees. Based on the education level, the language used, and competency-based assessment scores, the workforce can cover a wide range of needs from customers.

ByteBridge.io, as an intermediate data solution provider, bridges the massive new advanced roles with less-skilled works, bringing working opportunities around the globe.

How Data Labeling Contributes to the War against Covid-19

 Healthcare industry is under enormous pressure, especially in the midst of Covid-19 period. The unexpected global pandemic has presented overwhelming challenges on human beings. Scientist, medical experts, doctors and nurses across the globe have undertaken their responsibility to fight against the disease. However, with a shortage of healthcare labor force, we still cannot deny how limited the current medical capacity is.

On December 30 of 2019, Healthmap, an artificial intelligence (AI) data-driven system that scans data sources for disease outbreak signs, detected an unusual activity about a new type of pneumonia burst in China. One day later, BlueDot, an AI outbreak risk software, raised a similar alarm after scanning thousands of Chinese news reports through its machine learning algorithms.

There’s no doubt that Covid-19 has been a catalyst for strengthening the increasing connection and cooperation between AI and healthcare industry.

Medical image diagnosis for future healthcare

AI and ML can be powerful methods for everything in healthcare: medicine research, diagnosis, disease prevention and control, patient treatment, even administrative and personnel management. AI/ML-enabled systems improve their capabilities and effectiveness by automating the most repetitive and homogenous activities. It is currently moving out of the labs and into real-world applications in the health sector.

When it comes to medical images, ML’s applications can cover the entire cycle from image creation and reconstruction to diagnosis and outcome prediction. AI-backed Machines use the computer vision to detect patterns that human eye can’t catch and correlate them with similar medical image data to identify possible diseases and prepare reports after analysis. X-ray, computed tomography (CT) scan, magnetic resonance imaging (MRI) and other image-based test reports can be easily screened to predict various illness in an automated, accurate, and fast way.

Some healthcare companies are now using ML technology to detect organ anomalies, such as identifying tumors from an MRI scan of the brain, along with millions of labeled medical images to show the affected area and to train ML algorithms to detect such diseases. For example, AI semantic segmentation can be used in liver and brain diagnosis; polygon annotation can be used in dentistry; bounding box in kidney stone; annotation detection in cancer cells, and etc. Medical image annotations provide results of greater accuracy in the early detection, diagnostics and treatment of disease as well as understanding the normal. The medical imaging diagnosis is seen as a powerful method for future applications in the health sector.

Bottlenecks of medical image labeling

High-quality training data is the key to building ML models and help to improve medical image-based diagnosis. However, a great challenge in this field is the lack of high quality data and annotation. Specifically, medical imaging annotations have to be performed by clinical specialists, which is costly and time-consuming.


As DJ Patil and Hilary Mason write in Data Driven, “Cleaning the data is often the most taxing part of data science, and is frequently 80% of the work.” The lack of precise and high quality data presents an overwhelming challenge for machine learning industry, limiting their ability to provide the “right data” to answer specific questions. Currently, most medical research organizations have limited access to data samples from a certain geographic areas.

The hardest part of building AI products is not the AI or algorithms but data preparation and labeling. For example, retinal images are used to develop automated diagnostic systems for conditions, such as diabetic retinopathy, age-related macular degeneration. In order to do that millions of medical images need to be labeled by various conditions structurally. This is laborious as it requires identification of very small structures and usually takes hours for experts to annotate them carefully.

Turning points

Aware of those challenges, ByteBridge.io moves a big step forward through its automated data collection and labeling platform. It allows researchers to have access to high-quality labeled datasets related to health care and public health.

ByteBridge’s innovative data training platform empowers healthcare researchers and ML medical companies to use data cost-effectively and improve healthcare outcomes. From data collection, to data labeling, to machine learning applications, ByteBridge.io provides professional data annotation service on medical images with the highest quality and maximum accuracy.

Different with traditional data labeling companies, in ByteBridge’s dashboard, researchers can create the data project by themselves, upload raw data, download processed results as well as check ongoing labeling progress simultaneously on a pay-per-task model with clear estimated time and more control over the project status.

Compared to existing Western companies for data annotation outsourcing, Bytebridge.io charges 90% lower. It offers 50% cheaper price than its competitors in China and India. More than that, ByteBridge’s data processing speed is more than 10 times faster than the current data annotation company.

“I believe that we can achieve great innovation in this field based on our product development capabilities and underlying blockchain-based technology. ByteBridge.io is aimed at accelerating the development of ML industry and seamlessly transforming it into other essential areas such as healthcare,” said Brian Cheong, CEO of ByteBridge.io.

Imagine one day, patients can simply go through a fast AI scan as diagnosis; smart wearable devices, such as Apple Watch, can analyze physical data, note abnormality and generate an alarm before you are about to have a heart attack or a stroke; medical detection and prediction can be fully automated and supervised with little human intervention. Such scenes can definitely be realized in the coming future, thanks to ML and AI technology.

Machine Learning has achieved unprecedented success in computer vision and other industries so far. And now it is drastically revolutionizing healthcare area with indispensable support from automated data labeling service.

Invisible Workforce of the AI era

The Surging Demand for Data Labelling Services

Thirty years ago computer vision systems could hardly recognize hand-written digits. But now AI-powered machines are able to facilitate self-driving vehicles, detect malignant tumors in medical imaging, and review legal contracts. Along with advanced algorithms and powerful compute resources, labeled datasets help to fuel AI’s development.

AI runs on data. The unstructured raw data need to be firstly labeled in the dataset so that the machine learning algorithms can understand it. Given the rapid expansion of digital transformation progress, there is a surging demand for high quality data labeling services. According to Fractovia, data annotation tools market was valued at $ 65O million in 2019 and is projected to surpass $5billion by 2026. The expected market growth refers to the increasing transition of raw unlabeled data into useful Business Intelligence (BI) by machine learning skills with human guidance.

Image for post

AI’s new workforce

Data labelers are referred as “AI’s new workforce” or “invisible workers of the AI era”. They annotate tremendous amount of raw datasets for model training that enables the public to enjoy machine learning empowered goods and services. Along with the hugely lucrative market, there is more than one way for the data labelling industry to organize their workforce.

In-house

The data labelling enterprises hire part-time or full-time data labelling teams with direct oversight of the whole tagging process. When the annotation projects are quite specific, the team can adjust to changes of the particular needs. As a rule of thumb, it is more common to have an in-house team for long-term AI projects, where data flow is continuous during the prolonged periods of time.

The cons of in-house data labeling team are quite obvious. It’s expensive to hire and train a professional labeling team, develop a software with the right tools and maintain a secured working environment.

Outsourcing

Hiring a third-party annotation service can be another option. Outsourced companies have experienced annotators who finish tasks with higher speed and efficiency. Specialized labelers can proceed with a large volume of datasets within a shorter period.

On the other hand, outsourcing results in less control over the project process and the communication cost is comparably high. A clear set of instructions is necessary for the labeling team to understand what the task is about and make annotations correctly. Tasks may also change as developers optimize their models. Besides that, it takes extra time to check the quality of the completed tasks.

Crowdsourcing

Crowdsourcing means sending data labelling tasks to individual labelers all at once. It breaks down large and complex projects into smaller and simpler parts for a large distributed workforce. A crowdsourcing labelling platform also implies the lowest cost. It is always the top choice when facing a tight budget constraint.

While Crowdsourcing is considerably lower priced than other approaches, its biggest challenge, as we can imagine, is the accuracy level of the tasks. According to a report studying the quality of crowdsourced workers, the error rate of the task is significantly related to data annotation types. In the case of basic description task, crowdsource workers’ error rate is around 6%, which is much lower than sentiment analysis task with 40%.

A turning point during COVID-19

Crowdsourcing has been proven beneficial during the COVID-19 crisis as in-house and outsourced data labelers are affected by the lockdown. Meanwhile, people stuck indoors are now turning to more flexible jobs. Millions of unemployed or part-time workers are starting the crowdsourcing labelling tasks from anywhere with internet.

Image for post

Bytebridge.io, a tech startup for data service, has also seen the workforce as well. It provides high quality and cost-effective data labeling service for AI companies and job opportunities for labelers who can work without any limit on time and place.

Bytebridge.io employs consensus mechanism to optimize the labelling system. Before distributing individual tasks for labelers, the system firstly sets a consensus index, such as 90%. If 90% of labeling results are basically the same for the same part of the task, the system would judge that they have reached a consensus and move onto the next part of the task. If the machine learning model requires higher accuracy for data annotation, the platform can adjust to “multi-round consensus” to repeat tasks over again to improve the accuracy of final data delivery.

Developers can create their own projects on Bytebridge’s dashboard. The automated platform allows developers to write down their specific requirements for the labeling projects, upload raw dataset and control the labeling process in a transparent and dynamic way. Developers can check the processed data, speed, estimated price and time, even though working at home.

By cutting down the intermediary costs and time, Bytebridge.io charges 90% cheaper than Google and any other in Silicon Valleyshows 10 times or more rapid data processing speed. Bytebridge.io is devoted to gearing up the AI revolution and digital transformation through its premium data processing service, automated data platform and connection of the cost-effective international fragmented labor force.


ByteBridge.io Provides Language Opportunity Across Globe to Help Local Economy Against Covid-19

 ByteBridge.io, an automated service provider for collecting, managing, and processing datasets for AI and machine learning industries, has partnered with over 10 different language speaking communities across the globe, aiming to help local economies against Covid-19 through providing language job opportunities and the best quality data services to its clients meanwhile minimizing the effect that Covid-19 brought to the local economy.

ByteBridge.io now provides language services covering Asia, Europe to South America regions, such as Chinese, Korean, Bengali, Vietnamese, Indonesian, Turkish, Arabic, Spanish, and more. With the close partnership with these communities, it highly improves its data quality in different languages meanwhile expand its service scope to a wider range.

“We are honored to have support from these different communities, during this global pandemic. We believe the non-English speaking populations are part of the most vulnerable ones. By providing language services, we can leverage the expertise of our global network to help them expand their opportunities they serve,” said Brian Cheong, founder of ByteBridge.io.

Providing services in over 10 languages, ByteBridge.io is dedicated to improving the quality of its data service, and by partnering with local communities in different native languages, it could ensure data service quality through the help of thousands of workers across these regions and the processing time to finish tasks will also be shortened.

Currently, anyone can use ByteBridge.io for free. Clients will only be charged once they hit a certain usage threshold, and once the free credits run out, clients are only charged based on the volume of data that clients upload and the breadth of services they use.

About ByteBridge.io :

ByteBridge.io is a self-service platform to manage and monitor the overall data processing, to provide data collecting and labeling services for organizations and provide convenient toolkits for machine learning companies to initiate tasks, manage the data they are receiving, and ensuring the quality of the data meets their requirements.

No Bias Labeled Data — the New Bottleneck in Machine Learning

  The Performance of an AI System Depends More on the Training Data Than the Code Over the last few years, there has been a burst of excitem...