Microsoft has a new auto-captioning system for images. The system will launch first in Azure Cognitive Services. However, Microsoft has indicated that the same will trickle down to Microsoft Word, Outlook, and PowerPoint.

How Does The New AI-Drive Image Captioning System Work?

Any AI-driven system has to be first trained on relevant datasets. These algorithms learn from the data points and then gain the ability to mimic the expected behavioral patterns. Microsoft’s new auto image captioning system too is reportedly trained with a huge dataset of images that were paired with word tags. These word tags were mapped to a distinct object in an image. After the initial training, researchers fine-tuned the pre-trained model for captioning on the already captioned images dataset. The training and finetuning process allowed the AI model to learn how to compose an understandable sentence. The new AI model subsequently leverages the visual vocabulary to self-generate captions for images containing novel or distinct objects accurately. It appears the emphasis is on the object that is specific or unique in the image.

As with all AI Models, even Microsoft’s image captioning system isn’t a 100 percent accurate or perfect. However, Microsoft assures the new AI Model is twice as better as the image captioning model currently being used in the company’s products and services. Internal testing indicates the new model can create captions that are more descriptive and accurate than the captions written manually by humans, claims Xuedong Huang, a Microsoft technical fellow and the chief technology officer of Azure AI Cognitive Services in Redmond, Washington, “We’re taking this AI breakthrough to Azure as a platform to serve a broader set of customers. It is not just a breakthrough in the research; the time it took to turn that breakthrough into production on Azure is also a breakthrough.”

— Ken Ross (@hotkrossbits) October 14, 2020 What Huang indicated was that Microsoft has able to significantly accelerate the development, refinement, and deployment of AI Models which can compete against human-generated content. However, it is important to note that these models usually follow a specific set of guidelines and rely heavily on the datasets. Microsoft has been working hard for the last few years to infuse the power of AI across several of its products and services. AI holds the power to boost productivity while freeing humans to do more creative tasks. Interestingly, Microsoft aims to help all users access the vital content in any image for people with vision impairment through the new automatic image captioning system.