Computer vision is becoming a giant of technology companies, enabling machines to speed up operations and accomplish tasks that can only be done by humans.
A few months ago, eBay announced that it would add new search features that would allow users to use existing photos to find similar products, while online apparel retailers ASOS in the fashion field to get involved in this. Shutterstock released a new test function last week, users can search according to their own layout stock photos. After a few days, Google Photo app released a new pet image recognition feature.
In short, in the field of computer vision, the development of more and more exciting, but also can see people in the field of artificial intelligence a lot of investment fruitful.
At present, most of the progress of computer vision technology mainly occurs in the field of static images, but we also began to see the results of computer vision technology in the video. For example, the Russian authorities applied to face recognition technology in the nationwide real-time monitoring network. Pornhub is doing similar things, automatically sorting the “adult entertainment” video, including training systems to identify specific sexual pose. In addition, there is a booming autopilot car industry that relies heavily on the machine’s ability to understand real-world behavior.
In this context, Google launched a new video database, hoping to promote computer vision to identify images in the behavior of the study. “AVA” is a database of multiple tags, the user can operate in the video sequence.
The difficulty of motion recognition in video is that the complex scenes in the video are intertwined and that multiple actions are also issued by different people at the same time.
Google software engineer Gu Chunhui and David Ross explained in a blog post: “The church machine to identify the image of human behavior is the development of computer vision a major problem, but for personal video search and discovery, sports analysis and gestures Interface and other applications. “Despite the exciting breakthroughs in image classification and finding objects over the past few years, identifying human behavior is still a huge challenge. ”
In essence, AVA is a pile of 80 atomic actions marked YouTube website, and extended to nearly 58,000 video clips, involving a lot of daily activities, such as handshake, kick, hug, kiss, drink, play, walk, etc. Wait.
By opening the database, Google hopes to improve the machine’s “social visual intelligence” so that they can understand what human beings are doing and predict what they will do next.
“We hope that the release of AVA will help improve the development of human behavior recognition systems and have the opportunity to model complex activities based on space-time tags,” the company said.