This week, Google announced the launch of Cloud Vision, the company’s image recognition technology, making it available to developers. The API is currently available in limited preview using the Google Cloud Platform.
Google’s image recognition technology is one of the strongest around, applicable to many domains that include optical character recognition (OCR), face detection, and object recognition.
Google previously made Prediction API available without image support. With the launch of Cloud Vision, Google joins a public cloud service market led by Amazon’s Web Services and Microsoft’s Project Oxford.
In the press release, Google Product Manager Ram Ramanathan said, “Advances in machine learning, powered by platforms like TensorFlow, have enabled models that can learn and predict the content of an image. Our limited preview of Cloud Vision API encapsulates these sophisticated models as an easy-to-use REST API.”
Developers can sign up for the limited preview here.
According to Google, the following set of Google Cloud Vision API features can be applied in any combination on an image:
- Label/Entity Detection picks out the dominant entity (e.g., a car, a cat) within an image, from a broad set of object categories. You can use the API to easily build metadata on your image catalog, enabling new scenarios like image based searches or recommendations.
- Optical Character Recognition to retrieve text from an image. Cloud Vision API provides automatic language identification, and supports a wide variety of languages.
- Safe Search Detection to detect inappropriate content within your image. Powered by Google SafeSearch, the feature enables you to easily moderate crowd-sourced content.
- Facial Detection can detect when a face appears in photos, along with associated facial features such as eye, nose and mouth placement, and likelihood of over 8 attributes like joy and sorrow. We don't support facial recognition and we don’t store facial detection information on any Google server.
- Landmark Detection to identify popular natural and manmade structures, along with the associated latitude and longitude of the landmark.
- Logo Detection to identify product logos within an image. Cloud Vision API returns the identified product brand logo, with the associated bounding polybox.