1 agosto, 2024

What is Google Cloud Vision? | Bootcamps

Google Cloud Vision API is one of the tools of the Google Cloud Platform that allows you to make a analysis of a large number of images, to later extract valuable information that contribute to its interpretation.

Google launched this system in 2015 with options for developers, such as Easy integration of vision detection functions into different applications, the labeling of images and explicit content, among others. In addition, it makes tasks easier for developers, and supports different file formats such as JPEG or RAW.

Google Cloud Vision also detects faces and landmarks and features optical character recognition, also known as Optical Character Recognition (OCR).

Google Cloud Vision API Features

Among the elements that characterize Google Cloud Vision, are the face detection, landmark detection, text detection and logo detection, among others, that work by making a request to this tool. We detail them below:

Face detection

Google Cloud Vision face detection allows detecting one or more human faces within an image, as well as the associated key facial properties or attributes, such as the emotional state and information related to the coordinates of the face position, its reference points and orientation. This option does not allow individual facial recognition of a specific type.

The face detection tool has the function of match biometric information associated with a detected face with stored face biometric data and labeled, so they work better in cases where the face is seen frontally.

For him face detection the distance measured in pixels between pupils is important, because this allows for accurate detection. Google Cloud Vision, also called Google Vision, is capable of offering effective results when this distance is a minimum of 32 pixels.

Landmark detection

Landmark detection in Google Cloud Vision takes care of detect popular natural or artificial structures within a certain image. Additionally, this API has the ability to detect features in a local image fileas long as the user sends the contents of this image file as a string or character string encoded in the base64 positional numbering system in the content of the request.

This tool also allows detection of landmarks in remote files, that is, properties directly from an image file that is located on storage platforms such as Google Cloud Storage, or that are on the web, without it being necessary to send the content of this image file within the body of the request.

For this waypoint detection option, the Google Cloud Vision API will return the user the longitude and latitude of landmark identified.

Label detection

Google’s CloudVision offers tag detection, which consists of identifying and extracting information about the entities of an image in a broad group of categories. These tags can identify general objects, locations, actions, animals, products, and more.

The user also has the possibility of creating personalized orientation labels, through the platform Cloud AutoML Visionwhere you can train a custom machine learning model which allows you to classify images.

It should be noted that these tags are only displayed in English and are returned in JSON structure format along with the success percentage.

Text detection

This Google Cloud Vision feature allows you to detect and extract text from images, through the use of OCR o Optical Character Recognition, which has compatibility with several languages ​​and whose algorithms in combination with certain semantic rules are used for license plate recognition, so the Google Cloud Vision API returns to the user a text string with its coordinates, as well as individual words and their bounding boxes.

There is a variation of text detection and it is document text detection from Google Cloud Vision, where the text is also extracted from the image, but The response is optimized for texts and other long documents. A file is included in JSON format that contains data regarding the page, division, block, paragraph and word.

Logo detection

The logo detection from Google CloudVision is responsible for detecting popular logos found in an image, either within a local file or from a remote image. This platform also allows the user to use an already specified image to detect the logo or to specify their own image through the customization option.

What is the next step?

Now that you know what the Google Cloud Vision service is and what its most important features and functionalities are, Do not hesitate to continue learning thanks to our DevOps & Cloud Computing Full Stack Bootcampwhere in less than 6 months you will master everything necessary to become an expert in the IT sector. Sign up now!

Deja una respuesta

Tu dirección de correo electrónico no será publicada. Los campos obligatorios están marcados con *