According the document, we know defining of this API:
Text recognition is the process of detecting text in images and video streams and recognizing the text contained therein. Once detected, the recognizer then determines the actual text in each block and segments it into lines and words. The Text API detects text in Latin based languages (French, German, English, etc.), in real-time, on device.
Text Structure
- a Block is a contiguous set of text lines, such as a paragraph or column,
- a Line is a contiguous set of words on the same vertical axis,
- a Word is a contiguous set of alphanumeric characters on the same vertical axis.
Project configuration
dependencies
block:
compile 'com.google.android.gms:play-services-vision:9.8.0'
Because you will be using the device’s camera to capture texts, please add CAMERA permission to your AndroidManifest.xml:
<uses-permission android:name="android.permission.CAMERA"/>
Defining the activity layout
SurfaceView
in order to display the preview frames captured by the camera. I also add a TextView
to display the contents of the recognized text:
activity_main.xml
<?xml version="1.0" encoding="utf-8"?>
<RelativeLayout xmlns:android="http://schemas.android.com/apk/res/android"
android:layout_width="match_parent"
android:layout_height="match_parent"
android:padding="16dp">
<SurfaceView
android:id="@+id/surface_view"
android:layout_width="match_parent"
android:layout_height="match_parent"
android:layout_alignParentLeft="true"
android:layout_centerVertical="true" />
<TextView
android:id="@+id/text_value"
android:layout_width="match_parent"
android:layout_height="wrap_content"
android:text="No text"
android:layout_alignParentBottom="true"
android:textColor="@android:color/white"
android:textSize="20sp" />
</RelativeLayout>
Capturing text with Camera device
private SurfaceView cameraView;
private TextView textBlockContent;
private CameraSource cameraSource;
@Override
public void onCreate(Bundle bundle) {
super.onCreate(bundle);
setContentView(R.layout.activity_main);
cameraView = (SurfaceView) findViewById(R.id.surface_view);
textBlockContent = (TextView) findViewById(R.id.text_value);
}
Now, we're going to create a TextRecognizer
object. This detector object processes images and determines what text appears within them. Once it's initialized, a Recognizer
can be used to detect text in all types of images:
TextRecognizer textRecognizer = new TextRecognizer.Builder(getApplicationContext()).build();
Just like that, the TextRecognizer
is built. However, it might not work yet. If the device does not have enough storage, or Google Play Services can't download the OCR dependencies, the TextRecognizer
object may not be operational. Before we start using it to recognize text, we should check that it's ready. We'll add this check after we initialized the TextRecognizer
:if (!textRecognizer.isOperational()) {
Log.w("MainActivity", "Detector dependencies are not yet available.");
}
To fetch a stream of images from the device’s camera and display them in the SurfaceView
, create a new instance of the CameraSource
class using CameraSource.Builder
. Because the CameraSource
needs a TextRecognizer
, initializing it with the TextRecognizer
instance we've just built above:
cameraSource = new CameraSource.Builder(getApplicationContext(), textRecognizer)
.setFacing(CameraSource.CAMERA_FACING_BACK)
.setRequestedPreviewSize(1280, 1024)
.setRequestedFps(2.0f)
.setAutoFocusEnabled(true)
.build();
Next, add a callback to the SurfaceHolder
of the SurfaceView
so that you know when you can start drawing the preview frames. The callback should implement the SurfaceHolder.Callback
interface. Inside the surfaceCreated()
method, call the start()
method of the CameraSource
to start drawing the preview frames and in surfaceDestroyed()
, call stop()
to closes the camera and stops sending frames to the underlying frame detector:
cameraView.getHolder().addCallback(new SurfaceHolder.Callback() {
@Override
public void surfaceCreated(SurfaceHolder holder) {
try {
//noinspection MissingPermission
cameraSource.start(cameraView.getHolder());
} catch (IOException ex) {
ex.printStackTrace();
}
}
@Override
public void surfaceChanged(SurfaceHolder holder, int format, int width, int height) {
}
@Override
public void surfaceDestroyed(SurfaceHolder holder) {
cameraSource.stop();
}
});
The most important work is we need to tell the TextRecognizer
what it should do when it detects a text block. Create an instance of a class that implements the Detector.Processor
interface and pass it to the setProcessor()
method of the TextRecognizer
. Inside the receiveDetections()
method which we must override, displayed a SparseArray
of TextBlock
to a TextView
like this:
textRecognizer.setProcessor(new Detector.Processor<TextBlock>() {
@Override
public void release() {
}
@Override
public void receiveDetections(Detector.Detections<TextBlock> detections) {
Log.d("Main", "receiveDetections");
final SparseArray<TextBlock> items = detections.getDetectedItems();
if (items.size() != 0) {
textBlockContent.post(new Runnable() {
@Override
public void run() {
StringBuilder value = new StringBuilder();
for (int i = 0; i < items.size(); ++i) {
TextBlock item = items.valueAt(i);
value.append(item.getValue());
value.append("\n");
}
//update text block content to TextView
textBlockContent.setText(value.toString());
}
});
}
}
});
The last work is you should override onDestroy()
method of our Activity
. In this, stops the camera and releases the resources of the camera and underlying detector:
@Override
protected void onDestroy() {
super.onDestroy();
cameraSource.release();
}
Running this activity, scanning a text block on the paper, you may have result like this:
Conclusions
Read more:
- Face detection with Mobile Vision API
- Barcode/QR code reading with Mobile Vision API