Capture2Text - Alternative (Capture text from Screen Directly) in Ubuntu Mate

Question

I found few similar questions in this site but could not complete the process.

From the Answer of How can instantaneously extract text from a screen area using OCR tools? and How can I use OCR on a partial screen capture to get text?

First I installed the dependencies

sudo apt-get install tesseract-ocr
sudo apt-get install imagemagick
sudo apt-get install scrot
sudo apt-get install xsel

Then I put the following script in /home/blueray/Documents/Translate/screen_ts.sh

#!/bin/bash 
# Dependencies: tesseract-ocr imagemagick scrot xsel

SCR_IMG=`mktemp`
trap "rm $SCR_IMG*" EXIT

scrot -s $SCR_IMG.png -q 100    
# increase image quality with option -q from default 75 to 100

mogrify -modulate 100,0 -resize 400% $SCR_IMG.png 
#should increase detection rate

tesseract $SCR_IMG.png $SCR_IMG &> /dev/null
cat $SCR_IMG.txt | xsel -bi

exit

Please note that I removed

select tesseract_lang in eng rus equ ;do break;done
# Quick language menu, add more if you need other languages.

In the hope that it will only consider english. Please let me know if this is not the case.

Now when I put

bash /home/blueray/Documents/Translate/screen_ts.sh

It works as I wanted.

In windows, with Capture2Text, I used to use Win+Q to capture part of the screen as text. So, I checked How do I set a custom keyboard shortcut to control volume?

I went to Menu-> Searched for Keyboard Shortcuts -> Click

Then I clicked Add
Name: Capture2Text
Command: bash /home/blueray/Documents/Translate/screen_ts.sh
Clicked Apply
Clicked On Shortcut on the right.
Pressed WinQ

Now when I press WinQ, nothing happens. What am I doing wrong?

anthony · Accepted Answer · 2021-07-12T00:54:26.703

You don't need "scrot". Imagemagick (which provided "mogrify") can do the job of screen capture. You also don't need to save an intermediate image, as "tesseract" can accept an image on standard input.

As such the above simplifies to...

convert x: -modulate 100,0 -resize 400% -set density 300 png:- |
  tesseract stdin stdout | xsel -bi

However I also added the following to my version of the script, to pop up the text on screen so you can check it.

xsel -po | xless - &

Of course tesseract could use some improvements for some fonts! For example 'f's in some fonts have a small hook that makes tesseract think they are 'P's! Arrghhhh...

EDIT: Full script I use is located at...

https://antofthy.gitlab.io/software/#capture_ocr

I link this to a 'hotkey' (Meta-Print) using my window manager (openbox), so I can use it at any time.

If you can't use a hotkey, and need to uncover the part of the screen containing the text you can always launch it with a delay...

sleep 5; capture_ocr

Enjoy

score 1 · Answer 2 · answered May 22 '20 at 13:49

I had to tweek @anthony 's script so it would work on my box (Kubuntu 18.04):

Instead of the convert line, I used:

import -resize 300% +dither png:- |

Also, I removed the trailing minus - sign from the last line, so:

xsel -ob | $XPAGER

Working great.

Ahmad Ismail · Answer 3 · 2025-05-16T07:01:50.837

0

I use the following script:

#!/usr/bin/env bash
maim --hidecursor --format=png --select | tesseract - - -l eng --dpi 300 | xclip -selection clipboard
First - → Input from stdin (instead of a file)
Second - → Output to stdout (instead of a file)
notify-send "OCR" "Text copied to clipboard!"

Please bind it to a keyboard shortcut.

edited May 16 '25 at 07:01

answered May 20 '18 at 04:27

Ahmad Ismail

798

score 0 · Answer 4 · answered May 16 '25 at 08:36

Here's my own version of the above script. It improves on several points:

uses command line parameter for language 3-letter code (default eng)
negate the image if white text on black to improve tesseract's readability
remove alpha channel as some desktops use transparency and it messes with the OCR
use a whitelist of characters to avoid weird characters in OCR

#! /bin/bash
# Dependencies: tesseract-ocr imagemagick scrot xsel
Quick language menu, add more if you need other languages.
List possible ones with --list-langs
#select tesseract_lang in eng fra ita; do break; done
Use command line parameter. Default to english
tesseract_lang=${1:-eng}
SCR_IMG=$(mktemp)
trap "rm $SCR_IMG*" EXIT
For X11
#scrot -s "$SCR_IMG.png" -q 100
For Wayland
#grim -o "$SCR_IMG.png"
For X11 without artifacts
#scrot -a $(slop -f '%x,%y,%w,%h') -s "$SCR_IMG.png" -q 100
maim -s "$SCR_IMG.png"
increase image quality with option -q from default 75 to 100
Get the color of the 1st pixel: if black, negate the image below to increase detection rate
Pix=$(convert -type bilevel "$SCR_IMG.png" -format "%[pixel:u.p{0,0}]\n" info:)
Also remove possible transparency
if [ "$Pix" = "gray(0)" ]; then
    mogrify -alpha off -negate -modulate 100,0 -resize 400% "$SCR_IMG.png" 
else
    mogrify -alpha off         -modulate 100,0 -resize 400% "$SCR_IMG.png" 
fi
Use a whitelist of chars to improve detection
You might want to make it language specific
tesseract -l $tesseract_lang -c 'tessedit_char_whitelist= !"&'"'"'(),-.0123456789:;?#@ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz' "$SCR_IMG.png" "$SCR_IMG" &> /dev/null
cat "$SCR_IMG.txt" | xsel -bi   # Copy to clipboard
cat "$SCR_IMG.txt"

Capture2Text - Alternative (Capture text from Screen Directly) in Ubuntu Mate

4 Answers4

First - → Input from stdin (instead of a file)

Second - → Output to stdout (instead of a file)

Quick language menu, add more if you need other languages.

List possible ones with --list-langs

Use command line parameter. Default to english

For X11

For Wayland

For X11 without artifacts

increase image quality with option -q from default 75 to 100

Get the color of the 1st pixel: if black, negate the image below to increase detection rate

Also remove possible transparency

Use a whitelist of chars to improve detection

You might want to make it language specific

Linked