Home / Resources / GUI script and image recognition with sikuliX

Introduction

Python is a high-level, general-purpose programming language. Its design philosophy emphasizes code readability with the use of significant indentation. It is often described as a "batteries included" language due to its comprehensive standard library: Python's large standard library provides tools suited to many tasks and is commonly cited as one of its greatest strengths.

SikuliX automates anything you see on the screen of your desktop computer. It uses image recognition powered by OpenCV to identify GUI components. This is handy in cases when there is no easy access to a GUI's internals or the source code of the application or web page you want to act on. SikuliX supports Python as a scripting language.

Prerequisites

Before start, make sure to have Java or openJDK installed. Download the sikuliX API Java executable and install sikulix4python. Start sikuliX API with enable of a py4j server on port 25333. Example:

java -jar "sikulixapi-*.jar" -p

We are going to use image recognition, start with some screenshots of the button we want to click and of the pop-ups we may wait to appear. In this example, we print the content of a webpage from firefox web browser. As a consequence, our screenshots include:

In the end, GUI script requires a GUI to be visible. So, the last step is to run firefox and make sure that its window is visible.

Example with Python 3

The first step is to connect to sikuliX API and find the path to all of our screenshots.

# IMPORTANT! sikuliX API must be running py4j server on port 25333
# we must run: java -jar "sikulixapi-*.jar" -p
import sikulix4python

# where the script is:
theScript  = sikulix4python.os.path.realpath(__file__)
theDirname = sikulix4python.os.path.dirname(theScript)

# location of the images for pattern recognition:
img_file    = sikulix4python.os.path.join(theDirname, "file.png")
img_print   = sikulix4python.os.path.join(theDirname, "print.png")
img_save    = sikulix4python.os.path.join(theDirname, "save.png")
img_name    = sikulix4python.os.path.join(theDirname, "name.png")
img_save_as = sikulix4python.os.path.join(theDirname, "save_as.png")
img_replace = sikulix4python.os.path.join(theDirname, "replace.png")

GUI script is sensible to the speed of our OS and display. We can create a function to wait for some image to be visible.

def sikuli_waitExists(
  theScreen,    #!< sikulix4python screen instance
  theImagePath  #!< path to the image we wait to be visible
):
  while ( True ):
    theRegion = theScreen.exists(theImagePath)
    if ( theRegion is not None ):
      break
    else:
      print("wait for " + theImagePath)
  return theRegion

We are about to print the content of a webpage into a pdf file. The parent directory must exist.

theDestFile = "/tmp/test.pdf"
theDirname = sikulix4python.os.path.dirname(theDestFile)
sikulix4python.os.makedirs(theDirname, exist_ok=True)

Then, we can conduct all the GUI script. A condition might be added in case the destination file already exists and the OS asks to replace it.

# click File -> Print:
theRegion = sikuli_waitExists(sikulix4python.SCREEN.instance, img_file)
theRegion.click()
#
theRegion = sikuli_waitExists(sikulix4python.SCREEN.instance, img_print)
theRegion.click()

# click save button:
theRegion = sikuli_waitExists(sikulix4python.SCREEN.instance, img_save)
theRegion.click()

# wait for the "Name" pop-up to appear:
theRegion = sikuli_waitExists(sikulix4python.SCREEN.instance, img_name)
# add pixels to X of the region and click:
theRegion.setX(theRegion.getX() + 300)
theRegion.click()

# save as:
theRegion.type(sikulix4python.SXPKG.script.Key.BACKSPACE * 30)
theRegion.paste(theDestFile)
theRegion.type(sikulix4python.SXPKG.script.Key.ENTER)
#
theRegion = sikuli_waitExists(sikulix4python.SCREEN.instance, img_save_as)
theRegion.click()
#
theRegion = sikulix4python.SCREEN.instance.exists(img_replace)
if ( theRegion is not None ):
  theRegion.click()

We can make sure that the OS has finished loading when all images have disappeared from screen.

sikulix4python.SCREEN.instance.waitVanish(img_replace)
sikulix4python.SCREEN.instance.waitVanish(img_save_as)
sikulix4python.SCREEN.instance.waitVanish(img_save)

print("OK!")

Associated resources

Here is an archive which contains the discussed example. Screenshots have been taken from firefox and GNOME widgets in french.



no cookie, no javascript, no external resource, KISS!