GUI Automation

Controlling the Mouse from Python

Controlling the mouse and keyboard is called GUI automation. The PyAutoGUI third-party module has many functions to control the mouse and keyboard.

Installation: pip install pyautogui Documentation: https://pyautogui.readthedocs.io/en/latest/

In programming, Y-coordinates increase going down. This is the opposite of Y-coordinates in mathematics.

A Word of Caution & Fail-Safes

If your program gets out of control, quickly move the mouse cursor to the top-left

There are some programs that prohibit third party software from sending clicks to them. For example, many anti-virus programs will prevent this (otherwise the virus could click on the AV program to disable it). It sounds like this is what's happening. You can try running the Python script as an Administrator, but that might not work either. If not, then I don't think there's anything you can do about this.

To use PyAutoGUI

In [2]:
import pyautogui

pyautogui.size() returns the screen resolution. You can use the output from the function for multiple assignment

In [3]:
pyautogui.size()
Out[3]:
(1920, 1080)
In [4]:
width, height = pyautogui.size()
width
Out[4]:
1920
In [5]:
height
Out[5]:
1080

pyautogui.position() returns the mouse position, both as tuple of two ints. The tuple of (0, 0) refers to top-left corner of the screen

In [6]:
pyautogui.position()
Out[6]:
(723, 635)

pyautogui.moveTo() moves the mouse to an x,y coordinate. The mouse move is instant unless you pass an int for the duration keyword argument (in seconds)

In [7]:
pyautogui.moveTo(10, 10)
In [8]:
pyautogui.moveTo(10, 100, duration = 1.5)

pyautogui.moveRel() moves the mouse relative to its current position

In [9]:
pyautogui.moveRel(200, 0)
In [10]:
pyautogui.moveRel(200, 0, duration = 2)
In [11]:
pyautogui.moveRel(0, -100)

pyautogui's click(), doubleClick(), rightClick(), middleClick() click the mouse buttons

In [12]:
pyautogui.doubleClick()
In [13]:
pyautogui.position()
Out[13]:
(628, 878)
In [14]:
pyautogui.click(339, 38)
In [15]:
pyautogui.rightClick(339, 38)

dragTo() and dragRel() will move the mouse while holding down a mouse button

Scripts: Auto Draw Spiral in Windows Paint

file: draw.py

In [ ]:
# import pyautogui

# pyautogui.click() # click to put drawing program in focus
# distance = 200
# while distance > 0:
#   print(distance, 0)
#   pyautogui.dragRel(distance, 0, duration=0.1) # move right
#   distance = distance - 25
#   print(0, distance)
#   pyautogui.dragRel(0, distance, duration=0.1) # move down
#   print(-distance, 0)
#   pyautogui.dragRel(-distance, 0, duration=0.1) # move left
#   distance = distance - 25
#   print(0, -distance)
#   pyautogui.dragRel(0, -distance, duration=0.1)

Trick: Display Mouse Position Quickly

Find screen coordinates quickly. Run the following commands in terminal i.e. command prompt, NOT in IDLE!

In [16]:
# python
# import pyautogui
# pyautogui.displayMousePosition()

Controlling the Keyboard from Python

  • PyAutoGUI's virtual key presses will go to the window that current has focus. You can bring the target window to focus with pyautogui.click() first and then execute next command.

typewrite() can be passed a string of characters to type. It also has a interval keyword argument

In [17]:
# pyautogui.click(100, 100); pyautogui.typewrite('Hello world')
In [18]:
# pyautogui.click(100, 100); pyautogui.typewrite('Hello world', interval=0.2)

Passing a list of strings to typewrite() lets you use hard-to-type keyboard keys, like 'left', 'shift' or 'f1'

In [19]:
# pyautogui.click(100, 100); pyautogui.typewrite(['a', 'b' 'left', 'X', 'Y'], interval=1)

These keyboard key strings are in the pyautogui.KEYBOARD_KEYS list

In [22]:
# pyautogui.KEYBOARD_KEYS

pyautogui.press() will press a single key

In [23]:
pyautogui.press('f1')

This will open the Python Docs

pyautogui.hotkey() can be used for keyboard shortcuts, like Ctrl+O

In [24]:
pyautogui.hotkey('ctrl', 'o')

Screenshots and Image Recognition

PyAutoGUI is installed with the pillow imaging module to use its same image data type. Pillow and working with images are beyond the scope of this course. You can learn about it from chapter 17 of the Automate the Boring Stuff with Python book

Additional intallation (on Linux): sudo apt-get install scrot

A screenshot is an image of the screen's content. The pyautogui.screenshot() will return an Image object of the screen, or you can pass it a filename to save it in a file.

In [26]:
# pyautogui.screenshot()
In [27]:
# pyautogui.screenshot('Z:\\IT\\Python\\Snippets\\screenshot_example.png')

locateOnScreen() is passed a sample image file, and returns the coordinates of where it is on screen: x-coordinate, y-coordinate, width, height

In [28]:
# pyautogui.locateOnScreen('Z:\\IT\\Python\\Snippets\\calc7key.png')

locateCenterOnScreen() will return an (x, y) typle of where the image is on the screen

In [29]:
# pyautogui.locateCenterOnScreen('Z:\\IT\\Python\\Snippets\\calc7key.png')

# pyautogui.moveTo(907, 316, duration=1)

Combining the keyboard/mouse/screenshot functions let's you make awesome stuff!!. Example: Bot to Play SushiGoRound