Home / Resources / Instrumentation of firefox web browser with the marionette

Introduction

Python is a high-level, general-purpose programming language. Its design philosophy emphasizes code readability with the use of significant indentation. It is often described as a "batteries included" language due to its comprehensive standard library: Python's large standard library provides tools suited to many tasks and is commonly cited as one of its greatest strengths.

Marionette is an automation driver for Mozilla's Gecko engine. It can remotely control either the UI or the internal JavaScript of a Gecko platform, such as Firefox. It can control both the chrome (i.e. menus and functions) or the content (the webpage loaded inside the browsing context). In addition to performing actions on the browser, Marionette can also read the properties and attributes of the DOM.

Prerequisites

Before start, make sure to have installed the driver for Mozilla's Gecko engine along with firefox web browser. Start firefox with the -marionette argument.

By default, firefox listens to port 2828.

Example with Python 3

The first step is to connect to firefox using the marionette driver. Then, we can start a session, navigate to a specific URL and access the DOM and CSS properties.

import marionette_driver

theClient = marionette_driver.marionette.Marionette('localhost', port=2828)
theClient.start_session()

theClient.navigate("https://en.wikipedia.org/wiki/Main_Page")
theLogo = theClient.find_element(marionette_driver.By().CLASS_NAME, "mw-wiki-logo")

theUrl = theLogo.value_of_css_property("background-image")
theUrl = theUrl.split("\"")[1]

It is also possible to access the preferences of firefox and change their values.

# save the profile to restore it later:
save_pref_DownloadDir    = theClient.get_pref("browser.download.dir")
save_pref_FolderList     = theClient.get_pref("browser.download.folderList")
save_pref_UseDownloadDir = theClient.get_pref("browser.download.useDownloadDir")
save_pref_neverAskToSave = theClient.get_pref("browser.helperApps.neverAsk.saveToDisk")

# sets the download destination and set to download without confirmation
# (depending on the extension):
theClient.set_pref("browser.download.dir",                   "/tmp"     )
theClient.set_pref("browser.download.folderList",            2          )
theClient.set_pref("browser.download.useDownloadDir",        True       )
theClient.set_pref("browser.helperApps.neverAsk.saveToDisk", "image/png")

An other powerful feature is that we can execute some javascript within the current window.

# javascript to be executed in a webpage to download an element. Just replace
# markers THE_OUTPUT_NAME and THE_URL:
theJavascript = """
  // creates an invisible link element and adds it to the HTML body:
  let a   = document.createElement("a");
  a.style = "display: none";
  document.body.appendChild(a);

  // configures the link (url and file name) and click it:
  a.href     = \"THE_URL\";
  a.download = \"THE_OUTPUT_NAME\";
  a.click();

  // remove the link from the HTML body:
  a.remove();
"""

# creates a javascript and executes it:
theScript = theJavascript
theScript = theScript.replace("THE_OUTPUT_NAME", "a.png")
theScript = theScript.replace("THE_URL", theUrl)
theClient.execute_script(theScript)

Associated resources

Here is an archive which contains the discussed example.



no cookie, no javascript, no external resource, KISS!