A Case Study of Java's Dynamic Proxies (and other Reflection Bits): Selenium's PageFactory


I spend a lot of time studying the source code of libraries that I really enjoy using. Since my day job is developing automated web UI tests at Red Hat, Selenium is one of those libraries. At the time I started writing this, I was not very familiar with Java’s reflection capabilities, nor what the heck a “Dynamic Proxy” was. The Proxy pattern is particularly powerful, and java.lang.reflect’s dynamic proxies provide a fairly nice way to do it. Dissecting PageFactory demonstrates.

What is PageFactory?

Selenium WebDriver comes with a fancy class called PageFactory. It allows you to write your page object’s like this:

class LoginPage {
  private WebElement username;
  private WebElement password;
  @FindBy(id = "submitLogin")
  private WebElement submit;

  public HomePage login(String usernameToType, String passwordToType) {
    return new HomePage(driver);

Those familiar with Selenium and the Page Object Pattern will notice right away our members are not By types, but WebElements. And by the looks of it, we can act on them straight away without having to findElement all over the place, despite appearing uninstantiated. And what’s that annotation? This is where the PageFactory automagic happens!

If you’d like to learn more about PageFactory and how to use it, check out the Selenium wiki. If you’re like me and have to know what crazy Java wizardry is behind those factory gates, here’s your golden ticket…

Charlie and the Page Factory

The magic starts when you initialize the page class via PageFactory’s static initElements method. It sprinkles reflection and dynamic proxy fairy dust on your WebElement fields so that rather than throwing NullPointerException, they do stuff. This post covers the details of that process, so that you too may wield such black magic!

Pass initElements either a WebDriver and a page object’s .class

// If you want to instance a new page from test code.
public void funWithPageFactory() {
  WebDriver driver = new FirefoxDriver();
  LoginPage loginPage = PageFactory.initElements(driver, LoginPage.class);
  // Do stuff

Or a WebDriver and the page instance itself.

public class LoginPage {
  // Elements go here...

  public LoginPage(WebDriver driver) {
    // Passing an already instantiated instance is just as cool
    PageFactory.initElements(driver, this);

  // Methods to do stuff on elements go here...

Either way, you start with the most reduced set of data you need: a WebDriver and some class that has WebElement fields (classic constructor style dependency injection). These elements have optional annotations that describe how to construct a By instance. And as you very well know those are used by a driver to find elements. So, really, those are the three essential ingredients: a WebDriver, WebElement fields, and By objects. We make By’s from annotations, or we’ll assume them from the name of the WebElement fields.

So if you can’t use an element before finding it, when does driver.findElement(by) actually get called? The end result of a chain of Oompa Loompa shenanigans is that elements “find themselves” when they are called upon. Behind the scenes, findElement is not being called until you actually try to “do stuff” on that element. That’s the real drama of PageFactory and where the bulk of interesting work happens. Let’s take a look at that flow.

A Factory in Chicago that Makes Miniature Models… of Factories

The following steps are a little hard to follow (perhaps because of a little pattern overload… but I won’t argue with Google). Ultimately, we need to sheath those WebElement fields with Proxy instances that implement WebElement, and in between calling your desired method on the desired element, do that driver.findElement call we know needs to happen in order to get the element we want to work with. PageFactory wraps that need in ElementLocators. We’ll need a locator for each element, and so Selenium divvies out that duty to, you guessed it, ElementLocatorFactory.

1. Instance an ElementLocatorFactory by giving it a SearchContext

An ElementLocatorFactory can take a WebElement field from the page object, combo it with that SearchContext (in a typical case this is the WebDriver we passed initElements), and spit out, you guessed it, ElementLocators. We reference the individual fields via the reflection api, and we get actual Field objects that the locator factory can accept in order to create locators.

Here’s some code to illustrate that:

// ElementLocatorFactory instantiation and usage.
public void demonstrateLocatorFactory() {
    WebDriver driver = new FirefoxDriver();
    // Some page modeled with the PageFactory pattern.
    LoginPage loginPage = new LoginPage();

    // ElementLocatorFactory is an interface, and 
    // DefaultElementLocatorFactory is Selenium's stock implementation
    ElementLocatorFactory locatorFactory = 
            new DefaultElementLocatorFactory(driver);

    // Here's the reflection bit. It just does exactly what it looks like.
    // Field types have a method to access their annotations (if any), so
    // the ElementLocator's will use the Field to get the annotations, which
    // has all the info to create a By object, as we'll discuss.
    Field[] fields = loginPage.class.getDeclaredFields();

    for (Field field : fields) {
        // Assume for brevity these are all WebElement fields
        ElementLocator locator = locatorFactory.createLocator(field);
        // Do stuff with the locator...

So the Locator Factory creates Locators from Fields. If you look at all these interfaces (SearchContext, ElementLocator, By1), you’ll start to see they look really similar. What’s special about an ElementLocator instance specifically is that it can find an element on demand without any parameters. A SearchContext needs a By to do that. A By needs a SearchContext. Like Doc Brown needs both a flux capacitor and some plutonium to travel through time, we need both a DOM context and a means-of-finding-something-in-that-context to reference an element.

Now, we’ve got a SearchContext (from the driver we passed initElements), but what about the By? Well, actually, we have that already too. More specifically, we have enough information to make a By. We have our page object, and that page object has fields, and each field has an annotation that says, “Hey this is how you make a By for me,” and if not, we assume that the name of the field is the exact id (in HTML terms) of the element we’re looking for. And so, an ElementLocator constructs a By itself, given the information attached to a field. And now we’ve got a SearchContext and a By wrapped up in this ElementLocator guy that we can pass around like it’s a WebElement… Effectively it is! That is, without the shortcomings of a direct WebElement reference. In order to get a WebElement we have to find it first, and if it can’t be found, we can’t get a reference to it (findElement would throw a NoSuchElementException). An ElementLocator on the other hand is as good as pointing to a specific element, but we can hold off on actually finding that element until we’re ready assert that that element should actually be there in the driver’s current context. An ElementLocator can even cache an element once it’s found it for the first time, and just reuse it on subsequent lookups.

In summary, an ElementLocatorFactory takes a SearchContext and a Field, smashes them together and makes a portable ElementLocator. An ElementLocator constructs the By from the Field, looking at its annotations if it has any, and from there it has all the ingredients to reference a specific element without additional parameters. This useful feature is going to be essential in a minute.

2. Instance a FieldDecorator by giving it the ElementLocatorFactory

The next type we encounter is a FieldDecorator. A field decorator is the thing that actually uses the ElementLocatorFactory, so we instantiate the decorator by passing along the locator factory in the decorator’s constructor. It’s going to need the ability to generate locators for a given field (which is what the factory does), because it has the core task of actually assigning WebElements to the fields of our page object – that is, “decorating” those fields.

3. Use the FieldDecorator to assign references to the page object’s WebElement fields

The decorate method of our FieldDecorator takes a Field and a ClassLoader. The ClassLoader is just what it sounds like: every Java class is loaded by “something.” To load a class is to take a class definition from some form, and spit out a Java-executable .class file: the real working bits. There are different ClassLoader implementations depending on the platform or source of the Java class. PageFactory will always just reuse the ClassLoader that our page object used. Any Class<?> object has a getClassLoader method for this purpose.

More important is the Field, which the FieldDecorator will use to generate an ElementLocator for the particular field we are attempting to decorate. Cool, but how do we get a Field object? The Class<?> interface also provides this facility, via getDeclaredFields. This is reflection. With a Field, you can examine its modifiers, and also set or get it for a particular instance of the class that declares the field. This is what PageFactory does, as seen here:

private static void proxyFields(FieldDecorator decorator, Object page, 
    Class<?> proxyIn) {
  Field[] fields = proxyIn.getDeclaredFields(); // proxyIn is just the page 
                                                // object being initialized.
  for (Field field : fields) {
    Object value = decorator.decorate(page.getClass().getClassLoader(), 
    if (value != null) {
      try {
        field.setAccessible(true); // Fields accessed via reflection still 
                                   // obey Java's visibility rules, however 
                                   // this can be overriden by setting the 
                                   // "accessible" flag.
        field.set(page, value);
      } catch (IllegalAccessException e) {
        throw new RuntimeException(e);

As you can see, FieldDecorator.decorate(...) returns an object, and we set that object as the value of the field we passed to it. What is that value? You might be able to guess at this point. Recall we instantiated the DefaultFieldDecorator by passing along the ElementLocatorFactory. So this thing knows how to make element locators, perhaps it just returned an element then based on the locator for the field? What if the element couldn’t be found at initialization?

Enter the proxy pattern. Instead of assigning those fields a WebElement directly, we assign it a “proxy” instance of a WebElement. That is, an object that implements the WebElement interface, but not by way of a conventional class. Instead, when methods are called on the proxy, that method and its arguments are passed to an intercepting method as arguments (as in Method method, Object[] args). That intercepting method is ours to implement by way of an InvocationHandler. When we implement the InvocationHandler interface, we implement that intercepting method. There, we can do whatever we want, provided it returns a type that complies with the method’s signature. Due to that constraint, it usually involves calling invoke of the original method object (say click()), on some other WebElement object. See where this is going? That “other” WebElement is the one our ElementLocator can track down independently. By implementing WebElement via a proxy, we defer calling SearchContext.findElement (and potentially throwing an exception), until we actually try to do something on that element. Magic!

Instantiating and implementing a Proxy is quite simple. Here’s a contrived example:

// The interface(s) that the proxy will implement governs this type.
WebElement proxyElement;

// java.lang.reflect.Proxy has a static method, newProxyInstance. At compile 
// time we can only say that this returns an Object type, but it's really 
// returning a new class that implements whatever interfaces we say it does.
// So we can safely cast to WebElement.
proxyElement = (WebElement) Proxy.newProxyInstance(
        // We have to pass a ClassLoader here so the proxy class can be 
        // defined. Recall this is why decorate accepts a ClassLoader (with 
        // which we pass the page object's ClassLoader.
        // This is an array of Class types -- these are the interfaces that 
        // this object supports, governing the cast rules and the methods we 
        // have to be able to handle in our InvocationHandler. This is why we
        // can cast to WebElement.
        new Class[] {WebElement.class}, 
        // This is our invocation handler.

And this is what the InvocationHandler looks like that PageFactory uses, with my own comments added:

public class LocatingElementHandler implements InvocationHandler {
  private final ElementLocator locator;

  // Inject the thing that can lookup a specific element at a later time
  public LocatingElementHandler(ElementLocator locator) {
    this.locator = locator;

  public Object invoke(Object object, Method method, Object[] objects) 
      throws Throwable {
    // The lazy look up!
    WebElement element = locator.findElement();

    // This proxy also implements "WrapsElement" and must implement its single
    // method manually, like so:
    if ("getWrappedElement".equals(method.getName())) {
      return element;

    try {
      return method.invoke(element, objects);
    } catch (InvocationTargetException e) {
      // If the method that is reflectively invoked throws an exception, it's 
      // rethrown as an "InvocationTargetException". We can throw the original,
      // more interesting exception by "unwrapping" it.

      // Unwrap the underlying exception
      throw e.getCause();

All of this work is encapsulated inside of the field decorator. You pass it a field and an ElementLocatorFactory, and it returns a proxy, which we then assign to the respective field. Tada!


With Java’s reflection and dynamic proxies you can,

  • Retrieve a class’s fields and modify them at runtime
  • Pose as an implementation of an interface by intercepting method calls and implementing your own logic (ie. a proxy)

And this is how PageFactory does its magic.

Happy reflecting!


Popular posts from this blog

Asynchronous denormalization and transactional messaging with MongoDB change streams

Kubernetes Distilled, Part 1: Deployments and Services