Web Automation: Why I Hate Page Object Model

itay zohar
4 min readApr 4, 2020

Apart from the click-bait title, after 5 years of automation work, I remain strongly oppose to POM. Here’s why.

Article summery

POM is a good QA approach but a bad software architecture, the division shouldn't be just between test scripts and web-pages, but mainly between logic and html

this is a sample project built with article’s principles

The concept

POM derives it’s inspiration from web development. There, an app is traditionally structured by pages. Or modernly, by components.

This approach gives your automation project a guaranteed structure. As a web app is the some of it’s components, test them all and you got yourself a full test coverage.

The fallacy

POM is not made to plan tests, but to design software.

The problem is, it’s a hoax. said inspiration is misleading.
What we have here is a mix-up between the two QA automation worldviews: tester’s view, and developer’s view.

The tester’s view takes a “black box” look at the outline of web development and sees the components. That is all the QA should think about when planning the tests. This shouldn't have to change, it’s just not the scope of POM.

POM is not made to plan tests, but to design software. An automation software, but a software nonetheless.

With that, a “developer’s view” will expose that the main architectural principle in web today is MVC.

While components and molecularity is important, the essence of MVC completely lacks in POM:
The separation of concern between UI design, business logic, and user flow.

A bad software architecture

What does screen querying has to do with app business logic?

Implementation in mind, we need to take a fresh look and apply software design principles into our test development.

“Page Object” responsibilities are:

  1. Querying all elements of the webpage
    xpath, css-selector and other identifiers, further process if needed, etc
  2. Handle all functionality taking place at the webpage
    this includes simple things like clicks/hovers. but also table search, form manipulation and so on.
  3. Handle all user flows and “business logic” included in the webpage
    encapsulated high level methods such as “log in”, or “create new user”

To any experienced developer the trouble already presents itself. A module or a class should have one responsibility, one area of expertise.
What does screen querying has to do with app business logic?
Why is “create user” flow depend on the fact that the button that creates the user is blue?

To make things clear, let’s review some of the issues formally:

  • No separation of concern
    This monstrous page object scope is the automation equivalent of a react component querying the db. Or handling back-end calculations.
  • Strong coupling
    If a single element changed it’s class-name, while all functionality and flow remain untouched. The page object has to be rewritten. Same goes for responsibilities (2) and (3).
  • Not generic
    In realty, same components are used in multiple pages. It’s probably the same table across the application. This makes POM inconvenient.
  • Not modular
    A whole process can be very similar between different components, “create user” and “create new user-group” can be quite the same. From design perspective we might want to create one generic method — but POM prevents us to.

I think it is clear by now that POM, simply put, is a bad software architecture.
It’s shouldn’t be a division between “pages” and tests, but between logic and html.

The alternative

Like every software component, we should divide our projects into layers.
first, three basic layers should exist even with POM:

  1. Data layer (mainly test data related).
  2. Web automation framework (like “pom over selenium”).
  3. Tests (the scripts).

now instead of POM, I use a framework that has 2–3 layers itself:

web automation architecture
  1. Data layer.
  2. Selectors layer: interacts with the screen.
  3. Web app functionality layer, with two responsibilities:
    a. Group layer (1) interactions into meaningful actions.
    b. Control business flows using these actions.
  4. Finally, the test scripts: perform flows and validate them.

Note: you may prefer to split layers 3.a and 3.b themselves into layers, that’s great. From my experience this matter heavily depends on the tested application.

Short case study: Facebook

Test script writers aren’t even aware of the selectors , they only know of the functional files to help them perform their test steps.

To demonstrate, I created basic project structure:

Selectors files represent screen interactions, if Facebook would decide to rewrite their react application with angular, only those files will have to change.

Functionality files like feed.py are modular and convenient. no matter if you’re on a group, on the news feed or on a timeline: you will use feed.py to scroll and manipulate the currently displayed feed. same goes for posts.py and friends.py.

Friends.py would use methods from other functionality files in order to navigate, send friend request, check notification and approve a request.

Finally, test script writers aren’t even aware of the selectors. They only know of the functional files and can use them to perform their test steps.

a sample project

this is a sample project built with article’s principles

Conclusion

In a broader sense, it seems basic software principles are sometimes absent when talking test automation.

This article is my first attempt at steering the discussion from working automation to well-designed one.

I hope I made my case about POM. Your comments and thoughts are most welcome!

--

--

itay zohar

Jack of all trades, master of some: | Software Developer | Entrepreneur | QA Automation Manager | Various Technologies | OOP & procedural |