This site uses cookies and by using the site you are consenting to this. We utilize cookies to optimize our brand’s web presence and website experience. To learn more about cookies, click here to read our privacy statement.

How POM-based Architecture Boosts your Automated Testing

Recognizing the enormous effort for performing manual UI testing, organizations have opted to automate their UI test activity to speed up the development pipeline. To make automation payoff, one must approach it with the same mindset that we approach the development of our production code.

Automation tools provide the infrastructure to automate testing, but structuring the automation code itself falls on the shoulders of the automation engineer. The page object model (POM) has been widely accepted as an architectural style for structuring and building automation frameworks. The basic architecture of a POM-based architecture is to model each page in the application as a class, within the context of the language used to implement the automation framework, such as C# or Java.

Advantages of POM-based architecture

An image of a bridge with a red arrow pointing to it, showcasing POM-based Architecture.A POM-based architecture has many advantages. First, it provides a clear separation between the automated test cases (TCs) and the automation code. Second, the TCs become much easier to write because they are expressed using the business domain language. Third, the TCs require no updates when the UI changes, such as re-arranging elements on the pages of the application, or even changing the properties used to locate them, such as changing ‘Find Element By Class’ to ‘Find Element By Id’. It does, however, take an upfront time investment to structure the framework and identify the application pages or components of the page to be modeled as page objects.

To gain a better understanding of a POM-based automation framework, let’s use a simple example of a test. A user logs in by entering ‘User name’ and ‘Password’ in their respective textboxes and then clicks on the ‘Login’ button. The user is then directed to a page displaying their work items.

We’ll contrast two approaches for automating the following test case. We assume that the input data is provided by an external data source to support a data-driven automation approach.

StepActionInput dataExpected results
1Navigate to home page<AppUrl>Home page is displayed
2Enter user name<username>Home page is displayed
3Enter password<password>Home page is displayed
4Click LoginEmployee page is displayed


Approach 1: Accessing UI elements without a POM

In this approach, the automation code interacts directly with the pages of the application. It interacts with two pages, the ‘Home Page’ and the ‘Employee Page’. The test case locates the HTML elements on each of the pages and directly interacts with them.  Here is the code for automating the test case using  C#, where all HTML elements are located by “ID”.

[code language=”Csharp”]

[TestMethod, DataSource(
"Dsn=Excel Files;" +
"Driver={Microsoft Excel Driver(*.xlsx)};" +
"dbq=|DataDirectory|\EmployeeDataSource.xlsx;defaultdir=.;" +
"driver=790;" +
"maxbuffersize=2048;" +
"pagetimeout=5;" +
public void EmployeeCanViewAssignedWorkItemsCodeSmell()
string UserName = TestContext.DataRow["Username"].ToString();
string Password = TestContext.DataRow["Password"].ToString();
string EmployeeName = TestContext.DataRow["EmployeeName"].ToString();

// Navigate to the Home Page
BrowserWindow browser = new BrowserWindow();
browser.NavigateToUrl(new Uri(ApplicationUrl));

// Find textbox and enter user name
HtmlEdit userNameTextbox = new HtmlEdit(browser);
userNameTextbox.SearchProperties[HtmlEdit.PropertyNames.Id] = "TextBoxUserID";
userNameTextbox.Text = UserName;// Employee.uName(TestContext);

// Find textbox and enter password
HtmlEdit passwordTextbox = new HtmlEdit(browser);
passwordTextbox.SearchProperties[HtmlEdit.PropertyNames.Id] = "PasswordID";
passwordTextbox.Text = Password;// Employee.Password(TestContext);

// Find login button and click it
HtmlInputButton loginButton = new HtmlInputButton(browser);
loginButton.SearchProperties[HtmlInputButton.PropertyNames.Id] = "LoginID";

// Verify that the employee page is displayed
HtmlInputButton eNameBtn = new HtmlInputButton(browser);
eNameBtn.SearchProperties[HtmlInputButton.PropertyNames.Id] = "ButtonEmployeeNameID";
string name = eNameBtn.DisplayText.ToString();




A major drawback of this approach is maintenance. If the properties for locating an element on the UI change, such as the ‘Username’ textbox, then all the test cases using that textbox need to be updated. Another drawback is that we are mixing the steps of a test case with the logic of locating and interacting with the HTML elements, thereby violating the “separation of concerns” principle.

Approach 2: Accessing UI elements using a POM

A better approach is to model the application pages using the POM architecture. By designing an interface, we abstract the mechanics of finding the HTML elements and provide an API to the tests.  The POM resides in the middle between the test cases and the application pages. From the test case perspective, the model provides an abstraction of the functions that the test case uses, and from the application perspective, the POM implements the details of locating and interacting with the application pages.

This approach enforces the “separation of concerns” principle by separating the “What” to test and the “How” to implement the test. For example, a test case step such as ‘LoginWithUsernameAndpassword (user, password)’, expressing what to test, requires no changes to the test, even if the properties for locating the user name and password text boxes change.

We can re-write our earlier test by focusing strictly on ‘What’ to test, by abstracting away all the details of interacting with the elements of the application page and moving them to another class that implements the ‘How’ to implement the test.

Here is the resulting version. It is much more concise, more readable, and more maintainable than the first approach.

[code language=”csharp”]

[TestMethod, DataSource(… same as before …)

public void EmployeeCanViewAssignedWorkItems()


string UserName = TestContext.DataRow["Username"].ToString();

string Password = TestContext.DataRow["Password"].ToString();

string EmployeeName = TestContext.DataRow["EmployeeName"].ToString();

// Open browser and navigate to the homepage

BrowserWindow browser = new BrowserWindow();

HomePage homePage = new HomePage(browser);



.LoginWithUsernameAndpassword(UserName, Password)






The mapping from application pages to POM pages is largely an architectural decision. In many cases, it makes sense to map different components of an application page to their own page objects, then use class inheritance or class composition to construct other page objects. For example, you can map a footer, a header, and a navigation bar to their own page objects.

In summary, the page object model architecture offers many advantages when designing an automation framework. (1) It supports the separation of concerns principle, where the automation code is designed as an API to the automated test cases. (2) It supports a functional abstraction of the application pages, where all the inner details of the pages are expressed as separate objects. (3) It produces maintainable code, where changes to the UI for locating the elements is confined to the automation code only, without affecting the automated test cases. (4) Finally, it supports code reuse, since many functions across the pages can be factored out to build a generic code library.