Recognizing the enormous effort for performing manual UI testing, organizations have opted to automate their UI test activity to speed up the development pipeline. To make automation payoff, one must approach it with the same mindset that we approach the development of our production code.
Automation tools provide the infrastructure to automate testing, but structuring the automation code itself falls on the shoulders of the automation engineer. The page object model (POM) has been widely accepted as an architectural style for structuring and building automation frameworks. The basic architecture of a POM-based architecture is to model each page in the application as a class, within the context of the language used to implement the automation framework, such as C# or Java.
Advantages of POM-based architecture
A POM-based architecture has many advantages. First, it provides a clear separation between the automated test cases (TCs) and the automation code. Second, the TCs become much easier to write because they are expressed using the business domain language. Third, the TCs require no updates when the UI changes, such as re-arranging elements on the pages of the application, or even changing the properties used to locate them, such as changing ‘Find Element By Class’ to ‘Find Element By Id’. It does, however, take an upfront time investment to structure the framework and identify the application pages or components of the page to be modeled as page objects.
To gain a better understanding of a POM-based automation framework, let’s use a simple example of a test. A user logs in by entering ‘User name’ and ‘Password’ in their respective textboxes and then clicks on the ‘Login’ button. The user is then directed to a page displaying their work items.
We’ll contrast two approaches for automating the following test case. We assume that the input data is provided by an external data source to support a data-driven automation approach.
Navigate to home page
Home page is displayed
Enter user name
Home page is displayed
Home page is displayed
Employee page is displayed
Approach 1: Accessing UI elements without a POM
In this approach, the automation code interacts directly with the pages of the application. It interacts with two pages, the ‘Home Page’ and the ‘Employee Page’. The test case locates the HTML elements on each of the pages and directly interacts with them. Here is the code for automating the test case using C#, where all HTML elements are located by “ID”.
// Navigate to the Home Page BrowserWindow browser = new BrowserWindow(); browser.NavigateToUrl(new Uri(ApplicationUrl));
// Find textbox and enter user name HtmlEdit userNameTextbox = new HtmlEdit(browser); userNameTextbox.SearchProperties[HtmlEdit.PropertyNames.Id] = "TextBoxUserID"; userNameTextbox.DrawHighlight(); userNameTextbox.Text = UserName;// Employee.uName(TestContext);
// Find textbox and enter password HtmlEdit passwordTextbox = new HtmlEdit(browser); passwordTextbox.SearchProperties[HtmlEdit.PropertyNames.Id] = "PasswordID"; passwordTextbox.DrawHighlight(); passwordTextbox.Text = Password;// Employee.Password(TestContext);
// Find login button and click it HtmlInputButton loginButton = new HtmlInputButton(browser); loginButton.SearchProperties[HtmlInputButton.PropertyNames.Id] = "LoginID"; loginButton.DrawHighlight(); Mouse.Click(loginButton);
// Verify that the employee page is displayed HtmlInputButton eNameBtn = new HtmlInputButton(browser); eNameBtn.SearchProperties[HtmlInputButton.PropertyNames.Id] = "ButtonEmployeeNameID"; string name = eNameBtn.DisplayText.ToString(); Assert.IsTrue(eNameBtn.DisplayText.Contains(name));
A major drawback of this approach is maintenance. If the properties for locating an element on the UI change, such as the ‘Username’ textbox, then all the test cases using that textbox need to be updated. Another drawback is that we are mixing the steps of a test case with the logic of locating and interacting with the HTML elements, thereby violating the “separation of concerns” principle.
Approach 2: Accessing UI elements using a POM
A better approach is to model the application pages using the POM architecture. By designing an interface, we abstract the mechanics of finding the HTML elements and provide an API to the tests. The POM resides in the middle between the test cases and the application pages. From the test case perspective, the model provides an abstraction of the functions that the test case uses, and from the application perspective, the POM implements the details of locating and interacting with the application pages.
This approach enforces the “separation of concerns” principle by separating the “What” to test and the “How” to implement the test. For example, a test case step such as ‘LoginWithUsernameAndpassword (user, password)’, expressing what to test, requires no changes to the test, even if the properties for locating the user name and password text boxes change.
We can re-write our earlier test by focusing strictly on ‘What’ to test, by abstracting away all the details of interacting with the elements of the application page and moving them to another class that implements the ‘How’ to implement the test.
Here is the resulting version. It is much more concise, more readable, and more maintainable than the first approach.
The mapping from application pages to POM pages is largely an architectural decision. In many cases, it makes sense to map different components of an application page to their own page objects, then use class inheritance or class composition to construct other page objects. For example, you can map a footer, a header, and a navigation bar to their own page objects.
In summary, the page object model architecture offers many advantages when designing an automation framework. (1) It supports the separation of concerns principle, where the automation code is designed as an API to the automated test cases. (2) It supports a functional abstraction of the application pages, where all the inner details of the pages are expressed as separate objects. (3) It produces maintainable code, where changes to the UI for locating the elements is confined to the automation code only, without affecting the automated test cases. (4) Finally, it supports code reuse, since many functions across the pages can be factored out to build a generic code library.