How to Use Windows Functions in SQL to Analyze Data Like a Pro(10)
Windows functions are a powerful feature of SQL that allow you to perform calculations or transformations on a set of rows that are related to the current row. They are also known as analytical functions or windowing functions. In this blog post, I will explain what windows functions are, how they work, and how you can use them to solve common data analysis problems. I will also show you some examples of windows functions using the Row_Number(), Rank(), Dense_Rank(), and NTILE() functions. This is part of my SQL Learning series, where I share my tips and tricks on how to master SQL. If you want to learn more, you can follow me on Medium and LinkedIn, and check out my other posts on SQL topics.
What are Windows Functions?
Windows functions are a type of function that operate on a subset of rows, called a window, that are related to the current row. A window can be defined by using the OVER clause, which specifies how to partition and order the rows in the window. For example, you can define a window that partitions the rows by a certain column, such as customer_id, and orders them by another column, such as order_date. This way, you can apply a windows function to each group of rows that have the same customer_id, and within each group, the rows are ordered by order_date.
Windows functions can be used to perform various calculations or transformations on the rows in the window, such as:
- Aggregations: You can use windows functions to calculate the sum, average, count, min, max, or other aggregate functions on the rows in the window, without collapsing the rows into a single group. This allows you to retain the original detail of the rows, and compare the aggregated values with the individual values.
- Rankings: You can use windows functions to assign ranks or numbers to the rows in the window, based on their order or value. This allows you to identify the top or bottom rows, or divide the rows into groups or percentiles.
- Offsets: You can use windows functions to access the values of the previous or next rows in the window, or calculate the difference or ratio between them. This allows you to perform calculations that depend on the relative position of the rows, such as moving averages, cumulative sums, or growth rates.
How to Use Windows Functions?
To use a windows function, you need to specify the function name, the arguments (if any), and the OVER clause. The OVER clause consists of two optional parts: the PARTITION BY clause and the ORDER BY clause. The PARTITION BY clause defines how to split the rows into groups, or partitions, that have the same values for the specified columns. The ORDER BY clause defines how to sort the rows within each partition, by the specified columns. You can also use the ROWS or RANGE clause to further limit the window to a certain number of rows before or after the current row, but this is beyond the scope of this post.
The syntax of a windows function is as follows:
SQL
function_name(arguments) OVER (
PARTITION BY partition_columns
ORDER
For example, the following query uses the SUM() windows function to calculate the total amount of orders for each customer, and the ROW_NUMBER() windows function to assign a sequential number to each order within each customer, ordered by the order date.
SQL
SELECT customer_id, order_id, order_date, amount,
SUM(amount) OVER (PARTITION BY customer_id) AS total_amount,
ROW_NUMBER() OVER (PARTITION BY customer_id ORDER BY order_date) AS order_number
FROM orders;
The result of this query might look something like this:
Table:

As you can see, the windows functions allow you to perform calculations that are based on the context of the current row, and compare the values across different rows.
Examples of Windows Functions:

Now that you have a basic understanding of what windows functions are and how they work, let’s look at some examples of how you can use them to solve common data analysis problems. I will use the following sample table of employee data for the examples:

1)Row_Number():
The Row_Number() function assigns a sequential number to each row in the window, starting from 1. You can use this function to rank the rows by a certain order, or to generate unique identifiers for the rows.
For example, the following query uses the Row_Number() function to rank the employees by their salary, within each department.
SQL
SELECT employee_id, name, department, salary,
ROW_NUMBER() OVER (PARTITION BY department ORDER BY salary DESC) AS salary_rank
FROM employees;
The result of this query might look something like this:

As you can see, the Row_Number() function assigns a unique rank to each employee, based on their salary and department.
2)Rank():
The Rank() function assigns a rank to each row in the window, based on the order of the values in the specified column. The rank of a row is equal to the number of rows that have a higher or equal value than the current row. If two or more rows have the same value, they are assigned the same rank, and the next rank is skipped.
For example, the following query uses the Rank() function to rank the employees by their salary, within each department.
SQL
SELECT employee_id, name, department, salary,
RANK() OVER (PARTITION BY department ORDER BY salary DESC) AS salary_rank
FROM employees;
The result of this query might look something like this:

As you can see, the Rank() function assigns the same rank to employees who have the same salary, and skips the next rank accordingly.
3)Dense_Rank():
The Dense_Rank() function is similar to the Rank() function, except that it does not skip any ranks when there are ties. The rank of a row is equal to the number of distinct values that are higher or equal than the current value.
For example, the following query uses the Dense_Rank() function to rank the employees by their salary, within each department.
SQL
SELECT employee_id, name, department, salary,
DENSE_RANK() OVER (PARTITION BY department ORDER BY salary DESC) AS salary_rank
FROM employees;
The result of this query might look something like this:

As you can see, the Dense_Rank() function assigns the same rank to employees who have the same salary, but does not skip any ranks when there are ties.
4)NTILE():
The NTILE() function divides the rows in the window into a specified number of groups, or tiles, and assigns each row a tile number, starting from 1. You can use this function to split the rows into equal or approximately equal groups, such as quartiles, quintiles, deciles, etc.
For example, the following query uses the NTILE() function to divide the employees into four groups, or quartiles, based on their salary, within each department.
SQL
SELECT employee_id, name, department, salary,
NTILE(4) OVER (PARTITION BY department ORDER BY salary DESC) AS salary_quartile
FROM employees;
The result of this query might look something like this:

As you can see, the NTILE() function assigns each employee a quartile number, based on their salary and department. The first quartile contains the highest salaries, and the fourth quartile contains the lowest salaries.
Conclusion:
Windows functions are a powerful and versatile feature of SQL that allow you to perform complex calculations or transformations on a set of rows that are related to the current row. They can help you solve many common data analysis problems, such as aggregations, rankings, offsets, and more. In this blog post, I showed you some examples of how to use windows functions, such as Row_Number(), Rank(), Dense_Rank(), and NTILE(), to analyze data like a pro. I hope you found this post useful and interesting. If you did, please like, share, and clap for this post, and follow me on here and LinkedIn for more SQL tips and tricks. Also, if you want to learn more about SQL, you can check out the following resources:
- LearnSQL: Practice & Test Skills A to Z — A comprehensive online course that covers everything you need to know about SQL, from basic to advanced topics, with interactive exercises and quizzes.
- Designgurus: One-Stop Portal For Tech Interviews.

- SQL for Students — Get my free ebook SQL for Students on Gumroad by liking and commenting below.
Thank you for reading, and happy learning! ๐
“Data is the new oil. It’s valuable, but if unrefined it cannot really be used. It has to be changed into gas, plastic, chemicals, etc. to create a valuable entity that drives profitable activity; so must data be broken down, analyzed for it to have value.” — Clive Humby
If you enjoyed this post, please leave a comment below and let me know what you think. Also, if you have any questions or suggestions for future topics, feel free to ask me. I would love to hear from you. Until next time, keep learning and keep coding! ๐
Comments
Post a Comment