top of page

You are learning Conditional Formatting in MS Excel

How do I create conditional formatting rules to identify outliers in a dataset?

There are two main approaches to creating conditional formatting rules for identifying outliers in Excel:

1. Using Standard Deviation:

This method identifies data points that fall outside a certain number of standard deviations from the mean (average).

* Steps:
1. Select the data range you want to analyze.
2. Go to the "Home" tab and click "Conditional Formatting" in the Styles group.
3. Choose "New Rule" from the dropdown menu.
4. In the "New Formatting Rule" window, select "Format only cells that are" under "Select a Rule Type."
5. Choose "Below" or "Above" from the first dropdown menu depending on which outliers you want to identify (below a certain standard deviation are considered lower outliers, above are higher outliers).
6. In the next dropdown menu, select "Standard deviation" of the mean.
7. Enter the number of standard deviations you want to use as the threshold (e.g., 2 for values more than 2 standard deviations away from the mean).
8. Click "Format" to choose the formatting style you want to apply to outlier cells (e.g., red fill color).
9. Click "OK" on all open windows.

2. Using Interquartile Range (IQR):

This method focuses on identifying outliers based on the quartiles of your data set.

* Steps:
1. Select the data range you want to analyze.
2. Calculate the Quartiles:
- In an empty cell, enter the formula `=QUARTILE.INC(A1:A20,1)` (replace A1:A20 with your actual data range) to find the first quartile (Q1). Copy this formula down one cell to calculate the third quartile (Q3) by changing the "1" to a "3" in the formula.
3. Calculate the Interquartile Range (IQR):
- In another empty cell, enter the formula `=Q3-Q1` to calculate the IQR.
4. Calculate the Upper and Lower Fences:
- In two separate cells, enter the formulas `=Q1-1.5*IQR` and `=Q3+1.5*IQR` to calculate the lower and upper fences, respectively. These represent the boundaries beyond which data points are considered outliers.
5. Apply Conditional Formatting:
- Follow steps 2-4 from the Standard Deviation method, but choose "Less Than" for the lower outliers and enter the formula referencing the lower fence cell (e.g., `< B2`) in step 6. Repeat the process for upper outliers using "Greater Than" and referencing the upper fence cell formula.

Choosing the Right Method:

- Standard Deviation is a simpler approach, but it might be less sensitive to outliers in skewed data sets.
- IQR is more robust for skewed data as it focuses on the middle half of your data distribution.

Additional Tips:

- You can use different formatting styles for upper and lower outliers for easier visual distinction.
- Consider using data validation to restrict the number of standard deviations entered to prevent invalid criteria.
- Remember, these methods are starting points. You can adjust the thresholds or explore more advanced conditional formatting techniques based on your specific needs.

bottom of page