Algorithm Analysis

Asymptotic analysis is the backbone of algorithm analysis, it attempts to estimate the resource consumption of an algorithm by comparing the relative costs of two or more algorithms for solving the same problem. After reading this post, you should be familiar with three concepts: growth rate, upper and lowers bounds of a growth rate, and how to calculate the Running Time of an algorithm. But before we get to that, here is a recap of what you learned last week.

Recap: last week‘s material
Last week, we covered the main zest of data structures: what it is about and how we should utilize it to solve problems. Design patterns were briefly touched upon and good attributes any algorithm should have were covered as well.

ASYMPTOTIC ANALYSIS
Asymptotic Analysis is an estimation technique that measures the efficiency of an algorithm, or its implementation as the input size of a program increases. Generally, you will need to analyze the time required for an algorithm and the space required for a data structure. Here, is [The Big Five] which are a bunch of functions I will be elaborating on shortly: (I thought I’d mention them before I start as they are practically what this whole section is about; ps. Ignore the stuff in red for now)

1. f(n) < g(n)   |   f(n) = o(g(n)) ≡ ∀c > 0,   n0 > 0 ∋∀n ≥ n0,   0 ≤ f(n) ≤ cg(n)
2. f(n) ≤ g(n)   |   f(n) = O(g(n)) ≡ ∃c > 0,   n0 > 0 ∋∀n ≥ n0,     0 ≤ f(n) ≤ cg(n)
3. f(n) = g(n)   |   f(n) = Θ(g(n)) ≡ f(n) = O(g(n))   and   f(n) = Ω(g(n))
4. f(n) ≥ g(n)   |   f(n) = Ω(g(n)) ≡ ∃c > 0,   n0 > 0 ∋∀n ≥ n0,     0 ≤ cg(n) ≤ f(n)
5. f(n) > g(n)   |   f(n) = ω(g(n)) ≡ ∀c > 0,  ∃n0 > 0 ∋∀n ≥ n0,  0 ≤ cg(n) ≤ f(n)

Growth rate
The growth rate for an algorithm is the rate at which the cost of the algorithm grows as the size of its input grows. Checkout the table below (obtained from Princeton.edu)

 Complexity Description 1 – Constant growth does not depend on the input size. log N – Logarithmic growth gets slightly slower as N grows. N – Linear growth is optimal if you need to process N inputs. As N doubles, so does time. N log N – Linearithmic growth scales to huge problems. As N doubles, time more than doubles. N^2 – Quadratic growth practical for use only on relatively small problems. As N doubles, time increases four-fold. N^3 – Cubic growth is practical for use on only small problems. As N doubles, time increases eight-fold. 2^N – Exponential growth is not appropriate for practical use. As N doubles, time squares. N! – Factorial growth is worse than exponential. As N increases by 1, time increases by a factor of N.

Upper & Lower bounds + Θ notation
Various terms describe the running-time equation of an algorithm. These terms along with their associated symbols, indicate what aspect of the algorithm’s behavior is being described. One is the upper bound and another is the lower bound. When describing the upper bound of an algorithm, our statement should be about some class of inputs of size n on either the best-case, average-case, or worst-case scenario. An appropriate description would be: “this algorithm has an upper bound to its growth rate of n^2 in the average case.” However, the phrase “this algorithm has an upper bound to its growth rate of f(n)” is too long and is generally replaced with the big-Oh notationThe previous statement is now: O(n^2) in the average case”. Keep in mind that, the big-Oh notation gives us the running time of an algorithm in the worst case scenario.

The lower bound is the exact opposite of big-Oh, it is the lowest amount of some resource (usually time) that is required by an algorithm for some class of inputs of size n. The lower bound for an algorithm is denoted by the symbol Ω and is pronounced “big-Omega” or just “Omega.” On the other hand, when both the upper and lower bounds are the same within a constant factor, we indicate this by using Θ (big-Theta) notation. Lastly, in time complexity analysis we should always try to find the tight bound aka Θ (big-Theta) notation as it provides the best idea of the time taken.

o(g(n)) = f(n) ≡ ∀c > 0,   n0 > 0 ∋∀n ≥ n0,   0 ≤ f(n) ≤ cg(n)   //not important, just 4 ur info
Translation in layman’s terms: the set of functions o(g(n)) [pronounced as little-oh of g(n)] is defined to be the set of functions f(n) such that ‘n’ falls in the following conditions: There is any positive constant and some ‘no’ exists such that for all ‘n’ is bigger than ‘n0’, f(n) is no bigger than the constant times g(n). In other words, f(n) < g(n).

O(g(n)) = f(n) ≡ ∃c > 0,   n0 > 0 ∋∀n ≥ n0,     0 ≤ f(n) ≤ cg(n)
Translation in layman’s terms: the set of functions O(g(n)) is defined to be the set of functions f(n) such that ‘n’ falls in the following conditions: There exists some positive constant and some ‘no’ such that for all ‘n’ is bigger than ‘no’, f(n) is no bigger than the constant times g(n). In other words, f(n) ≤ g(n).

Θ(g(n)) =f(n) ≡ f(n) = O(g(n))   and   f(n) = Ω(g(n))
This is straight forward, the set of functions Θ(g(n)) is defined to be the set of functions f(n) when both the upper and lower bounds are the same within a constant factor. In other words, f(n) = g(n).

Ω(g(n)) = f(n) ≡ ∃c > 0,   n0 > 0 ∋∀n ≥ n0,     0 ≤ cg(n) ≤ f(n)
Translation in layman’s terms: the set of functions Ω(g(n)) is defined to be the set of functions f(n) such that ‘n’ falls in the following conditions: There exists some positive constant and some ‘no’ such that for all ‘n’ is bigger than ‘no’, f(n) is at least as big as the constant times g(n). In other words, f(n) ≥ g(n).

ω(g(n)) = f(n) ≡ ∀c > 0,  ∃n0 > 0 ∋∀n ≥ n0,  0 ≤ cg(n) ≤ f(n)   //not important, just 4 ur info
Translation in layman’s terms: the set of functions ω(g(n)) [pronounced as little-omega of g(n)] is defined to be the set of functions f(n) such that n falls in the following conditions: There is any positive constant and some ‘no’ exists such that for all ‘n’ is bigger than ‘no’, f(n) is at least as big as the constant times g(n). In other words, f(n) > g(n).

ALGORITHM PERFORMANCE
The primary consideration when estimating an algorithm’s performance is the number of basic operations required by the algorithm to process an input of a certain size (the number of inputs processed). Keep in mind that a basic operation must have the property that its time to complete does not depend on the particular values of its operands. Basic operations include: evaluating an expression, assigning a value to a variable, indexing into an array, calling a method, returning from a method, etc.

Generally, an algorithm may run faster on some inputs than it does on others of the same size. We might be eager to just average everything up and move on but that usually doesn’t work. An average case analysis requires that we understand how the actual inputs to the program (and their costs) are distributed with respect to the set of all possible inputs to the program.

Programmers usually stick with the worst case analysis because it is an indication that the algorithm must perform at least that well. Sticking with the average or best case is usually risky. This case is easier than an average case analysis since it requires only the ability to identify the worst case input, ultimately leading to better algorithms. The best case analysis on the other hand, is used to describe the way an algorithm behaves under optimal conditions (not very realistic).

RUNNING TIME OF AN ALGORITHM
Generally, the running time of any algorithm depends upon a bunch of factors. Like, whether your machine is a single processor machine or a multiple processor machine. The generation of the machine also counts and whether it is 32-bit or 64-bit. Another factor is the read or write speed onto the memory/disk of your machine. Lastly, it also depends upon the kind of input you are giving to your program, which is the main factor. All you should be concerned about is how your program behaves for various inputs or how the time taken by the program grows. Hence, we are interested in the rate of growth taken with respect to the input.

In order to estimate the running time of an algorithm you need to read each line of your given code segment and assign the cost for each operation and mention the number of times that particular operation will repeat. You will see what I mean, from the example on Matrix Multiplication:

 Algorithm int n = A.length; for (int i = 0; i < n; i++) { for (int j = 0; j < n; j++) { sum = 0; for k = 0; k < n; k++) sum = sum + A[i][k]*B[k][j]; C[i][j] = sum; } } Cost × c0 c1 c2 c3 c4 c5 c6 # of Repetition × 1 time × n time × n*n time × n*n time × n*n*n time × n*n*n time × n*n time Total Number of operations: = c0 + c1*n + (c2+c3+c6)*n*n + (c4+c5)*n*n*n = O(n^3)

Time Complexity Analysis – General Rules
We analyze time complexity in most cases for very large input-size and worst case scenario. For instance, if you want to get the big Oh notation for a function, let’s say a polynomial. Rule 1: you will need to drop all the lower terms. Rule 2: you will need to drop the constant multiplier as well. Example: T(n) = 17n^4 + 3n^3 + 4n + 8, all terms will be insignificant for higher values of n; hence, your big Oh is O(n^4). The same thing applies with ‘log’. For instance, T(n) = 16n + log n = O(n).

Moreover, you can calculate the running time of an algorithm by summing up the running times of the fragments of the program; which brings us to Rule 3: Running time = ∑ running time of all fragments. Example:

 int a; a=5 a++;Simple statements fragment 1 = O(1) for (i=o; i < n; i++) { //Simple statements } Single loop fragment 2 = O(n) for (i=o; i < n; i++) { for (j=o; j < n; j++) { //Simple statements }}Nested loop Fragment 3 = O(n^2) Now, just add all the fragments. T(n) = O(1) +O(n) + O(n^2) = O(n^2)

However, when we have some conditional statements, like an if and else control structure. One fragment of the code has a particular growth rate of O(n) for instance, while another fragment has a growth rate of O(n^2). Will it still be valid to sum all fragments? Example:

Explanation: When we have some conditional statements, the program goes like: if some condition is true then we have a single loop with a time complexity of O(n) and if the condition is not true, then we will have a nested double loop with a time complexity of O(n^2). Now if the control of the program goes to the first part, then it will execute with O(n); but, if the control goes to the else part, then it will execute with O(n^2). Regardless of which part executes, we always follow the worst case scenario, which is T(n) = O(n^2) in this example. Hence, in the case of conditional statements, you don’t just simply add the fragments, but pickup the maximum of the two. Rule 4: For conditional statements, pick the complexity of the condition which is the worst case.

You have reached the end of this blog post. Thanks for reading, I will leave you now with some recurrence relations to remember, have a great day!

 Recurrence Relations Recurrence Algorithm Big-Oh Solution T(n) = T(n/2) + O(1) Binary Search O(log n) T(n) = T(n-1) + O(1) Sequential Search O(n) T(n) = 2*T(n/2) + O(1) Tree Traversal O(n) T(n) = T(n-1) + O(n) Selection Sort O(n^2) T(n) = 2*T(n/2) + O(n) Merge Sort O(n log n)

Checklist aka Things you should [kinda] know after reading this post:

• Basic concepts of Asymptotic analysis
• Big O (O), Big Omega (Ω), Big Theta (Θ)
• The best, worst, and average case performance of a particular algorithm
• Calculating the Running Time of an Algorithm

References:
[Textbook] Shaffer C.A (2011). A Practical Introduction to Data Structure and Algorithm analysis. Retrieved from http://courses.cs.vt.edu/cs3114/Spring09/book.pdf
[Videos] Asymptotic Definitions Part 1 by David TaylorAsymptotic Definitions Part 2 by David TaylorAlgorithm Analysis for Different Control Structures by S. Saurabh | Big-O notation by Jamal Thorne | Time complexity analysis by MyCodeSchool
[Webpages] Analysis of Algorithms (Java) | Analysis of Algorithms by Daisy Tang
[
Lecture Slides] Asymptotic Running Time of Algorithms

Background Story: Hey, I’m Yasmin, a soon-to-be junior CS Major at UoPeople with 80 credit hours down. I will be using my blog as a platform to help me study and share what I learn on a weekly basis with my fellow classmates and readers. Follow the Studying CS at UoPeople section for weekly blog posts published every Friday Saturday. Please note, this series will not interfere with the weekly Monday posts. Thanks for reading. Cheers!